18 — Well-Architected Framework

The 6 Pillars

AWS's framework for building secure, reliable, efficient, cost-effective, and sustainable workloads.

┌──────────────────────────────────────────┐
│         Well-Architected Framework        │
├────────┬────────┬────────┬───────────────┤
│Security│Reliab- │Perform-│    Cost       │
│        │ility   │ance    │ Optimization  │
├────────┴────────┴────────┴───────────────┤
│  Operational Excellence │ Sustainability │
└─────────────────────────┴────────────────┘

1. Operational Excellence

Run and monitor systems to deliver business value.

Practice	How
IaC	CloudFormation, CDK, Terraform
CI/CD	Automated build, test, deploy
Observability	CloudWatch metrics, logs, alarms, X-Ray
Runbooks	Documented procedures for incidents
Small changes	Frequent, small deployments (not big-bang)
Learn from failure	Post-incident reviews, game days

2. Security

Protect information, systems, and assets.

Identity:        IAM roles, least privilege, MFA
Detection:       CloudTrail, GuardDuty, Config
Infrastructure:  VPC, security groups, WAF, Shield
Data:            Encryption at rest (KMS) + in transit (TLS)
Incident:        Automated response, runbooks

Key principle: Security at EVERY layer
  Edge (CloudFront + WAF)
    → VPC (NACLs + Security Groups)
      → Application (auth + input validation)
        → Data (encryption + access control)

3. Reliability

Recover from failures and meet demand.

Practice	Implementation
Multi-AZ	Deploy across 2+ AZs
Auto-scaling	EC2 ASG, ECS auto-scale, Lambda auto-scales
Health checks	ALB health checks, Route 53 failover
Backups	RDS automated backups, S3 versioning
Chaos engineering	Simulate failures to test resilience
Loose coupling	SQS, EventBridge between services

Design for failure:
  Single AZ failure    → Multi-AZ deployment
  Single region failure → Multi-region (DR)
  Service failure       → Circuit breaker, retries, DLQ
  Data loss            → Backups, replication, versioning

4. Performance Efficiency

Use resources efficiently as demand changes.

Area	Strategy
Compute	Right-size instances, use Graviton (ARM), serverless
Storage	Right storage class (S3 tiers), EBS type (gp3 vs io2)
Database	Read replicas, caching (ElastiCache/DAX), Aurora Serverless
Networking	CloudFront CDN, VPC endpoints, Global Accelerator

Caching layers:
  Browser cache → CloudFront → API Gateway cache → ElastiCache/DAX → Database

Reduce latency at every layer.

5. Cost Optimization

Avoid unnecessary costs.

Right-sizing:
  ✅ Use CloudWatch to identify underutilized resources
  ✅ Downsize oversized instances
  ✅ Use Graviton instances (20% cheaper, better performance)

Pricing models:
  ✅ Reserved Instances / Savings Plans for steady workloads
  ✅ Spot for fault-tolerant batch jobs
  ✅ Serverless for variable traffic (pay-per-use)

Eliminate waste:
  ✅ Delete unused EBS volumes, EIPs, snapshots
  ✅ S3 lifecycle policies for old data
  ✅ Auto-stop dev/test environments off-hours
  ✅ Use AWS Cost Explorer + Budgets + anomaly detection

6. Sustainability

Minimize environmental impact.

Practice	Implementation
Right-size	Don't over-provision
Serverless	Resources used only when needed
Efficient code	Optimize algorithms, reduce compute time
Graviton	More efficient ARM processors
Regions	Choose regions with renewable energy
Data lifecycle	Delete unused data, compress, archive

Architecture Review Checklist

□ Security: IAM least privilege, encryption, MFA, WAF
□ Reliability: Multi-AZ, backups, health checks, auto-scaling
□ Performance: Caching, CDN, right-sized resources
□ Cost: Reserved/Savings Plans, no idle resources, lifecycle policies
□ Operations: IaC, CI/CD, monitoring, alerting, runbooks
□ Sustainability: Right-sized, serverless where possible

Tools:
  - AWS Well-Architected Tool (free review in console)
  - AWS Trusted Advisor (checks for best practices)
  - AWS Cost Explorer (cost analysis)

Key Takeaways

6 pillars: Operational Excellence, Security, Reliability, Performance, Cost, Sustainability
Security at every layer — edge, network, application, data
Design for failure — Multi-AZ, auto-scaling, backups, loose coupling
Right-size everything — use CloudWatch data to match resources to actual usage
Automate everything — IaC, CI/CD, scaling, incident response
Use the AWS Well-Architected Tool to review your workloads against best practices
Cost optimization is ongoing — review monthly with Cost Explorer and Trusted Advisor

17 — Serverless Architectures