Docs
/
AWS Cloud
Chapter 18
18 — Well-Architected Framework
The 6 Pillars
AWS's framework for building secure, reliable, efficient, cost-effective, and sustainable workloads.
┌──────────────────────────────────────────┐
│ Well-Architected Framework │
├────────┬────────┬────────┬───────────────┤
│Security│Reliab- │Perform-│ Cost │
│ │ility │ance │ Optimization │
├────────┴────────┴────────┴───────────────┤
│ Operational Excellence │ Sustainability │
└─────────────────────────┴────────────────┘
1. Operational Excellence
Run and monitor systems to deliver business value.
| Practice | How |
|---|---|
| IaC | CloudFormation, CDK, Terraform |
| CI/CD | Automated build, test, deploy |
| Observability | CloudWatch metrics, logs, alarms, X-Ray |
| Runbooks | Documented procedures for incidents |
| Small changes | Frequent, small deployments (not big-bang) |
| Learn from failure | Post-incident reviews, game days |
2. Security
Protect information, systems, and assets.
Identity: IAM roles, least privilege, MFA
Detection: CloudTrail, GuardDuty, Config
Infrastructure: VPC, security groups, WAF, Shield
Data: Encryption at rest (KMS) + in transit (TLS)
Incident: Automated response, runbooks
Key principle: Security at EVERY layer
Edge (CloudFront + WAF)
→ VPC (NACLs + Security Groups)
→ Application (auth + input validation)
→ Data (encryption + access control)
3. Reliability
Recover from failures and meet demand.
| Practice | Implementation |
|---|---|
| Multi-AZ | Deploy across 2+ AZs |
| Auto-scaling | EC2 ASG, ECS auto-scale, Lambda auto-scales |
| Health checks | ALB health checks, Route 53 failover |
| Backups | RDS automated backups, S3 versioning |
| Chaos engineering | Simulate failures to test resilience |
| Loose coupling | SQS, EventBridge between services |
Design for failure:
Single AZ failure → Multi-AZ deployment
Single region failure → Multi-region (DR)
Service failure → Circuit breaker, retries, DLQ
Data loss → Backups, replication, versioning
4. Performance Efficiency
Use resources efficiently as demand changes.
| Area | Strategy |
|---|---|
| Compute | Right-size instances, use Graviton (ARM), serverless |
| Storage | Right storage class (S3 tiers), EBS type (gp3 vs io2) |
| Database | Read replicas, caching (ElastiCache/DAX), Aurora Serverless |
| Networking | CloudFront CDN, VPC endpoints, Global Accelerator |
Caching layers:
Browser cache → CloudFront → API Gateway cache → ElastiCache/DAX → Database
Reduce latency at every layer.
5. Cost Optimization
Avoid unnecessary costs.
Right-sizing:
✅ Use CloudWatch to identify underutilized resources
✅ Downsize oversized instances
✅ Use Graviton instances (20% cheaper, better performance)
Pricing models:
✅ Reserved Instances / Savings Plans for steady workloads
✅ Spot for fault-tolerant batch jobs
✅ Serverless for variable traffic (pay-per-use)
Eliminate waste:
✅ Delete unused EBS volumes, EIPs, snapshots
✅ S3 lifecycle policies for old data
✅ Auto-stop dev/test environments off-hours
✅ Use AWS Cost Explorer + Budgets + anomaly detection
6. Sustainability
Minimize environmental impact.
| Practice | Implementation |
|---|---|
| Right-size | Don't over-provision |
| Serverless | Resources used only when needed |
| Efficient code | Optimize algorithms, reduce compute time |
| Graviton | More efficient ARM processors |
| Regions | Choose regions with renewable energy |
| Data lifecycle | Delete unused data, compress, archive |
Architecture Review Checklist
□ Security: IAM least privilege, encryption, MFA, WAF
□ Reliability: Multi-AZ, backups, health checks, auto-scaling
□ Performance: Caching, CDN, right-sized resources
□ Cost: Reserved/Savings Plans, no idle resources, lifecycle policies
□ Operations: IaC, CI/CD, monitoring, alerting, runbooks
□ Sustainability: Right-sized, serverless where possible
Tools:
- AWS Well-Architected Tool (free review in console)
- AWS Trusted Advisor (checks for best practices)
- AWS Cost Explorer (cost analysis)
Key Takeaways
- 6 pillars: Operational Excellence, Security, Reliability, Performance, Cost, Sustainability
- Security at every layer — edge, network, application, data
- Design for failure — Multi-AZ, auto-scaling, backups, loose coupling
- Right-size everything — use CloudWatch data to match resources to actual usage
- Automate everything — IaC, CI/CD, scaling, incident response
- Use the AWS Well-Architected Tool to review your workloads against best practices
- Cost optimization is ongoing — review monthly with Cost Explorer and Trusted Advisor