Docs
/
AWS Cloud
Chapter 18

18 — Well-Architected Framework

The 6 Pillars

AWS's framework for building secure, reliable, efficient, cost-effective, and sustainable workloads.

┌──────────────────────────────────────────┐
│         Well-Architected Framework        │
├────────┬────────┬────────┬───────────────┤
│Security│Reliab- │Perform-│    Cost       │
│        │ility   │ance    │ Optimization  │
├────────┴────────┴────────┴───────────────┤
│  Operational Excellence │ Sustainability │
└─────────────────────────┴────────────────┘

1. Operational Excellence

Run and monitor systems to deliver business value.

PracticeHow
IaCCloudFormation, CDK, Terraform
CI/CDAutomated build, test, deploy
ObservabilityCloudWatch metrics, logs, alarms, X-Ray
RunbooksDocumented procedures for incidents
Small changesFrequent, small deployments (not big-bang)
Learn from failurePost-incident reviews, game days

2. Security

Protect information, systems, and assets.

Identity:        IAM roles, least privilege, MFA
Detection:       CloudTrail, GuardDuty, Config
Infrastructure:  VPC, security groups, WAF, Shield
Data:            Encryption at rest (KMS) + in transit (TLS)
Incident:        Automated response, runbooks

Key principle: Security at EVERY layer
  Edge (CloudFront + WAF)
    → VPC (NACLs + Security Groups)
      → Application (auth + input validation)
        → Data (encryption + access control)

3. Reliability

Recover from failures and meet demand.

PracticeImplementation
Multi-AZDeploy across 2+ AZs
Auto-scalingEC2 ASG, ECS auto-scale, Lambda auto-scales
Health checksALB health checks, Route 53 failover
BackupsRDS automated backups, S3 versioning
Chaos engineeringSimulate failures to test resilience
Loose couplingSQS, EventBridge between services
Design for failure:
  Single AZ failure    → Multi-AZ deployment
  Single region failure → Multi-region (DR)
  Service failure       → Circuit breaker, retries, DLQ
  Data loss            → Backups, replication, versioning

4. Performance Efficiency

Use resources efficiently as demand changes.

AreaStrategy
ComputeRight-size instances, use Graviton (ARM), serverless
StorageRight storage class (S3 tiers), EBS type (gp3 vs io2)
DatabaseRead replicas, caching (ElastiCache/DAX), Aurora Serverless
NetworkingCloudFront CDN, VPC endpoints, Global Accelerator
Caching layers:
  Browser cache → CloudFront → API Gateway cache → ElastiCache/DAX → Database

Reduce latency at every layer.

5. Cost Optimization

Avoid unnecessary costs.

Right-sizing:
  ✅ Use CloudWatch to identify underutilized resources
  ✅ Downsize oversized instances
  ✅ Use Graviton instances (20% cheaper, better performance)

Pricing models:
  ✅ Reserved Instances / Savings Plans for steady workloads
  ✅ Spot for fault-tolerant batch jobs
  ✅ Serverless for variable traffic (pay-per-use)

Eliminate waste:
  ✅ Delete unused EBS volumes, EIPs, snapshots
  ✅ S3 lifecycle policies for old data
  ✅ Auto-stop dev/test environments off-hours
  ✅ Use AWS Cost Explorer + Budgets + anomaly detection

6. Sustainability

Minimize environmental impact.

PracticeImplementation
Right-sizeDon't over-provision
ServerlessResources used only when needed
Efficient codeOptimize algorithms, reduce compute time
GravitonMore efficient ARM processors
RegionsChoose regions with renewable energy
Data lifecycleDelete unused data, compress, archive

Architecture Review Checklist

□ Security: IAM least privilege, encryption, MFA, WAF
□ Reliability: Multi-AZ, backups, health checks, auto-scaling
□ Performance: Caching, CDN, right-sized resources
□ Cost: Reserved/Savings Plans, no idle resources, lifecycle policies
□ Operations: IaC, CI/CD, monitoring, alerting, runbooks
□ Sustainability: Right-sized, serverless where possible

Tools:
  - AWS Well-Architected Tool (free review in console)
  - AWS Trusted Advisor (checks for best practices)
  - AWS Cost Explorer (cost analysis)

Key Takeaways

  • 6 pillars: Operational Excellence, Security, Reliability, Performance, Cost, Sustainability
  • Security at every layer — edge, network, application, data
  • Design for failure — Multi-AZ, auto-scaling, backups, loose coupling
  • Right-size everything — use CloudWatch data to match resources to actual usage
  • Automate everything — IaC, CI/CD, scaling, incident response
  • Use the AWS Well-Architected Tool to review your workloads against best practices
  • Cost optimization is ongoing — review monthly with Cost Explorer and Trusted Advisor