Warming up the neural circuits...
By the end of this chapter you will:
It is technically possible to store a file as a
BYTEAcolumn in PostgreSQL. Don't. There's an entire category of infrastructure — object storage — designed for one job: store bytes, retrieve bytes, charge by GB. Use the right tool.
| Problem | What goes wrong |
|---|---|
| Backup bloat | A 50 GB DB suddenly becomes 500 GB; nightly backups fall over. |
| Slow queries | SELECT * accidentally returns megabytes per row. |
| Memory pressure | PostgreSQL's TOAST mechanism pages large blobs in/out — wasted I/O. |
| Caching impossible | You cannot put a in front of a database. |
| Scaling blocked | Read replicas lag because they replicate huge blobs. |
The DB stores only the metadata pointer. The object storage stores the bytes.
PostgreSQL row:
id: a3f1...
owner_id: restaurant-123
s3_key: menu/restaurant-123/a3f1.jpg
size_bytes: 482311
sha256: a92c...
uploaded_at: 2026-05-02T10:23:01Z
S3 object:
Bucket: quickbite-assets-prod
Key: menu/restaurant-123/a3f1.jpg
Body: <binary bytes>App servers are ephemeral — every deploy spins up new containers, the old disk is gone. App servers are horizontally scaled — Box A wrote the file, the next request lands on Box B which can't see it. Local disk has no replication, no encryption-at-, no lifecycle rules.
12-factor (Chapter 16): treat your processes as stateless. The disk under your app is scratch space, not a file system.
Buckets are a flat namespace. Renaming a bucket means copying every object and updating every reference. Decide the layout once.
quickbite-assets-dev
quickbite-assets-staging
quickbite-assets-prodNever share a bucket between dev and prod — a bug in dev that lists all objects will find production data.
quickbite-assets-prod/
├── menu/{restaurant_id}/{uuid}.{ext} ← menu item photos
├── profiles/{user_id}/{uuid}.{ext} ← profile pictures
├── receipts/{year}/{month}/{order_id}.pdf ← order receipts
├── exports/{user_id}/{job_id}.csv ← data exports
└── temp/{user_id}/{uuid} ← lifecycle rule deletes after 24hS3 has no real folders — / in the key is convention. But that convention enables prefix-based IAM policies, lifecycle rules, and listing.
❌ burger.jpg — collision-prone, leaks the content type in the key
❌ Mario Pizza Margherita 2026.jpg — spaces, PII, original filename
✅ menu/rst-123/a3f12c8e-uuid.jpg — owner-scoped, opaque, UUID-based
S3 keys show up in CloudTrail logs and access logs. Never put PII (names, emails) in them.
Enable bucket-level default encryption:
| Mode | When to use |
|---|---|
| SSE-S3 (AES-256, AWS-managed key) | Default. Free. Good enough for most data. |
| SSE-KMS (AWS KMS key) | Compliance-driven (PCI, HIPAA, SOC2). Gives you key rotation and per-key audit. |
| SSE-C (customer-supplied key) | You hold the key, AWS forgets it. Almost never the right answer. |
For a food delivery app with PCI-adjacent data (card tokens, order receipts): SSE-KMS with a dedicated CMK per environment.
Force TLS with a bucket policy:
{
"Sid": "DenyInsecureTransport",
"Effect": "Deny",
"Principal": "*",
"Action": "s3:*",
"Resource": ["arn:aws:s3:::quickbite-assets-prod/*"],
"Condition
Every bucket should be created with Block Public Access turned ON at the account level.
Most public S3 leaks did NOT happen because someone "hacked AWS". They happened because someone clicked "make public" to debug something and forgot. All four Block Public Access toggles: ON.
import { S3Client, PutObjectCommand } from '@aws-sdk/client-s3';
@Injectable()
export class S3Service {
private readonly s3 = new S3Client({ region: this.config.get('AWS_REGION')
Notice: bucket name is in config (never hard-coded), ContentType is set explicitly, encryption is set per-object as defence-in-depth.
For product images, marketing PDFs, public branding — put CloudFront (or Cloudflare) in front of S3:
Browser → CloudFront edge cache (cached 24h) → S3 origin (only on miss)Benefits: latency drops from ~200 ms to ~20 ms, S3 GET costs disappear on hits, you can revoke URLs without touching S3.
For private user content (order receipts, profile photos), use CloudFront + signed URLs (Chapter 20).
Restaurant owners upload high-res food photos. Don’t serve raw originals to users — generate variants:
menu/rst-123/{uuid}.jpg ← 4 MB, original
menu/rst-123/{uuid}_thumb_200.jpg ← 8 KB, menu list view
menu/rst-123/{uuid}_thumb_800.jpg ← 60 KB, item detail viewOn upload, fire an async job (Chapter 15) that uses sharp to resize and writes variants back to S3:
// resize.worker.ts
import sharp from 'sharp';
await Promise.all([
sharp(buffer).resize(200, 200, { fit: 'cover' }).jpeg({ quality: 80
quickbite-assets-prod/temp/* → DELETE after 1 day
quickbite-assets-prod/exports/* → DELETE after 30 days
quickbite-assets-prod/receipts/* → TRANSITION to Glacier after 90 days
→ DELETE after 7 years (compliance)
quickbite-assets-prod/menu/* → no lifecycle rule (keep forever)These rules run for free, every night, in the background. Storage costs grow forever unless you configure them.
Enable bucket versioning. Each PUT to the same key keeps the old version too. If a bug overwrites an object, you can restore it.
Versioning has a cost — every old version still occupies storage. Combine with a lifecycle rule that deletes non-current versions after 30 days.
Turn on S3 Server Access Logging or CloudTrail Data Events for the bucket. Both record every GetObject, PutObject, DeleteObject with the IP, IAM principal, and timestamp.
Without these logs, after a security incident you cannot answer "did the attacker download every customer's passport, or just one?" That distinction changes whether you're notifying 1 person or 100,000.
temp/* and exports/*The DB stores metadata (owner, size, s3_key, upload time). S3 stores bytes. Never put PII in the S3 key — UUIDs only. Set Block Public Access on day one, because it's much easier than explaining to customers why their passport was publicly accessible.