Warming up the neural circuits...
Database design is harder to change than code. A well-designed schema saves months of refactoring. This module teaches you how to design schemas that scale.
organizes data to reduce redundancy and improve integrity.
Each column contains atomic values (no arrays or nested data), and each row is unique.
-- ❌ Violates 1NF: multi-valued column
CREATE TABLE orders (
id SERIAL PRIMARY KEY,
products TEXT -- "laptop, mouse, keyboard"
);
-- ✅ 1NF: separate rows
CREATE TABLE order_items (
id SERIAL PRIMARY KEY,
order_id INTEGER REFERENCES orders(id),
product_name VARCHAR(200)
);Every non-key column must depend on the whole primary key (relevant for composite keys).
-- ❌ Violates 2NF: product_name depends only on product_id, not on (order_id, product_id)
CREATE TABLE order_items (
order_id INTEGER,
product_id INTEGER,
product_name VARCHAR(200), -- ❌ depends only on product_id
quantity INTEGER,
PRIMARY KEY (order_id, product_id)
);
-- ✅ 2NF: split into two tables
CREATE TABLE products (id SERIAL PRIMARY KEY
Every non-key column must depend only on the primary key — not on other non-key columns.
-- ❌ Violates 3NF: department_name depends on department_id
CREATE TABLE employees (
id SERIAL PRIMARY KEY,
name VARCHAR(100),
department_id INTEGER,
department_name VARCHAR(100), -- ❌ depends on department_id, not on id
department_head VARCHAR(100)
);
-- ✅ 3NF: departments in their own table
CREATE
Normalization isn't free. Sometimes denormalization is the right choice.
| Situation | Strategy |
|---|---|
| Read-heavy dashboards | Pre-calculated aggregate columns |
| Real-time counters | Cached counts in parent table |
| High-traffic read paths | Redundant but faster lookups |
| Reporting/analytics | Materialized views |
-- Denormalized counter (avoids COUNT(*) query on every page load)
ALTER TABLE posts ADD COLUMN comment_count INTEGER DEFAULT 0;
-- Update counter via trigger
CREATE OR REPLACE FUNCTION update_comment_count()
RETURNS TRIGGER AS $$
BEGIN
IF TG_OP = 'INSERT' THEN
UPDATE posts SET comment_count = comment_count +
| SERIAL | UUID | |
|---|---|---|
| Size | 4 bytes | 16 bytes |
| Predictable | Yes (sequential) | No |
| Merge-safe | No | Yes |
| Performance | Slightly faster (B-tree smaller) | Slightly slower |
| URL-safe | Exposes row count | Safe |
Recommendation: Use UUID for distributed systems, SERIAL for single-server apps. Or use gen_random_uuid() for the best of both worlds.
-- Adding a column with a default (watch out: locks table!)
ALTER TABLE users ADD COLUMN timezone VARCHAR(50) DEFAULT 'UTC';
-- Better: add nullable first, backfill, then set NOT NULL
ALTER TABLE users ADD COLUMN timezone VARCHAR(50);
-- Backfill in batches (don't update all at once on large tables)
UPDATE users SET timezone = '