nonobie — Real dev skills, explained like a friend would

What you'll learn

This module goes far beyond basic JOINs. You'll write queries that would make most developers' heads spin — and understand exactly how they work.

Beyond basic JOINs

In the free course, you learned INNER JOIN and LEFT JOIN. Real production queries use:

Self-joins — joining a table to itself
Multi-way joins — 5+ tables in a single query
Subqueries — queries within queries
EXISTS / NOT EXISTS — efficient existence checks
LATERAL — per-row subqueries

1. Self-joins

A self-join joins a table to itself. Use this for hierarchical data.

Employee-manager hierarchy

sql

CREATE TABLE employees (
    id SERIAL PRIMARY KEY,
    name VARCHAR(100) NOT NULL,
    manager_id INTEGER REFERENCES employees(id)
);
 
INSERT INTO employees (name, manager_id) VALUES
    ('CEO', NULL),
    ('VP Engineering', 1),
    ('VP Marketing', 1),
    ('Engineering Manager', 2),
    ('Senior Developer', 4),
    ('Junior Developer', 4),
    ('Marketing Lead', 3);

Find everyone's manager

sql

SELECT
    e.name AS employee,
    m.name AS manager
FROM employees e
LEFT JOIN employees m ON e.manager_id = m.id;

Output:

plaintext

     employee      |     manager
-------------------+-----------------
 CEO               | NULL
 VP Engineering    | CEO
 VP Marketing      | CEO
 Engineering Mgr   | VP Engineering
 Senior Developer  | Engineering Mgr
 Junior Developer  | Engineering Mgr
 Marketing Lead    | VP Marketing

Find all direct reports of a manager

sql

SELECT e.name AS direct_report
FROM employees e
JOIN employees m ON e.manager_id = m.id
WHERE m.name = 'VP Engineering';

Find the org depth (who reports to whom recursively)

sql

-- PostgreSQL recursive CTE
WITH RECURSIVE org_chain AS (
    -- Base case: top-level managers
    SELECT id, name, manager_id, 1 AS depth
    FROM employees
    WHERE manager_id IS NULL
 
    UNION ALL
 
    -- Recursive case: employees who report to someone in the chain
    SELECT e.id, e.name,

Tip

Recursive CTEs are PostgreSQL's secret weapon for tree structures — org charts, category trees, comment threads, and more.

2. Subqueries

A subquery is a query inside another query. They can appear in SELECT, FROM, or WHERE.

Subquery in WHERE

sql

-- Find employees who earn more than the average
SELECT name, salary
FROM employees
WHERE salary > (SELECT AVG(salary) FROM employees);

Subquery in SELECT (scalar subquery)

sql

-- Show each employee's salary and how it compares to average
SELECT
    name,
    salary,
    (SELECT AVG(salary) FROM employees) AS avg_salary,
    salary - (SELECT AVG(salary) FROM employees) AS difference
FROM employees;

Subquery in FROM (derived table)

sql

-- Find the top earner in each department
SELECT department, max_salary
FROM (
    SELECT
        department,
        MAX(salary) AS max_salary
    FROM employees
    GROUP BY department
) AS dept_max
ORDER BY max_salary DESC;

Watch out

Performance trap: Correlated subqueries run once per outer row. On a table with 10,000 rows, that's 10,000 subquery executions. Often a JOIN or window function is faster.

3. EXISTS and NOT EXISTS

EXISTS checks whether a subquery returns any rows. It's often faster than JOIN for existence checks.

Find authors who have published at least one post

sql

SELECT u.username, u.email
FROM users u
WHERE EXISTS (
    SELECT 1 FROM posts p
    WHERE p.user_id = u.id
      AND p.published = true
);

Find authors with NO published posts

sql

SELECT u.username, u.email
FROM users u
WHERE NOT EXISTS (
    SELECT 1 FROM posts p
    WHERE p.user_id = u.id
      AND p.published = true
);

EXISTS vs JOIN: When to use which

Use case	Winner
"Find X that have Y"	EXISTS (stops at first match)
"Find X with Y data"	JOIN (needs the data)
"Find X without Y"	NOT EXISTS (avoids NULL handling)
"Count X per Y"	JOIN with GROUP BY

sql

-- EXISTS — more efficient for large datasets
SELECT * FROM users u
WHERE EXISTS (SELECT 1 FROM orders o WHERE o.user_id = u.id);
 
-- Same result with JOIN — but requires DISTINCT and scans more rows
SELECT DISTINCT u.* FROM users u
JOIN orders o ON o.user_id = u.

Example

Real-world use case: Finding users who haven't logged in for 30 days:

sql

SELECT email
FROM users u
WHERE NOT EXISTS (
    SELECT 1 FROM activity_log a
    WHERE a.user_id = u.id
      AND a.event = 'login'
      AND a

4. LATERAL joins

LATERAL lets a subquery reference columns from preceding tables in the same FROM clause.

sql

-- Top 2 most recent posts per author
SELECT
    u.username,
    recent_posts.title,
    recent_posts.created_at
FROM users u
CROSS JOIN LATERAL (
    SELECT title, created_at
    FROM posts
    WHERE posts.user_id = u.id
    ORDER BY created_at DESC
    LIMIT 2
) AS

Without LATERAL, you'd need a complex window function or multiple queries.

Real-world LATERAL examples

Nearest locations:

sql

SELECT u.name, nearby_branch.name AS nearest_branch
FROM users u
CROSS JOIN LATERAL (
    SELECT name, ST_Distance(location, u.location) AS dist
    FROM branches
    ORDER BY dist
    LIMIT 1
) AS nearby_branch;

Latest order per customer:

sql

SELECT c.name, last_order.total, last_order.ordered_at
FROM customers c
CROSS JOIN LATERAL (
    SELECT total, ordered_at
    FROM orders
    WHERE orders.customer_id = c.id
    ORDER BY ordered_at DESC
    LIMIT 1
) AS last_order;

Tip

LATERAL is PostgreSQL's superpower. It's like a correlated subquery that returns multiple columns and rows. Use it for "top N per group" queries.

5. Anti-patterns to avoid

❌ Using DISTINCT to work around bad JOINs

sql

-- BAD: DISTINCT hides duplicate rows from a bad JOIN
SELECT DISTINCT u.* FROM users u
JOIN orders o ON o.user_email = u.phone;

✅ Fix: Use EXISTS instead, or fix the JOIN condition.

❌ Correlated subquery in SELECT on large tables

sql

-- BAD: Runs the subquery for every row
SELECT name,
    (SELECT MAX(amount) FROM payments p WHERE p.user_id = u.id)
FROM users u;

✅ Fix: Use a window function or LATERAL.

❌ Too many JOINs in one query

sql

-- BAD: 8 JOINs — hard to read, debug, and optimize
SELECT * FROM orders o
JOIN users u ... JOIN payments p ... JOIN addresses a ...
JOIN coupons c ... JOIN products pr ... JOIN reviews r ...
JOIN shipments s ...

✅ Fix: Break into multiple queries or use CTEs.

Practice: Production query challenge

You have a database with these tables:

users(id, name, email, created_at)
orders(id, user_id, total_cents, status, created_at)
order_items(id, order_id, product_name, quantity, unit_price_cents)
refunds(id, order_id, amount_cents, reason, created_at)

Write queries for:

Users who have placed more than 5 orders but never requested a refund
The top 3 most recent orders for each user (use LATERAL)
Products that have been ordered by at least 10 different users
The average time between consecutive orders per user
Users who haven't ordered anything in the last 90 days but have ordered before

What you learned

✅ Self-joins for hierarchical data ✅ Subqueries in SELECT, FROM, and WHERE ✅ EXISTS / NOT EXISTS for efficient filtering ✅ LATERAL joins for top-N-per-group ✅ Recursive CTEs for tree structures

Next: CTEs and window functions — two of PostgreSQL's most powerful features.

Advanced SQL — Complex Joins & Subqueries