Skip to main content

Subqueries

Learning Focus

Use this lesson to write subqueries — queries nested inside other queries — for dynamic filtering and derived tables.

Types of Subqueries

TypeLocationReturns
ScalarSELECT clauseSingle value
Derived tableFROM clauseA table
CorrelatedWHERE clause, references outer queryPer-row result
EXISTSWHERE clauseBoolean

Scalar Subquery

-- Find employees earning above average salary
SELECT name, salary,
(SELECT AVG(salary) FROM employees) AS company_avg
FROM employees
WHERE salary > (SELECT AVG(salary) FROM employees);

Derived Table (FROM Subquery)

-- Subquery in FROM acts as a virtual table
SELECT dept_name, avg_salary
FROM (
SELECT department AS dept_name, AVG(salary) AS avg_salary
FROM employees
GROUP BY department
) dept_averages
WHERE avg_salary > 75000;

IN / NOT IN with Subquery

-- Employees in departments where average salary > 80000
SELECT name, department
FROM employees
WHERE department IN (
SELECT department
FROM employees
GROUP BY department
HAVING AVG(salary) > 80000
);

EXISTS / NOT EXISTS

-- Employees who have at least one project assignment
SELECT name FROM employees e
WHERE EXISTS (
SELECT 1 FROM project_assignments pa WHERE pa.employee_id = e.id
);

-- Employees with NO project assignments
SELECT name FROM employees e
WHERE NOT EXISTS (
SELECT 1 FROM project_assignments pa WHERE pa.employee_id = e.id
);

Correlated Subquery

-- Find employees earning more than their department's average
SELECT name, department, salary
FROM employees e
WHERE salary > (
SELECT AVG(salary)
FROM employees
WHERE department = e.department -- references outer query
);

Subqueries vs CTEs

AspectSubqueryCTE
ReadabilityLower (nested)Higher (named)
ReuseMust repeatReference by name
RecursionNot supportedSupported
DuckDB performanceEquivalentEquivalent for most cases

Common Pitfalls

PitfallConsequencePrevention
Scalar subquery returns multiple rowsErrorAdd LIMIT 1 or use MIN/MAX
NOT IN with NULL in resultsReturns empty setUse NOT EXISTS instead of NOT IN when NULLs possible
Correlated subquery on large tablesN×M executionReplace with JOIN or window function

Quick Reference

-- Scalar
SELECT (SELECT MAX(salary) FROM employees);

-- Derived table
SELECT * FROM (SELECT col FROM t WHERE cond) sub;

-- IN subquery
WHERE col IN (SELECT col FROM t WHERE cond);

-- EXISTS
WHERE EXISTS (SELECT 1 FROM t WHERE t.id = outer.id);

What's Next