Window Functions
Learning Focus
Use this lesson to master Window Functions — one of DuckDB's most powerful features for analytical SQL.
What are Window Functions?
Window functions compute a value for each row based on a set of related rows (the "window"), without collapsing them into a single group like GROUP BY does.
function_name() OVER (
[PARTITION BY col] -- divide into groups
[ORDER BY col] -- define row order within group
[ROWS/RANGE frame] -- define the window frame
)
Ranking Functions
-- ROW_NUMBER: unique sequential rank
SELECT name, salary,
row_number() OVER (ORDER BY salary DESC) AS row_num
FROM employees;
-- RANK: same rank for ties, gaps after ties
SELECT name, salary,
rank() OVER (ORDER BY salary DESC) AS rank
FROM employees;
-- DENSE_RANK: same rank for ties, no gaps
SELECT name, salary,
dense_rank() OVER (ORDER BY salary DESC) AS dense_rank
FROM employees;
-- NTILE: divide into N buckets
SELECT name, salary,
ntile(4) OVER (ORDER BY salary) AS quartile
FROM employees;
Running Totals and Moving Averages
-- Running total of salary (ordered by hire date)
SELECT name, hire_date, salary,
SUM(salary) OVER (ORDER BY hire_date) AS running_payroll
FROM employees
ORDER BY hire_date;
-- 3-month moving average
SELECT order_date, daily_revenue,
AVG(daily_revenue) OVER (
ORDER BY order_date
ROWS BETWEEN 2 PRECEDING AND CURRENT ROW
) AS moving_avg_3d
FROM daily_sales;
PARTITION BY (Per-Group Windows)
-- Rank employees within each department
SELECT name, department, salary,
rank() OVER (PARTITION BY department ORDER BY salary DESC) AS dept_rank
FROM employees;
-- Department's running total (reset per dept)
SELECT name, department, salary,
SUM(salary) OVER (PARTITION BY department ORDER BY hire_date) AS dept_running_total
FROM employees;
LAG and LEAD
-- Compare each row to the previous/next row
SELECT order_date, revenue,
lag(revenue, 1) OVER (ORDER BY order_date) AS prev_day_revenue,
lead(revenue, 1) OVER (ORDER BY order_date) AS next_day_revenue,
revenue - lag(revenue, 1) OVER (ORDER BY order_date) AS day_over_day_change
FROM daily_sales;
FIRST_VALUE and LAST_VALUE
-- Department's top earner name alongside each employee
SELECT name, department, salary,
first_value(name) OVER (
PARTITION BY department ORDER BY salary DESC
) AS dept_top_earner
FROM employees;
Practical Use Case: Top N per Group
WITH ranked AS (
SELECT name, department, salary,
row_number() OVER (PARTITION BY department ORDER BY salary DESC) AS rn
FROM employees
)
SELECT name, department, salary
FROM ranked
WHERE rn <= 2; -- top 2 earners per department
Common Pitfalls
| Pitfall | Consequence | Prevention |
|---|---|---|
| Forgetting ORDER BY in ranking functions | Non-deterministic ranks | Always include ORDER BY in ranking windows |
| LAST_VALUE without frame clause | Returns current row, not last | Add ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING |
| Using window functions in WHERE | Error — not allowed | Filter via CTE or subquery wrapping the window function |
Quick Reference
row_number() OVER (ORDER BY col)
rank() OVER (ORDER BY col)
dense_rank() OVER (ORDER BY col)
ntile(4) OVER (ORDER BY col)
SUM(col) OVER (ORDER BY col)
AVG(col) OVER (PARTITION BY dept ORDER BY date ROWS BETWEEN 6 PRECEDING AND CURRENT ROW)
lag(col, 1) OVER (ORDER BY col)
lead(col, 1) OVER (ORDER BY col)
What's Next
- Next: Subqueries - Nest queries inside queries for flexible data retrieval.
- Module Overview - Return to this module index.