Skip to main content

GROUPING SETS, ROLLUP, CUBE

Learning Focus

Use this lesson to compute multi-dimensional aggregations in a single pass using DuckDB's advanced grouping extensions.

GROUPING SETS

GROUPING SETS lets you define multiple grouping combinations in one query — equivalent to running several GROUP BY queries and unioning the results.

SELECT department, active, COUNT(*) AS cnt
FROM employees
GROUP BY GROUPING SETS (
(department),
(active),
(department, active),
() -- grand total
);

ROLLUP

ROLLUP generates hierarchical subtotals from left to right — ideal for reports with sub-totals and grand totals.

-- Department > Hire Year > Count (with subtotals at each level)
SELECT
department,
extract('year' FROM hire_date) AS hire_year,
COUNT(*) AS headcount
FROM employees
GROUP BY ROLLUP (department, hire_year)
ORDER BY department NULLS LAST, hire_year NULLS LAST;

Result structure:

  • Rows with both department and hire_year → per-group detail
  • Rows where hire_year IS NULL → department subtotal
  • Row where both are NULL → grand total

CUBE

CUBE generates all possible combinations of the specified dimensions — like running ROLLUP in all directions simultaneously.

SELECT
department,
active,
COUNT(*) AS cnt,
SUM(salary) AS total_salary
FROM employees
GROUP BY CUBE (department, active)
ORDER BY department NULLS LAST, active NULLS LAST;

Identifying Aggregation Level with GROUPING()

SELECT
department,
active,
COUNT(*) AS cnt,
GROUPING(department) AS is_dept_subtotal,
GROUPING(active) AS is_active_subtotal
FROM employees
GROUP BY CUBE (department, active);

GROUPING() returns 1 when the column was omitted for that subtotal row, 0 otherwise.

Practical Use Case: Executive Report

SELECT
COALESCE(department, 'ALL DEPARTMENTS') AS department,
COALESCE(CAST(extract('year' FROM hire_date) AS VARCHAR), 'ALL YEARS') AS year,
COUNT(*) AS headcount,
ROUND(AVG(salary), 2) AS avg_salary
FROM employees
GROUP BY ROLLUP (department, extract('year' FROM hire_date))
ORDER BY 1, 2;

Common Pitfalls

PitfallConsequencePrevention
Confusing NULL subtotals with real NULLsMisread reportsUse GROUPING() to flag subtotal rows
CUBE on many columnsExponential result sizeCUBE on N dims = 2^N groupings — limit to 3-4 dims
Not using COALESCE on NULL labels'NULL' in report outputWrap with COALESCE(col, 'Total')

Quick Reference

-- ROLLUP
GROUP BY ROLLUP (col1, col2)

-- CUBE
GROUP BY CUBE (col1, col2)

-- GROUPING SETS
GROUP BY GROUPING SETS ((col1), (col2), ())

-- Detect subtotal rows
GROUPING(col) -- returns 1 if this row is a subtotal

What's Next