GROUP BY

Learning Focus

Use this lesson to master GROUP BY for summarizing data by category in DuckDB.

Concept Overview

GROUP BY collapses multiple rows sharing the same value(s) into a single summary row, enabling aggregate functions to compute per-group statistics.

Basic Syntax

SELECT column, aggregate_function(other_col)
FROM table
GROUP BY column;

Examples

Example 1: Headcount per Department

SELECT department, COUNT(*) AS headcount
FROM employees
GROUP BY department;

Example 2: Multi-Column Grouping

SELECT department, active, COUNT(*) AS count, AVG(salary) AS avg_salary
FROM employees
GROUP BY department, active
ORDER BY department, active;

Example 3: Group by Expression

-- Group by year of hire
SELECT
    extract('year' FROM hire_date) AS hire_year,
    COUNT(*) AS hires
FROM employees
GROUP BY hire_year
ORDER BY hire_year;

Example 4: Group on File Data

-- Aggregate directly from a CSV
SELECT category, SUM(revenue) AS total_revenue
FROM read_csv_auto('sales.csv')
GROUP BY category
ORDER BY total_revenue DESC;

Example 5: Group by with FILTER

SELECT
    department,
    COUNT(*) FILTER (WHERE active = true)  AS active,
    COUNT(*) FILTER (WHERE active = false) AS inactive
FROM employees
GROUP BY department;

Common Pitfalls

Pitfall	Consequence	Prevention
Non-aggregated column missing from GROUP BY	Error	Every SELECT column must be in GROUP BY or wrapped in an aggregate
Grouping by high-cardinality columns	Many small groups	Check cardinality with `COUNT(DISTINCT col)` first
Using WHERE instead of HAVING for group filters	Error or wrong results	Filter group results with `HAVING`

Quick Reference

SELECT col, COUNT(*), AVG(col2) FROM t GROUP BY col;
SELECT col1, col2, SUM(col3) FROM t GROUP BY col1, col2;
SELECT extract('year' FROM date_col), COUNT(*) FROM t GROUP BY 1;

What's Next

Next: HAVING - Filter groups after aggregation.
Module Overview - Return to this module index.

Concept Overview​

Basic Syntax​

Examples​

Example 1: Headcount per Department​

Example 2: Multi-Column Grouping​

Example 3: Group by Expression​

Example 4: Group on File Data​

Example 5: Group by with FILTER​

Common Pitfalls​

Quick Reference​

What's Next​