Query Profiling
Learning Focus
Use this lesson to profile and debug slow queries in DuckDB using built-in explain and timing tools.
EXPLAIN — View the Query Plan
-- View the logical query plan
EXPLAIN SELECT department, AVG(salary) FROM employees GROUP BY department;
Output shows:
- Physical operators (HashAggregate, SeqScan, Filter, etc.)
- Row estimates
- Column projections
EXPLAIN ANALYZE — Run and Profile
-- Execute the query AND show timing for each operator
EXPLAIN ANALYZE
SELECT department, AVG(salary)
FROM employees
GROUP BY department;
Output includes:
- Actual execution time per operator
- Rows processed vs. estimated
- Memory usage per operator
.timer on (CLI Timing)
-- Enable query timing in the CLI
.timer on
SELECT COUNT(*) FROM 'large_dataset.parquet';
-- Run Time (s): real 0.142 user 0.312 sys 0.004
PRAGMA statements
-- Show number of threads used
PRAGMA threads;
-- Set number of worker threads
PRAGMA threads = 8;
-- Show memory limit
PRAGMA memory_limit;
-- Set memory limit
PRAGMA memory_limit = '4GB';
-- Show all pragmas
PRAGMA all_pragmas;
SUMMARIZE — Quick Data Profiling
-- Get statistics for every column in a table
SUMMARIZE employees;
-- Summarize a file
SUMMARIZE SELECT * FROM read_parquet('large_data.parquet');
Output includes: count, nulls, min, max, avg, stddev, unique count — per column.
Interpreting EXPLAIN Output
-- Example plan:
HASH_GROUP_BY
│ Groups: #0, #1
└── PROJECTION
│ Projections: department, salary
└── SEQ_SCAN employees
│ Projections: department, salary
└── (Rows: 5)
Key operators:
| Operator | Description |
|---|---|
SEQ_SCAN | Full table/file scan |
FILTER | WHERE clause applied |
HASH_JOIN | Hash-based join |
HASH_GROUP_BY | GROUP BY aggregation |
ORDER_BY | Sorting |
PROJECTION | Column selection |
Common Pitfalls
| Pitfall | Consequence | Prevention |
|---|---|---|
| Not running EXPLAIN before optimizing | Optimizing wrong part | Always EXPLAIN first to find the bottleneck |
| Ignoring row estimate vs actual mismatch | Statistics staleness | ANALYZE tables after large inserts |
| High memory usage on large sorts | OOM error | Add LIMIT or increase PRAGMA memory_limit |
Quick Reference
EXPLAIN query;
EXPLAIN ANALYZE query;
SUMMARIZE tablename;
PRAGMA threads = 8;
PRAGMA memory_limit = '8GB';
.timer on
What's Next
- Next: Indexes in DuckDB - When and how indexes help in DuckDB.
- Module Overview - Return to this module index.