DuckDB vs SQLite vs PostgreSQL
Learning Focus
Use this lesson to understand when to choose DuckDB over SQLite or PostgreSQL based on workload characteristics.
Overview
All three databases support standard SQL, but they are designed for very different use cases.
| Feature | DuckDB | SQLite | PostgreSQL |
|---|---|---|---|
| Storage Model | Columnar | Row-based | Row-based |
| Primary Use | Analytics (OLAP) | Embedded apps (OLTP) | General-purpose (OLTP) |
| Server Required | No | No | Yes |
| Concurrency | Single writer, many readers | Single writer | Full multi-user |
| File Format | .db (columnar) | .db (B-tree) | Data directory |
| Parquet Support | Native | Via extension | Via Foreign Data Wrapper |
| Best For | Analytical queries | Small apps, mobile | Web apps, production OLTP |
DuckDB Architecture
Your App / Python / CLI
│
▼
DuckDB Engine (in-process)
│
┌────┴────────────────┐
│ Columnar Store │ ← .db file on disk
│ Vectorized Engine │ ← processes 1024 rows/batch
│ Parallel Executor │ ← uses all CPU cores
└─────────────────────┘
When to Use DuckDB
Choose DuckDB when:
- You are running analytical queries (
GROUP BY,Window Functions, largeJOIN) - Your data lives in Parquet, CSV, or JSON files
- You want zero infrastructure overhead (no Docker, no server)
- You need fast ad-hoc analysis inside a Python/R/Node.js script
- You are building a data pipeline that queries S3 or HTTPFS sources
Avoid DuckDB when:
- You need high-concurrency multi-user writes (use PostgreSQL)
- You are building a mobile app with small transactional data (use SQLite)
- You need row-level security and role-based access control (use PostgreSQL)
Side-by-Side Query Comparison
Reading a 10M-row CSV file:
-- DuckDB (native, no import needed)
SELECT department, AVG(salary)
FROM read_csv_auto('employees.csv')
GROUP BY department;
-- SQLite (must import first)
.import employees.csv employees
SELECT department, AVG(salary) FROM employees GROUP BY department;
-- PostgreSQL (must use COPY or a loader)
COPY employees FROM '/path/employees.csv' CSV HEADER;
SELECT department, AVG(salary) FROM employees GROUP BY department;
DuckDB executes the file query directly without an import step.
Performance Benchmarks (Illustrative)
| Query Type | DuckDB | SQLite | PostgreSQL |
|---|---|---|---|
| Count 10M rows | ~0.1s | ~3s | ~1s |
| AVG on 10M rows | ~0.2s | ~5s | ~1.5s |
| GROUP BY 10M rows | ~0.3s | ~8s | ~2s |
| Single row lookup | ~5ms | ~1ms | ~2ms |
DuckDB excels at scans; SQLite/PostgreSQL are faster for single-row point lookups.
Learning Check
Questions:
- Which database would you use for a mobile iOS app? Why?
- You need to query a 500GB Parquet file on S3. Which database fits best?
- What makes DuckDB's vectorized engine faster than row-based engines for analytics?
Common Pitfalls
| Pitfall | Consequence | Prevention |
|---|---|---|
| Using DuckDB as a web app backend | Poor concurrent write performance | Use PostgreSQL for the web layer; DuckDB for analytics |
| Expecting SQLite-style tiny footprint | DuckDB has a larger binary | Both are acceptable for most local workloads |
| Mixing OLTP and OLAP in one system | Performance degradation | Separate concerns: OLTP → Postgres, OLAP → DuckDB |
Quick Reference
-- DuckDB: Query a Parquet file
SELECT * FROM 'data.parquet' LIMIT 10;
-- DuckDB: Query a CSV without importing
SELECT COUNT(*) FROM read_csv_auto('data.csv');
-- DuckDB: Show database info
PRAGMA database_list;
What's Next
- Next: When to Use DuckDB - Real-world scenarios mapped to DuckDB's strengths.
- Module Overview - Return to this module index.