COPY FROM and Export
Learning Focus
Use this lesson to import data into DuckDB tables and export query results to various file formats.
COPY FROM (Import)
-- Import CSV into an existing table
COPY employees FROM 'employees.csv' (FORMAT CSV, HEADER true, DELIMITER ',');
-- Import Parquet
COPY employees FROM 'employees.parquet' (FORMAT PARQUET);
-- Import JSON Lines
COPY events FROM 'events.ndjson' (FORMAT JSON);
COPY TO (Export)
Export to CSV
-- Export entire table
COPY employees TO 'output/employees.csv' (FORMAT CSV, HEADER true);
-- Export query result
COPY (
SELECT name, department, salary
FROM employees
WHERE active = true
ORDER BY salary DESC
) TO 'active_employees.csv' (FORMAT CSV, HEADER true, DELIMITER ',');
Export to Parquet
-- Export with Parquet compression
COPY employees TO 'employees.parquet' (FORMAT PARQUET, COMPRESSION SNAPPY);
-- Export partitioned Parquet (Hive layout)
COPY (SELECT * FROM sales)
TO 'output/' (FORMAT PARQUET, PARTITION_BY (year, month));
Export to JSON
COPY employees TO 'employees.json' (FORMAT JSON);
Practical ETL Patterns
CSV → Parquet Pipeline
-- Read, clean, and write in one step
COPY (
SELECT
id,
UPPER(name) AS name,
department,
ROUND(salary,2) AS salary,
hire_date
FROM read_csv_auto('raw/employees.csv')
WHERE salary > 0 AND name IS NOT NULL
) TO 'clean/employees.parquet' (FORMAT PARQUET, COMPRESSION ZSTD);
Split Export into Multiple Files
-- Export one file per department using PARTITION_BY
COPY (SELECT * FROM employees)
TO 'output/by_dept/' (FORMAT PARQUET, PARTITION_BY (department));
-- Creates: output/by_dept/department=Engineering/data.parquet etc.
COPY Options Reference
| Option | Values | Description |
|---|---|---|
FORMAT | CSV, PARQUET, JSON | Output format |
HEADER | true/false | Include CSV header |
DELIMITER | Any char | Field delimiter for CSV |
COMPRESSION | SNAPPY, ZSTD, GZIP | Parquet compression |
PARTITION_BY | Column list | Hive-partition the output |
Common Pitfalls
| Pitfall | Consequence | Prevention |
|---|---|---|
| COPY to a path without directory existing | Error: directory not found | Create directory before COPY |
| Exporting large result without compression | Large output files | Always use COMPRESSION ZSTD for Parquet |
| CSV export without HEADER | Missing column names | Always set HEADER true for CSV |
Quick Reference
-- Import
COPY t FROM 'file.csv' (FORMAT CSV, HEADER true);
COPY t FROM 'file.parquet' (FORMAT PARQUET);
-- Export
COPY t TO 'out.csv' (FORMAT CSV, HEADER true);
COPY t TO 'out.parquet' (FORMAT PARQUET, COMPRESSION ZSTD);
COPY (SELECT ...) TO 'out.parquet' (FORMAT PARQUET, PARTITION_BY (col));
What's Next
- Next: Performance and Extensions - Optimize and extend DuckDB.
- Module Overview - Return to this module index.