COPY FROM and Export

Learning Focus

Use this lesson to import data into DuckDB tables and export query results to various file formats.

COPY FROM (Import)

-- Import CSV into an existing table
COPY employees FROM 'employees.csv' (FORMAT CSV, HEADER true, DELIMITER ',');

-- Import Parquet
COPY employees FROM 'employees.parquet' (FORMAT PARQUET);

-- Import JSON Lines
COPY events FROM 'events.ndjson' (FORMAT JSON);

COPY TO (Export)

Export to CSV

-- Export entire table
COPY employees TO 'output/employees.csv' (FORMAT CSV, HEADER true);

-- Export query result
COPY (
    SELECT name, department, salary
    FROM employees
    WHERE active = true
    ORDER BY salary DESC
) TO 'active_employees.csv' (FORMAT CSV, HEADER true, DELIMITER ',');

Export to Parquet

-- Export with Parquet compression
COPY employees TO 'employees.parquet' (FORMAT PARQUET, COMPRESSION SNAPPY);

-- Export partitioned Parquet (Hive layout)
COPY (SELECT * FROM sales)
TO 'output/' (FORMAT PARQUET, PARTITION_BY (year, month));

Export to JSON

COPY employees TO 'employees.json' (FORMAT JSON);

Practical ETL Patterns

CSV → Parquet Pipeline

-- Read, clean, and write in one step
COPY (
    SELECT
        id,
        UPPER(name)     AS name,
        department,
        ROUND(salary,2) AS salary,
        hire_date
    FROM read_csv_auto('raw/employees.csv')
    WHERE salary > 0 AND name IS NOT NULL
) TO 'clean/employees.parquet' (FORMAT PARQUET, COMPRESSION ZSTD);

Split Export into Multiple Files

-- Export one file per department using PARTITION_BY
COPY (SELECT * FROM employees)
TO 'output/by_dept/' (FORMAT PARQUET, PARTITION_BY (department));
-- Creates: output/by_dept/department=Engineering/data.parquet etc.

COPY Options Reference

Option	Values	Description
`FORMAT`	`CSV`, `PARQUET`, `JSON`	Output format
`HEADER`	`true/false`	Include CSV header
`DELIMITER`	Any char	Field delimiter for CSV
`COMPRESSION`	`SNAPPY`, `ZSTD`, `GZIP`	Parquet compression
`PARTITION_BY`	Column list	Hive-partition the output

Common Pitfalls

Pitfall	Consequence	Prevention
COPY to a path without directory existing	Error: directory not found	Create directory before COPY
Exporting large result without compression	Large output files	Always use `COMPRESSION ZSTD` for Parquet
CSV export without HEADER	Missing column names	Always set `HEADER true` for CSV

Quick Reference

-- Import
COPY t FROM 'file.csv' (FORMAT CSV, HEADER true);
COPY t FROM 'file.parquet' (FORMAT PARQUET);

-- Export
COPY t TO 'out.csv'     (FORMAT CSV,     HEADER true);
COPY t TO 'out.parquet' (FORMAT PARQUET, COMPRESSION ZSTD);
COPY (SELECT ...) TO 'out.parquet' (FORMAT PARQUET, PARTITION_BY (col));

What's Next

Next: Performance and Extensions - Optimize and extend DuckDB.
Module Overview - Return to this module index.

COPY FROM (Import)​

COPY TO (Export)​

Export to CSV​

Export to Parquet​

Export to JSON​

Practical ETL Patterns​

CSV → Parquet Pipeline​

Split Export into Multiple Files​

COPY Options Reference​

Common Pitfalls​

Quick Reference​

What's Next​

COPY FROM (Import)

COPY TO (Export)

Export to CSV

Export to Parquet

Export to JSON

Practical ETL Patterns

CSV → Parquet Pipeline

Split Export into Multiple Files

COPY Options Reference

Common Pitfalls

Quick Reference

What's Next