Sales and Revenue Analytics
Learning Focus
Use this lesson to build production-ready sales analytics with DuckDB — covering revenue trends, product performance, and CLV.
Complete Sales Analytics Script
-- ═══════════════════════════════════════════════════════
-- SALES ANALYTICS DASHBOARD — DuckDB
-- Source: sales.parquet (or read_csv_auto('sales.csv'))
-- ═══════════════════════════════════════════════════════
-- 1. Revenue by Region and Month
SELECT
region,
date_trunc('month', sale_date) AS month,
COUNT(*) AS transactions,
SUM(revenue) AS total_revenue,
ROUND(AVG(revenue), 2) AS avg_order_value,
SUM(units_sold) AS total_units
FROM 'sales.parquet'
GROUP BY region, month
ORDER BY region, month;
-- 2. Top 20 Products by Revenue (YTD)
SELECT
product_name,
SUM(revenue) AS ytd_revenue,
SUM(units_sold) AS ytd_units,
ROUND(AVG(revenue / NULLIF(units_sold, 0)), 2) AS avg_unit_price
FROM 'sales.parquet'
WHERE extract('year' FROM sale_date) = 2024
GROUP BY product_name
ORDER BY ytd_revenue DESC
LIMIT 20;
-- 3. Month-over-Month Revenue Growth
WITH monthly AS (
SELECT date_trunc('month', sale_date) AS month, SUM(revenue) AS revenue
FROM 'sales.parquet'
GROUP BY month
)
SELECT
month,
revenue,
ROUND(
(revenue - lag(revenue) OVER (ORDER BY month)) /
NULLIF(lag(revenue) OVER (ORDER BY month), 0) * 100, 1
) AS mom_growth_pct
FROM monthly
ORDER BY month;
-- 4. Customer Lifetime Value (CLV)
SELECT
customer_id,
COUNT(DISTINCT order_id) AS total_orders,
SUM(revenue) AS lifetime_revenue,
ROUND(AVG(revenue), 2) AS avg_order_value,
MIN(sale_date) AS first_purchase,
MAX(sale_date) AS last_purchase,
DATEDIFF('day', MIN(sale_date), MAX(sale_date)) AS customer_lifespan_days
FROM 'sales.parquet'
GROUP BY customer_id
ORDER BY lifetime_revenue DESC
LIMIT 100;
-- 5. Product Revenue Rank within Each Region (Window Function)
SELECT
region,
product_name,
SUM(revenue) AS regional_revenue,
rank() OVER (PARTITION BY region ORDER BY SUM(revenue) DESC) AS regional_rank
FROM 'sales.parquet'
GROUP BY region, product_name
QUALIFY regional_rank <= 5 -- DuckDB QUALIFY = filter on window function result
ORDER BY region, regional_rank;
-- 6. Daily Revenue with 7-Day MA
SELECT
sale_date,
SUM(revenue) AS daily_revenue,
ROUND(AVG(SUM(revenue)) OVER (
ORDER BY sale_date
ROWS BETWEEN 6 PRECEDING AND CURRENT ROW
), 2) AS rolling_7d_avg
FROM 'sales.parquet'
GROUP BY sale_date
ORDER BY sale_date;
QUALIFY Clause (DuckDB-specific)
QUALIFY filters rows based on window function results — without needing a wrapper subquery:
-- Top 3 products per region (without CTE wrapper)
SELECT region, product_name, SUM(revenue) AS revenue,
rank() OVER (PARTITION BY region ORDER BY SUM(revenue) DESC) AS rnk
FROM 'sales.parquet'
GROUP BY region, product_name
QUALIFY rnk <= 3;
Export Dashboard Data
-- Export all dashboard datasets to separate Parquet files
COPY (SELECT region, date_trunc('month', sale_date) AS month, SUM(revenue) AS revenue
FROM 'sales.parquet' GROUP BY region, month)
TO 'dash/regional_monthly.parquet' (FORMAT PARQUET);
COPY (SELECT product_name, SUM(revenue) AS revenue FROM 'sales.parquet'
GROUP BY product_name ORDER BY revenue DESC LIMIT 50)
TO 'dash/top_products.parquet' (FORMAT PARQUET);
Common Pitfalls
| Pitfall | Consequence | Prevention |
|---|---|---|
| Division by zero in unit price | NULL or error | Use NULLIF(units_sold, 0) |
| CLV without cohort filter | Mixing partial periods | Filter by full completed months |
| QUALIFY on non-window expression | Syntax error | QUALIFY only works with windowed expressions |
Quick Reference
-- QUALIFY (filter on window result)
SELECT *, rank() OVER (PARTITION BY col ORDER BY val DESC) AS rnk
FROM t GROUP BY col, val
QUALIFY rnk <= 3;
-- CLV
SUM(revenue) AS ltv, COUNT(DISTINCT order_id) AS orders
-- MoM growth
(revenue - lag(revenue) OVER (ORDER BY month)) / lag(revenue) OVER (...) * 100
What's Next
- Module Overview - Return to the Run Analytics module index.
- Back to DuckDB Index - Review the full DuckDB curriculum.