Skip to main content

Sales and Revenue Analytics

Learning Focus

Use this lesson to build production-ready sales analytics with DuckDB — covering revenue trends, product performance, and CLV.

Complete Sales Analytics Script

-- ═══════════════════════════════════════════════════════
-- SALES ANALYTICS DASHBOARD — DuckDB
-- Source: sales.parquet (or read_csv_auto('sales.csv'))
-- ═══════════════════════════════════════════════════════

-- 1. Revenue by Region and Month
SELECT
region,
date_trunc('month', sale_date) AS month,
COUNT(*) AS transactions,
SUM(revenue) AS total_revenue,
ROUND(AVG(revenue), 2) AS avg_order_value,
SUM(units_sold) AS total_units
FROM 'sales.parquet'
GROUP BY region, month
ORDER BY region, month;

-- 2. Top 20 Products by Revenue (YTD)
SELECT
product_name,
SUM(revenue) AS ytd_revenue,
SUM(units_sold) AS ytd_units,
ROUND(AVG(revenue / NULLIF(units_sold, 0)), 2) AS avg_unit_price
FROM 'sales.parquet'
WHERE extract('year' FROM sale_date) = 2024
GROUP BY product_name
ORDER BY ytd_revenue DESC
LIMIT 20;

-- 3. Month-over-Month Revenue Growth
WITH monthly AS (
SELECT date_trunc('month', sale_date) AS month, SUM(revenue) AS revenue
FROM 'sales.parquet'
GROUP BY month
)
SELECT
month,
revenue,
ROUND(
(revenue - lag(revenue) OVER (ORDER BY month)) /
NULLIF(lag(revenue) OVER (ORDER BY month), 0) * 100, 1
) AS mom_growth_pct
FROM monthly
ORDER BY month;

-- 4. Customer Lifetime Value (CLV)
SELECT
customer_id,
COUNT(DISTINCT order_id) AS total_orders,
SUM(revenue) AS lifetime_revenue,
ROUND(AVG(revenue), 2) AS avg_order_value,
MIN(sale_date) AS first_purchase,
MAX(sale_date) AS last_purchase,
DATEDIFF('day', MIN(sale_date), MAX(sale_date)) AS customer_lifespan_days
FROM 'sales.parquet'
GROUP BY customer_id
ORDER BY lifetime_revenue DESC
LIMIT 100;

-- 5. Product Revenue Rank within Each Region (Window Function)
SELECT
region,
product_name,
SUM(revenue) AS regional_revenue,
rank() OVER (PARTITION BY region ORDER BY SUM(revenue) DESC) AS regional_rank
FROM 'sales.parquet'
GROUP BY region, product_name
QUALIFY regional_rank <= 5 -- DuckDB QUALIFY = filter on window function result
ORDER BY region, regional_rank;

-- 6. Daily Revenue with 7-Day MA
SELECT
sale_date,
SUM(revenue) AS daily_revenue,
ROUND(AVG(SUM(revenue)) OVER (
ORDER BY sale_date
ROWS BETWEEN 6 PRECEDING AND CURRENT ROW
), 2) AS rolling_7d_avg
FROM 'sales.parquet'
GROUP BY sale_date
ORDER BY sale_date;

QUALIFY Clause (DuckDB-specific)

QUALIFY filters rows based on window function results — without needing a wrapper subquery:

-- Top 3 products per region (without CTE wrapper)
SELECT region, product_name, SUM(revenue) AS revenue,
rank() OVER (PARTITION BY region ORDER BY SUM(revenue) DESC) AS rnk
FROM 'sales.parquet'
GROUP BY region, product_name
QUALIFY rnk <= 3;

Export Dashboard Data

-- Export all dashboard datasets to separate Parquet files
COPY (SELECT region, date_trunc('month', sale_date) AS month, SUM(revenue) AS revenue
FROM 'sales.parquet' GROUP BY region, month)
TO 'dash/regional_monthly.parquet' (FORMAT PARQUET);

COPY (SELECT product_name, SUM(revenue) AS revenue FROM 'sales.parquet'
GROUP BY product_name ORDER BY revenue DESC LIMIT 50)
TO 'dash/top_products.parquet' (FORMAT PARQUET);

Common Pitfalls

PitfallConsequencePrevention
Division by zero in unit priceNULL or errorUse NULLIF(units_sold, 0)
CLV without cohort filterMixing partial periodsFilter by full completed months
QUALIFY on non-window expressionSyntax errorQUALIFY only works with windowed expressions

Quick Reference

-- QUALIFY (filter on window result)
SELECT *, rank() OVER (PARTITION BY col ORDER BY val DESC) AS rnk
FROM t GROUP BY col, val
QUALIFY rnk <= 3;

-- CLV
SUM(revenue) AS ltv, COUNT(DISTINCT order_id) AS orders

-- MoM growth
(revenue - lag(revenue) OVER (ORDER BY month)) / lag(revenue) OVER (...) * 100

What's Next