Glenn Yonemitsu - software engineer. yonemitsu@gmail.com

SQL Window Functions

SQL window functions allow you to perform calculations across a set of rows that are related to the current row. Unlike aggregate functions (like SUM(), AVG(), COUNT()) which group rows and return a single value for each group, window functions return a value for each row, based on a "window" of rows.

This allows for more complex analytical queries without the need for self-joins or subqueries, often leading to more efficient and readable SQL.

What is a Window?

A "window" is a set of rows defined by the OVER() clause. This clause can include:

Let's use a simple example with sales data to illustrate.

sales table

sale_id region product amount sale_date
1 East A 100.00 2025-01-01
2 East B 150.00 2025-01-05
3 West A 200.00 2025-01-02
4 East A 120.00 2025-01-10
5 West B 180.00 2025-01-07

Ranking Functions: ROW_NUMBER(), RANK(), DENSE_RANK()

These functions assign a rank to each row within its partition based on the ORDER BY clause.

The following query ranks the sales within each region by amount:

SELECT    sale_id,
          region,
          product,
          amount,
          RANK() OVER (PARTITION BY region ORDER BY amount DESC) as rnk,
FROM      sales

Aggregate Window Functions: SUM(), AVG(), COUNT(), MIN(), MAX()

These aggregate functions can be used as window functions as well. When used with OVER(), they perform the aggregation over the defined window for each row, rather than collapsing rows into groups.

Let's calculate a running total of sales amount within each region, ordered by sale date:

SELECT    sale_id,
          region,
          product,
          amount,
          sale_date,
          SUM(amount) OVER (PARTITION BY region ORDER BY sale_date) as running_total_region
FROM      sales

3. Value Functions: LAG(), LEAD(), FIRST_VALUE(), LAST_VALUE()

These functions allow you to access values from preceding or succeeding rows within the window.

Let's find the previous sale amount for each sale within its region:

SELECT    sale_id,
          region,
          product,
          amount,
          sale_date,
          LAG(amount, 1, 0.00) OVER (PARTITION BY region ORDER BY sale_date) as previous_sale_amount
FROM      sales

References