Window functions in BigQuery
Window functions are one of the most important SQL features in BigQuery. They allow you to perform calculations across related rows without collapsing rows (unlike GROUP BY).
Basic Syntax (Core Pattern)
window_function(expression)
OVER (
PARTITION BY column1, column2
ORDER BY column3
ROWS | RANGE frame_definition
)
What Each Part Does
- window_function → Functions such as SUM, AVG, COUNT, ROW_NUMBER, LAG, LEAD, etc.
- PARTITION BY → Optional. Splits the dataset into independent groups. Similar to GROUP BY, but rows are not collapsed.
If PARTITION BY is not used:
- No grouping occurs
- All rows belong to one single partition
- The window function is calculated over the full dataset
- ORDER BY → Defines the row sequence inside each partition
- ROWS / RANGE → Defines how many rows are included in the window frame
Example: Without Window Function
a) Using GROUP BY (rows collapse)
SQL Query
SELECT department, AVG(salary) AS avg_salary FROM employees GROUP BY department;
Result One row per department
b) Same Logic Using Window Function
SQL Query
SELECT employee_id, department, salary, AVG(salary) OVER (PARTITION BY department) AS avg_dept_salary FROM employees;
Key Difference
- Every employee row remains
- Department average is repeated for each row
Use Case: Comparisons, percentages, rankings, running totals
Important Notes on Window Frames
- If neither ORDER BY nor a window frame is present, the window includes all rows in the partition.
- If ORDER BY is present but no frame is defined, BigQuery uses the default frame:
RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
Equivalent Queries Example
SQL Query
SELECT book,
LAST_VALUE(book) OVER (ORDER BY year)
FROM Library; is equivalent to
SQL Query
SELECT book,
LAST_VALUE(book)
OVER (
ORDER BY year
RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
)
FROM Library; Example: Running Total Using Window Function
SQL Query
SELECT
order_date,
sales_amount,
SUM(sales_amount) OVER (
ORDER BY order_date
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
) AS running_sales
FROM sales; Frame Explained
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
- Starts from the first row
- Continuously adds values until the current row