UNION operator in BigQuery

The UNION operator in BigQuery is used to combine the results of two or more SELECT queries into a single result set.

Each SELECT query involved in a UNION operation must satisfy the following conditions:

We can control whether the final result set includes duplicate rows:

Key Characteristics of UNION in BigQuery

1. Automatic Duplicate Removal

The UNION operator performs a DISTINCT operation across all columns, ensuring that duplicate rows are eliminated.

2. Column Names Derived from the First Query

The column names in the final result set are taken only from the first SELECT statement. Any aliases defined in subsequent SELECT statements are ignored.

SQL Syntax

SELECT column1, column2, ...
FROM project_name.dataset_name.table_name1
[WHERE condition]

UNION DISTINCT   -- DISTINCT is optional (default)

SELECT column1, column2, ...
FROM project_name.dataset_name.table_name2
[WHERE condition];    

Example: Combine sales data from two regions.

SQL

SELECT order_id, customer_id, order_amount
FROM `project.dataset.sales_india`

UNION

SELECT order_id, customer_id, order_amount
FROM `project.dataset.sales_usa`;    

UNION with ORDER BY (Important Rule)

When using UNION, the ORDER BY clause is allowed only once, and it must appear at the end of the query.

Sorting is applied to the entire UNION result, not to individual SELECT statements.

SQL

SELECT order_id, order_date
FROM orders_2024

UNION ALL

SELECT order_id, order_date
FROM orders_2025

ORDER BY order_date DESC;    

Invalid Query: ORDER BY cannot be used before the UNION operation.

SQL (Invalid)

SELECT * FROM table1 ORDER BY col
UNION ALL
SELECT * FROM table2;