UNION operator in BigQuery
The UNION operator in BigQuery is used to combine the results of two or more SELECT queries into a single result set.
Each SELECT query involved in a UNION operation must satisfy the following conditions:
- The same number of columns
- Compatible data types in corresponding columns
- Columns in the same order
We can control whether the final result set includes duplicate rows:
- UNION ALL – Includes duplicate rows
- UNION – Removes duplicate rows
Key Characteristics of UNION in BigQuery
1. Automatic Duplicate Removal
The UNION operator performs a DISTINCT operation across all columns, ensuring that duplicate rows are eliminated.
2. Column Names Derived from the First Query
The column names in the final result set are taken only from the first SELECT statement. Any aliases defined in subsequent SELECT statements are ignored.
SQL Syntax
SELECT column1, column2, ... FROM project_name.dataset_name.table_name1 [WHERE condition] UNION DISTINCT -- DISTINCT is optional (default) SELECT column1, column2, ... FROM project_name.dataset_name.table_name2 [WHERE condition];
Example: Combine sales data from two regions.
SQL
SELECT order_id, customer_id, order_amount FROM `project.dataset.sales_india` UNION SELECT order_id, customer_id, order_amount FROM `project.dataset.sales_usa`;
UNION with ORDER BY (Important Rule)
When using UNION, the ORDER BY clause is allowed only once, and it must appear at the end of the query.
Sorting is applied to the entire UNION result, not to individual SELECT statements.
SQL
SELECT order_id, order_date FROM orders_2024 UNION ALL SELECT order_id, order_date FROM orders_2025 ORDER BY order_date DESC;
Invalid Query: ORDER BY cannot be used before the UNION operation.
SQL (Invalid)
SELECT * FROM table1 ORDER BY col UNION ALL SELECT * FROM table2;