Query BigQuery Tables Using Python

BigQuery is a fully-managed, serverless data warehouse that enables scalable analysis over petabytes of data. In this tutorial, we will learn how to execute SQL queries on BigQuery tables using the google-cloud-bigquery Python library.

Prerequisites

Python Code to Query BigQuery

The following example demonstrates how to initialize a BigQuery client, define a SQL query, and iterate through the results.

Python

# Import the packages
from dotenv import load_dotenv
import os
from google.oauth2 import service_account
from google.cloud import bigquery

def main():
    # Loads environment variables from a .env file
    load_dotenv()

    # Use environment variables from .env file
    credentials_path = os.getenv("GOOGLE_APPLICATION_CREDENTIALS")
    project_name = os.getenv("project_id")

    # Get the service account credentials
    credentials = service_account.Credentials.from_service_account_file(credentials_path)

    # Create the BigQuery client object
    client = bigquery.Client(credentials=credentials, project=project_name)

    # Perform a query
    QUERY = """
        SELECT
            name,
            SUM(number) AS total_count
        FROM
            `bigquery-public-data.usa_names.usa_1910_2013`
        WHERE
            state = 'TX'
        GROUP BY
            name
        ORDER BY
            total_count DESC
        LIMIT 5
    """

    print("Executing query...")

    # Execute the query
    query_job = client.query(QUERY)

    print("The results are:")

    # Waits for query to finish
    rows = query_job.result()

    for row in rows:
        # Access column values by name
        print(f"Name: {row['name']}, Total: {row['total_count']}")

if __name__ == "__main__":
    main()    

Key Steps Explained

Note: When querying, ensure you use backticks (`) for table names if they contain special characters or are fully qualified (for example, project.dataset.table).