List BigQuery Datasets Using Python

In this tutorial, we will learn how to retrieve a list of all datasets available in your Google Cloud project using Python. This is useful for project auditing and resource discovery.

Python Code to List Datasets

We will use the list_datasets() method from the google-cloud-bigquery library.

Python

# Import the packages
from dotenv import load_dotenv
import os
from google.oauth2 import service_account
from google.cloud import bigquery

def main():
    # Loads environment variables from a .env file
    load_dotenv()

    # Use environment variables from .env file
    credentials_path = os.getenv("GOOGLE_APPLICATION_CREDENTIALS")
    project_name = os.getenv("project_id")

    # Get the service account credentials
    credentials = service_account.Credentials.from_service_account_file(credentials_path)

    # Create the BigQuery client
    client = bigquery.Client(credentials=credentials, project=project_name)

    # List all datasets in the project
    datasets = client.list_datasets()

    for dataset in datasets:
        print(f"Dataset ID: {dataset.dataset_id}")

        # List tables inside each dataset
        tables = client.list_tables(dataset.dataset_id)
        for table in tables:
            print(f"  Table ID: {table.table_id}")

if __name__ == "__main__":
    main()    

Key Steps Explained

Note: By default, datasets are listed from the project linked with the client. You can specify a project explicitly using list_datasets(project="project-id").