List BigQuery Datasets Using Python
In this tutorial, we will learn how to retrieve a list of all datasets available in your Google Cloud project using Python. This is useful for project auditing and resource discovery.
Python Code to List Datasets
We will use the list_datasets() method from the google-cloud-bigquery library.
Python
# Import the packages
from dotenv import load_dotenv
import os
from google.oauth2 import service_account
from google.cloud import bigquery
def main():
# Loads environment variables from a .env file
load_dotenv()
# Use environment variables from .env file
credentials_path = os.getenv("GOOGLE_APPLICATION_CREDENTIALS")
project_name = os.getenv("project_id")
# Get the service account credentials
credentials = service_account.Credentials.from_service_account_file(credentials_path)
# Create the BigQuery client
client = bigquery.Client(credentials=credentials, project=project_name)
# List all datasets in the project
datasets = client.list_datasets()
for dataset in datasets:
print(f"Dataset ID: {dataset.dataset_id}")
# List tables inside each dataset
tables = client.list_tables(dataset.dataset_id)
for table in tables:
print(f" Table ID: {table.table_id}")
if __name__ == "__main__":
main() Key Steps Explained
- client.list_datasets() returns an iterator over all datasets in the current project.
- dataset.dataset_id retrieves the unique identifier of each dataset.
- client.list_tables() lists all tables inside each dataset.
Note: By default, datasets are listed from the project linked with the client. You can specify a project explicitly using list_datasets(project="project-id").