Create a Dataset in BigQuery using Python

In this tutorial, we will learn how to create a new dataset in BigQuery using Python. A dataset is a container for tables, views, and other objects in BigQuery.

Install the BigQuery Python Client

Install the required BigQuery client library before running the Python code:

pip install google-cloud-bigquery    

Python Code to Create Dataset

We will use the create_dataset method from the google-cloud-bigquery library.

Python

# Import the packages
from dotenv import load_dotenv
import os
from google.oauth2 import service_account
from google.cloud import bigquery

def main():
    # Loads environment variables from a .env file
    load_dotenv()

    # Read environment variables
    credentials_path = os.getenv("GOOGLE_APPLICATION_CREDENTIALS")
    project_name = os.getenv("project_id")

    # Create credentials using the service account file
    credentials = service_account.Credentials.from_service_account_file(credentials_path)

    # Create the BigQuery client
    client = bigquery.Client(credentials=credentials, project=project_name)

    # Define the dataset ID
    dataset_id = "ashishcoder.new_dataset"

    # Create the dataset object
    dataset = bigquery.Dataset(dataset_id)

    # Specify the location for the dataset
    dataset.location = "US"

    # Create the dataset in BigQuery
    dataset = client.create_dataset(dataset, exists_ok=True)

    # Print a success message
    print(f"Created dataset {dataset_id}.")

if __name__ == "__main__":
    main()    

Key Steps Explained