Create a Dataset in BigQuery using Python
In this tutorial, we will learn how to create a new dataset in BigQuery using Python. A dataset is a container for tables, views, and other objects in BigQuery.
Install the BigQuery Python Client
Install the required BigQuery client library before running the Python code:
pip install google-cloud-bigquery
Python Code to Create Dataset
We will use the create_dataset method from the google-cloud-bigquery library.
Python
# Import the packages
from dotenv import load_dotenv
import os
from google.oauth2 import service_account
from google.cloud import bigquery
def main():
# Loads environment variables from a .env file
load_dotenv()
# Read environment variables
credentials_path = os.getenv("GOOGLE_APPLICATION_CREDENTIALS")
project_name = os.getenv("project_id")
# Create credentials using the service account file
credentials = service_account.Credentials.from_service_account_file(credentials_path)
# Create the BigQuery client
client = bigquery.Client(credentials=credentials, project=project_name)
# Define the dataset ID
dataset_id = "ashishcoder.new_dataset"
# Create the dataset object
dataset = bigquery.Dataset(dataset_id)
# Specify the location for the dataset
dataset.location = "US"
# Create the dataset in BigQuery
dataset = client.create_dataset(dataset, exists_ok=True)
# Print a success message
print(f"Created dataset {dataset_id}.")
if __name__ == "__main__":
main() Key Steps Explained
- bigquery.Dataset(dataset_ref): Initializes a new dataset object with the specified reference.
- dataset.location = "US": Sets the geographical location where the dataset data will be stored.
- client.create_dataset(dataset): Sends the request to BigQuery to create the dataset.