Create BigQuery Tables from Google Cloud Storage (Using SQL)

Creating BigQuery tables directly from files stored in Google Cloud Storage (GCS) is a common data engineering workflow. This method avoids manual downloads and enables scalable analytics pipelines.

Below is a step-by-step guide using BigQuery SQL.

Prerequisites

Before creating the table:

A Google Cloud Storage bucket with files
Supported file formats: CSV, JSON, Parquet, ORC, Avro
BigQuery dataset created
Required permissions:
- BigQuery Data Editor
- Storage Object Viewer

🎯 Goal

Load a CSV file stored in:

gs://new-bucket-via-python-sdk/Sample Data.csv

into a BigQuery table:

ashishcoder.Coding_Dataset.gcstable

Step 1: Create BigQuery Table

SQL

CREATE OR REPLACE TABLE `ashishcoder.Coding_Dataset.gcstable` (
  EmployeeID INT64,
  Name STRING,
  Age INT64,
  Salary INT64,
  Department STRING,
  Gender STRING
);

Explanation

EmployeeID (INT64) → Unique employee ID
Name (STRING) → Employee name
Age (INT64) → Age
Salary (INT64) → Salary
Department (STRING) → Department
Gender (STRING) → Gender
CREATE OR REPLACE overwrites the table if it already exists

Step 2: Load Data from GCS into BigQuery

The LOAD DATA statement loads data from one or more files into a table.

SQL

LOAD DATA OVERWRITE `ashishcoder.Coding_Dataset.gcstable`
FROM FILES (
  format = 'CSV',
  uris = ['gs://new-bucket-via-python-sdk/Sample Data.csv'],
  skip_leading_rows = 1
);

Parameter Breakdown

LOAD DATA OVERWRITE

Loads data into the table
Replaces existing data

a) format → Specifies the file type. Supported formats:

CSV
NEWLINE_DELIMITED_JSON
PARQUET
AVRO
ORC

b) uris → Specifies the location of files.

Examples:

uris = ['gs://bucket/file.csv']

# Multiple files
uris = ['gs://bucket/*.csv']

c) skip_leading_rows → Skips header rows in the file.

Previous Next