Rename column in Pyspark

In this exercise, we will learn about how to rename the column in pyspark. The df.withColumnRenamed() function is used to rename the existing column in the dataframe.

Syntax df.withColumnRenamed("OldColumnName", "newColumnName")

Example: First create the SparkSession.

Python

# Import the SparkSession module
from pyspark.sql import SparkSession

# Initialize a Spark session
spark = SparkSession.builder.appName("App Name").getOrCreate()   

Read the data from the CSV file and show the data after reading.

Python

# Import the Data
df = spark.read.csv("data.csv", header=True, inferSchema=True)

# Show the data in the DataFrame
df.show()  

The output of the above code is shown below:

Rename column in Pyspark

Let’s rename the “Salary” column to “Employee Salary”.

Python

# Rename the column
newdf = df.withColumnRenamed("Salary", "Employee Salary")

# Display the DataFrame with the renamed column
newdf.show() 

The output of the above code is shown below:

Rename column in Pyspark