Drop a Column from the dataframe in Pyspark
The dataframe.drop() method is used to drop the specific column(s) from the dataframe and returns the transformed new dataframe.
Syntax a) df = dataframe.drop("column_name")
b) df = dataframe.drop([“column_name1”, “column_name2”])
Example: First create the SparkSession.
Python
# Import the SparkSession module from pyspark.sql import SparkSession # Initialize a Spark session spark = SparkSession.builder.appName("App Name").getOrCreate()
Read the data from the CSV file and show the data after reading.
Python
# Import the Data df = spark.read.csv("data.csv", header=True, inferSchema=True) # Show the data in the DataFrame df.show()
The output of the above code is shown below:

Let’s drop the “Company” column from the DataFrame.
Python
newdf = df.drop("Company") newdf.show()
The output of the above code is shown below:
