Return DataFrame Columns in Pyspark
The dataframe.columns is used to return the column names in the dataframe.
Example: First create the SparkSession.
Python
# Import the SparkSession module from pyspark.sql import SparkSession # Initialize a Spark session spark = SparkSession.builder.appName("App Name").getOrCreate()
Read the data from the CSV file and show the data after reading.
Python
# Import the Data df = spark.read.csv("data.csv", header=True, inferSchema=True) # Show the data in the DataFrame df.show()
The output of the above code is shown below:

Let’s get all the column names from the dataframe as a list.
Python
df.columns
The output of the above code is shown below:
