Sort the Dataframe Columns in Pyspark
The dataframe.sort() function is used to sort the columns in the dataframe.
Syntax DataFrame.sort(Column, SortOrder)
Here, the Column specifies the name of the column on which we want to sort, and it can be in the string, or list type. The SortOrder specifies the sort order i.e. ascending or descending, it will be in boolean data type, i.e. True or False. True for ascending order and False for the descending Order. By default, its value is True.
Example: First create the SparkSession and read the data from the CSV file.
Python
# Import the SparkSession module from pyspark.sql import SparkSession # Initialize a Spark session spark = SparkSession.builder.appName("App Name").getOrCreate() # Import the Data df = spark.read.csv("data.csv", header=True, inferSchema=True) # Show the data in the DataFrame df.show()
The output of the above code is shown below:

Let’s sort the dataframe by Ascending order.
Python
groupdf=df.groupBy(df["Company"]).agg(sum("Salary").alias("Total Salary")) groupdf.sort("Total Salary").show()
The output of the above code is shown below:

Let’s sort the dataframe in descending order.
Python
The output of the above code is shown below:
