Count Rows in DataFrame in Pyspark
The dataframe.count() is used to count the number of rows in the dataframe.
Syntax dataframe.count()
In this exercise, we are using the datasource data.csv. You can download the datasource and use for the transformation.
Example: First create the SparkSession and read the data from the CSV file.
Python
# Import the SparkSession module from pyspark.sql import SparkSession # Initialize a Spark session spark = SparkSession.builder.appName("App Name").getOrCreate() # Import the Data df = spark.read.csv("data.csv", header=True, inferSchema=True) # Show the data in the DataFrame df.show()
The output of the above code is shown below:

Let’s get the count of rows in the dataframe.
Python
df.count()
The output of the above code is shown below:
