The sum() method in Pandas

In this exercise, we are using the datasource employees.csv. You can download the datasource and use for the transformation.

The sum() method returns the sum of the values over the requested axis.

Syntax a) pandas.DataFrame.sum()
It includes all the columns for the sum operation.
b) pandas.DataFrame.sum(numeric_only=True)
The numeric_only parameter specifies whether to include only the numeric and boolean columns. By default, the numeric_onlyhas value False. So, it includes all the includes for sum operation.
c) pandas.DataFrame.sum(axis=1)
The sum function is by default applied on the column wise but if we want to calculate the sum of column over the row axis, we can do that by applying the axis parameter value 1.
d) pandas.DataFrame.sum(skipna=True)
By default, the sum function excludes the NA/ null values in the sum operation.

Example: Load the Dataframe in the variable mydata.

Python

import pandas as pd
mydata=pd.read_csv("employees.csv")
mydata 

The output of the above code is shown below:

The sum() method in Pandas

Let’s get the sum of the DataFrame.

Python

mydata.sum()  

The output of the above code is shown below:

The sum() method in Pandas

If we want to include only the numeric and boolean columns, specify the numeric_only parameter values to True.

Python

mydata.sum(numeric_only=True)   

The output of the above code is shown below:

The sum() method in Pandas

Now we are going to use the datasource expenditure.csv. You can download the datasource and use for the transformation.

Example: Load the Dataframe in the variable df.

Python

import pandas as pd

# Read the data, and create a dataframe
df=pd.read_csv("expenditure.csv")
df   

The output of the above code is shown below:

The sum() method in Pandas

Let’s create a new column in the dataframe with name “Total Expenditure”, which is the sum of "Expenditure on Clothes", "Expenditure on Food", and "Expenditure on Travelling" columns.

Python

# Calculate the sum over the rows in the dataframe by specifying the axis parameter
df["Total Expenditure"]=df[["Expenditure on Clothes", "Expenditure on Food", "Expenditure on Travelling"]].sum(axis=1)
df  

The output of the above code is shown below:

The sum() method in Pandas

We can also use the following formula to achieve the same result as above:

Python

# Calculate the sum over the rows in the dataframe by specifying the axis parameter
df["Total Expenditure"]=df["Expenditure on Clothes"] + df["Expenditure on Food"] + df["Expenditure on Travelling"]
df