The sum() method in Pandas
In this exercise, we are using the datasource employees.csv. You can download the datasource and use for the transformation.
The sum() method returns the sum of the values over the requested axis.
Syntax a) pandas.DataFrame.sum()
It includes all the columns for the sum operation.
b) pandas.DataFrame.sum(numeric_only=True)
The numeric_only parameter specifies whether to include only the numeric and boolean columns. By default, the numeric_onlyhas value False. So, it includes all the includes for sum operation.
c) pandas.DataFrame.sum(axis=1)
The sum function is by default applied on the column wise but if we want to calculate the sum of column over the row axis, we can do that by applying the axis parameter value 1.
d) pandas.DataFrame.sum(skipna=True)
By default, the sum function excludes the NA/ null values in the sum operation.
Example: Load the Dataframe in the variable mydata.
Python
import pandas as pd mydata=pd.read_csv("employees.csv") mydata
The output of the above code is shown below:
Let’s get the sum of the DataFrame.
Python
mydata.sum()
The output of the above code is shown below:
If we want to include only the numeric and boolean columns, specify the numeric_only parameter values to True.
Python
mydata.sum(numeric_only=True)
The output of the above code is shown below:
Now we are going to use the datasource expenditure.csv. You can download the datasource and use for the transformation.
Example: Load the Dataframe in the variable df.
Python
import pandas as pd # Read the data, and create a dataframe df=pd.read_csv("expenditure.csv") df
The output of the above code is shown below:
Let’s create a new column in the dataframe with name “Total Expenditure”, which is the sum of "Expenditure on Clothes", "Expenditure on Food", and "Expenditure on Travelling" columns.
Python
# Calculate the sum over the rows in the dataframe by specifying the axis parameter df["Total Expenditure"]=df[["Expenditure on Clothes", "Expenditure on Food", "Expenditure on Travelling"]].sum(axis=1) df
The output of the above code is shown below:
We can also use the following formula to achieve the same result as above:
Python
# Calculate the sum over the rows in the dataframe by specifying the axis parameter df["Total Expenditure"]=df["Expenditure on Clothes"] + df["Expenditure on Food"] + df["Expenditure on Travelling"] df