The drop_duplicates method in Pandas

This pandas.DataFrame.drop_duplicates() method returns the dataframe after removing the duplicate rows. By default, it will consider all the columns to check whether a row is duplicate or not. Optionally, we can specify the columns to consider while checking whether a row is duplicate.

Syntax a) pandas.DataFrame.drop_duplicates()
b) pandas.DataFrame.drop_duplicates(subset=[“column_name1”, “column_name2”])

Example: Create a dataframe.

Python

mydata = {
    'Name': ['Ashish', 'Katrina', 'Alia', 'Ashish', 'Alia'],
    'Age': [25, 30, 35, 25, 40],
    'City': ['New York', 'Los Angeles', 'Mumbai', 'New York', 'Mumbai']
}
df = pd.DataFrame(mydata)
print(df)

Use the below command to drop the duplicates.

Python

newdf=df.drop_duplicates()
print(newdf)

The output of the above code is shown below:

Previous Next