Label Encoder in ML

In this exercise, we will learn about the Label Encoder in ML.

Label Encoder is a preprocessing tool in Scikit-Learn (a Python library) that is used to convert categorical labels into numerical values. It is particularly useful for machine learning models, as many of them require input features to be numeric.

Why Label Encoding? Many machine learning algorithms do not work with text or string data. Therefore, categorical variables (like "red," "blue," "green") need to be converted into numerical values.

How Label Encoding Works Label Encoding assigns each unique value in a categorical column an integer value, starting from 0.

Example: Import the required module and load the data from the CSV file.

Python

# Import the pandas package
import pandas as pd
import numpy as np

# Load Data into Pandas DataFrame
df = pd.read_csv("employees.csv")
df 

The output of the above code is shown below:

Label Encoder in ML

Python

# Import the LabelEncoder module
from sklearn.preprocessing import LabelEncoder

# Initialize the object
label_encoder = LabelEncoder()

# Create or replace an existing column with the same name
df['Gender'] = label_encoder.fit_transform(df['Gender'])
df 

The output of the above code is shown below:

Label Encoder in ML

Python

# Create or replace an existing column with the same name
df['Company'] = label_encoder.fit_transform(df['Company'])
df 

The output of the above code is shown below:

Label Encoder in ML