An Introduction to Artificial Intelligence

In this exercise, we will learn about the Artificial Intelligence and its related terms.

Artificial intelligence (AI) AI is a broad field that encompasses the development of intelligent systems capable of performing tasks that typically require human intelligence, such as perception, reasoning, learning, problem-solving, and decision-making. AI serves as an umbrella term for various techniques and approaches, including machine learning, deep learning, and generative AI, among others.

Machine learning (ML) ML is a type of AI for understanding and building methods that make it possible for machines to learn. These methods use data to improve computer performance on a set of tasks.

Deep learning (DL) Deep learning uses the concept of neurons and synapses similar to how our brain is wired. An example of a deep learning application is Amazon Rekognition, which can analyze millions of images and streaming and stored videos within seconds.

Generative AI Generative AI is a subset of deep learning because it can adapt models built using deep learning, but without retraining or fine tuning.

Generative AI systems are capable of generating new data based on the patterns and structures learned from training data.

An Introduction to Artificial Intelligence

KNeighborsRegressor Algorithm

In the context of K-Nearest Neighbors (KNN), neighbors refer to the data points in the training dataset that are closest to a given data point based on a distance metric (e.g., Euclidean distance). Syntax sklearn.neighbors.KNeighborsRegressor(n_neighbors=5)

The significance of neighbors lies in their influence on predicting the class (in classification) or value (in regression) of the new or test data point. KNN assumes that similar data points exist close to each other in feature space (i.e., they share similar properties).

How It Works: For a given test data point, the algorithm:

Machine Learning using ColumnTransformer

Python Syntax

from sklearn.compose import ColumnTransformer

# Create a ColumnTransformer
column_transformer = ColumnTransformer(
    transformers=[
        ('transformer_name1', transformer1, column_list1),
        ('transformer_name2', transformer2, column_list2),
        ...
    ],
    remainder='drop'  # Optional: Specifies what to do with unselected columns
)  

Create a Pipeline in the ML

Python Syntax

from sklearn.pipeline import Pipeline

# Create a pipeline
pipeline = Pipeline(steps=[
    ('step_name1', transformer_or_model1),
    ('step_name2', transformer_or_model2),
    ...
])  

Explanation:

Pipeline with ColumnTransformer

Python

from sklearn.compose import ColumnTransformer
from sklearn.preprocessing import StandardScaler, OneHotEncoder
from sklearn.ensemble import RandomForestClassifier
from sklearn.pipeline import Pipeline

# Define the preprocessors
preprocessor = ColumnTransformer(
    transformers=[
        ('num', StandardScaler(), ['age', 'income']),        # Scale numerical columns
        ('cat', OneHotEncoder(), ['gender', 'city'])         # Encode categorical columns
    ])

# Create a pipeline
pipeline = Pipeline(steps=[
    ('preprocessor', preprocessor),                         # Step 1: Preprocess the data
    ('classifier', RandomForestClassifier())                # Step 2: Train the classifier
]) 

Key Methods in Pipelines: 1. fit(X, y) Fits all steps (e.g., preprocessing and model) to the training data.

Python Syntax

pipeline.fit(X_train, y_train)

2. predict(X) Applies the transformations and makes predictions using the model.

Python Syntax

y_pred = pipeline.predict(X_test)

3. fit_predict(X, y) Fits the pipeline and directly predicts results.

Python Syntax

y_pred = pipeline.fit_predict(X_train, y_train)

4. score(X, y) Evaluates the model's performance (e.g., accuracy for classifiers).

Python Syntax

pipeline.score(X_test, y_test)