An Introduction to Artificial Intelligence
In this exercise, we will learn about the Artificial Intelligence and its related terms.
Artificial intelligence (AI) AI is a broad field that encompasses the development of intelligent systems capable of performing tasks that typically require human intelligence, such as perception, reasoning, learning, problem-solving, and decision-making. AI serves as an umbrella term for various techniques and approaches, including machine learning, deep learning, and generative AI, among others.
Machine learning (ML) ML is a type of AI for understanding and building methods that make it possible for machines to learn. These methods use data to improve computer performance on a set of tasks.
Deep learning (DL) Deep learning uses the concept of neurons and synapses similar to how our brain is wired. An example of a deep learning application is Amazon Rekognition, which can analyze millions of images and streaming and stored videos within seconds.
Generative AI Generative AI is a subset of deep learning because it can adapt models built using deep learning, but without retraining or fine tuning.
Generative AI systems are capable of generating new data based on the patterns and structures learned from training data.

KNeighborsRegressor Algorithm
In the context of K-Nearest Neighbors (KNN), neighbors refer to the data points in the training dataset that are closest to a given data point based on a distance metric (e.g., Euclidean distance). Syntax sklearn.neighbors.KNeighborsRegressor(n_neighbors=5)
The significance of neighbors lies in their influence on predicting the class (in classification) or value (in regression) of the new or test data point. KNN assumes that similar data points exist close to each other in feature space (i.e., they share similar properties).
How It Works: For a given test data point, the algorithm:
- Measures its distance to all the points in the training dataset.
- Selects the k nearest neighbors.
- Makes a prediction based on these neighbors.
- Classification: Assigns the most common class among the neighbors.
- Regression: Takes the average (or weighted average) of the values of the neighbors.
Machine Learning using ColumnTransformer
Python Syntax
from sklearn.compose import ColumnTransformer # Create a ColumnTransformer column_transformer = ColumnTransformer( transformers=[ ('transformer_name1', transformer1, column_list1), ('transformer_name2', transformer2, column_list2), ... ], remainder='drop' # Optional: Specifies what to do with unselected columns )
Create a Pipeline in the ML
Python Syntax
from sklearn.pipeline import Pipeline # Create a pipeline pipeline = Pipeline(steps=[ ('step_name1', transformer_or_model1), ('step_name2', transformer_or_model2), ... ])
Explanation:
- Pipeline: The scikit-learn class used to create the pipeline.
- steps: A list of tuples specifying the sequence of steps.
- step_name: A string that names the step (e.g., "preprocessor", "scaler").
- transformer_or_model: A transformer (e.g., scaler, encoder) or model (e.g., RandomForestClassifier) to be applied in this step.
- Order: Steps are executed sequentially in the order they appear in the list.
Pipeline with ColumnTransformer
Python
from sklearn.compose import ColumnTransformer from sklearn.preprocessing import StandardScaler, OneHotEncoder from sklearn.ensemble import RandomForestClassifier from sklearn.pipeline import Pipeline # Define the preprocessors preprocessor = ColumnTransformer( transformers=[ ('num', StandardScaler(), ['age', 'income']), # Scale numerical columns ('cat', OneHotEncoder(), ['gender', 'city']) # Encode categorical columns ]) # Create a pipeline pipeline = Pipeline(steps=[ ('preprocessor', preprocessor), # Step 1: Preprocess the data ('classifier', RandomForestClassifier()) # Step 2: Train the classifier ])
Key Methods in Pipelines: 1. fit(X, y) Fits all steps (e.g., preprocessing and model) to the training data.
Python Syntax
2. predict(X) Applies the transformations and makes predictions using the model.
Python Syntax
3. fit_predict(X, y) Fits the pipeline and directly predicts results.
Python Syntax
4. score(X, y) Evaluates the model's performance (e.g., accuracy for classifiers).
Python Syntax