Introduction
to
Machine Learning
What is machine Learning?
Machine learning
⚫Machine learning is a method of data analysis
that automates analytical model building.
⚫It is a branch of Artificial intelligence
based on the idea that systems can learn from
data, identify patterns and make decisions with
minimal human intervention.
Why we need Machine
Learning?
●Healthcare.
●Government Systems
●Marketing and sales.
●E-commerce and social media
●Transportation Etc.
Application
Application
How machine learning works
Types of Machine learning
Supervised learning is the type of machine learning which
involves the task of learning a function that maps an input to an
output based on example input-output pairs. It infers a function
from labeled training data consisting of a set of training examples.
Example algorithms:
● Linear Regression
● Logistic Regression
● Decision Trees
● Support Vector Machines (SVM)
Unsupervised learning is a type of machine learning algorithm used
to draw inferences from data-sets consisting of input data without
labeled responses. The most common unsupervised learning method
is cluster analysis, which is used for exploratory data analysis to find
hidden patterns or grouping in data.
Example techniques:
● K-Means Clustering
● Hierarchical Clustering
● Principal Component Analysis (PCA)
1. Import required packages
2. Load dataset
3. Identify/ Group features and target attributes
4. Visualize data to apply efficient algorithm
5. Split dataset into training set and testing set
6. load the model
7. Train the model by providing training data
8. Prediction based on test data
9. score the model
10.Determine the error and accuracy
Steps involved in building machine learning
model
Evaluation Metrics for Regression
Evaluation metrics for class
Accuracy
Measures the percentage of correctly predicted
instances.
Formula:
Accuracy = (TP + TN) / (TP + TN + FP + FN)
Precision
Measures the correctness of positive
predictions.
Formula:
Precision = TP / (TP + FP)
Recall (Sensitivity)
Measures the model's ability to capture actual
positive cases.
Formula:
Recall = TP / (TP + FN)
F1-Score
Balances precision and recall for imbalanced
datasets.
Formula:
F1-Score = 2 * (Precision * Recall) / (Precision +
Recall)
Confusion Matrix
Summarizes model predictions compared to
actual values.
Supervised Machine Learning
Regression
1. For regression analysis there is one independent set and one
dependent set.
2. When we have to model the relationship between dependent and
independent variable, regression analysis will be applied.
3. Whenever any dependent set is continuous or having numerical data,
Linear Regression
1. Dependent variable is continuous in nature
2. Applied when there is a linear relationship between independent and
dependent variable
Logistic Regression
Logistic regression is kind of like linear
regression but is used when the dependent
variable is not a number, but something else
(like a Yes/No response).
Its called Regression but performs
classification as based on the regression it
classifies the dependent variable into either of
the classes.
Logistic regression is used for prediction of
output which is binary
For example, if a credit card company is
going to build a model to decide whether to
issue a credit card to a customer or not.
Firstly, Linear Regression is performed on the
relationship between variables to get the
model . The threshold for the classification
line is assumed to be at 0.5
Linear
Regression
Logistic
Sigmoid Function
Logistic Function is applied to the regression
to get the probabilities of it belonging in
either class.
It gives the log of the probability of the event
occurring to log of the probability of it not
Log
Odds
K-Nearest Neighbours (K-
NN)
K-NN algorithm is one of the simplest
classification algorithm and it is used to
identify the data points that are separated
into several classes to predict the
classification of a new sample point.
K-NN is a non-parametric, lazy learning
algorithm.
It classifies new cases based on a similarity
measure (e.g. distance functions).
KNN does not learn. It is lazy and it just
Decision Tree
Classification
Decision tree builds classification or regression models in the
form of a tree structure.
It breaks down a dataset into smaller and smaller subsets while
at the same time an associated decision tree is incrementally
developed.
The final result is a tree with decision nodes and leaf nodes.
Decision Tree uses Entropy and Information Gain to construct a
Information Gain
1. It measures the relative change in entropy with respect to the
independent attribute.
2. Constructing a decision tree is all about finding the attribute that
returns the highest information gain (i.e., the most
homogeneous branches).
3. The information gain is calculated for the target attribute.
Entropy
1. Entropy is the degree or amount of uncertainty in the randomness of
elements or in other words it is a measure of impurity.
2. Entropy of an attribute is the multiplication of Information gain of that
attribute and the probability of that attribute.
3. Out of all the features attributes one will become the root node.
age competition type profit
old yes s/w down
old no s/w down
old no h/w down
mid yes s/w down
mid yes h/w down
mid no h/w up
mid no s/w up
new yes s/w up
new no h/w up
new no s/w up
1st
find out entropy of Age
Down Up
Old 3 0
Mid 2 2
New 0 3
P(old) =
3/10
P(mid) =
4/10
P(new) =
3/10
E(Age) = P(old)*I(old) +
P(mid)*I(mid) + P(new)*I(new)
E(Age) = 3/10*0 + 4/10*1 + 3/10*0
E(Age) = 0.4
Gain(Age) = 1 – 0.4
Gain(Age) = 0.6
Gain(Age) = 0.6
Gain(Competitio
n) = 0.124
Gain(Type) = 0
I.G. (target) = 1
Age
Compet
ition
ol
d
mi
d
ne
w
ye
s
no
do up
do
wn
up
Why python?
Integrated Development environment(IDE)
Installation
●https://docs.anaconda.com/anaconda/install/
windows/
Data types
⚫Data types are the classification or
categorization of data items.
Variables
⚫Variables are containers for storing data values.
Operators
⚫Arithmetic operators
⚫Assignment operators
⚫Comparison operators
⚫Logical operators
⚫Identity operators
⚫Membership operators
⚫Bitwise operators
Arithmetic operator
Assignment operators
Comparison Operators
Bitwise operators
Identity operator
Membership operator
Decision Making
⚫ Decision-making is the anticipation of conditions occurring
during the execution of a program and specified actions taken
according to the conditions.
⚫ if statements:An if statement consists of a Boolean expression
followed by one or more statements.
⚫ if...else statements:An if statement can be followed by an
optional else statement, which executes when the Boolean
expression is FALSE.
⚫ nested if statements:You can use one if or else if statement
inside another if or else if statement(s).
Loops
⚫ A loop statement allows us to execute a statement or group of
statements multiple times.
⚫ while loop:Repeats a statement or group of statements while a given
condition is TRUE. It tests the condition before executing the loop body.
⚫ for loop:Executes a sequence of statements multiple times and
abbreviates the code that manages the loop variable.
⚫ nested loops:You can use one or more loop inside any another while,
or for loop.
⚫ The Infinite Loop:A loop becomes infinite loop if a condition never
becomes FALSE.
⚫ The range() function
The built-in function range() is the right function to iterate over a
Loop Control Statements
⚫break statement:Terminates the loop statement
and transfers execution to the statement
immediately following the loop.
⚫continue statement:Causes the loop to skip the
remainder of its body and immediately retest its
condition prior to reiterating.
⚫pass statement:The pass statement in Python is
used when a statement is required syntactically
but you do not want any command or code to
Functions
⚫A function is a block of organized, reusable code
that is used to perform a single, related action.
⚫ Functions provide better modularity for your
application and a high degree of code reusing.
⚫Defining a Function
⚫Calling a Function
Function Arguments
⚫Required arguments
⚫ Keyword arguments
⚫ Default arguments
⚫ Variable-length arguments
File handling
Supervised Machine Learning
Regression
1. For regression analysis there is one independent set and one
dependent set.
2. When we have to model the relationship between dependent and
independent variable, regression analysis will be applied.
3. Whenever any dependent set is continuous or having numerical data,
Limitations of Traditional Machine Learning
● Feature Engineering is Manual
● Scalability Issues
● Poor Performance on Unstructured Data
● Inability to Handle High-Dimensional Data
● Lack of Hierarchical Representations
What is Deep Neural Networks (DNNs)?
● A type of Artificial Neural Network (ANN) with multiple
hidden layers.
● Designed to automatically learn features and
representations from data.
● Used in fields like computer vision, speech
recognition, and natural language processing (NLP).
machine _learning_introductionand python.pptx

machine _learning_introductionand python.pptx

  • 1.
  • 2.
    What is machineLearning?
  • 3.
    Machine learning ⚫Machine learningis a method of data analysis that automates analytical model building. ⚫It is a branch of Artificial intelligence based on the idea that systems can learn from data, identify patterns and make decisions with minimal human intervention.
  • 4.
    Why we needMachine Learning? ●Healthcare. ●Government Systems ●Marketing and sales. ●E-commerce and social media ●Transportation Etc.
  • 5.
  • 6.
  • 7.
  • 8.
  • 10.
    Supervised learning isthe type of machine learning which involves the task of learning a function that maps an input to an output based on example input-output pairs. It infers a function from labeled training data consisting of a set of training examples. Example algorithms: ● Linear Regression ● Logistic Regression ● Decision Trees ● Support Vector Machines (SVM)
  • 11.
    Unsupervised learning isa type of machine learning algorithm used to draw inferences from data-sets consisting of input data without labeled responses. The most common unsupervised learning method is cluster analysis, which is used for exploratory data analysis to find hidden patterns or grouping in data. Example techniques: ● K-Means Clustering ● Hierarchical Clustering ● Principal Component Analysis (PCA)
  • 12.
    1. Import requiredpackages 2. Load dataset 3. Identify/ Group features and target attributes 4. Visualize data to apply efficient algorithm 5. Split dataset into training set and testing set 6. load the model 7. Train the model by providing training data 8. Prediction based on test data 9. score the model 10.Determine the error and accuracy Steps involved in building machine learning model
  • 13.
  • 16.
    Evaluation metrics forclass Accuracy Measures the percentage of correctly predicted instances. Formula: Accuracy = (TP + TN) / (TP + TN + FP + FN)
  • 17.
    Precision Measures the correctnessof positive predictions. Formula: Precision = TP / (TP + FP)
  • 18.
    Recall (Sensitivity) Measures themodel's ability to capture actual positive cases. Formula: Recall = TP / (TP + FN)
  • 19.
    F1-Score Balances precision andrecall for imbalanced datasets. Formula: F1-Score = 2 * (Precision * Recall) / (Precision + Recall)
  • 20.
    Confusion Matrix Summarizes modelpredictions compared to actual values.
  • 21.
    Supervised Machine Learning Regression 1.For regression analysis there is one independent set and one dependent set. 2. When we have to model the relationship between dependent and independent variable, regression analysis will be applied. 3. Whenever any dependent set is continuous or having numerical data,
  • 22.
    Linear Regression 1. Dependentvariable is continuous in nature 2. Applied when there is a linear relationship between independent and dependent variable
  • 24.
    Logistic Regression Logistic regressionis kind of like linear regression but is used when the dependent variable is not a number, but something else (like a Yes/No response). Its called Regression but performs classification as based on the regression it classifies the dependent variable into either of the classes.
  • 25.
    Logistic regression isused for prediction of output which is binary For example, if a credit card company is going to build a model to decide whether to issue a credit card to a customer or not. Firstly, Linear Regression is performed on the relationship between variables to get the model . The threshold for the classification line is assumed to be at 0.5 Linear Regression Logistic Sigmoid Function
  • 26.
    Logistic Function isapplied to the regression to get the probabilities of it belonging in either class. It gives the log of the probability of the event occurring to log of the probability of it not Log Odds
  • 27.
    K-Nearest Neighbours (K- NN) K-NNalgorithm is one of the simplest classification algorithm and it is used to identify the data points that are separated into several classes to predict the classification of a new sample point. K-NN is a non-parametric, lazy learning algorithm. It classifies new cases based on a similarity measure (e.g. distance functions). KNN does not learn. It is lazy and it just
  • 28.
    Decision Tree Classification Decision treebuilds classification or regression models in the form of a tree structure. It breaks down a dataset into smaller and smaller subsets while at the same time an associated decision tree is incrementally developed. The final result is a tree with decision nodes and leaf nodes. Decision Tree uses Entropy and Information Gain to construct a
  • 29.
    Information Gain 1. Itmeasures the relative change in entropy with respect to the independent attribute. 2. Constructing a decision tree is all about finding the attribute that returns the highest information gain (i.e., the most homogeneous branches). 3. The information gain is calculated for the target attribute.
  • 30.
    Entropy 1. Entropy isthe degree or amount of uncertainty in the randomness of elements or in other words it is a measure of impurity. 2. Entropy of an attribute is the multiplication of Information gain of that attribute and the probability of that attribute. 3. Out of all the features attributes one will become the root node.
  • 31.
    age competition typeprofit old yes s/w down old no s/w down old no h/w down mid yes s/w down mid yes h/w down mid no h/w up mid no s/w up new yes s/w up new no h/w up new no s/w up 1st find out entropy of Age Down Up Old 3 0 Mid 2 2 New 0 3
  • 32.
    P(old) = 3/10 P(mid) = 4/10 P(new)= 3/10 E(Age) = P(old)*I(old) + P(mid)*I(mid) + P(new)*I(new) E(Age) = 3/10*0 + 4/10*1 + 3/10*0 E(Age) = 0.4 Gain(Age) = 1 – 0.4 Gain(Age) = 0.6 Gain(Age) = 0.6 Gain(Competitio n) = 0.124 Gain(Type) = 0 I.G. (target) = 1 Age Compet ition ol d mi d ne w ye s no do up do wn up
  • 33.
  • 34.
  • 35.
  • 36.
    Data types ⚫Data typesare the classification or categorization of data items.
  • 37.
    Variables ⚫Variables are containersfor storing data values.
  • 38.
    Operators ⚫Arithmetic operators ⚫Assignment operators ⚫Comparisonoperators ⚫Logical operators ⚫Identity operators ⚫Membership operators ⚫Bitwise operators
  • 39.
  • 40.
  • 41.
  • 42.
  • 43.
  • 44.
  • 45.
    Decision Making ⚫ Decision-makingis the anticipation of conditions occurring during the execution of a program and specified actions taken according to the conditions. ⚫ if statements:An if statement consists of a Boolean expression followed by one or more statements. ⚫ if...else statements:An if statement can be followed by an optional else statement, which executes when the Boolean expression is FALSE. ⚫ nested if statements:You can use one if or else if statement inside another if or else if statement(s).
  • 46.
    Loops ⚫ A loopstatement allows us to execute a statement or group of statements multiple times. ⚫ while loop:Repeats a statement or group of statements while a given condition is TRUE. It tests the condition before executing the loop body. ⚫ for loop:Executes a sequence of statements multiple times and abbreviates the code that manages the loop variable. ⚫ nested loops:You can use one or more loop inside any another while, or for loop. ⚫ The Infinite Loop:A loop becomes infinite loop if a condition never becomes FALSE. ⚫ The range() function The built-in function range() is the right function to iterate over a
  • 47.
    Loop Control Statements ⚫breakstatement:Terminates the loop statement and transfers execution to the statement immediately following the loop. ⚫continue statement:Causes the loop to skip the remainder of its body and immediately retest its condition prior to reiterating. ⚫pass statement:The pass statement in Python is used when a statement is required syntactically but you do not want any command or code to
  • 48.
    Functions ⚫A function isa block of organized, reusable code that is used to perform a single, related action. ⚫ Functions provide better modularity for your application and a high degree of code reusing. ⚫Defining a Function ⚫Calling a Function
  • 49.
    Function Arguments ⚫Required arguments ⚫Keyword arguments ⚫ Default arguments ⚫ Variable-length arguments
  • 50.
  • 51.
    Supervised Machine Learning Regression 1.For regression analysis there is one independent set and one dependent set. 2. When we have to model the relationship between dependent and independent variable, regression analysis will be applied. 3. Whenever any dependent set is continuous or having numerical data,
  • 52.
    Limitations of TraditionalMachine Learning ● Feature Engineering is Manual ● Scalability Issues ● Poor Performance on Unstructured Data ● Inability to Handle High-Dimensional Data ● Lack of Hierarchical Representations
  • 53.
    What is DeepNeural Networks (DNNs)? ● A type of Artificial Neural Network (ANN) with multiple hidden layers. ● Designed to automatically learn features and representations from data. ● Used in fields like computer vision, speech recognition, and natural language processing (NLP).