Unit II
Machine
Learning
Algorithm
ARVIND SARDAR
Machine Learning Algorithm
REGRESSION ANALYSIS IN
MACHINE LEARNING .
NUNC VIVERRA IMPERDIET
ENIM. FUSCE EST. VIVAMUS A
TELLUS.
PELLENTESQUE HABITANT
MORBI TRISTIQUE SENECTUS
ET NETUS.
Regression Analysis in Machine learning
Regression is a supervised learning technique, and it is a statistical method, to determine the relationship between one
dependent (target) or output variable and one or more independent (predictor) variables with one or more independent
variables.
Regression analysis helps us understand the relationship between the dependent and independent variables.
It is mainly used for prediction continuous values like temperature, age, salary, and price, forecasting, time series
modeling, and determining the causal-effect relationship between variables.
In Regression, we plot a graph between the variables which best fits the given datapoints, using this plot, the
machine learning model can make predictions about the data.
Logistic Regression
What is the Classification Algorithm?
The Classification algorithm is a Supervised
technique.
Classification algorithm is used to identify the category of new observations on the basis of training data.
In Classification, a program learns from the given dataset or observations and then classifies new observation into a number of classes
or groups. Such as, Yes or No, 0 or 1, Spam or Not Spam, cat or dog, etc. Classes can be called as targets / labels or categories.
The output variable of Classification is a category, not a value like regression is a continuous value, such as "Green or Blue", "fruit or
animal", etc.
In classification algorithm, a discrete output function (y) is mapped to input variable(x).
y=f(x), where y = categorical output
The best example of an ML classification algorithm is Email Spam Detector.
Logistic Regression
The main goal of the Classification algorithm is to identify the category of a given dataset, and these algorithms are mainly
used to predict the output for the categorical data.
The algorithm which implements the classification on a dataset is known as a classifier.
There are two types of Classifications:
1. Binary Classifier:
If the classification problem has only two possible outcomes, then it is called as Binary Classifier.
Examples: YES or NO, MALE or FEMALE, SPAM or NOT SPAM, CAT or DOG, etc.
2. Multi-class Classifier:
If a classification problem has more than two outcomes, then it is called as Multi-class Classifier.
Example: Classifications of types of crops, Classification of types of music.
ML Classification Algorithms
Types of ML Classification Algorithms:
Classification Algorithms can be further divided into the Mainly two category:
1. Linear Models
A. Logistic Regression
B. Support Vector Machines
2. Non-linear Models
A. K-Nearest Neighbours
B. Kernel SVM
C. Naïve Bayes
D. Decision Tree Classification
E. Random Forest Classification
Logistic Regression
 Linear regression predicts the numerical responses but not it is not suitable for predicting the categorical data.
 It supervised learning model
 It is used to predict the categorical dependent variable using a given set of independent variable. It most suitable for binary
classification problem.
 The outcome must be a categorical or discrete value.
 It can be either Yes or No, 0 or 1, true or False, etc. but instead of giving the exact value as 0 and 1, it gives the probabilistic values
which lie between 0 and 1.
 Linear Regression is used for solving Regression problems, whereas Logistic regression is used for solving the classification
problems.
 In Logistic regression, instead of fitting a regression line, we fit an "S" shaped logistic function, which predicts two
maximum values (0 or 1).
Logistic Regression
 Logistic Regression has the ability to provide probabilities and classify new data using continuous and discrete datasets.
Applications
 Binary Classification Spam /Not Spam
 Based on the entrance exam marks student admitted in course or not.
 E Commerce companies can identify buyers likely to purchase a certain product or not
 Companies can predict whether they will gain or lose money in the next quarter / year or month
 Determine the probability of heart attacks:
Logistic Regression
Advantages:
 Logistic regression performs better when data is linearly separable.
 It does not require too many computational recourses as its highly interpretable.
 There is no problem scaling the input feature. It does not require tuning.
 It gives a measure of how relevant a predictor ( Coefficient Size) is and its direction of association (+ ve and – ve)
Spam / not spam
Logistic Regression
Logistic Regression Equation and Assumptions
Logistic regression uses a logistic function called a sigmoid function to map predictions and their probabilities.
The value of the logistic regression must be between 0 and 1, which cannot go beyond this limit, so it forms a curve like
the "S" form. The S-form curve is called the Sigmoid function or the logistic function that converts any real value to a
range between 0 and 1.
To classify the model, we require threshold value. If the output of the sigmoid function (estimated probability) is
greater than a predefined threshold on the graph, the model predicts that the instance belongs to that
class. If the estimated probability is less than the predefined threshold, the model predicts that the instance
does not belong to the class.
For example, if the output of the sigmoid function is above 0.5, the output is considered as 1.
On the other hand, if the output is less than 0.5, the output is classified as 0.
Logistic Regression Equation and Assumptions
 An activation function for logistic regression and is defined as:
Where ,
• e = base of natural logarithms
• value = numerical value one wishes to transform
S()
Y
Example
Calculate the probability of pass for the students who studded 33 hours. Give = - 64 , = 2
Y is the probability
z =
X ix 33 hours
Z = - 64 + 2 * 33
Z = -64+66
Z = 2
https://www.omnicalculator.com/math/e-power-x
Hours
Study
Pass(1) /
Fail (0)
29 0
15 0
33 1
28 1
39 1
Y
Y
= 0.13
If the student 33 hours, then there is 88 % chance student will
pass the exam
Example
2. At least how many hours students should study that makes he will pass the course with the probability of more than 95%.
0.95 *( )= 1
0.95 + 0.95 = 1
0.95 = 1 - 0.95
=
= = 0.0526
https://www.omnicalculator.com/math/natural-log
Y
-z =
Z= 2.94
Z = - 64 + 2 * hours
2.94 = -64 + 2*hours
2.94+64 = 2hours
66.94 / 2 = hours
Hours = 33.47
Students should study at least 33.47 hours, so he will pass the
course with the probability of more than 95%.
Bayes Theorem
 Bayes theorem is given by an English statistician, philosopher, and Presbyterian minister named Mr. Thomas Bayes in
17th
century.
 Bayes provides their thoughts in decision theory which is extensively used in important mathematics concepts as
Probability.
 In Machine Learning bayes theorem is used to predict classes precisely and accurately.
 Bayes theorem named Bayesian method is used to calculate conditional probability in Machine Learning application
that includes classification tasks.
 Bayes theorem ( i.e.Naïve Bayes classification) is used to reduce computation time and average cost of the projects.
 Bayes theorem is also known with some other name such as Bayes rule or Bayes Law.
 Bayes theorem helps to determine the probability of an event with random knowledge.
 It is used to calculate the probability of occurring one event while other event already occurred.
 It is a best method to relate the condition probability and marginal probability.
 Bayes theorem helps to contribute more accurate results
Bayes Theorem
 Bayes Theorem is used to estimate the precision of values and provides a method for calculating the
conditional probability.
 What is Bayes’ Theorem?
• Bayes’ theorem provides a way to update probabilities based on new evidence or information.
• It allows us to infer our belief in a hypothesis based on new data.
• Mathematically, Bayes’ theorem can be expressed as:
• P(A B)=P(B A) P(A)​ / P(B)
∣ ∣ ⋅
• (P(A|B)) represents the posterior probability of event (A) given event (B).
• (P(B|A)) is the likelihood of event (B) given event (A).
• (P(A)) is the prior probability of event (A).
• (P(B)) is the total probability of event (B)
1.Key Terms Related to Bayes’ Theorem:
1. Likelihood (P(B|A)): Represents the probability of observing the given evidence (features) given that the
class is true. In the Naive Bayes algorithm, a key assumption is that features are conditionally independent
given the class label.
2. Prior Probability (P(A)): Represents the probability of a particular class before considering any features.
It is estimated from the training data.
3. Evidence Probability (P(B)): This is the probability of observing the given evidence (features). It serves as
a normalization factor and is often calculated as the sum of the joint probabilities over all possible classes.
4. Posterior Probability (P(A|B)): This is the updated probability of the class given the observed features. It is
what we are trying to predict or infer in a classification task.
1.Applications of Bayes’ Theorem in Machine Learning:
1. Naive Bayes Classifier: The Naive Bayes classifier is a simple probabilistic classifier based on applying
Bayes’ theorem with a strong (naive) independence assumption between the features.
It is widely used for text classification, spam filtering, and other tasks involving high-dimensional data.
Despite its simplicity, the Naive Bayes classifier often performs well in practice and is computationally
efficient.

MACHINE LEARNING Unit -2 Algorithm.pptx

  • 1.
  • 2.
    Machine Learning Algorithm REGRESSIONANALYSIS IN MACHINE LEARNING . NUNC VIVERRA IMPERDIET ENIM. FUSCE EST. VIVAMUS A TELLUS. PELLENTESQUE HABITANT MORBI TRISTIQUE SENECTUS ET NETUS.
  • 3.
    Regression Analysis inMachine learning Regression is a supervised learning technique, and it is a statistical method, to determine the relationship between one dependent (target) or output variable and one or more independent (predictor) variables with one or more independent variables. Regression analysis helps us understand the relationship between the dependent and independent variables. It is mainly used for prediction continuous values like temperature, age, salary, and price, forecasting, time series modeling, and determining the causal-effect relationship between variables. In Regression, we plot a graph between the variables which best fits the given datapoints, using this plot, the machine learning model can make predictions about the data.
  • 4.
    Logistic Regression What isthe Classification Algorithm? The Classification algorithm is a Supervised technique. Classification algorithm is used to identify the category of new observations on the basis of training data. In Classification, a program learns from the given dataset or observations and then classifies new observation into a number of classes or groups. Such as, Yes or No, 0 or 1, Spam or Not Spam, cat or dog, etc. Classes can be called as targets / labels or categories. The output variable of Classification is a category, not a value like regression is a continuous value, such as "Green or Blue", "fruit or animal", etc. In classification algorithm, a discrete output function (y) is mapped to input variable(x). y=f(x), where y = categorical output The best example of an ML classification algorithm is Email Spam Detector.
  • 5.
    Logistic Regression The maingoal of the Classification algorithm is to identify the category of a given dataset, and these algorithms are mainly used to predict the output for the categorical data. The algorithm which implements the classification on a dataset is known as a classifier. There are two types of Classifications: 1. Binary Classifier: If the classification problem has only two possible outcomes, then it is called as Binary Classifier. Examples: YES or NO, MALE or FEMALE, SPAM or NOT SPAM, CAT or DOG, etc. 2. Multi-class Classifier: If a classification problem has more than two outcomes, then it is called as Multi-class Classifier. Example: Classifications of types of crops, Classification of types of music.
  • 6.
    ML Classification Algorithms Typesof ML Classification Algorithms: Classification Algorithms can be further divided into the Mainly two category: 1. Linear Models A. Logistic Regression B. Support Vector Machines 2. Non-linear Models A. K-Nearest Neighbours B. Kernel SVM C. Naïve Bayes D. Decision Tree Classification E. Random Forest Classification
  • 7.
    Logistic Regression  Linearregression predicts the numerical responses but not it is not suitable for predicting the categorical data.  It supervised learning model  It is used to predict the categorical dependent variable using a given set of independent variable. It most suitable for binary classification problem.  The outcome must be a categorical or discrete value.  It can be either Yes or No, 0 or 1, true or False, etc. but instead of giving the exact value as 0 and 1, it gives the probabilistic values which lie between 0 and 1.  Linear Regression is used for solving Regression problems, whereas Logistic regression is used for solving the classification problems.  In Logistic regression, instead of fitting a regression line, we fit an "S" shaped logistic function, which predicts two maximum values (0 or 1).
  • 8.
    Logistic Regression  LogisticRegression has the ability to provide probabilities and classify new data using continuous and discrete datasets. Applications  Binary Classification Spam /Not Spam  Based on the entrance exam marks student admitted in course or not.  E Commerce companies can identify buyers likely to purchase a certain product or not  Companies can predict whether they will gain or lose money in the next quarter / year or month  Determine the probability of heart attacks:
  • 9.
    Logistic Regression Advantages:  Logisticregression performs better when data is linearly separable.  It does not require too many computational recourses as its highly interpretable.  There is no problem scaling the input feature. It does not require tuning.  It gives a measure of how relevant a predictor ( Coefficient Size) is and its direction of association (+ ve and – ve) Spam / not spam
  • 10.
  • 12.
    Logistic Regression Equationand Assumptions Logistic regression uses a logistic function called a sigmoid function to map predictions and their probabilities. The value of the logistic regression must be between 0 and 1, which cannot go beyond this limit, so it forms a curve like the "S" form. The S-form curve is called the Sigmoid function or the logistic function that converts any real value to a range between 0 and 1. To classify the model, we require threshold value. If the output of the sigmoid function (estimated probability) is greater than a predefined threshold on the graph, the model predicts that the instance belongs to that class. If the estimated probability is less than the predefined threshold, the model predicts that the instance does not belong to the class. For example, if the output of the sigmoid function is above 0.5, the output is considered as 1. On the other hand, if the output is less than 0.5, the output is classified as 0.
  • 13.
    Logistic Regression Equationand Assumptions  An activation function for logistic regression and is defined as: Where , • e = base of natural logarithms • value = numerical value one wishes to transform S() Y
  • 14.
    Example Calculate the probabilityof pass for the students who studded 33 hours. Give = - 64 , = 2 Y is the probability z = X ix 33 hours Z = - 64 + 2 * 33 Z = -64+66 Z = 2 https://www.omnicalculator.com/math/e-power-x Hours Study Pass(1) / Fail (0) 29 0 15 0 33 1 28 1 39 1 Y Y = 0.13 If the student 33 hours, then there is 88 % chance student will pass the exam
  • 15.
    Example 2. At leasthow many hours students should study that makes he will pass the course with the probability of more than 95%. 0.95 *( )= 1 0.95 + 0.95 = 1 0.95 = 1 - 0.95 = = = 0.0526 https://www.omnicalculator.com/math/natural-log Y -z = Z= 2.94 Z = - 64 + 2 * hours 2.94 = -64 + 2*hours 2.94+64 = 2hours 66.94 / 2 = hours Hours = 33.47 Students should study at least 33.47 hours, so he will pass the course with the probability of more than 95%.
  • 16.
    Bayes Theorem  Bayestheorem is given by an English statistician, philosopher, and Presbyterian minister named Mr. Thomas Bayes in 17th century.  Bayes provides their thoughts in decision theory which is extensively used in important mathematics concepts as Probability.  In Machine Learning bayes theorem is used to predict classes precisely and accurately.  Bayes theorem named Bayesian method is used to calculate conditional probability in Machine Learning application that includes classification tasks.  Bayes theorem ( i.e.Naïve Bayes classification) is used to reduce computation time and average cost of the projects.  Bayes theorem is also known with some other name such as Bayes rule or Bayes Law.  Bayes theorem helps to determine the probability of an event with random knowledge.  It is used to calculate the probability of occurring one event while other event already occurred.  It is a best method to relate the condition probability and marginal probability.  Bayes theorem helps to contribute more accurate results
  • 17.
    Bayes Theorem  BayesTheorem is used to estimate the precision of values and provides a method for calculating the conditional probability.  What is Bayes’ Theorem? • Bayes’ theorem provides a way to update probabilities based on new evidence or information. • It allows us to infer our belief in a hypothesis based on new data. • Mathematically, Bayes’ theorem can be expressed as: • P(A B)=P(B A) P(A)​ / P(B) ∣ ∣ ⋅ • (P(A|B)) represents the posterior probability of event (A) given event (B). • (P(B|A)) is the likelihood of event (B) given event (A). • (P(A)) is the prior probability of event (A). • (P(B)) is the total probability of event (B)
  • 18.
    1.Key Terms Relatedto Bayes’ Theorem: 1. Likelihood (P(B|A)): Represents the probability of observing the given evidence (features) given that the class is true. In the Naive Bayes algorithm, a key assumption is that features are conditionally independent given the class label. 2. Prior Probability (P(A)): Represents the probability of a particular class before considering any features. It is estimated from the training data. 3. Evidence Probability (P(B)): This is the probability of observing the given evidence (features). It serves as a normalization factor and is often calculated as the sum of the joint probabilities over all possible classes. 4. Posterior Probability (P(A|B)): This is the updated probability of the class given the observed features. It is what we are trying to predict or infer in a classification task.
  • 19.
    1.Applications of Bayes’Theorem in Machine Learning: 1. Naive Bayes Classifier: The Naive Bayes classifier is a simple probabilistic classifier based on applying Bayes’ theorem with a strong (naive) independence assumption between the features. It is widely used for text classification, spam filtering, and other tasks involving high-dimensional data. Despite its simplicity, the Naive Bayes classifier often performs well in practice and is computationally efficient.