EXPECTATION-
MAXIMIZATION ALGORITHM
Machine Learning
SIMPLER
WAY TO UNDERSTAND
Imagine you have a puzzle where some pieces are missing.
The EM algorithm helps you complete the puzzle by
guessing what those missing pieces look like.
STEPS
YOU WOULD FOLLOW
GUESS AND IMPROVE (EXPECTATION STEP)
First, you make a guess about what the missing puzzle
pieces might look like. This is like saying, "Hmm, I think the
missing pieces could be this color and shape." Your guess
doesn't have to be perfect; it's just a starting point.
STEPS
YOU WOULD FOLLOW
MAKE IT BETTER (MAXIMIZATION STEP)
Then, you look at the pieces you have and the ones you
guessed. You figure out how to adjust your guess to make
it match the pieces you have as closely as possible. This
step is like tweaking your guess to fit the puzzle better.
STEPS
YOU WOULD FOLLOW
REPEAT UNTIL DONE
You keep doing these two steps over and over, making
your guess better and better each time. It's like refining
your guess until the puzzle is complete.
The EM algorithm is like a smart helper that makes
educated guesses and keeps improving them until the
puzzle is solved. It's great for figuring out things when you
don't have all the information you need.
IN
ACTUAL TERMS
The Expectation-Maximization (EM) algorithm is an
iterative statistical technique used for estimating
parameters of probabilistic models when some of the data
is missing or unobserved. EM is particularly useful in
situations where you have incomplete or partially observed
data, and you want to estimate the underlying hidden
variables or parameters of a statistical model.
IN
ACTUAL TERMS
The Expectation-Maximization (EM) algorithm is an
iterative optimization method that combines different
unsupervised machine learning algorithms to find
maximum likelihood or maximum posterior estimates of
parameters in statistical models that involve unobserved
latent variables.
IN
ACTUAL TERMS
The EM algorithm is commonly used for latent variable
models and can handle missing data. It consists of an
estimation step (E-step) and a maximization step (M-
step), forming an iterative process to improve model fit.
IN
ACTUAL TERMS
In the E step, the algorithm computes the latent
variables i.e. expectation of the log-likelihood using the
current parameter estimates.
In the M step, the algorithm determines the parameters
that maximize the expected log-likelihood obtained in
the E step, and corresponding model parameters are
updated based on the estimated latent variables.
IN
ACTUAL TERMS
By iteratively repeating these steps, the EM algorithm seeks
to maximize the likelihood of the observed data. It is
commonly used for unsupervised learning tasks, such as
clustering, where latent variables are inferred, and has
applications in various fields, including machine learning,
computer vision, and natural language processing.
Source: GeeksforGeeks
function ExpectationMaximization(data, initial_parameters, convergence_threshold, max_iterations):
parameters = initial_parameters
iteration = 0
converged = false
while (iteration < max_iterations and not converged):
# E-Step: Calculate expected values of hidden data
expected_values = EStep(data, parameters)
# M-Step: Update parameter estimates based on expected values
parameters = MStep(data, expected_values)
# Check for convergence based on parameter change
converged = CheckConvergence(parameters, previous_parameters, convergence_threshold)
previous_parameters = parameters # Save parameters for the next iteration
iteration = iteration + 1
return parameters # Final estimated parameters
function EStep(data, parameters):
# Calculate expected values (responsibilities) of hidden data
# Based on the current parameter estimates and observed data
# Return the expected values
PSEUDOCODE
function MStep(data, expected_values):
# Update parameter estimates to maximize the expected log-likelihood
# of the complete data (observed and hidden)
# Return the updated parameter estimates
function CheckConvergence(parameters, previous_parameters, threshold):
# Calculate a measure of how much the parameters have changed
# from the previous iteration (e.g., Euclidean distance or change in log-likelihood)
# Check if the change is smaller than the convergence threshold
# Return true if converged, false otherwise
# Example Usage
data = ... # Your observed data
initial_parameters = ... # Initial parameter values
convergence_threshold = ... # Convergence threshold for parameter change
max_iterations = ... # Maximum number of iterations
estimated_parameters = ExpectationMaximization(data, initial_parameters, convergence_threshold,
max_iterations)
PROBLEM
Imagine you have a bag of colorful candies, but you don't
know how many of each color are in the bag. You want to
figure this out by using the EM algorithm.
STEP-1 (E STEP)
Close your eyes and take out one candy from the bag
without looking.
Now, you ask your friend to guess the color of the
candy.
Your friend makes a guess based on their knowledge of
candies, but they're not entirely sure because they can't
see the candy either. So, they give you their best guess
along with how confident they are in their guess.
1.
2.
3.
STEP-2 (M STEP)
You collect all the guesses and confidence levels from
your friend for the candies you've taken out so far.
You count how many times each color was guessed
and use the confidence levels to estimate the number
of candies of each color in the bag.
You adjust your guess of how many candies of each
color are in the bag based on this new information.
1.
2.
3.
STEP-3 (REPEAT)
Keep repeating these two steps. Each time you do it, your
guess about the candies' colors and amounts gets better
and better. After doing this many times, you'll have a very
good idea of how many candies of each color are in the
bag.
LET’S MAKE IT MATHEMATICAL
For the first candy: 80% chance it's Red, 10% Green, 10%
Blue
For the second candy: 30% Red, 60% Green, 10% Blue
For the third candy: 20% Red, 10% Green, 70% Blue
Suppose you have a bag with red (R), green (G), and blue
(B) candies. You take out one candy at a time and record
your friend's guesses. After several candies, you have these
guesses:
LET’S MAKE IT MATHEMATICAL
For the first candy: 80% chance it's Red, 10% Green, 10%
Blue
For the second candy: 30% Red, 60% Green, 10% Blue
For the third candy: 20% Red, 10% Green, 70% Blue
Suppose you have a bag with red (R), green (G), and blue
(B) candies. You take out one candy at a time and record
your friend's guesses. After several candies, you have these
guesses:
LET’S MAKE IT MATHEMATICAL
Red: (0.80 + 0.30 + 0.20) / 3 = 0.43
Green: (0.10 + 0.60 + 0.10) / 3 = 0.27
Blue: (0.10 + 0.10 + 0.70) / 3 = 0.30
Now, in the M-step, you count the total guesses for each
color and update your estimates:
So, based on these new estimates, you think there are
approximately 43% Red candies, 27% Green candies, and
30% Blue candies in the bag.
You repeat this process many times until your estimates
become very accurate, and you have a good idea of the
candy distribution in the bag. That's how the EM algorithm
works to solve problems like this one!
ADVANTAGES
Handles data with missing values effectively.
Useful for unsupervised learning tasks like clustering.
Robust to noisy data.
Adaptable to various probabilistic models.
Can be applied to large datasets.
Estimates model parameters in mixture distributions.
Guarantees convergence to a local maximum.
Well-founded in statistical theory.
Not very sensitive to initial parameter values.
Versatile for various machine learning applications.
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
DISADVANTAGES
Sensitive to initial parameter guesses.
Slow convergence for high-dimensional data.
Limited scalability for very large datasets.
Assumes data is generated from a specific model.
Convergence is not guaranteed for all cases.
Can be computationally intensive for some problems.
1.
2.
3.
4.
5.
6.
THANK YOU

EM Algorithm

  • 1.
  • 2.
    SIMPLER WAY TO UNDERSTAND Imagineyou have a puzzle where some pieces are missing. The EM algorithm helps you complete the puzzle by guessing what those missing pieces look like.
  • 3.
    STEPS YOU WOULD FOLLOW GUESSAND IMPROVE (EXPECTATION STEP) First, you make a guess about what the missing puzzle pieces might look like. This is like saying, "Hmm, I think the missing pieces could be this color and shape." Your guess doesn't have to be perfect; it's just a starting point.
  • 4.
    STEPS YOU WOULD FOLLOW MAKEIT BETTER (MAXIMIZATION STEP) Then, you look at the pieces you have and the ones you guessed. You figure out how to adjust your guess to make it match the pieces you have as closely as possible. This step is like tweaking your guess to fit the puzzle better.
  • 5.
    STEPS YOU WOULD FOLLOW REPEATUNTIL DONE You keep doing these two steps over and over, making your guess better and better each time. It's like refining your guess until the puzzle is complete.
  • 6.
    The EM algorithmis like a smart helper that makes educated guesses and keeps improving them until the puzzle is solved. It's great for figuring out things when you don't have all the information you need.
  • 7.
    IN ACTUAL TERMS The Expectation-Maximization(EM) algorithm is an iterative statistical technique used for estimating parameters of probabilistic models when some of the data is missing or unobserved. EM is particularly useful in situations where you have incomplete or partially observed data, and you want to estimate the underlying hidden variables or parameters of a statistical model.
  • 8.
    IN ACTUAL TERMS The Expectation-Maximization(EM) algorithm is an iterative optimization method that combines different unsupervised machine learning algorithms to find maximum likelihood or maximum posterior estimates of parameters in statistical models that involve unobserved latent variables.
  • 9.
    IN ACTUAL TERMS The EMalgorithm is commonly used for latent variable models and can handle missing data. It consists of an estimation step (E-step) and a maximization step (M- step), forming an iterative process to improve model fit.
  • 10.
    IN ACTUAL TERMS In theE step, the algorithm computes the latent variables i.e. expectation of the log-likelihood using the current parameter estimates. In the M step, the algorithm determines the parameters that maximize the expected log-likelihood obtained in the E step, and corresponding model parameters are updated based on the estimated latent variables.
  • 11.
    IN ACTUAL TERMS By iterativelyrepeating these steps, the EM algorithm seeks to maximize the likelihood of the observed data. It is commonly used for unsupervised learning tasks, such as clustering, where latent variables are inferred, and has applications in various fields, including machine learning, computer vision, and natural language processing.
  • 12.
  • 13.
    function ExpectationMaximization(data, initial_parameters,convergence_threshold, max_iterations): parameters = initial_parameters iteration = 0 converged = false while (iteration < max_iterations and not converged): # E-Step: Calculate expected values of hidden data expected_values = EStep(data, parameters) # M-Step: Update parameter estimates based on expected values parameters = MStep(data, expected_values) # Check for convergence based on parameter change converged = CheckConvergence(parameters, previous_parameters, convergence_threshold) previous_parameters = parameters # Save parameters for the next iteration iteration = iteration + 1 return parameters # Final estimated parameters function EStep(data, parameters): # Calculate expected values (responsibilities) of hidden data # Based on the current parameter estimates and observed data # Return the expected values PSEUDOCODE
  • 14.
    function MStep(data, expected_values): #Update parameter estimates to maximize the expected log-likelihood # of the complete data (observed and hidden) # Return the updated parameter estimates function CheckConvergence(parameters, previous_parameters, threshold): # Calculate a measure of how much the parameters have changed # from the previous iteration (e.g., Euclidean distance or change in log-likelihood) # Check if the change is smaller than the convergence threshold # Return true if converged, false otherwise # Example Usage data = ... # Your observed data initial_parameters = ... # Initial parameter values convergence_threshold = ... # Convergence threshold for parameter change max_iterations = ... # Maximum number of iterations estimated_parameters = ExpectationMaximization(data, initial_parameters, convergence_threshold, max_iterations)
  • 15.
    PROBLEM Imagine you havea bag of colorful candies, but you don't know how many of each color are in the bag. You want to figure this out by using the EM algorithm.
  • 16.
    STEP-1 (E STEP) Closeyour eyes and take out one candy from the bag without looking. Now, you ask your friend to guess the color of the candy. Your friend makes a guess based on their knowledge of candies, but they're not entirely sure because they can't see the candy either. So, they give you their best guess along with how confident they are in their guess. 1. 2. 3.
  • 17.
    STEP-2 (M STEP) Youcollect all the guesses and confidence levels from your friend for the candies you've taken out so far. You count how many times each color was guessed and use the confidence levels to estimate the number of candies of each color in the bag. You adjust your guess of how many candies of each color are in the bag based on this new information. 1. 2. 3.
  • 18.
    STEP-3 (REPEAT) Keep repeatingthese two steps. Each time you do it, your guess about the candies' colors and amounts gets better and better. After doing this many times, you'll have a very good idea of how many candies of each color are in the bag.
  • 19.
    LET’S MAKE ITMATHEMATICAL For the first candy: 80% chance it's Red, 10% Green, 10% Blue For the second candy: 30% Red, 60% Green, 10% Blue For the third candy: 20% Red, 10% Green, 70% Blue Suppose you have a bag with red (R), green (G), and blue (B) candies. You take out one candy at a time and record your friend's guesses. After several candies, you have these guesses:
  • 20.
    LET’S MAKE ITMATHEMATICAL For the first candy: 80% chance it's Red, 10% Green, 10% Blue For the second candy: 30% Red, 60% Green, 10% Blue For the third candy: 20% Red, 10% Green, 70% Blue Suppose you have a bag with red (R), green (G), and blue (B) candies. You take out one candy at a time and record your friend's guesses. After several candies, you have these guesses:
  • 21.
    LET’S MAKE ITMATHEMATICAL Red: (0.80 + 0.30 + 0.20) / 3 = 0.43 Green: (0.10 + 0.60 + 0.10) / 3 = 0.27 Blue: (0.10 + 0.10 + 0.70) / 3 = 0.30 Now, in the M-step, you count the total guesses for each color and update your estimates: So, based on these new estimates, you think there are approximately 43% Red candies, 27% Green candies, and 30% Blue candies in the bag. You repeat this process many times until your estimates become very accurate, and you have a good idea of the candy distribution in the bag. That's how the EM algorithm works to solve problems like this one!
  • 22.
    ADVANTAGES Handles data withmissing values effectively. Useful for unsupervised learning tasks like clustering. Robust to noisy data. Adaptable to various probabilistic models. Can be applied to large datasets. Estimates model parameters in mixture distributions. Guarantees convergence to a local maximum. Well-founded in statistical theory. Not very sensitive to initial parameter values. Versatile for various machine learning applications. 1. 2. 3. 4. 5. 6. 7. 8. 9. 10.
  • 23.
    DISADVANTAGES Sensitive to initialparameter guesses. Slow convergence for high-dimensional data. Limited scalability for very large datasets. Assumes data is generated from a specific model. Convergence is not guaranteed for all cases. Can be computationally intensive for some problems. 1. 2. 3. 4. 5. 6.
  • 24.