GMMs
GMMs work on the idea that there are a certain number of Gaussian distributions and each of these distributions makes a cluster. Suppose there are 3 gaussian distributions with different sets of mean and variance. Now, the probability of every data point to fit into a certain distribution is calculated. GMMs are probabilistic models and use soft clustering technique to assign which data point belongs to which cluster. This is one fundamental difference between hard k-means clustering and GMMs. The difference between hard-clustering and soft-clustering techniques is that hard-clustering techniques are definite about assigning a data point to a single cluster. However, in soft-clustering, data points may be shared among different clusters, the extent of which is decided based on the probability of the data point to belong to that cluster.

Source - link
Let’s take a step back and understand what are Gaussian distributions. Gaussian distributions are bell curves symmetrically spread around the mean value.
Source - link
The spread of the curve depends on variance in the gaussian distribution. The more the variance, the more spread out the curve is.
For the above distribution depicted in a 2D space, the probability density function is given by

Source - link
For a 3D distribution, that is, a curve constituting 2 variables. The probability density would be given by -

Source - link
Where,
x = input vector
µ = 2D mean vector
∑ = 2 x 2 covariance matrix
This is an equation for a 2 variable Gaussian distribution. This can further be generalized for an n-dimension distribution where x and µ would be given by n-length vectors and ∑ be given by n x n covariance matrix.
Expectation Maximisation
Expectation Maximisation is a model parameter determining technique and is best used when there are missing data values. It becomes a tough task to set the right model parameters if the values are missing variables. Hence, EM first estimates optimum values for missing variables with available data and then finds the right model parameters. It’s a 2 step algorithm -
- Estimation step(E-step) - Estimating missing values with available data
- Maximisation step(M-step) - Model parameters are decided after estimating the missing values.
Say, we want to assign K clusters, that means, K Gaussian distributions. The appropriate parameters will be Mean, Covariance, and an additional density of distribution parameter(π). Initially, we assign these values randomly using K-means or hierarchical clustering, followed by the E-step-
-
E-step- For every point, calculate the probability of it belonging to a certain cluster.

Source - link
-
M-step- The density value is given by

Source - link
The mean and covariance matrix are calculated and updated as per the following formula-

Source - link
It’s an iterative process and is done to maximise the log-likelihood function.
Applications of GMMs
GMMs have a variety of real-world applications. Some of them are listed below.
- Used for signal processing
- Used for customer churn analysis
- Used for language identification
- Used in video game industry
- Genre classification of songs
FAQs
-
How do GMMs differentiate from K-means clustering?
GMMs and K-means, both are clustering algorithms used for unsupervised learning tasks. However, the basic difference between the 2 is that k-means is a distance-based clustering method while GMMs is a distribution-based clustering method.
-
Briefly explain EM algorithm.
It is a statistical algorithm used to find the right parameters when there are missing values in the data. The missing variables are called latent variables. Latent variables are first estimated using the available values and then parameters are updated with the completed data. It’s a 2 step process - E-step(estimation of latent variables) followed by M-step(Updating the model parameters). It is an iterative process.
Key Takeaways
The blog briefly explains a very powerful clustering algorithm- Gaussian Mixture Models and highlights its salient features by contrasting it with other clustering algorithms. However, we recommend trying it out yourself to better understand the nitty-gritty of this clustering method. You may check out our industry-oriented courses on machine learning to give yourself that promising start to your machine learning journey.
Happy Learning!!