GMMs
GMMs work on the idea that there are a certain number of Gaussian distributions and each of these distributions makes a cluster. Suppose there are 3 gaussian distributions with different sets of mean and variance. Now, the probability of every data point to fit into a certain distribution is calculated. GMMs are probabilistic models and use soft clustering technique to assign which data point belongs to which cluster. This is one fundamental difference between hard kmeans clustering and GMMs. The difference between hardclustering and softclustering techniques is that hardclustering techniques are definite about assigning a data point to a single cluster. However, in softclustering, data points may be shared among different clusters, the extent of which is decided based on the probability of the data point to belong to that cluster.
Source  link
Letâ€™s take a step back and understand what are Gaussian distributions. Gaussian distributions are bell curves symmetrically spread around the mean value.
Source  link
The spread of the curve depends on variance in the gaussian distribution. The more the variance, the more spread out the curve is.
For the above distribution depicted in a 2D space, the probability density function is given by
Source  link
For a 3D distribution, that is, a curve constituting 2 variables. The probability density would be given by 
Source  link
Where,
x = input vector
Âµ = 2D mean vector
âˆ‘ = 2 x 2 covariance matrix
This is an equation for a 2 variable Gaussian distribution. This can further be generalized for an ndimension distribution where x and Âµ would be given by nlength vectors and âˆ‘ be given by n x n covariance matrix.
Expectation Maximisation
Expectation Maximisation is a model parameter determining technique and is best used when there are missing data values. It becomes a tough task to set the right model parameters if the values are missing variables. Hence, EM first estimates optimum values for missing variables with available data and then finds the right model parameters. Itâ€™s a 2 step algorithm 
 Estimation step(Estep)  Estimating missing values with available data
 Maximisation step(Mstep)  Model parameters are decided after estimating the missing values.
Say, we want to assign K clusters, that means, K Gaussian distributions. The appropriate parameters will be Mean, Covariance, and an additional density of distribution parameter(Ï€). Initially, we assign these values randomly using Kmeans or hierarchical clustering, followed by the Estep

Estep For every point, calculate the probability of it belonging to a certain cluster.
Source  link

Mstep The density value is given by
Source  link
The mean and covariance matrix are calculated and updated as per the following formula
Source  link
Itâ€™s an iterative process and is done to maximise the loglikelihood function.
Applications of GMMs
GMMs have a variety of realworld applications. Some of them are listed below.
 Used for signal processing
 Used for customer churn analysis
 Used for language identification
 Used in video game industry
 Genre classification of songs
FAQs

How do GMMs differentiate from Kmeans clustering?
GMMs and Kmeans, both are clustering algorithms used for unsupervised learning tasks. However, the basic difference between the 2 is that kmeans is a distancebased clustering method while GMMs is a distributionbased clustering method.

Briefly explain EM algorithm.
It is a statistical algorithm used to find the right parameters when there are missing values in the data. The missing variables are called latent variables. Latent variables are first estimated using the available values and then parameters are updated with the completed data. Itâ€™s a 2 step process  Estep(estimation of latent variables) followed by Mstep(Updating the model parameters). It is an iterative process.
Key Takeaways
The blog briefly explains a very powerful clustering algorithm Gaussian Mixture Models and highlights its salient features by contrasting it with other clustering algorithms. However, we recommend trying it out yourself to better understand the nittygritty of this clustering method. You may check out our industryoriented courses on machine learning to give yourself that promising start to your machine learning journey.
Happy Learning!!