Code360 powered by Coding Ninjas X Code360 powered by Coding Ninjas X
Table of contents
Hidden Markov Model
Assumptions of HMM
HMM as a Finite State Machine
Explanation of HMM with the help of an example
Applications of HMM
Hidden Markov Models in Natural Language Processing
Frequently Asked Questions
Key Takeaways
Last Updated: Mar 27, 2024

Hidden Markov Model

Author Mayank Goyal
0 upvote
Master Python: Predicting weather forecasts
Ashwin Goyal
Product Manager @


The Hidden Markov Model (HMM) is a statistical model used in machine learning. It can describe the evolution of observable events that are influenced by internal, non-observable forces. These are probabilistic graphical models that allow us to predict a series of unknown variables based on a set of known variables. We'll go over Hidden Markov Models in-depth in this article. We'll learn about the numerous situations in which it can be employed and its various applications.

Hidden Markov Model

A probabilistic model called the Hidden Markov model is used to explain or infer the probabilistic characteristics of any random process. It states that an observed event will be attributed to a series of probability distributions rather than its step-by-step status. Let's pretend that the system being modeled is a Markov chain and that there are some hidden states in the process. In such a situation, hidden states are a process that is reliant on the primary Markov process/chain.

HMM's primary purpose is to observe a Markov chain's hidden states to learn about it. In the case of a Markov process X with hidden states Y, the HMM establishes that the probability distribution of Y for each timestamp must not depend on X's history according to that timestamp.

Get the tech career you deserve, faster!
Connect with our expert counsellors to understand how to hack your way to success
User rating 4.7/5
1:1 doubt support
95% placement record
Akash Pal
Senior Software Engineer
326% Hike After Job Bootcamp
Himanshu Gusain
Programmer Analyst
32 LPA After Job Bootcamp
After Job

Assumptions of HMM

HMM is also based on several assumptions, the most important of which are as follows:

Output independence assumption: Given the current hidden state, output observation is conditionally independent of all other hidden states and observations.

Emission Probability Matrix: Chances of a concealed state generating output Vi when the state at the time was Sj.

HMM as a Finite State Machine

Consider the example given below which elaborates how a person feels in different climates.


Set of states (S) = {Happy, Grumpy}

Set of hidden states (Q) = {Sunny , Rainy}

State series over time = z∈ S_T

Observed States for four day = {z1=Happy, z2= Grumpy, z3=Grumpy, z4=Happy}

Since you are observing someone, the feelings you understand from a person emoting are referred to as observations.

The weather that affects a person's mood is a hidden condition because it is not visible.

Emission Probabilities

Feelings (Happy or Grumpy) can only be detected in the case above. When the climate at the time of observation (or day in this example) is Sunny, a person has an 80% chance of being happy. Similarly, given the rainy conditions, a person has a 60% likelihood of becoming grumpy. Because they relate to observations, the Emission probabilities of 80% and 60% are mentioned here.

Transition Probabilities

There are relationships between consecutive days being Sunny or alternate days being Rainy when we consider the climates (hidden states) that influence the data. Sunny days are expected to occur 80 percent of the time, while rainy days are expected to occur 60 percent.

Explanation of HMM with the help of an example

We might use the example of two buddies, Rahul and Ashok, further explain it. Rahul now completes his everyday tasks by the weather conditions. Rahul's top three activities include going jogging, going to the office, and cleaning his house. What Rahul accomplishes today is determined by whether or not he informs Ashok. While Ashok does not have access to accurate weather information, he can infer the weather conditions based on Rahul's job.

Ashok believes that the weather is modeled after a discrete Markov chain, with only two states: rainy or sunny. Ashok cannot notice the weather conditions because they are hidden from him here. Each day, there is a probability that Bob will do one of the following activities: "jog," "work," or "clean," depending on the weather. These are the observations since Rahul informs Ashok of his actions. The system is based on a hidden Markov model (HMM).

We may claim that Ashok knows the HMM parameter because he has broad weather information and knows what Rahul likes to do regularly.

Take, for example, a day when Rahul called Ashok to inform him that he had cleaned his house. In that case, Ashok will believe that the chances of a rainy day are higher, and we can say that Ashok's belief is the start probability of HMM, which is as follows.

The following are the states and observations:

States(S): ('Rainy', 'Sunny') 

Observations: ('walk','shop', 'clean') 

And here's the starting probability:

start probability = ('Rainy': 0.6, 'Sunny': 0.4)

Now that the probability distribution favors wet days in the United States, we can assume that there will be more chances for rainy days in the future, and the odds for next day weather states are as follows:

transition probability ={

'Rainy': {‘Rainy’:0.7, 'Sunny': 0.3},

'Sunny': {‘Rainy’:0.4, 'Sunny': 0.6}


From the above, we may deduce that changes in probability for a day are transition probabilities. The emission results for the chance of labor that Rahul would accomplish are transition probabilities.

emission_probability = {

   'Rainy' : {'jog': 0.1, 'work': 0.4, 'clean': 0.5},

   'Sunny' : {'jog': 0.6, 'work: 0.3, 'clean': 0.1},


This chance is referred to as the likelihood of emission. Ashok can forecast weather conditions using emission probabilities, and he can predict Rahul's work for the next day using transition probabilities.

The HMM method for calculating probabilities is depicted in the graphic below.


So, based on the preceding intuition and example, we can see how this probabilistic model can be used to create a forecast. Now let's look at some of the applications that it can be used in.

Applications of HMM

Using the Hidden Markov Model

The goal of an application that uses HMM aims to recover the data sequence when the future sequence of data cannot be seen right away, and the upcoming data depends on the old sequences. The HMM can be utilized in the following applications based on the above intuition:

  • Analyzing the speed of computation in finance
  • Speech recognition is a service that allows you to recognize
  • Synthesis of speech
  • Tagging of parts of speech
  • In scanning solutions, document separation is essential.
  • Automated translation
  • Recognition of handwriting
  • Analyzing time series
  • Recognized activities
  • Classification of sequences
  • Forecasting for transportation

Hidden Markov Models in Natural Language Processing

We can see from the examples above that the HMM may be employed in applications with sequential data such as time-series data, audio and video data, and text data or NLP data. Our primary focus in this article is on those NLP applications where we can utilize the HMM to improve the model's performance, and we can see from the list above that one of the applications of the HMM is in Part-of-Speech tagging.

Frequently Asked Questions

1. What are the Hidden Markov model's key flaws?
It is mainly used for voice recognition, although it is also utilized for categorization tasks. HMM solves three problems: evaluation, decoding, and learning to determine the most likely categorization.

2. Which NLP problem can the Hidden Markov model solve?
The hidden Markov model is used to solve probabilistic temporal reasoning that is not dependent on the transition or sensor model.

3. What is the significance of hidden Markov chains?
A Markov chain is a form of HMM. Its state cannot be directly observed, but the vector series can be used to determine it. HMM has been utilized for speech recognition, character recognition, and mobile communication techniques since the 1980s.

4. What is Markov's theory, and how does it work?
Given an arbitrary initial value, the Markov chain theory predicts that if the chain is operated for a long enough time, it will eventually converge to the equilibrium point.

5. What's the difference between a hidden Markov model and a Markov model?
The Markov model is a state machine with probabilities as state transitions. You don't know the possibilities in a hidden Markov model, but you know the outcomes.

6. What is the Viterbi algorithm and how does it work?
The Viterbi algorithm is a dynamic programming approach for determining the most likely sequence of hidden states—called the Viterbi path—that results in a sequence of seen events, especially in the context of Markov information sources and hidden Markov models (HMM).

Key Takeaways

Let us brief out the article.

Firstly, we saw the definition of the hidden Markov model with its basic assumptions. Later, we saw how the hidden Markov model could be used as a finite state machine with some of its terminologies. We saw how the HMM model works with some of its applications. That's all from the article. I hope you all like it.

Happy Learning Ninajs!

Next article
Viterbi Decoding with Hidden Markov Models
Live masterclass