Evaluation problem of Hidden Markov Model, One of the three fundamental problems to be solved under HMM is Likelihood problem
Evaluation problem of Hidden Markov Model
Evaluation problem (Likelihood):
Given an HMM λ = (A, B, π)
and an observation sequence O = o1, o2, …, oT,
how do we compute the probability that the observed sequence was produced by
the model. In other words, it is about determining the likelihood of the
observation sequence.
Explain evaluation problem of HMM with example
It is just about how
to calculate the probability of the observation sequence given the model. The calculation
depends on the data given to us. For instance, if you are given observation sequence
O = o1,
o2, …, oT and
- one particular hidden state sequence Q = q1, q2, …, qT, you are computing the likelihood of observation only for the given sequence as in the example below; [For more, refer Hidden Markov Model for POS tagging solved exercises]
P(O|Q) = P(drink water | VB NN) =
P(VB|start) * P(drink|VB) * P(NN|VB) * P(water|NN)
- no particular state sequence, then you need to find the same as above and do the sum over all possible hidden state sequences. For example, to get the P(drink water) without any particular hidden tag sequence, we need to compute the likelihood for each possibility and sum all the values as given below. This is because both the words “drink” and “water” are listed under the POS categories Noun and Verb.
P(drink water) = P(drink water | VB NN) +
P(drink water | VB VB) + P(drink water | NN VB) + P(drink water | NN NN)
This way we can
compute the probability for all hidden sequences to find the probability of the
given observation sequence.
What is the drawback in Evaluation (likelihood) problem of HMM?
How about in
general for an HMM with N hidden states
and a sequence of T observations? The
above said calculations have to be done for NT different possibilities. Let us do simple calculations
to understand the complexity;
- If number of tags N = 2, and number of words observed T = 2, then 22 = 4 possible likelihood estimates.
- If N = 6 and T = 4, then 64 = 1296 possible likelihood estimates.
- For longer sentences, it will be too high.
What is the solution to handle Evaluation problem of HMM?
- Forward-Backward procedure
***********
Go to NLP Glossary
Go to Natural Language Processing Home page
No comments:
Post a Comment