How to calculate transition probabilities in HMM using MLE? Calculate emission probabilities in HMM using MLE from a corpus, How to count and measure MLE from a corpus?

Question:

Given the following tagged corpus as the training corpus, answer the following questions using Maximum Likelihood Estimation (MLE);

Training corpus:

But/CC then/RB the/DT bear/NN thought/VBD that/IN the/DT fish/NN was/VBD too/RB small/JJ to/TO fill/VB the/DT stomach/NN of/IN bear/NN. He/PRP decided/VBD to/TO catch/VB a/DT bigger/JJR fish/NN. He/PRP let/VBD off/RP the/DT small/JJ fish/NN and/CC waited/VBD for/IN some/DT time/NN. Again/RB a/DT small/JJ fish/NN came/VBD and/CC he/PRP let/VBP it/PRP go/VB thinking/VBG that/IN the/DT small/JJ fish/NN would/MD not/RB fill/VB his/PRP$ belly/NN. This/DT way/NN he/PRP caught/VBD many/JJ small/JJ fish/NN, but/CC let/VB all/DT of/IN them/PRP go/VB off/RP. By/IN sunset/NN, the/DT bear/NN had/VBD not/RP caught/VBN any/DT big/JJ fish/NN.

Tags used in this corpus
CC – Conjunction DT – Determiner IN - Preposition JJ – Adjective JJR – Adjective comparative MD - Modal NN – Noun PRP – Personal pronoun	PRP$ - Possessive pronoun RB – Adverb RP - Particle TO - To VB - Verb VBD – Verb past tense VBN – Verb past participle VBP – Verb non-3^rdperson singular present

(a) Find the tag transition probabilities using MLE for the following.

(i) P(JJ|DT) (ii) P(VB|TO) (iii) P(NN|DT, JJ)

(b) Find the emission probabilities for the following;

(i) P(go|VB) (ii) P(fish|NN)

Answer:

(a) We can compute the maximum likelihood estimate of bigram and trigram transition probabilities as follows;

In Equation (1),

P(t_i|t_i-1) – Probability of a tag t_i given the previous tag t_i-1.

C(t_i-1, t_i) – Count of the tag sequence “t_i-1 t_i” in the corpus. That is, how many times tag t_i follows the tag t_i-1 in the corpus.

C(t_i-1) – Count of occurrence of tag t_i-1 in the corpus. That is, frequency of the tag t_i-1 in the corpus.

In Equation (2),

P(t_i|t_i-1, t_i-2) – Probability of a tag t_i given the previous two tag t_i-1, and t_i-2.

C(t_i-2, t_i-1, t_i) – Count of the tag sequence “t_i-2 t_i-1 t_i” in the corpus. That is, how many times tag t_i follows the couple of tags t_i-2andt_i-1 in the corpus.

C(t_i-2, t_i-1) – Count of occurrence of tag sequence “t_i-2t_i-1” in the corpus.

Solution to exercise a(i):

Find the probability of tag JJ given the previous tag DT using MLE

To find P(JJ | DT), we can apply Equation (1) to find the bigram probability using MLE.

In the corpus, the tag DT occurs 12 times out of which 4 times it is followed by the tag JJ.

Solution to exercise a(ii):

Find the probability of tag VB given the previous tag TO using MLE

To find P(VB | TO). We can apply Equation (1) to find the bigram probability using MLE.

In the corpus, the tag TO occurs 2 times out of which 2 times it is followed by the tag VB.

Solution to exercise a(iii):

Find the probability of tag NN given previous two tags DT and JJ using MLE

To find P(NN | DT JJ), we can apply Equation (2) to find the trigram probability using MLE.

In the corpus, the tag sequence “DT JJ” occurs 4 times out of which 4 times it is followed by the tag NN.

(B) We can compute the Maximum Likelihood Estimate of emission probability as follows;

In Equation (3),

P(w_i|t_i) – Probability of a word w_i given the tag t_i which is associated with the word.

C(t_i, w_i) – Count of occurrence of word w_i with associated tag t_i in the corpus. C(t_i) – Count of occurrence of tag t_i in the corpus.

Solution to exercise b(i):

Find the Maximum Likelihood Estimate of emission probability P(go|VB)

To find the MLE of emission probability P(go | VB), we can apply Equation (3) as follows;

In the corpus, the tag VB occurs 6 times out of which VB associated with the word “go” 2 times. [How to read P(go | VB)? – If we are going to generate a tag VB, how likely it will be associated with the word go]

Solution to exercise b(ii):

Find the Maximum Likelihood Estimate of emission probability P(fish|NN)

To find the MLE of emission probability P(fish | NN), we can apply Equation (3) as follows;

**********

Go to Hidden Markov Model Formal Definition page

Go to Natural Language Processing (NLP) home

Go to NLP Glossary - Dictionary page

How to calculate the tranisiton and emission probabilities in HMM from a corpus?

How to use Maxmimum Likelihood Estimate to calculate transition and emission probabilities for POS tagging?

TOPICS (Click to Navigate)

Friday, April 10, 2020

How to calculate transition and emission probabilities in HMM

How to calculate transition probabilities in HMM using MLE? Calculate emission probabilities in HMM using MLE from a corpus, How to count and measure MLE from a corpus?

Given the following tagged corpus as the training corpus, answer the following questions using Maximum Likelihood Estimation (MLE);

(a) We can compute the maximum likelihood estimate of bigram and trigram transition probabilities as follows;

Find the probability of tag JJ given the previous tag DT using MLE

Find the probability of tag VB given the previous tag TO using MLE

Find the probability of tag NN given previous two tags DT and JJ using MLE

(B) We can compute the Maximum Likelihood Estimate of emission probability as follows;

Find the Maximum Likelihood Estimate of emission probability P(go|VB)

Find the Maximum Likelihood Estimate of emission probability P(fish|NN)

How to calculate the tranisiton and emission probabilities in HMM from a corpus?

How to use Maxmimum Likelihood Estimate to calculate transition and emission probabilities for POS tagging?

Maximum Likelihood Estimate in HMM

Calculate emission probability in HMM

how to calculate transition probabilities in hidden markov model

how to calculate bigram and trigram transition probabilities solved exercise

solved problems in hidden markov model

No comments:

Post a Comment

Featured Content

Multiple choice questions in Natural Language Processing Home

All time most popular contents

Report Abuse