Tuesday, March 8, 2022

Solved MCQ in Natural Language Processing - Maximum Likelihood Estimate in Language Models

Multiple choices questions in NLP, Natural Language Processing solved MCQ, Bigram model, How to calculate the bigram probability using a corpus statistics? maximum likelihood estimate to find the bigram probability

Natural Language Processing MCQ - Bigram probability calculation using MLE

1. Using Maximum Likelihood Estimate (MLE), to compute the bigram probability P(w_n|w_n-1), we need to count the number of bigrams (w_n-1w_n) from a corpus and normalize by the count of all bigrams that start with w_n-1. This normalization step ensures that the estimate lie between 0 and 1.

P(w_n|w_n-1) = Count (w_n-1w_n) / Sum(Count(w_n-1w))

Here, w is any word that follows w_n-1.

This equation can be simplified by replacing the bigram count in the denominator with the unigram count of w_n-1. Why do we want to do that?

a) Bigram count can only be normalized by unigram count

b) Sum of all bigram counts that start with the word wn-1 is equal to the unigram count of the same word

c) Normalization using bigram count will make the estimate to be greater than 1 in some cases.

d) None of the above.

Answer: (b) Sum of all bigram counts that start with the word w_n-1 is equal to the unigram count of the same word

Let us calculate the bigram probability P(increase | to) using both the normalization using bigram and unigram. (Note: hereafter I use ‘C’ to refer ‘Count’)

Normalizing by sum of all bigram counts

For this case, we need to normalize using the total count of bigrams that start with the word “to”.

P(increase | to) = C(“to increase”)/[C(“to increase”)+C(“to be”)+C(“to fill”)] = 2/[2+1+1] = 2/4 = 0.5

Normalizing by unigram count

For this case, we need to normalize using the unigram count of the same word “to”.

P(increase|to) = C(“to increase”)/C(“to”) = 2/4 = 0.5

We have only 4 occurrences of word “to” in the corpus. Hence, the sum of count of any bigram that starts with “to” cannot exceed 4. For this reason, we can simplify the equation by normalizing using unigram count instead of sum of all bigram counts.

< Previous

Next >

*****************

Related links:

Go to Natural Language Processing home page

Go to Natural Language Processing - Glossary

Go to NLP - MCQ Quiz Home page

Maximum likelihood estimate normalizes the n-gram count using the n-1 gram count. Why? Explain the reason behind this.

How to calculate the bi-gram probability?

Bi-gram language model uses bi-gram probabilities that were learnt from the corpus using MLE

How to make use of maximum likelihood estimate to measure bi-gram probability?

Why do we normalize the bigram counts using unigram counts instead of sum of all bigram counts in bigram language model?

MLE estimate of a bigram model counts the bigrams and normalize using unigrams, Why?

NLP Solved MCQ, Natural language processing solved mcq, language model solved mcq, perplexity solved mcq, nlp solved exercises

TOPICS (Click to Navigate)

Tuesday, March 8, 2022

Solved MCQ in Natural Language Processing - Maximum Likelihood Estimate in Language Models

Multiple choices questions in NLP, Natural Language Processing solved MCQ, Bigram model, How to calculate the bigram probability using a corpus statistics? maximum likelihood estimate to find the bigram probability

Natural Language Processing MCQ - Bigram probability calculation using MLE

Go to Natural Language Processing home page

Go to Natural Language Processing - Glossary

Go to NLP - MCQ Quiz Home page

Top interview questions in NLP

Maximum likelihood estimate normalizes the n-gram count using the n-1 gram count. Why? Explain the reason behind this.

How to calculate the bi-gram probability?

Bi-gram language model uses bi-gram probabilities that were learnt from the corpus using MLE

How to make use of maximum likelihood estimate to measure bi-gram probability?

Why do we normalize the bigram counts using unigram counts instead of sum of all bigram counts in bigram language model?

MLE estimate of a bigram model counts the bigrams and normalize using unigrams, Why?

Featured Content

Multiple choice questions in Natural Language Processing Home

All time most popular contents

Report Abuse