Multiple choices questions in NLP, Natural Language Processing solved MCQ, Bigram model, How to calculate the bigram probability using a corpus statistics? maximum likelihood estimate to find the bigram probability
Natural Language Processing MCQ - Bigram probability calculation
1. Find the probability P(Alice | is) as per the bi-gram model. Use the corpus given below;
<s> My name is Alice </s>
<s> Alice my name is </s>
<s> A girl said that her name is Alice </s>
<s> My daughter’s name is Alice </s>
a) 0
b) 0.75
c) 0.25
d) 0.5
Answer: (b) 0.75 The frequency of the bi-gram 'is Alice' and the frequency of the uni-gram 'is' can be used to calculate the required bi-gram probability. The occurrence of bi-grams can be counted from the given corpus and the probability can be calculated as follows using Maximum Likelihood Estimate (MLE). P(Alice | is) = Count('is Alice')/Count('is') = 3/4 = 0.75 The bi-gram 'is Alice' occurs three times and uni-gram 'is' occurs 4 times in the given corpus. |
No comments:
Post a Comment