Bigram probability estimate of a word sequence, Probability estimation for a sentence using Bigram language model

Bigram Model - Probability Calculation - Example Problem

Page 1 Page 2 Page 3

Solved Example:

Let us solve a small example to better understand the Bigram model. For this we need a corpus and the test data. Let us assume that the following is a small corpus;

Training corpus:

<s> I am from Vellore </s>

<s> I am a teacher </s>

<s> students are good and are from various cities</s>

<s> students from Vellore do engineering</s>

Test data:

<s> students are from Vellore </s>

Let us find the Bigram probability of the given test sentence. I explained the solution in two methods, just for the sake of understanding. the second method is the formal way of calculating the bigram probability of a sequence of words.

Method 1
As per the Bigram model, the test sentence can be expanded as follows to estimate the bigram probability;
P(<s> students are from Vellore </s>)
                               = P(students | <s>) * P(are | students) * P(from | are)
                                 * P(Vellore | from) *    P(</s> | Vellore)

To estimate bigram probabilities, we can use the following equation;

[Hint – count of sentence start (<s>) = 4, count of string <s> students = 2]

[Hint – count of word students = 2, count of string students are = 1]

[Hint – count of word are = 2, count of string are from = 1]

[Hint – count of word from = 3, count of string from Vellore = 2]

[Hint – count of word Vellore = 2, count of string Vellore </s> = 1]

P(<s> students are from Vellore </s>)

                        = P(students | <s>) * P(are | students) * P(from | are)
                             * P(Vellore | from) * P(</s> | Vellore)

                        = 1/2 * 1/2 * 1/2 * 2/3 * 1/2 = 0.04167

Method 2

Formal way of estimating the bigram probability of a word sequence:
The bigram probabilities of the test sentence can be calculated by constructing Unigram and bigram probability count matrices and bigram probability matrix as follows;

Unigram count matrix

<s>

students

are

from

Vellore

4

2

2

3

2

Bigram count matrix

w_n

students

are

from

Vellore

</s>

w_n-1

<s>

1

0

0

0

0

students

0

1

1

0

0

are

0

0

1

0

0

from

0

0

0

2

0

Vellore

0

0

0

0

1

Bigram probability matrix (normalized by unigram counts)

w_n

students

are

from

Vellore

</s>

w_n-1

<s>

2/4

0/4

0/4

0/4

0/4

students

0/2

1/2

1/2

0/2

0/2

are

0/2

0/2

1/2

0/2

0/2

from

0/3

0/3

0/3

2/3

0/3

Vellore

0/2

0/2

0/2

0/2

1/2

P(<s> students are from Vellore </s>)

                          = P(students | <s>) * P(are | students) * P(from | are)
                              * P(Vellore | from) * P(</s> | Vellore)

                          = 1/2 * 1/2 * 1/2 * 2/3 * 1/2 = 0.04167

The probability of the test sentence as per the bigram model is 0.04167.

Page 1    Page 2    Page 3

----------------------------------------------------------------------------------------------------------

Go to NLP Glossary

Go to Natural Language Processing Home page

Find the probability of test sentence using bigram language model

Example solved problem in natural language processing

How to calculate probability of a sentence as per bigram statistical language model

Major links

Quicklinks

Monday, April 6, 2020

Bigram probability estimate of a word sequence

Bigram probability estimate of a word sequence, Probability estimation for a sentence using Bigram language model

Bigram Model - Probability Calculation - Example Problem

Find the probability of test sentence using bigram language model

Example solved problem in natural language processing

How to calculate probability of a sentence as per bigram statistical language model

Explain bigram statistical language model

Bigram model solved exercises

5 comments:

		*w_n*
		students	are	from	Vellore	</s>
*w_n-1*	<s>	1	0	0	0	0
	students	0	1	1	0	0
	are	0	0	1	0	0
	from	0	0	0	2	0
	Vellore	0	0	0	0	1