Probabilistic Context Free Grammar Formal Definition and Examples
Probabilistic Context Free Grammar (PCFG)
Probabilistic
Context Free Grammar (PCFG) is an extension of Context Free Grammar (CFG) with
a probability for each production rule. Ambiguity is the reason why we are
using probabilistic version of CFG. For instance, some sentences may have more
than one underlying derivation. That is, the sentence can be parsed in more
than one ways. In this case, the parse of the sentence become ambiguous. To
eliminate this ambiguity, we can use PCFG to find the probability of each parse
of the given sentence.
A PCFG is made up
of a CFG and a set of probabilities for each production rule of CFG. A PCFG can be
formally defined as follows;
A probabilistic context free grammar G is a
quintuple G = (N, T, S, R, P) where
- (N, T, S, R) is a context free grammar where N is set of non-terminal (variable) symbols, T is set of terminal symbols, S is the start symbol and R is the set of production rules where each rule of the form A → s [Refer for more here – Context Free Grammar Formal Definition].
- A probability P(A → s) for each rule in R. The properties governing the probability are as follows;
- P(A → s) is a conditional probability of choosing a rule A → s in a left-most derivation, given that A is the non-terminal that is expanded.
- The value for each probability lies between 0 and 1.
- The sum of all probabilities of rules with A as the left hand side non-terminal should be equal to 1.
Example PCFG:
Probabilistic Context Free Grammar G
= (N, T, S, R, P)
- N = {S, NP, VP, PP, Det, Noun, Verb, Pre}
- T = {‘a’, ‘ate’, ‘cake’, ‘child’, ‘fork’, ‘the’, ‘with’}
- S = S
- R = { S → NP VP
NP
→ Det Noun | NP PP
PP
→ Pre NP
VP
→ Verb NP
Det
→ ‘a’ | ‘the’
Noun
→ ‘cake’ | ‘child’ | ‘fork’
Pre
→ ‘with’
Verb
→ ‘ate’ }
- P = R with
associated probability as in the table below;
Rule
|
Probability
|
Rule
|
Probability
|
S → NP VP
|
1.0
|
Det → ‘a’
Det → ‘the’
|
0.5
0.5
|
NP → NP PP
NP → Det Noun
|
0.6
0.4
|
Noun → ‘cake’
Noun → ‘child’
Noun → ‘fork’
|
0.4
0.3
0.3
|
PP → Pre NP
|
1.0
|
Pre → ‘with’
|
1.0
|
VP → Verb NP
|
1.0
|
Verb
→ ‘ate’
|
1.0
|
In the next page,
let us discuss the question “How to use PCFG to calculate the probability of a
parse tree?”
********************
Related links:
- Go to Natural Language Processing (NLP) home
- Go to NLP Solved Exercise page
- Go to Context Free Grammar (CFG) Formal Definition page