Table of contents
1.
Introduction
2.
What is Probabilistic Context-Free Grammar?
3.
Parsing with PCFGs
4.
Frequently Asked Questions
4.1.
Define Probabilistic context-free grammar?
4.2.
What is the significance of the parameters in PCFG?
4.3.
Can we apply the CKY algorithm to the Probabilistic Context-Free Grammar?
4.4.
What is probabilistic parsing?
5.
Conclusion
Last Updated: Sep 29, 2024
Medium

Probabilistic Context-Free Grammar

Author Rajkeshav
0 upvote
Career growth poll
Do you think IIT Guwahati certified course can help you in your career?

Introduction

Probabilistic Context-Free Grammar (PCFG) is an extension of Context-Free Grammar (CFG) that incorporates probabilities into its production rules. While CFG defines the structural rules of a language, PCFG adds a probabilistic dimension, allowing it to manage ambiguity by selecting the most likely parse for a given input. This makes PCFG particularly useful in fields like natural language processing, where multiple interpretations of a sentence may exist, as it can prioritize the most probable one based on statistical patterns learned from data.

Probabilistic Context-Free Grammar

What is Probabilistic Context-Free Grammar?

Probabilistic Context-Free Grammar

source

In the above we can see the Probabilistic Context-Free Grammar. It looks the same as context-free grammar, but now we have probabilities assigned to each rule in the grammar. Here, I have a set of context-free rules and will assume that 'S' is the start symbol under the context-free grammar, and we have probabilities for each rule. These probabilities have one critical property: if I take any nonterminal in the grammar, let's take VP, for example. There are many possible ways of expanding that nonterminal. So VP can be extended in three different ways. We can write it as VI or VT NP or V PPP. We can Notice that the three different probabilities associated with the three options sum to one (0.4+0.4+0.2=1.) This is the key constraint on the probabilities in a probabilistic context-free grammar. If we look at any nonterminal, we must have these probabilities summing to 1. Here is another example: NP; we can rewrite it in two different ways: a determiner followed by a noun or a noun followed by a prepositional phrase. These two probabilities sum to 1. So these probabilities have a clear interpretation which is the following

They are the conditional probabilities conditioned on a particular nonterminal. We have multiple different ways of writing nonterminals. We now distribute those other options over those different ways of redirecting that nonterminal.

Let's take a particular parse tree under this grammar. 

source

Notice that every one of the rules I am using here is in this grammar. Hence, this is a valid parse tree under the underlying context-free grammar. We will now assign probabilities to this parse tree, which is simply a product of possibilities for the different rules. If we look at the first rule, S goes to  NPVP with a probability of 1.0. NP goes to DTNN with a chance of 0.3; DT goes to 'the' with a probability of 1.0. To calculate the possibility of the entire tree, we need to multiply together the chances of the different rules.

The final expression looks like- 1.0 x 0.3 x 1.0 x 0.7 x 0.4 x 1.0.

Given a parse-tree t ∈ TG containing rules α1 → β1, α2 → β2, . . . , αn → βn, the probability of t under the PCFG is 

Where q(α → β) is the probability for the rule 

Probabilistic context-free grammars (PCFGs) are defined as follows:

A PCFG consists of: 

1. A context-free grammar G = (N, Σ, S, R).

2. A parameter q(α → β

for each rule α → β, ∈ R. The parameter q(α → β) can be interpreted as the conditional probability of choosing rule α → β in a left-most derivation, given that the nonterminal being expanded is α.For any X ∈ N, we have the constraint 

In addition we have q(α → β) ≥ 0 for any α → β ∈ R 

Parsing with PCFGs

For sentence s, how do we find the highest-scoring parse tree for s, or more explicitly, how do we find

 

We can take the help of the CKY algorithm for this problem. The Cocke-Younger-Kasami algorithm applies to a restricted type of PCFG, which is in Chomsky standard form (CNF). While the restriction to grammar in CNF might initially seem restrictive, it is not a reasonable assumption. We can convert any PCFG into an equivalent grammar in CNF.

CNF (Chomsky Normal Form) is a context-free grammar G = (N, Σ, R, S) if each rule α → β ∈ R represents one of two following forms.

X → Y1 Y2, X ∈ N, Y1 ∈ N, Y2 ∈ N. 

• X → Y, X ∈ N, Y ∈ Σ

There is a nonterminal X rewriting for each rule in the grammar as two nonterminal symbols, Y1 Y2. Or, it can consist of a nonterminal X rewriting as exactly one terminal symbol Y. 

Frequently Asked Questions

Define Probabilistic context-free grammar?

 A probabilistic context-free grammar is the same as context-free grammar, but here we have probabilities assigned to each rule in the grammar.

What is the significance of the parameters in PCFG?

The parameter q(α → β) can be interpreted as the conditional probability of choosing rule α → β in a left-most derivation, given that the nonterminal being expanded is α.

Can we apply the CKY algorithm to the Probabilistic Context-Free Grammar?

Yes, the Cocke-Younger-Kasami algorithm applies to a restricted type of PCFG, which is in Chomsky standard form (CNF).

What is probabilistic parsing?

Probabilistic parsing uses dynamic programming to compute the most likely parse of a given sentence.

Conclusion

Probabilistic Context-Free Grammar (PCFG) enhances traditional context-free grammars by assigning probabilities to production rules, allowing more flexible and accurate modeling of natural language. This probabilistic approach helps in handling ambiguity in language parsing, improving tasks like syntax analysis, speech recognition, and machine translation.

If you find it exciting and want to learn more about NLP and Machine Learning, visit here. 

Further readings-

CYK Algorithm in Parsing

Hyperparameter Tuning and Predicting scores

Restricted Boltzmann Machine on MNIST Dataset

Live masterclass