Last Updated: Mar 27, 2024

# Probabilistic Context-Free Grammar

Leveraging ChatGPT - GenAI as a Microsoft Data Expert
Speaker
Prerita Agarwal
Data Specialist @
23 Jul, 2024 @ 01:30 PM

## Introduction

Probabilistic Context-Free Grammar

source

In the above we can see the Probabilistic Context-Free Grammar. It looks the same as context-free grammar, but now we have probabilities assigned to each rule in the grammar. Here, I have a set of context-free rules and will assume that 'S' is the start symbol under the context-free grammar, and we have probabilities for each rule. These probabilities have one critical property: if I take any nonterminal in the grammar, let's take VP, for example. There are many possible ways of expanding that nonterminal. So VP can be extended in three different ways. We can write it as VI or VT NP or V PPP. We can Notice that the three different probabilities associated with the three options sum to one (0.4+0.4+0.2=1.) This is the key constraint on the probabilities in a probabilistic context-free grammar. If we look at any nonterminal, we must have these probabilities summing to 1. Here is another example: NP; we can rewrite it in two different ways: a determiner followed by a noun or a noun followed by a prepositional phrase. These two probabilities sum to 1. So these probabilities have a clear interpretation which is the following

They are the conditional probabilities conditioned on a particular nonterminal. We have multiple different ways of writing nonterminals. We now distribute those other options over those different ways of redirecting that nonterminal.

Let's take a particular parse tree under this grammar.

source

Notice that every one of the rules I am using here is in this grammar. Hence, this is a valid parse tree under the underlying context-free grammar. We will now assign probabilities to this parse tree, which is simply a product of possibilities for the different rules. If we look at the first rule, S goes to  NPVP with a probability of 1.0. NP goes to DTNN with a chance of 0.3; DT goes to 'the' with a probability of 1.0. To calculate the possibility of the entire tree, we need to multiply together the chances of the different rules.

The final expression looks like- 1.0 x 0.3 x 1.0 x 0.7 x 0.4 x 1.0.

Given a parse-tree t âˆˆ TG containing rules Î±1 â†’ Î²1, Î±2 â†’ Î²2, . . . , Î±n â†’ Î²n, the probability of t under the PCFG is

Where q(Î± â†’ Î²) is the probability for the rule

Probabilistic context-free grammars (PCFGs) are defined as follows:

A PCFG consists of:

1. A context-free grammar G = (N, Î£, S, R).

2. A parameter q(Î± â†’ Î²

for each rule Î± â†’ Î², âˆˆ R. The parameter q(Î± â†’ Î²) can be interpreted as the conditional probability of choosing rule Î± â†’ Î² in a left-most derivation, given that the nonterminal being expanded is Î±.For any X âˆˆ N, we have the constraint

In addition we have q(Î± â†’ Î²) â‰¥ 0 for any Î± â†’ Î² âˆˆ R

## Parsing with PCFGs

For sentence s, how do we find the highest-scoring parse tree for s, or more explicitly, how do we find

We can take the help of the CKY algorithm for this problem. The Cocke-Younger-Kasami algorithm applies to a restricted type of PCFG, which is in Chomsky standard form (CNF). While the restriction to grammar in CNF might initially seem restrictive, it is not a reasonable assumption. We can convert any PCFG into an equivalent grammar in CNF.

CNF (Chomsky Normal Form) is a context-free grammar G = (N, Î£, R, S) if each rule Î± â†’ Î² âˆˆ R represents one of two following forms.

â€¢ X â†’ Y1 Y2, X âˆˆ N, Y1 âˆˆ N, Y2 âˆˆ N.

â€¢ X â†’ Y, X âˆˆ N, Y âˆˆ Î£

There is a nonterminal X rewriting for each rule in the grammar as two nonterminal symbols, Y1 Y2. Or, it can consist of a nonterminal X rewriting as exactly one terminal symbol Y.

Get the tech career you deserve, faster!
Connect with our expert counsellors to understand how to hack your way to success
User rating 4.7/5
1:1 doubt support
95% placement record
Akash Pal
Senior Software Engineer
326% Hike After Job Bootcamp
Himanshu Gusain
Programmer Analyst
32 LPA After Job Bootcamp
After Job
Bootcamp

## FAQs

1. Define Probabilistic context-free grammar?

A probabilistic context-free grammar is the same as context-free grammar, but here we have probabilities assigned to each rule in the grammar.

2. What is the significance of the parameters in PCFG?

The parameter q(Î± â†’ Î²) can be interpreted as the conditional probability of choosing rule Î± â†’ Î² in a left-most derivation, given that the nonterminal being expanded is Î±.

3. Can we apply the CKY algorithm to the Probabilistic Context-Free Grammar?

Yes, the Cocke-Younger-Kasami algorithm applies to a restricted type of PCFG, which is in Chomsky standard form (CNF).

4. What is probabilistic parsing?

Probabilistic parsing uses dynamic programming to compute the most likely parse of a given sentence.

5. Why do we use the CYK algorithm?

We use the CYK algorithm to find whether the given string belongs to a language of grammar or not.

## Key Takeaways

In this blog,  we discussed

• The basics of PCFG
• The parsing method
• The CNF form

If you find it exciting and want to learn more about NLP and Machine Learning, visit here.

CYK Algorithm in Parsing

Hyperparameter Tuning and Predicting scores

Restricted Boltzmann Machine on MNIST Dataset

Topics covered
1.
Introduction
2.
Parsing with PCFGs
3.
FAQs
4.
Key Takeaways