Code360 powered by Coding Ninjas X Naukri.com. Code360 powered by Coding Ninjas X Naukri.com
Table of contents
1.
Type of Predictive Parsing
1.1.
1. Top-down Parsing
1.2.
2. Bottom up Parsing
2.
Components of Predictive Parsing
2.1.
1. Input Buffer
2.2.
2. Stack 
2.3.
3. Predictive Parsing Table 
3.
Drawback of Predictive Parsing
4.
Algorithm to Construct Predictive Parsing Table
5.
Frequently Asked Questions
5.1.
Q. What is a parser?
5.2.
Q. What is a recursive descent parser?
5.3.
Q. What is the main purpose of parser?
5.4.
Q. What is the difference between predictive and recursive parsing?
6.
Conclusion
Last Updated: Jun 24, 2024
Medium

What is Predictive Parsing?

Author Apoorv
1 upvote
Master Python: Predicting weather forecasts
Speaker
Ashwin Goyal
Product Manager @

Predictive parsing is a form of recursive descent parsing, in which no backtracking is needed, so it can predict which products to use as the replacement of the input string. There's a version called LL(1) parsing that uses a table to help make decisions, which makes it very efficient and straightforward.

Predictive parsing is a parsing technique used in compiler construction and syntax analysis. It helps compilers analyze and understand the structure of code by predicting the part that comes next depending on what's already there. This simplifies the process of translating code into machine-readable instructions.

predictive parsing

We'll go through the basics of Predictive Parser and concentrate on the role of Predictive Parser. We will discuss everything about Predictive Parsing. Also will cover the algorithm for implementing the Predictive parser or Predictive Parsing, as well as an example involving the construction of the algorithm for precedence parsing but before that let's have a look at the basic hierarchy of parsing and their different types.

Predictive parsing is a parsing technique used in compiler design to analyze and validate the syntactic structure of a given input string based on a grammar. It predicts the production rules to apply without backtracking, making it efficient and deterministic.

Also, read - Recursive Descent Parser

Type of Predictive Parsing

1. Top-down Parsing

In top-down parsing, the parser starts with the top-level non-terminal of the grammar and attempts to construct a parse tree by repeatedly expanding non-terminals until the input string is derived. It predicts the production rules to apply from left to right, hence the name "top-down."

Eg. Consider the grammar:
S -> A B
A -> a
B -> b

Let's parse the input string “ab” using top-down predictive parsing. 

1. Construct the predictive parsing table:

 a | b
S -> A B
AA -> a
BB-> b

2. Initialising stack and input buffer: 

Stack: '$ S' 

Input Buffer: 'ab $'

3. Apply top-down predictive parsing algorithm. 

Stack Input BufferAction
$ Sa b $Match terminal ‘a’
$ S A b $Match terminal ‘b’
$ S A B$Accept

2. Bottom up Parsing

In bottom-up parsing, the parser begins with the input string and attempts to reduce it to the start symbol of the grammar by applying production rules in a right-to-left manner. It builds the parse tree from the bottom-up, hence the name "bottom-up."

Eg. Consider the grammar:
S -> A B
A -> a
B -> b

Let's parse the input string “ab” using bottom-up predictive parsing. 

1. Initialize the stack and input buffer.

Stack: $

Input Buffer: ab $
 

2. Applying bottom-up parsing algorithm: 

Stack Input BufferAction
$a b $Shift ‘a’ onto stack
$ ab $Shift ‘b’ onto stack
$ a bReduce using production B -> b
$ a B$Reduce using production A -> a
$ A$Reduce using production S -> A B
$ S$Accept

Components of Predictive Parsing

1. Input Buffer

The input buffer is a temporary storage area that holds the input symbols yet to be processed by the parser. It provides a lookahead mechanism, allowing the parser to examine the next input symbol without consuming it.

input buffer

In this diagram, the input buffer is shown as a series of unprocessed tokens (T) and non-terminals (N). The end-of-input marker is denoted by the "$" symbol. Based on the top of the parsing stack and the current input symbol, the parsing table is a data structure that directs the parser's activities. The parser's next token to process is indicated by the arrow "^" in the input buffer. The arrow moves to the right as the parser reads tokens from the input buffer, processing each token until the end-of-input marker is reached.

2. Stack 

The stack, also known as the parsing stack or the pushdown stack, is a data structure used by the parser to keep track of the current state and to guide the parsing process. It stores grammar symbols, both terminals and non-terminals, during the parsing process.

parsing stack

The stack in predictive parsing is a vertical sequence of symbols representing terminals (T) and non-terminals (N) from the language's grammar. The top of the stack, indicated by the "^" symbol, represents the symbol currently being processed by the parser. By comparing the input symbol with the top of the stack, the parser determines the next action to take. The parser can shift new symbols onto the stack or reduce symbols using grammar productions as it processes symbols from the input stream. The stack is crucial for maintaining the parser's state and enabling parsing actions. It allows the parser to track encountered symbols, make decisions based on the grammar, and interact with the parsing table. In practical implementations, the stack may contain additional information, such as attributes or semantic actions associated with the symbols.

3. Predictive Parsing Table 

The predictive parsing table is a data structure used in predictive parsing. It maps the combination of a non-terminal and a lookahead symbol to a production rule. It guides the parser's decisions, determining which production to apply based on the current non-terminal and the next input symbol.

predictive parsing table

This diagram represents the parsing table as a grid with rows and columns. The rows correspond to the non-terminals in the grammar, and the columns represent the input symbols (terminals and the end-of-input marker $).

Each cell in the table contains information about the action or the production to be applied when a specific non-terminal and input symbol combination is encountered. Here's what each entry in the table represents:

  • S, A, B: Non-terminals in the grammar.
  • S->, A->, B->: Indicates a production rule to be applied.
  • a, b: Terminal symbols from the input alphabet.
  • ε: Represents an empty or epsilon production.
  • Empty cells: Indicate that no action or production is defined for that combination of non-terminal and input symbols.
     

For example, if the parser is in state S (row) and the next input symbol is "a" (column), the corresponding cell entry "S->aAB" instructs the parser to apply the production rule "S -> aAB".

The parsing table is a critical component of predictive parsing as it systematically guides the parser's decisions and actions. By consulting the table, the parser can determine the appropriate production or action based on the current state and the lookahead input symbol, enabling it to construct the parse tree or detect syntax errors during the parsing process.

Get the tech career you deserve, faster!
Connect with our expert counsellors to understand how to hack your way to success
User rating 4.7/5
1:1 doubt support
95% placement record
Akash Pal
Senior Software Engineer
326% Hike After Job Bootcamp
Himanshu Gusain
Programmer Analyst
32 LPA After Job Bootcamp
After Job
Bootcamp

Drawback of Predictive Parsing

These are some drawbacks of Predictive Parsing:

  • Left Recursion: Predictive parsing cannot handle left-recursive grammars directly. Left recursion occurs when a non-terminal directly or indirectly references itself as the first symbol in its production rule.
  • Backtracking: Predictive parsing does not support backtracking, meaning it cannot change the decision it has made based on the input. This limitation restricts the class of grammars that can be handled by predictive parsing.
  • Difficulty handling ambiguous grammars: Predictive parsing has difficulties when dealing with ambiguous grammars, where a single input can have numerous correct parse trees. The LL(1) parsing approach cannot directly handle ambiguity. To resolve ambiguities, grammar adjustments or additional strategies like disambiguation rules or semantic actions may be needed, which might make the parser implementation more complex.

Algorithm to Construct Predictive Parsing Table


The algorithm to construct a Predictive Parsing table involves several steps:

1. Determine the First Sets for all non-terminals in the grammar. The First Set of a non-terminal represents the set of terminals that can appear as the first symbol of any derivation for that non-terminal.

2. Determine the Follow Sets for all non-terminals in the grammar. The Follow Set of a non-terminal represents the set of terminals that can appear immediately after the non-terminal in any derivation.

3. For each production rule A -> α in the grammar: a. For each terminal 'a' in First(α), add A -> α to the table entry [A, 'a']. b. If ε (empty string) is in First(α), for each terminal 'b' in Follow(A), add A -> α to the table entry [A, 'b'].

4. For each production rule A -> ε in the grammar, add A -> ε to the table entry [A, 'b'] for each terminal 'b' in Follow(A).


Examples of Predictive Parsing

Consider the grammar:
S -> A B
A -> a
B -> b | ε

1. Determine the First Sets:
First(S) = {a}
First(A) = {a}
First(B) = {b, ε}

2. Determine the Follow Sets:
Follow(S) = {$}
Follow(A) = {b}
Follow(B) = {$}

3. Construct the Predictive Parsing Table:

 ab$
SAB  
Aa  
B bε

4. Table entries based on the algorithm:

  • From the production S -> A B:
    For each terminal 'a' in First(A), add S -> A B to the entry [S, 'a'].
     
  • From the production A -> a:
    Add A -> a to the entry [A, 'a'].
     
  • From the production B -> b:
    Add B -> b to the entry [B, 'b'].
     
  • From the production B -> ε:
    For each terminal 'b' in Follow(B), add B -> ε to the entry [B, 'b'].
    Add B -> ε to the entry [B, '$'].

Frequently Asked Questions

Q. What is a parser?

A parser is a component of a compiler or interpreter that divides data into smaller chunks for easier translation into another language. A parser breaks down input in the form of a sequence of tokens, interactive commands, or computer instructions into portions that can be used by other programming components.

Q. What is a recursive descent parser?

It can be defined as a Parser that processes the input string using numerous recursive procedures with no backtracking. Recursive Descent Parser employs Top-Down Parsing without the need for retracing.

Q. What is the main purpose of parser?

Parsers are employed to abstractly represent input data, such as source code, as a structured format. This facilitates syntax validation by verifying adherence to language rules. Parsing is utilized in various coding languages and technologies for this purpose.

Q. What is the difference between predictive and recursive parsing?

Predictive parsing uses lookahead tokens to make parsing decisions without backtracking, suitable for LL grammars. Recursive parsing involves functions calling themselves to process input, adaptable but potentially less efficient due to backtracking in some cases.

Conclusion

In this article, we have extensively discussed Predictive Parsing along with some examples.

Recommended Reading:


Do check out The Interview guide for Product Based Companies as well as some of the Popular Interview Problems from Top companies like Amazon, Adobe, Google, Uber, Microsoft, etc. on Coding Ninjas Studio.

Also check out some of the Guided Paths on topics such as Data Structure and Algorithms, Competitive Programming, Operating Systems, Computer Networks, DBMS, System Design, etc. as well as some Contests, Test Series, Interview Bundles, and some Interview Experiences curated by top Industry Experts only on Coding Ninjas Studio.

Cheers!

Previous article
Left Factoring in Compiler Design
Next article
LR (0) Parser
Live masterclass