Table of contents
1.
What is Bottom up parsing?
1.1.
Example
2.
Classification of Bottom-up parsing
2.1.
Shift Reduce Parsing
2.1.1.
Example
2.1.2.
Parsing table 
2.2.
Operator Precedence Parsing
2.2.1.
Precedence table
2.2.2.
Parsing Action
2.3.
Table-driven LR Parsing
2.3.1.
LR algorithm
2.3.2.
LR (1) Parsing
2.3.3.
Augment Grammar
3.
LL vs. LR
4.
Frequently Asked Questions
4.1.
How does bottom-up parsing work?
4.2.
Write a drawback of the LR parser?
4.3.
What are the algorithms for constructing an LR parser?
4.4.
What is the reduce step in shift-reduce parsing?
4.5.
What are the features of a simple LR Parser?
5.
Conclusion
Last Updated: Jul 3, 2024
Easy

Bottom-up Parsing

Author Muskan Gupta
2 upvotes
Career growth poll
Do you think IIT Guwahati certified course can help you in your career?

Bottom-up parsing begins at the terminal symbols (leaf nodes) of a parse tree and ascends towards the root node, combining smaller units into larger ones until the entire structure is formed. This method is commonly used in syntax analysis to construct parse trees for input strings.

This blog covers the concept of bottom-up parsing, Shift Reduce Parsing, Operator Precedence Parsing, and table-driven L-R Parsing.

Compiler Design

What is Bottom up parsing?

This parser compresses the non-terminals where it starts and moves backwards to the starting symbol of the parse tree using grammar productions to draw the input string's parse tree. 

  • In this technique, parsing starts from the leaf node of the parse tree to the start symbol of the parse tree in a bottom-up manner.
  • Bottom-up parsing attempts to construct a parse tree by reducing input string and using Right Most Derivation.
  • Bottom-up parsing starts with the input symbol and constructs the parse tree up to the start symbol using a production rule to reduce the string to get the starting symbol.

Example

  1. E → T  
  2. T → T * F  
  3. T → id  
  4. F → T  
  5. F → id  

Parse Tree representation of input string "id * id" is as follows:

     

     

 

   

   

 

Also See, Specifications of Tokens in Compiler Design

Classification of Bottom-up parsing

Bottom-up parsing has been classified into various parsing. These are as follows:

  1. Shift-Reduce Parsing
  2. Operator Precedence Parsing
  3. Table Driven LR Parsing
     
Classification of Bottom-up parsing

Classification of Bottom-up parsing

Shift Reduce Parsing

Shift reduce parsing is a process to reduce a string to its grammar start symbol. It uses a stack to hold the grammar and an input tape to hold the string. Shift reduce parsing performs the two actions: shift and reduce. This is why it is known as shift-reduce parsing. The current symbol in the input string is pushed to a stack at the shift action. At each reduction, the symbols will be replaced by the non-terminals. The non-terminal is the left side of the output, and the symbol is the right side of the production.

Example

Grammar:

  1. S → S+S    
  2. S → S-S    
  3. S → (S)  
  4. S → a  

Input string: a1-(a2+a3) 

Parsing table 

Stack contents Input String Actions
$ a1-(a2+a3)$ Shift a1
$a1 -(a2+a3)$ Reduce by s->a
$S -(a2+a3)$ Shift -
$S- (a2+a3)$ Shift (
$S-( a2+a3)$ Shift a2
$S-(a2 +a3)$ Reduce by S->a
$S-(S +a3)$ Shift +
$S-(S+ a3)$ Shift a3
$S-(S+a3 )$ Reduce by S->a
$S-(S+S )$ shift)
$S-(S+S) $ Reduce by S->S+S
$S-(S) $ Reduce by S->(S)
$S-S $ Reduce by S->S-S
$S $ Accept

There are two main categories in shift-reduce parsing as follows:

  1. Operator-Precedence Parsing
  2. LR-Parser

Operator Precedence Parsing

Operator precedence grammar is a category in the shift-reduce method of parsing. It is applied to a class of grammars operators. Operator precedence grammar must have the following two properties:

  • No RHS of any product has a∈.
  • Two non-terminals must not be adjacent.

Operator precedence can only be fixed between the terminals of the grammar. It generally ignores the non-terminal.

There are the three operator precedence relations:

a ⋗ b means that terminal "a" has higher precedence than terminal "b."

a ⋖ b means that terminal "a" has lower priority than terminal "b."

a ≐ b means that the terminal "a" and "b" both have the same precedence.

Precedence table

Following is a precedence table according to which grammar of the operator precedence parsing works.

Precedence table

Parsing Action

The sequence in which parsing action is performed in operator precedence parsing is as:

  • At first, the $ symbol is added to both ends of the string.
  • Now we scan the input string from left to right until the character ⋗ is encountered.
  • Scanning is done towards leftover all the equal precedence until the first leftmost ⋖ is encountered.
  • Everything between leftmost ⋖ and rightmost ⋗ is a handle.
  •  If it is $ on $, then it means parsing is successful.

Table-driven LR Parsing

LR parsing is also a category of Bottom-up parsing. It is generally used to parse the class of grammars whose size is large. In the LR parsing, "L" stands for scanning of the input left-to-right, and "R" stands for constructing a rightmost derivation in a reverse way.

"K" stands for the count of input symbols of the look-ahead that are used to make many parsing decisions.

Types of LR Parsers:

  1. LR( 1 )
  2. SLR( 1 )
  3. CLR ( 1 )
  4. LALR( 1 )

The L stands for scan operations of Bottom-up parsing left-to-right, and R stands for scan right-to-left in Bottom-up parsing. 

LR bottom-up parsing parsers have several advantages of Bottom-up parsing like:

  • Used in many programming languages except Perl and C.
  • Efficient implementation.
  • In L-operations, they detect quickly any syntactic errors present in a bottom-up parsing example.

LR parsing is divided into four categories of Parsing:

  • LR (0) parsing
  • SLR parsing
  • CLR parsing 
  • LALR parsing
     
Types of LR Parser

Types of LR parser

LR algorithm

The LR algorithm requires input, output, stack, and parsing tables. In all types of LR parsing, input, output, and stack are the same, but the parsing table is different. The input buffer indicates the end of input data, and it has the string to be parsed. The symbol of "$ follows that string."A stack is used to contain grammar symbols' sequence with the symbol $ at the stack's Bottom.

A parsing table can be defined as an array of two-dimension. It usually contains two parts: the action part and the go-to part.

LR (1) Parsing

The various steps involved in the LR (1) Parsing are as follows:

  • At first, context-free grammar is written for the given input.
  • The ambiguity of the grammar is checked.
  • Augment production is added in the given grammar.
  • A canonical collection of LR (0) items are created.
  • A data flow diagram is drawn.
  • An LR (1) parsing table is created.

Augment Grammar

Augmented grammar will be generated if we add one more product in the given grammar G. It helps the parser identify when to stop the parsing and announce the acceptance of the input. 

LL vs. LR

LL (Left-to-Right, Leftmost Derivation) and LR (Left-to-Right, Rightmost Derivation) are two types of parsing techniques used in compiler construction and formal language theory. They belong to the broader category of shift-reduce parsing algorithms. Here's a comparison between LL and LR parsing:

Feature LL Parsing LR Parsing
Type Top-down parsing Bottom-up parsing
Derivation Leftmost Rightmost
Strategy Predictive Shift-reduce
Complexity Simpler, easier to understand and implement More complex, requires parsing tables
Power Limited power, handles simpler grammars More powerful, handles wider range of grammars
Lookahead Often requires more lookahead symbols Often requires fewer lookahead symbols
Error Handling Less efficient at error recovery More efficient at error recovery
Example Recursive descent parsing LR(0), SLR(1), LALR(1), CLR(1) parsing

Frequently Asked Questions

How does bottom-up parsing work?

Bottom-up parsers work by "shifting" symbols onto a stack till the stack top contains a right-hand side of a production, and then the stack is reduced by replacing the right-hand side of production with the left-hand side.

Write a drawback of the LR parser?

A major drawback of an LR parser is that it is too much work to construct an LR parser by hand.

What are the algorithms for constructing an LR parser?

There are mainly three such algorithms:
SLR(1) - Simple LR Parser
LR(1) - LR Parser
LALR(1) - Look-Ahead LR Parser

What is the reduce step in shift-reduce parsing?

When the parser finds a complete grammar rule in RHS and replaces it with LHS, it is a reduced step.

What are the features of a simple LR Parser?

It works on the smallest class of grammar, and they have simple and fast calculations.

Conclusion

In this article, we have discussed the concepts of bottom-up parsing, shift-reduce Parsing, operator precedence parsing, and table-driven LR parsing and frequently asked questions. 

Recommended Reading:

Do check out The Interview guide for Product Based Companies as well as some of the Popular Interview Problems from Top companies like Amazon, Adobe, Google, Uber, Microsoft, etc. on Coding Ninjas Studio.

Also check out some of the Guided Paths on topics such as Data Structure and Algorithms, Competitive Programming, Operating Systems, Computer Networks, DBMS, System Design, etc. as well as some Contests, Test Series, Interview Bundles, and some Interview Experiences curated by top Industry Experts only on Coding Ninjas Studio.

Live masterclass