Code360 powered by Coding Ninjas X Naukri.com. Code360 powered by Coding Ninjas X Naukri.com
Table of contents
1.
Introduction
2.
What is the LR parser?
3.
LR Algorithm
3.1.
Block Diagram
3.2.
Parsing Table
3.2.1.
ACTION Table
3.2.2.
GOTO Table
3.3.
LR Parser Steps 
3.4.
Example
4.
Augment Grammar
4.1.
Steps to Augment a Grammar
4.2.
Example
4.3.
Parsing with the Augmented Grammar
5.
Frequently Asked Questions
5.1.
Define parser.
5.2.
What is LR parsing?
5.3.
What are the types of LR parsing?
5.4.
What is the difference between LR(0) and SLR?
5.5.
What is LR(0) parsing technique?
5.6.
What is a CLR parser?
6.
Conclusion
Last Updated: May 20, 2024
Easy

LR Parsing

Author Pakhi Garg
1 upvote
Master Python: Predicting weather forecasts
Speaker
Ashwin Goyal
Product Manager @

Introduction

The parser is a compiler phase that accepts a token string as input and turns it into the matching intermediate representation using existing grammar. A parser is also known as a syntax analyzer. There are two types of parsing techniques in compiler design: Top-down parsing and Bottom-up parsing.

  • Top-Down Parsing- It is a parsing strategy that starts at the top of the parse tree and works its way down by applying grammatical rules.
  • Bottom-up Parsing- It is a parsing strategy that begins with the lowest level of the parse tree and works its way up using grammatical rules.

This article will learn a type of bottom-up parsing, i.e., LR parsing.

So let’s start.

Compiler Design

Also See, Specifications of Tokens in Compiler Design

What is the LR parser?

LR Parser is a type of bottom-up parsing. It is used to parse context-free grammar, primarily used by compilers. The LR parser reads the input from left to right and produces the rightmost derivation. 

The LR parsing can be further divided into four types-

LR Parser

The LR parser is denoted as LR(k), where the letter “L” refers to the left to right scanning of the input, the letter “R” refers to the rightmost derivation in reverse, and “k” refers to the number of unconsumed “look ahead” input symbols that are used in making parser decisions. The letter k is usually one and is frequently omitted. A context-free grammar is called LR(k) if the LR(k) parser exists.

Get the tech career you deserve, faster!
Connect with our expert counsellors to understand how to hack your way to success
User rating 4.7/5
1:1 doubt support
95% placement record
Akash Pal
Senior Software Engineer
326% Hike After Job Bootcamp
Himanshu Gusain
Programmer Analyst
32 LPA After Job Bootcamp
After Job
Bootcamp

LR Algorithm

The LR (Left-to-Right) algorithm, commonly referred to in the context of parsing, is a technique used to analyze a sequence of input symbols to determine its grammatical structure with respect to a given formal grammar. It is widely used in the construction of compilers and interpreters.

  1. Initialization: Start with an empty stack and place a special end-of-input marker ($) at the bottom. Place the start symbol of the grammar on top of the stack.
  2. Read Input: Read the input string from left to right, symbol by symbol.
  3. Shift and Reduce Operations: Shift, move the current input symbol to the stack and advance to the next input symbol. Reduce, if the stack matches the right-hand side of a production rule, replace it with the left-hand side non-terminal.
  4. Parser Actions: Use a parsing table to decide whether to shift, reduce, accept, or error. Accept, if the stack contains the start symbol and the input is fully consumed, the string is successfully parsed. Error, if no valid action exists for the current stack and input symbol, report a syntax error.
  5. Termination: The algorithm terminates when the input is successfully parsed (accept state) or a syntax error is encountered.

Next, we will discuss the block diagram of LR parsing.

Block Diagram

Below described is the block diagram of the LR parser.

Block Diagram

The input buffer takes one symbol at a time from left to right a1, a2,..., am,$. 

The stack contains the combination of the state symbol and the current input symbol.

The parsing table is a two-dimensional array containing the Action and GoTo tables.

Next, we will discuss the Parsing Table.

Also Read - Symbol Table Operations

Parsing Table

The parsing table is a two-dimensional array consisting of two tables- the ACTION table and the GOTO table.

Let’s see them one by one.

ACTION Table

The Action table specifies the grammar rules for implementing the input stream’s current state and terminal. There are four cases used in the Action table. These are-

  1. Shift- This operation involves moving the current symbol and the current state from the input buffer onto the stack. So, it becomes the new present state.
  2. Reduce- 
    1. The number is written to the output stream.
    2. The state is deleted from the stack, according to the symbol mentioned on the left-hand side of rule m.
    3. The symbol mentioned on the left-hand side of rule indicates that a new state is looked up in the goto table and pushed onto the stack as the new current state.
  3. Accept- After repeating the shift and reduce operations, if the stack contains the starting symbol of the input string and the input buffer is empty, i.e., includes the $ symbol, the input string is said to be accepted.
  4. Error- If the parser cannot perform the shift or the reduce operation, also the string is not accepted, then it is said to be in the error state.

This function is denoted by (Sm, ai), where Sm represents the four cases defined above and ai the current input symbol.

GOTO Table

The GOTO Table in the parsing table tells the next state where the parsing should proceed.

This function is denoted by (Sm, ai), where Sm represents the four cases defined above and ai the current input symbol.

Next, we will discuss the steps of LR parsing.

LR Parser Steps 

Initially put 0 in the stack and the input string with the $ symbol in the input buffer. Then follow the following steps to process the LR parsing-

  1. If the parsing table consists of the ACTION[Sm, ai] function equal to shift s, execute the shift operation. Push the current input symbol ai and the next state S = GOTO[Sm, ai] to the stack. The symbol ai+1 will become the new current input symbol.
  2. If the parsing table consists of the ACTION[Sm, ai] function equals to reduce A → B, then execute the reduce operation, entering the configuration (S0S1……Sm-rS, aiai+1…..an$) where r is the length of the right side of the production and S = GOTO[Sm-r, A]. 
    The parser will pop ‘2r’ symbols from the stack to expose Sm-r, then push both ‘A’ (left side of production) and ‘S’ (entry for ACTION[Sm-r, A]) onto the stack.
    Note: The current input symbol is not changed in the reduce operation.
  3. If the parsing table consists of the ACTION[Sm, ai] function equal to accept, then the parsing is said to be completed, and the input string is accepted.
  4. If the parsing table consists of the ACTION[Sm, ai] function equals to error means the parsing has encountered an error. So, the error routine will be called.

Now, we will discuss an example to understand the LR parsing.

Example

Question: Consider the following grammar

E → E + T

E → T

T → T x F

T → F

F → (E)

F → id

The ACTION and GOTO functions of the LR parsing table is given as follows:

 

STATE

ACTION

GOTO

id

+

x

(

)

$

E

T

F

0

s5

  

s4

  

1

2

3

1

 

s6

   

A

   

2

 

r2

s7

 

r2

r2

   

3

 

r4

r4

 

r4

r4

   

4

s5

  

s4

  

8

2

3

5

 

r6

r6

 

r6

r6

   

6

s5

  

s4

   

9

3

7

s5

  

s4

    

10

8

 

s6

  

s11

    

9

 

r1

s7

 

r1

r1

   

10

 

r3

r3

 

r3

r3

   

11

 

r5

r5

 

r5

r5

   

Using the ACTION and GOTO function table, parse the input string “idxid+id”.

Solution: First, we will label the given grammar.

(1) E → E + T

(2) E → T

(3) T → T x F

(4) T → F

(5) F → (E)

(6) F → id

Second, we will take a stack and input buffer and put 0 (initial state) and the input string in them respectively. 

 StackSymbolInput BufferParsing Action
(1)0 idxid+id$ 

 

At the initial state 0, we have s5 for id in the ACTION function of the parsing table. So, we will push id to the symbol column and 5 to the stack.

 StackSymbolInput BufferParsing Action
(1)0 idxid+id$Shift
(2)0 5idxid+id$ 

 

At state 5, we have r6 operation. So, we will reduce id by F → id and pop 5 out of the stack. We will also push 3 to the stack since for state 0 (now the symbol at the top of the stack), the GOTO function says to go to state 3 for F.

 StackSymbolInput BufferParsing Action
(1)0 idxid+id$Shift
(2)0 5idxid+id$Reduce F → id
(3)0 3Fxid+id$ 

 

At state 3, we have the r4 operation for the symbol. So, we will reduce F by T → F and pop 3 out of the stack.  We will also push 2 to the stack since for state 0 (now the symbol at the top of the stack), the GOTO function says to go to state 2 for T.

 StackSymbolInput BufferParsing Action
(1)0 idxid+id$Shift
(2)0 5idxid+id$Reduce F → id
(3)0 3Fxid+id$Reduce T → F
(4)0 2Txid+id$ 

 

At state 2, we have the s7 operation for symbol x. So, we will push x to the symbol column and 7 to the stack.

 StackSymbolInput BufferParsing Action
(1)0 idxid+id$Shift
(2)0 5idxid+id$Reduce F → id
(3)0 3Fxid+id$Reduce T → F
(4)0 2Txid+id$Shift
(5)0 2 7Txid+id$ 

 

At state 7, we have s5 operation for symbol id and s4 operation for symbol (. Since the current input symbol is id, we will go with the s5 operation. So, we will push id to the symbol column and 5 to the stack.

 StackSymbolInput BufferParsing Action
(1)0 idxid+id$Shift
(2)0 5idxid+id$Reduce F → id
(3)0 3Fxid+id$Reduce T → F
(4)0 2Txid+id$Shift
(5)0 2 7T xid+id$Shift
(6)0 2 7 5T x id+id$ 

 

At state 5, we have an r6 operation for the symbol. So, we will reduce id by F → id and pop 5 out of the stack. We will also push 10 to the stack since for state 7 (now the symbol at the top of the stack), the GOTO function says to go to state 10 for F.

 StackSymbolInput BufferParsing Action
(1)0 idxid+id$Shift
(2)0 5idxid+id$Reduce F → id
(3)0 3Fxid+id$Reduce T → F
(4)0 2Txid+id$Shift
(5)0 2 7T xid+id$Shift
(6)0 2 7 5T x id+id$Reduce F → id
(7)0 2 7 10T x F+id$ 

 

At state 10, we have r3 operation. So, we will reduce T x F by T → T x F and pop all the states till T i.e. till 2 from the stack. We will also push 2 to the stack since for state 0 (now the symbol at the top of the stack), the GOTO function says to go to state 3 for F.

 StackSymbolInput BufferParsing Action
(1)0 idxid+id$Shift
(2)0 5idxid+id$Reduce F → id
(3)0 3Fxid+id$Reduce T → F
(4)0 2Txid+id$Shift
(5)0 2 7T xid+id$Shift
(6)0 2 7 5T x id+id$Reduce F → id
(7)0 2 7 10T x F+id$Reduce T → T x F
(8)0 2T+id$ 

 

At state 2, we have s7 operation for the x symbol and r2 operation for +, ) and $ symbol. Since the current input symbol is +, we will go with r2. So, we will reduce T by E → T and pop 2 from the stack. We will also push 1 to the stack since for state 0 (now the symbol at the top of the stack), the GOTO function says to go to state 1 for E.

 StackSymbolInput BufferParsing Action
(1)0 idxid+id$Shift
(2)0 5idxid+id$Reduce F → id
(3)0 3Fxid+id$Reduce T → F
(4)0 2Txid+id$Shift
(5)0 2 7T xid+id$Shift
(6)0 2 7 5T x id+id$Reduce F → id
(7)0 2 7 10T x F+id$Reduce T → T x F
(8)0 2T+id$Reduce E → T
(9)0 1E+id$ 

 

At state 1, we have an s6 operation for the + symbol. So, we will push + to the symbol column and 6 to the stack.

 StackSymbolInput BufferParsing Action
(1)0 idxid+id$Shift
(2)0 5idxid+id$Reduce F → id
(3)0 3Fxid+id$Reduce T → F
(4)0 2Txid+id$Shift
(5)0 2 7T xid+id$Shift
(6)0 2 7 5T x id+id$Reduce F → id
(7)0 2 7 10T x F+id$Reduce T → T x F
(8)0 2T+id$Reduce E → T
(9)0 1E+id$Shift
(10)0 1 6E +id$ 

 

At state 6, we have an s5 operation for the id symbol and an s4 operation for ( symbol. Since the current input symbol is id, we will go with the s5 operation. So, we will push id to the symbol column and 5 to the stack.

 StackSymbolInput BufferParsing Action
(1)0 idxid+id$Shift
(2)0 5idxid+id$Reduce F → id
(3)0 3Fxid+id$Reduce T → F
(4)0 2Txid+id$Shift
(5)0 2 7T xid+id$Shift
(6)0 2 7 5T x id+id$Reduce F → id
(7)0 2 7 10T x F+id$Reduce T → T x F
(8)0 2T+id$Reduce E → T
(9)0 1E+id$Shift
(10)0 1 6E +id$Shift
(11)0 1 6 5E + id$ 

 

At state 5, we have r6 operation. So, we will reduce id by F → id and pop 5 from the stack. We will also push 3 to the stack since for state 6 (now the symbol at the top of the stack), the GOTO function says to go to state 3 for F.

 StackSymbolInput BufferParsing Action
(1)0 idxid+id$Shift
(2)0 5idxid+id$Reduce F → id
(3)0 3Fxid+id$Reduce T → F
(4)0 2Txid+id$Shift
(5)0 2 7T xid+id$Shift
(6)0 2 7 5T x id+id$Reduce F → id
(7)0 2 7 10T x F+id$Reduce T → T x F
(8)0 2T+id$Reduce E → T
(9)0 1E+id$Shift
(10)0 1 6E +id$Shift
(11)0 1 6 5E + id$Reduce F → id
(12)0 1 6 3E + F$ 

 

At state 3, we have an r4 operation. So, we will reduce F by T → F and pop 3 from the stack. We will also push 9 to the stack since for state 6 (now the symbol at the top of the stack), the GOTO function says to go to state 9 for T.

 StackSymbolInput BufferParsing Action
(1)0 idxid+id$Shift
(2)0 5idxid+id$Reduce F → id
(3)0 3Fxid+id$Reduce T → F
(4)0 2Txid+id$Shift
(5)0 2 7T xid+id$Shift
(6)0 2 7 5T x id+id$Reduce F → id
(7)0 2 7 10T x F+id$Reduce T → T x F
(8)0 2T+id$Reduce E → T
(9)0 1E+id$Shift
(10)0 1 6E +id$Shift
(11)0 1 6 5E + id$Reduce F → id
(12)0 1 6 3E + F$Reduce T → F
(13)0 1 6 9E + T$ 

 

At state 9, we have s7 operation for x and r1 operation for +, ) and $. Since $ is the current input symbol, we will go with r1. So, we will reduce E + T by E → E + T and pop all the states till E, i.e., till 6 from the stack. And we will push 1 into the stack.

 StackSymbolInput BufferParsing Action
(1)0 idxid+id$Shift
(2)0 5idxid+id$Reduce F → id
(3)0 3Fxid+id$Reduce T → F
(4)0 2Txid+id$Shift
(5)0 2 7T xid+id$Shift
(6)0 2 7 5T x id+id$Reduce F → id
(7)0 2 7 10T x F+id$Reduce T → T x F
(8)0 2T+id$Reduce E → T
(9)0 1E+id$Shift
(10)0 1 6E +id$Shift
(11)0 1 6 5E + id$Reduce F → id
(12)0 1 6 3E + F$Reduce T → F
(13)0 1 6 9E + T$Reduce E → E + T
(14)0 1E$ 

 

At state 1, we have s6 operation for + and A i.e. accept for $. Since the current input symbol is $, the input string will be accepted, and the parsing will be completed. 

 StackSymbolInput BufferParsing Action
(1)0 idxid+id$Shift
(2)0 5idxid+id$Reduce F → id
(3)0 3Fxid+id$Reduce T → F
(4)0 2Txid+id$Shift
(5)0 2 7T xid+id$Shift
(6)0 2 7 5T x id+id$Reduce F → id
(7)0 2 7 10T x F+id$Reduce T → T x F
(8)0 2T+id$Reduce E → T
(9)0 1E+id$Shift
(10)0 1 6E +id$Shift
(11)0 1 6 5E + id$Reduce F → id
(12)0 1 6 3E + F$Reduce T → F
(13)0 1 6 9E + T$Reduce E → E + T
(14)0 1E$Accept

Augment Grammar

Augmented grammar involves modifying a given grammar to facilitate certain parsing techniques, especially in the context of LR parsing. By augmenting a grammar, we add an extra production rule and a new start symbol, which helps in managing the parsing process more efficiently.

The primary goal of augmenting a grammar is to introduce a unique start state and handle the end-of-input marker explicitly. This modification ensures that the parser can recognize the completion of the input string and correctly handle the acceptance of the input.

Steps to Augment a Grammar

  • Introduce a New Start Symbol: Create a new start symbol (S') that is not part of the original grammar.
  • Add a New Production Rule: Add a production rule from the new start symbol to the original start symbol of the grammar (S' -> S).
  • Update the Grammar: Ensure that the new production rule and start symbol are incorporated into the existing set of rules.

Example

Given a simple grammar:

S -> aSb | ε

1. Introduce a New Start Symbol: Let S' be the new start symbol.

2. Add a New Production Rule: Add the production S' -> S.

3. Update the Grammar: The augmented grammar becomes:

S' -> S
S -> aSb | ε

Parsing with the Augmented Grammar

Using the augmented grammar, an LR parser can effectively determine the end of parsing when the input is completely reduced to the new start symbol S', ensuring accurate and efficient parsing of the input string.

Frequently Asked Questions

Define parser.

The parser is a compiler phase that accepts a token string as input and turns it into the matching intermediate representation using existing grammar. A parser is also known as a syntax analyzer.

What is LR parsing?

LR Parsing is a type of bottom-up parsing. It is used to parse context-free grammar, mostly used by compilers. The LR parser reads the input from left to right and produces the rightmost derivation. 

What are the types of LR parsing?

There are four types of LR parsing - LR(0), SLR(1), LALR(1) and CLR(1).

What is the difference between LR(0) and SLR?

LR(0) parsing uses only the state and the current input symbol, while SLR (Simple LR) parsing considers the follow sets of the non-terminals for making reduction decisions, thus handling more grammars and reducing conflicts.

What is LR(0) parsing technique?

LR(0) parsing technique involves constructing a DFA from LR(0) items, then parsing the input string by shifting symbols and reducing based on the constructed automaton, without considering lookahead symbols.

What is a CLR parser?

A CLR (Canonical LR) parser uses LR(1) items, which include lookahead symbols, providing more context and reducing conflicts compared to LR(0) and SLR parsers, making it more powerful and capable of handling a wider range of grammars.

Conclusion

In this article, we have studied the LR parser. LR parsers, including LR(0), SLR, and CLR variants, are powerful tools in compiler design, providing efficient and accurate syntactic analysis for a wide range of programming languages.

Recommended Reading:

Do check out The Interview guide for Product Based Companies as well as some of the Popular Interview Problems from Top companies like Amazon, Adobe, Google, Uber, Microsoft, etc. on Coding Ninjas Studio.

Also check out some of the Guided Paths on topics such as Data Structure and Algorithms, Competitive Programming, Operating Systems, Computer Networks, DBMS, System Design, etc. as well as some Contests, Test Series, Interview Bundles, and some Interview Experiences curated by top Industry Experts only on Coding Ninjas Studio.

Happy Learning!

Previous article
Operator Precedence Parsing
Next article
Leading and Trailing in Compiler Design
Live masterclass