Code360 powered by Coding Ninjas X Naukri.com. Code360 powered by Coding Ninjas X Naukri.com
Table of contents
1.
Introduction
2.
What is a compiler?
3.
Phases of Compiler
4.
1. Lexical Analysis
5.
2. Syntax Analysis
6.
3. Semantic Analysis
7.
4. Intermediate Code Generation
8.
5. Code Optimization
9.
6. Target Code Generator
10.
What is a Symbol Table?
11.
Error Handling Routine
11.1.
Error Detection
11.2.
Error Reporting
12.
Frequently Asked Questions
12.1.
What are the stages in compilation process?
12.2.
What are the two reasons as to why phases of compiler should be grouped?
12.3.
How phases are grouped in compiler?
12.4.
What are the three types of compilers?
12.5.
What are the 5 steps of the compilation process? 
13.
Conclusion
Last Updated: Aug 27, 2024
Medium

Phases of a Compiler

Author Juhi Sinha
2 upvotes

Introduction

The compiler is software that converts high-level language code into machine-readable language code. Multiple intermediate processes are required when converting code from one language to another. 

The compilation is divided into two phases:

  1.  Analysis (Machine Independent/Language Dependent)
  2.  Synthesis (Machine Dependent/Language-Independent)
phases of compiler


In this article, we will learn about the Phases of a Compiler. So without any further ado, let's get started!

What is a compiler?

A compiler is a software tool that translates code written in a high-level programming language into machine code that a computer's hardware can understand. It performs this translation in several stages, including lexical analysis, parsing, semantic analysis, optimization, and code generation. The primary goal of a compiler is to convert the source code into an executable program, optimizing the code for performance and efficiency along the way. This allows programmers to write in human-readable languages while still producing fast, efficient machine-level code.

Phases of Compiler

The phases of Compiler are as follows:

  1. Lexical Analysis
  2. Syntax Analysis
  3. Semantic Analysis
  4. Intermediate Code Generator
  5. Code optimisation
  6. Target Code Generator
Phases of a Compiler flowchart

1. Lexical Analysis

The lexical analyser phase reads the character stream from the source programme and groups them into meaningful sequences by identifying tokens. Scanning is another term for it. In the next phase, it adds the corresponding token to the symbol table and sends the tokens to the syntax analyser.

Also see, Cross Compiler

2. Syntax Analysis

This phase is also known as parsing. The tokens from the previous phase are used to create an intermediate tree-like data structure known as the syntax tree in this phase. Each node has an operator, and the operator's operands are the node's children. This is primarily done to ensure that the syntax of the given statements is correct and adheres to the language's pre-defined rules. If the syntax is incorrect, it generates a syntax error.

For e.g.: x = y+z; (Correct)

              y+z = x; (Produce an error)

3. Semantic Analysis

The semantic analyser checks if the source code is semantically consistent, that is, if it conveys the intended meaning, using the syntax tree from the previous phase and the symbol table.

Type-checking is one of the most important functions of the semantic analyser. The semantic analyser will also perform this task if the language allows for type conversions known as type coercions. It throws the semantic error if there is a type mismatch and no type coercion rules to satisfy the desired operation. 

E.g. The semantic analysers give an error in semantics when an integer and a string are added.

4. Intermediate Code Generation

After performing semantic analysis, the compiler generates an intermediate code of the source code for the target machine. It's a programme for some kind of abstract machine. It's a cross between a high-level and a machine language. This intermediate code should be written in a way that makes translating it into the target machine code easier.

5. Code Optimization

The intermediate code is optimised in the next phase. Optimisation can be defined as removing unnecessary code lines and rearranging statement sequences to speed up programme execution without wasting resources.

6. Target Code Generator

The code generator maps the optimised representation of the intermediate code to the target machine language in this phase. The code generator converts the intermediate code into a re-locatable sequence of machine code. A machine code sequence of instructions performs the same task as intermediate code.

What is a Symbol Table?

• A symbol table contains a record for each variable name and fields for the name's attributes.

• The data structure should be created so that the compiler can quickly locate the record for each name and store or retrieve data from that record.

• These attributes may provide information about the name's storage allocation, type, scope, and, in the case of procedure names, the number and types of the procedure's arguments, the method of passing each argument, and the type returned.

OperationFunction
allocateto allocate a new empty table 
freeto remove all entries and free storage of the symbol table
lookupto search for a name and return a pointer to its entry
insertto insert a name in a symbol table and return a pointer to its entry
set_attributeto associate an attribute with a given entry
get_attributeto get an attribute associated with a given entry

Error Handling Routine

An Error Handling Routine is a set of instructions implemented within a compiler to detect and handle errors during compilation. An error is a blank entry in the symbol table. The Errors may occur in all phases of the compiler. Whenever a phase of the compiler discovers an error, it must report it to the error handler, which issues an appropriate diagnostic message. The main components of the Error Handling Routine include Error Detection and Error Reporting.

Error Detection

This component identifies an error during the compilation process. It checks the input source code at various stages, including lexical analysis, parsing, or semantic analysis, to detect lexical errors and verify the correctness of the source code in terms of semantic rules.

Error Reporting

The Error Handling Routine generates an error report to convey information about the error to the user. This error report begins by specifying the error and includes a concise description. It usually includes the location of the error in the source code and some suggestions for resolving it.

Also see,  cousins of compiler

Frequently Asked Questions

What are the stages in compilation process?

The compilation process typically consists of several stages: lexical analysis (scanning), syntax analysis (parsing), semantic analysis, code generation, and optimization. Lexical analysis breaks the source code into tokens, syntax analysis verifies the syntax structure, semantic analysis checks for semantic correctness, code generation creates machine code, and optimization improves code efficiency.

What are the two reasons as to why phases of compiler should be grouped?

The phases of a compiler are grouped for two main reasons: modularity and efficiency. Modularity allows for easier development, testing, and maintenance of individual phases, while efficiency ensures that the compilation process can be optimized by combining related phases and minimizing redundant processing.

How phases are grouped in compiler?

Phases in compilers are grouped together so that it can reduce the number of passes. The compilers can be grouped from any phase, the front end as well as the back end. The lexical, syntax, semantic analysis, and intermediate code can be grouped together to read the input files.

What are the three types of compilers?

There are three major types of compilers. These are single-pass compilers, two-pass compilers, and multi-pass compilers.

What are the 5 steps of the compilation process? 

The compilation process involves five steps: lexical analysis converts source code into tokens, syntax analysis forms a syntax tree, semantic analysis checks for errors, optimization enhances performance, and code generation translates the optimized code into machine code.

Conclusion

In conclusion, the phases of a compiler are lexical analysis, syntax analysis, semantic analysis. Code Generation work together to transform high-level source code into efficient machine code. Each phase plays a vital role in ensuring the accuracy, efficiency, and functionality of the final executable program, making the compilation process essential for software development.

Recommended Reading:

Do upvote our blog to help other ninjas grow.

Happy Coding!

Live masterclass