Introduction
The Basic Block is a straight-line code sequence with no in and out branches except at the beginning and end. It is a set of statements that always run one after the other sequentially without any halt.
A compiler first converts the source code of any programming language into an intermediate code. It is then converted into basic blocks. After partitioning an intermediate code into basic blocks, the flow of control among basic blocks is represented by a flow graph.
Intermediate code can be language-independent (three-address code) or language-specific (e.g., Byte Code for Java).
Also see, Phases of Compiler
What is the Basic Block in Compiler Design?
In compiler design, a basic block is a straight-line piece of code that has only one entry point and one exit point. Basic block construction is the process of dividing a program's control flow graph into basic blocks.
The task is to partition a sequence of three-address codes into the basic block. The new basic block always begins with the first instruction and continues to add instructions until a jump or a label is reached. If no jumps or labels are identified, control will flow sequentially from one instruction to another.
Task: Partition a sequence of three-address codes into basic blocks.
Input: Sequence of three address statements.
Output: A sequence of basic blocks.
Also See, Top Down Parsing
Algorithm of Basic Block in Compiler Design
- First, find the set of leaders from intermediate code, the first statements of basic blocks. The following are the steps for finding leaders:
- The first instruction of the three-address code is a leader.
- Instructions that are the target of conditional/unconditional goto are leaders.
- Instructions that immediately follow any conditional/unconditional goto/jump statements are leaders.
- For each leader found, its basic block contains itself and all instructions up to the next leader.
Hence following the above algorithm, you can partition a sequence of three-address code into basic blocks.
Example of Basic Block in Compiler Design
Consider the source code for converting a 10 x 10 matrix to an identity matrix.
for r from 1 to 10 do
for c from 1 to 10 do
a [ r, c ] = 0.0;
for r from 1 to 10 do
a [ r, c ] = 1.0;
The following are the three address codes for the above source code:
1) r = 1
2) c = 1
3) t1 = 10 * r
4) t2 = t1 + c
5) t3 = 8 * t2
6) t4 = t3 - 88
7) a[t4] = 0.0
8) c = c + 1
9) if c <= 10 goto (3)
10) r = r + 1
11) if r <= 10 goto (2)
12) r = 1
13) t5 = c - 1
14) t6 = 88 * t5
15) a[t6] = 1.0
16) r = r + 1
17) if r <= 10 goto (13)
There are six basic blocks for the above-given code, which are:
- B1 for statement 1
- B2 for statement 2
- B3 for statements 3-9
- B4 for statements 10-11
- B5 for statement 12
- B6 for statements 13-17.
Explanation
According to the definition of leaders given in the above algorithm,
- Instruction 1 is a leader as the first instruction of a three-address code is always a leader.
- Instruction 2 is a leader because it is followed by a goto statement at Instruction 11.
- Instruction 3 and Instruction 13 are also leaders because they are followed by a goto statement at Instruction 9 and 17, respectively.
- Instruction 10 and Instruction 12 are also leaders because they are followed by a conditional goto statement at Instruction 9 and 17, respectively.
Also read About, Specifications of Tokens in Compiler Design