Introduction
Before reading this article, it is recommended to have a decent understanding of parsing and three-address code in compiler design. I will cover them briefly in this article still it is recommended for better understanding. The assignment statement is used to deal with expressions in syntax-directed translation. The type of the expression can be real, integer, array, or record.
A multi-operator expression, such as a + b* c, will be translated into instructions with only one operator per instruction. There can only be one operator on the right side of instruction in a three-address code; no built-up arithmetic expressions are allowed. As a result, a source-language statement like x+y*z might be translated into a three-address instruction sequence.
t1 = y*z
t2= x+t1
Where t1 and t2 are temporary names produced by the compiler.
Also See, Top Down Parsing
Operations Within Expressions
Using attribute code for S and attributes addr and code for an expression E, the syntax-directed definition in the below figure constructs the three-address code for an assignment statement S. S.code and E.code are attributes that represent the three-address code for S and E, respectively.
This is the translation schema.
Consider the last production, E ->• id, in the syntax-directed definition in the figure above.
- top denote the current symbol table.
- Function top.get retrieves the entry
- Attribute E.addr denotes the address that will hold the value of E.
- new Temp() creates t1, t2 …
-
gen(x’=’ y ‘+’ z) represents the 3-address instruction x=y+z. Expression appearing in place of variables like x, y, and z are evaluated when passed to gen
Advantage: No code generated for variable access
Disadvantage: Not good with side effects. Eg. x-(x=3)
In Fig. 6.19, the operators + and unary - reflect the operators in a typical language. The semantic rules for E —> E1 + E2 create code to compute E's value from E1 and E2's values. Values are computed and stored in temporary names that are formed on the fly. E1 + E2 translates to t = E1. addr + E2. addr, where t is a new temporary name, if E1 is computed as Ei.addr and E2 is computed as E2. addr. The value of E.addr is set to t. By repeatedly performing n e w Tempi), a succession of separate temporary names ti,t2,... is formed.
To denote the three-address instruction x = y + z, we use the notation gen(x '=' y '+' z). When provided to gen, expressions that appear in place of variables like x, y, and z are evaluated, and quoted strings are taken literally. 5 Other three-address instructions will be constructed in the same way. 5 In syntax-directed definitions, gen creates and returns an instruction.
By applying gen to a combination of expressions and strings, gen constructs an instruction and incrementally emits it by placing it into the stream.
When we translate E -> Ei+E2, the semantic rules in Fig. 6.19 combine Ei.code, E2.code, and an instruction that adds the values of E1 and E2 to create E.code. The instruction saves the result of the addition as E.addr, a new temporary name for E.
E -» - E1 has a similar translation. The rules provide instruction to do the unary minus operation and build a new temporary for E.
Finally, the production is completed. S id = E; creates instructions that assign the expression E's value to the identifier id. As with the rules for E —v id, the semantic rule for this production employs function top.get to find the address of the identifier represented by id. S.code contains instructions to compute E's value into an address specified by E.addr, followed by an assignment to top.get(id. lexeme) for this instance of id.
Example 6.11: In figure above 6.19, the syntax-directed definition converts the assignment phrase a = b + - c; into a three-address code sequence.
a=b+-c
=> t2 = minus c
t1= b+t1
a= t2