Do you think IIT Guwahati certified course can help you in your career?
No
Introduction
Before getting into DAG representation, let’s understand three address code.
The three-address code is a sort of intermediate code that is simple to create and convert to machine code. To represent an expression, it uses no more than three addresses and one operator, and the value computed at each instruction is saved in a compiler-generated temporary variable. The compiler uses three address codes to determine the order of operations.
Below is the general representation of a three-address code-
a = b op c
where
a, b and c are operands, and
op is an operator.
Now, we may proceed to DAG representation.
DAG representation or Directed Acyclic Graph representation is used to represent the structure of basic blocks. A basic block is a set of statements that execute one after another in sequence. It displays the flow of values between basic blocks and provides basic block optimization algorithms. It is an efficient way of identifying common subexpressions.
A DAG is a three address code formed due to an intermediate code generation to apply an optimization technique to a basic block.
What is DAG in compiler design?
In the compilation process, the high level code must be transformed into low level code. To perform this transformation, the object code generated must retain the exact meaning of the source code. Hence DAG is used to depict the structure of the basic blocks, and helps to see the flow of the values among the blocks and offers some degree of optimization too.
A DAG is used in compiler design to optimize the basic block. It is constructed using Three Address Code. Then after construction, multiple transformations are applied such as dead code elimination, and common subexpression elimination. DAG's are useful in compilers because topological ordering can be defined in the case of DAGs, which is crucial for construction of object level code. Transitive reduction as well as closure are uniquely defined for DAGs. This is how a DAG looks like:
Now, we will discuss some characteristics of DAG.
Directed Acyclic Graph Characteristics
The following are some characteristics of DAG.
DAG is a type of Data Structure used to represent the structure of basic blocks.
Its main aim is to perform the transformation on basic blocks.
The leaf nodes of the directed acyclic graph represent a unique identifier that can be a variable or a constant.
The non-leaf nodes represent an operator symbol.
Moreover, the nodes are also given a string of identifiers to use as labels for the computed value.
Transitive closure and transitive reduction are defined differently in DAG.
DAG has defined topological ordering.
Next, we will discuss the algorithm to draw a DAG.
Algorithm for construction of DAG
For constructing a DAG, the input and output are as follows.
Input- The input will contain a basic block.
Output- The output will contain the following information-
Each node of the graph represents a label.
If the node is a leaf node, the label represents an identifier.
If the node is a non-leaf node, the label represents an operator.
Each node contains a list of attached identifiers to hold the computed value.
There are three possible scenarios on which we can construct a DAG.
Case 1: x = y op z where x, y, and z are operands and op is an operator.
Case 2: x = op y where x and y are operands and op is an operator.
Case 3: x = y where x and y are operands.
Now, we will discuss the steps to draw a DAG handling the above three cases.
Steps
To draw a DAG, follow these three steps.
Step 1: According to step 1,
If, in any of the three cases, the y operand is not defined, then create a node(y).
If, in case 1, the z operand is not defined, then create a node(z).
Step 2: According to step 2,
For case 1, create a node(op) with node(y) as its left child and node(z) as its right child. Let the name of this node be n.
For case 2, check whether there is a node(op) with one child node as node(y). If there is no such node, then create a node.
For case 3, node n will be node(y).
Step 3: For a node(x), delete x from the list of identifiers. Add x to the list of attached identifiers list found in step 2. At last, set node(x) to n.
Now, we will consider an example of drawing a DAG.
Examples
Example 1
Consider the following statements with their three address code-
a = b * c
d = b
e = d * c
b = e
f = b + c
g = d + f
We will construct a DAG for these six statements.
Solution- Since we have six different statements, we will start drawing a graph from the first one.
Step 1: The first statement is a = b * c. This statement lies in the first case from the three cases defined above. So, we will draw a node(*) with its left child node(b) and right child node(c).
Step 2: The second statement is d = b. This statement lies in the third case from the three cases defined above. Since we already have a node defining operand b, i.e., node(b), we will append d to node(b).
Step 3: The third statement is e = d * c. This statement lies in the first case from the three cases defined above. Since we already have a node defining the * (multiplication) operation, i.e., node(*), and nodes defining operand d and c, i.e., node(d) and node(c), respectively. Thus, we will append the result of d * c, i.e., e to a.
Step 4: The fourth statement is b = e. This statement lies in the third case from the three cases defined above. Since we have both b and e defined in our graph, we will simply append b to e.
Step 5: The fifth statement is f = b + c. This statement lies in the first case from the three cases defined above. Since we already have a node defining operands b and c, i.e., node(b) and node(c), respectively but no node representing + (addition) operation. We will draw a node(+) with its left child node(b) and right child node(c).
Step 6: The sixth statement is g = d + f. This statement lies in the first case from the three cases defined above. Since we already have a node defining operands d and f. We will draw a node(+) with its left child node(d) and right child node(f).
The above graph is the final DAG representation for the given basic block.
Example 2
Consider the following expression: (a + b) * (a + b +c)
We will construct a DAG for this expression.
Solution- First, we will find the statements with the three address code for the given expression.
The above expression can be divided into three statements.
t1 = a + b
t2 = t1 + c
t3 = t1 * t2
We will start from the first statement.
Step 1: The first statement is t1 = a + b. This statement lies in the first case from the three cases defined above. So, we will draw a node(+) with its left child node(a) and right child node(b).
Step 2: The second statement is t2 = t1 + c. This statement lies in the first case from the three cases defined above. So, we will draw a node(+) with its left child node(t1) and right child node(c).
Step 3: The third statement is t3 = t1 * t2. This statement lies in the first case from the three cases defined above. So, we will draw a node(*) with its left child node(t1) and right child node(t2).
The above graph is the final DAG representation for the given expression.
Now, we will discuss the applications of DAG.
Application of Dag in Compiler Design
The applications of Directed Acyclic Graph (DAG) in Compiler Design are:
Expression Optimization: DAGs are used to optimize expressions by identifying common subexpressions and representing them efficiently, reducing redundant computations.
Code Generation: DAGs assist in generating efficient code by representing intermediate code and optimizing it before translating it into machine code.
Register Allocation: DAGs aid in register allocation by identifying variables that can share the same register, minimizing the number of memory accesses.
Control Flow Analysis: DAGs help in analyzing control flow structures and optimizing the flow of control within a program.
Advantages of Dag in Compiler Design
The advantages of DAG in Compiler Design are:
Space Efficiency: DAGs minimize memory usage by representing common subexpressions and eliminating redundancy in intermediate representations.
Time Efficiency: DAG-based optimizations reduce the time required for expression evaluation and code generation, leading to faster compilation times.
Improved Code Quality: By eliminating redundant computations and optimizing expressions, DAGs produce more efficient and optimized code, enhancing overall code quality and performance.
Simplifies Optimization Passes: DAG-based optimizations simplify the implementation of optimization passes in compilers by providing a structured representation of intermediate code.
Facilitates Register Allocation: DAGs aid in register allocation by providing insights into the usage of variables and their lifetimes, enabling efficient allocation of hardware resources.
Frequently Asked Questions
What is DAG in compiler design?
DAG in compiler design represents expressions and control flow without cycles. It optimizes code by identifying and eliminating redundant computations.
Why are DAGs used?
DAGs are used for space and time-efficient representation of common subexpressions, aiding in optimizing code and reducing memory usage.
Why DAG is used in spark?
DAGs in Spark represent transformations and actions as a directed acyclic graph, optimizing distributed data processing workflows.
What is DAG and syntax tree?
DAGs represent computation flow, while syntax trees represent program structure. DAGs optimize code, while syntax trees parse and analyze code syntax.
What are the disadvantages of DAG?
Disadvantages of DAGs include complexity in handling cyclic dependencies, difficulty in maintaining consistency, and potential inefficiency in dynamic environments.
Conclusion
In this article, we have studied DAG representation, i.e., Directed Acyclic Graph representation in Compiler Design. We went through the concept thoroughly, discussing the characteristics, algorithm, examples, and applications of DAG.