COMPILER DESIGN Run-Time
Environments
Dr R Jegadeesan Prof-CSE
Jyothishmathi Institute of Technology and Science,
Karimnagar
1
2
Syllabus:
Run-Time Environments: Storage organization, Stack
Allocation of Space, Access to Nonlocal Data on the
Stack, Heap Management, Introduction to Garbage
Collection, Introduction to Trace-Based Collection. Code
Generation: Issues in the Design of a Code Generator,
The Target Language, Addresses in the Target Code,
Basic Blocks and Flow Graphs, Optimization of Basic
Blocks, A Simple Code Generator, Peephole
Optimization, Register Allocation and Assignment,
Dynamic Programming Code-Generation
Storage Organization
3
UNIT 4 : Run-Time Environments
Static vs. Dynamic Allocation
4
• Static: Compile time, Dynamic: Runtime allocation
• Many compilers use some combination of following
– Stack storage: for local variables, parameters and so on
– Heap storage: Data that may outlive the call to the procedure that created it
• Stack allocation is a valid allocation for procedures since procedure calls are nested
Storage Organization
Activation records
5
• Procedure calls and returns are usually managed by a run-time stack called
the control stack.
• Each live activation has an activation record (sometimes called a frame)
• The root of activation tree is at the bottom of the stack
• The current execution path specifies the content of the stack with the last
activation has record in the top of the stack.
Storage Organization
6
UNIT 4 :Stack Allocation of Space
A General ActivationRecord
Activation Record
7
• Temporary values
• Local data
• A saved machine status
• An “access link”
• A control link
• Space for the return value of the called function
• The actual parameters used by the calling procedure
UNIT 4 :Stack Allocation of Space
Downward-growing stack of activation records
8
UNIT 4 :Stack Allocation of Space
Designing Calling Sequences
9
UNIT 4 : Access to Nonlocal Data
on the Stack
• Values communicated between caller and callee are generally placed at the
beginning of callee’s activation record
• Fixed-length items: are generally placed at the middle
• Items whose size may not be known early enough: are placed at the end of
activation record
• We must locate the top-of-stack pointer judiciously: a common approach is to have
it point to the end of fixed length fields.
Division of tasks between caller and callee
10
UNIT 4 : Access to Nonlocal Data
on the Stack
calling sequence
11
• The caller evaluates the actual parameters
• The caller stores a return address and the old value of top-sp into the
callee's activation record.
• The callee saves the register values and other status information.
• The callee initializes its local data and begins execution.
UNIT 4 : Access to Nonlocal Data
on the Stack
corresponding return sequence
12
• The callee places the return value next to the parameters
• Using information in the machine-status field, the callee restores top-sp
and other registers, and then branches to the return address that the
caller placed in the status field.
• Although top-sp has been decremented, the caller knows where the
return value is, relative to the current value of top-sp; the caller therefore
may use that value.
UNIT 4 : Access to Nonlocal Data
on the Stack
Access to dynamically allocated arrays
13
UNIT 4 : Access to Nonlocal Data
on the Stack
UNIT 4: HEAP MANGEMENT
HEAP ALLOCATION
14
•Variables local to a procedure are allocated and de-allocated only at runtime.
• Heap allocation is used to dynamically allocate memory to the variables and claim
it back when the variables are no more required.
•Except statically allocated memory area, both stack and heap memory can grow and
shrink dynamically and unexpectedly. Therefore, they cannot be provided with a
fixed amount of memory in the system.
HEAP ALLOCATION
15
UNIT 4: HEAP MANGEMENT
16
UNIT 4: HEAP MANGEMENT
HEAP ALLOCATION
•As shown in the image above, the text part of the code is allocated a fixed amount of
memory.
•Stack and heap memory are arranged at the extremes of total memory allocated to
the program.
•Both shrink and grow against each other.
•Parameter Passing
•The communication medium among procedures is known as parameter passing. The
values of the variables from a calling procedure are transferred to the called procedure
by some mechanism.
17
•Before moving ahead, first go through some basic terminologies pertaining to
the values in a program.
•r-value
•The value of an expression is called its r-value. The value contained in a single
variable also becomes an r-value if it appears on the right-hand side of the
assignment operator. r-values can always be assigned to some other variable.
•l-value
•The location of memory (address) where an expression is stored is known as
the l-value of that expression. It always appears at the left hand side of an
assignment operator.
UNIT 4: HEAP MANGEMENT
HEAP ALLOCATION
18
•For example:
•day = 1; week = day * 7; month = 1; year = month * 12;
•From this example, we understand that constant values like 1, 7, 12, and
•variables like day, week, month and year, all have r-values.
• Only variables have l-values as they also represent the memory location assigned to them.
•For example:
•7 = x + y;
•is an l-value error, as the constant 7 does not represent any memory location.
UNIT 4: HEAP MANGEMENT
HEAP ALLOCATION
UNIT 4 : Garbage collection
Garbage collection Definition
19
•Garbage collection (GC) is a dynamic approach to automatic memory
management and heap allocation that processes and identifies dead
memory blocks and reallocates storage for reuse.
•The primary purpose of garbage collection is to reduce memory
leaks.
•GC implementation requires three primary approaches, as follows:
•Mark-and-sweep - In process when memory runs out, the GC locates
all accessible memory and then reclaims available memory.
20
UNIT 4 : Garbage collection
Garbage collection Definition
•Reference counting - Allocated objects contain a reference count of the referencing number.
•When the memory count is zero, the object is garbage and is then destroyed. The freed
memory returns to the memory heap.
•Copy collection - There are two memory partitions.
•If the first partition is full, the GC locates all accessible data structures and
copies them to the second partition, compacting memory after GC process
and allowing continuous free memory
.
UNIT 4: Introduction to Trace-Based
Collection
Trace-Based Collection Sub Heading
21
Instead of collecting garbage as it is created, trace-based collectors run
periodically to find unreachable objects and reclaim their space. Typically, we run
the
22
.
•Trace-based collector whenever the free space is exhausted or its amount drops
below some threshold.
•We begin this section by introducing the simplest "mark-and-sweep" gar-bage
collection algorithm.
•We then describe the variety of trace-based algorithms in terms of four states that
chunks of memory can be put in.
•This section also contains a number of improvements on the basic algorithm,
including those in which object relocation is a part of the garbage-collection function
UNIT 4: Introduction to Trace-Based
Collection
Trace-Based Collection Sub Heading
UNIT 4: CODE GENERATIONNG
Introduction
23
• The final phase of a compiler is code generator
• It receives an intermediate representation (IR) with supplementary information in
symbol table
• Produces a semantically equivalent target program
• Code generator main tasks:
– Instruction selection
– Register allocation and assignment
– Instruction ordering
Front end Code optimizer
Code
Generator
Issues in the Design of Code Generator
24
UNIT 4: CODE GENERATIONNG
• The most important criterion is that it produces correct code
• Input to the code generator
– IR + Symbol table
– We assume front end produces low-level IR, i.e. values of names in it can be
directly manipulated by the machine instructions.
– Syntactic and semantic errors have been already detected
• The target program
– Common target architectures are: RISC, CISC and Stack based machines
– In this chapter we use a very simple RISC-like computer with addition of some
CISC-like addressing modes
complexity of mapping
25
UNIT 4: CODE GENERATIONNG
• the level of the IR
• the nature of the instruction-set architecture
• the desired quality of the generated code.
x=y+z
LD R0, y
ADD R0, R0, z
ST x, R0
a=b+c
d=a+e
LD R0, b
ADD R0, R0, c
ST a, R0
LD R0, a
ADD R0, R0, e
ST d, R0
Introduction
26
UNIT 4: Register allocation
• Two sub problems
– Register allocation: selecting the set of variables that will reside in registers at
each point in the program
– Resister assignment: selecting specific register that a variable reside in
• Complications imposed by the hardware architecture
– Example: register pairs for multiplication and division
t=a+b
t=t*c
T=t/d
t=a+b
t=t+c
T=t/d
L R1, a
A R1, b
M R0, c
D R0, d
ST R1, t
L R0, a
A R0, b
M R0, c
SRDA R0, 32
D R0, d
ST R1, t
Addressing Modes
27
UNIT 4: Addressing in target code
• variable name: x
• indexed address: a(r) like LD R1, a(R2) means R1=contents ( a+ contents(R2))
• integer indexed by a register : like LD R1, 100(R2)
• Indirect addressing mode: *r and *100(r)
• immediate constant addressing mode: like LD R1, #100
conditional-jump three-address instruction
28
UNIT 4: conditional-jump three-address
instruction
If x<y goto L
LD R1, x // R1 = x
LD R2, y // R2 = y
SUB R1, R1, R2 // R1 = R1 - R2
BLTZ R1, M // i f R1 < 0 jump t o M
Basic blocks and flow graphs
29
UNIT 4: Basic blocks and flow
graphs
• Partition the intermediate code into basic blocks
– The flow of control can only enter the basic block through the first instruction
in the block. That is, there are no jumps into the middle of the block.
– Control will leave the block without halting or branching, except possibly at
the last instruction in the block.
• The basic blocks become the nodes of a flow graph
rules for finding leaders
30
UNIT 4: CODE GENERATIONNG
• The first three-address instruction in the intermediate code is a leader.
• Any instruction that is the target of a conditional or unconditional jump is a leader
• Any instruction that immediately follows a conditional or unconditional jump is a
leader.
Intermediate code to set a 10*10 matrix to an
identity matrix
31
UNIT 4: CODE GENERATIONNG
Flow graph based on Basic Blocks
32
UNIT 4: CODE GENERATIONNG
liveness and next-use information
33
UNIT 4: CODE GENERATIONNG
• We wish to determine for each three address statement x=y+z what the next uses of x, y and z are.
• Algorithm:
– Attach to statement i the information currently found in the symbol table regarding the next
use and liveness of x, y, and z.
– In the symbol table, set x to "not live" and "no next use.“
– In the symbol table, set y and z to "live" and the next uses of y and z to i.
DAG representation of basic blocks
34
UNIT 4: CODE GENERATIONNG
• There is a node in the DAG for each of the initial values of the variables appearing in the basic
block.
• There is a node N associated with each statement s within the block. The children of N are
those nodes corresponding to statements that are the last definitions, prior to s, of the
operands used by s.
• Node N is labeled by the operator applied at s, and also attached to N is the list of variables for
which it is the last definition within the block.
• Certain nodes are designated output nodes. These are the nodes whose variables are live on
exit from the block.
Code improving transformations
35
UNIT 4: CODE GENERATIONNG
• We can eliminate local common subexpressions, that is, instructions that compute
a value that has already been computed.
• We can eliminate dead code, that is, instructions that compute a value that is
never used.
• We can reorder statements that do not depend on one another; such reordering
may reduce the time a temporary value needs to be preserved in a register.
• We can apply algebraic laws to reorder operands of three-address instructions,
and sometimes t hereby simplify t he computation.
DAG for basic block
36
UNIT 4: CODE GENERATIONNG
DAG for basic block
37
UNIT 4: CODE GENERATIONNG
array accesses in a DAG
38
UNIT 4: CODE GENERATIONNG
• An assignment from an array, like x = a [i], is represented by creating a
node with operator =[] and two children representing the initial value of
the array, a0 in this case, and the index i. Variable x becomes a label of this
new node.
• An assignment to an array, like a [j] = y, is represented by a new node with
operator []= and three children representing a0, j and y. There is no
variable labeling this node. What is different is that the creation of this
node kills all currently constructed nodes whose value depends on a0. A
node that has been killed cannot receive any more labels; that is, it cannot
become a common subexpression.
DAG for a sequence of array assignments
39
UNIT 4: CODE GENERATIONNG
Rules for reconstructing the basic block from a DAG
40
UNIT 4: CODE GENERATIONNG
• The order of instructions must respect the order of nodes in the DAG. That is, we
cannot compute a node's value until we have computed a value for each of its
children.
• Assignments to an array must follow all previous assignments to, or evaluations
from, the same array, according to the order of these instructions in the original
basic block.
• Evaluations of array elements must follow any previous (according to the original
block) assignments to the same array. The only permutation allowed is that two
evaluations from the same array may be done in either order, as long as neither
crosses over an assignment to that array.
• Any use of a variable must follow all previous (according to the original block)
procedure calls or indirect assignments through a pointer.
• Any procedure call or indirect assignment through a pointer must follow all
previous (according to the original block) evaluations of any variable.
principal uses of registers
41
UNIT 4: CODE GENERATIONNG
• In most machine architectures, some or all of the operands of an
operation must be in registers in order to perform the operation.
• Registers make good temporaries - places to hold the result of a
subexpression while a larger expression is being evaluated, or more
generally, a place to hold a variable that is used only within a single
basic block.
• Registers are often used to help with run-time storage management,
for example, to manage the run-time stack, including the
maintenance of stack pointers and possibly the top elements of the
stack itself.
Descriptors for data structure
42
UNIT 4: CODE GENERATIONNG
• For each available register, a register descriptor keeps track of the
variable names whose current value is in that register. Since we
shall use only those registers that are available for local use within a
basic block, we assume that initially, all register descriptors are
empty. As the code generation progresses, each register will hold
the value of zero or more names.
• For each program variable, an address descriptor keeps track of the
location or locations where the current value of that variable can be
found. The location might be a register, a memory address, a stack
location, or some set of more than one of these. The information
can be stored in the symbol-table entry for that variable name.
Machine Instructions for Operations
43
UNIT 4: CODE GENERATIONNG
• Use getReg(x = y + z) to select registers for x, y, and z. Call these Rx, Ry and Rz.
• If y is not in Ry (according to the register descriptor for Ry), then issue an
instruction LD Ry, y', where y' is one of the memory locations for y (according to
the address descriptor for y).
• Similarly, if z is not in Rz, issue and instruction LD Rz, z', where z' is a location for x .
• Issue the instruction ADD Rx , Ry, Rz.
Rules for updating the register and address
descriptors
44
UNIT 4: CODE GENERATIONNG
• For the instruction LD R, x
– Change the register descriptor for register R so it holds only x.
– Change the address descriptor for x by adding register R as an
additional location.
• For the instruction ST x, R, change the address descriptor for x to
include its own memory location.
• For an operation such as ADD Rx, Ry, Rz implementing a three-address
instruction x = y + x
– Change the register descriptor for Rx so that it holds only x.
– Change the address descriptor for x so that its only location is Rx. Note
that the memory location for x is not now in the address descriptor for
x.
– Remove Rx from the address descriptor of any variable other than x.
• When we process a copy statement x = y, after generating the load for
y into register Ry, if needed, and after managing descriptors as for all
load statements (per rule I):
– Add x to the register descriptor for Ry.
– Change the address descriptor for x so that its only location is Ry .
Instructions generated and the changes in
the register and address descriptors
45
UNIT 4: CODE GENERATIONNG
Rules for picking register Ry for y
46
UNIT 4: CODE GENERATIONNG
• If y is currently in a register, pick a register already containing y as Ry. Do not issue a
machine instruction to load this register, as none is needed.
• If y is not in a register, but there is a register that is currently empty, pick one such register
as Ry.
• The difficult case occurs when y is not in a register, and there is no register that is
currently empty. We need to pick one of the allowable registers anyway, and we need to
make it safe to reuse.
Possibilities for value of R
47
UNIT 4: CODE GENERATIONNG
• If the address descriptor for v says that v is somewhere besides R, then we are OK.
• If v is x, the value being computed by instruction I, and x is not also one of the
other operands of instruction I (z in this example), then we are OK. The reason is
that in this case, we know this value of x is never again going to be used, so we are
free to ignore it.
• Otherwise, if v is not used later (that is, after the instruction I, there are no further
uses of v, and if v is live on exit from the block, then v is recomputed within the
block), then we are OK.
• If we are not OK by one of the first two cases, then we need to generate the store
instruction ST v, R to place a copy of v in its own memory location. This operation
is called a spill.
Selection of the register Rx
48
UNIT 4: CODE GENERATIONNG
• Since a new value of x is being computed, a register that holds only x is always an
acceptable choice for Rx.
• If y is not used after instruction I, and Ry holds only y after being loaded, Ry can
also be used as Rx. A similar option holds regarding z and Rx.
49
UNIT 4: PEEPHOLE OPTIMIZATIONS
Characteristic of peephole optimizations
• Redundant-instruction elimination
• Flow-of-control optimizations
• Algebraic simplifications
• Use of machine idioms
50
UNIT 4: CODE GENERATIONNG
Redundant-instruction elimination
• LD a, R0
ST R0, a
• if debug == 1 goto L1
goto L2
L I : print debugging information
L2:
Flow-of-control optimizations
51
UNIT 4: CODE GENERATIONNG
goto L1
...
Ll: goto L2
Can be replaced
by:
goto L2
...
Ll: goto L2
if a<b goto L1
...
Ll: goto L2
Can be replaced by:
if a<b goto L2
...
Ll: goto L2
Algebraic simplifications
52
UNIT 4: CODE GENERATIONNG
• x=x+0
• x=x*1
Register Allocation and Assignment
53
UNIT 4: CODE GENERATIONNG
• Global Register Allocation
• Usage Counts
• Register Assignment for Outer Loops
• Register Allocation by Graph Coloring
Global register allocation
54
UNIT 4: Global register allocation
• Previously explained algorithm does local (block based) register allocation
• This resulted that all live variables be stored at the end of block
• To save some of these stores and their corresponding loads, we might arrange to
assign registers to frequently used variables and keep these registers consistent
across block boundaries (globally)
• Some options are:
– Keep values of variables used in loops inside registers
– Use graph coloring approach for more globally allocation
Usage counts
55
UNIT 4: CODE GENERATIONNG
• For the loops we can approximate the saving by register allocation as:
– Sum over all blocks (B) in a loop (L)
– For each uses of x before any definition in the block we add one unit of saving
– If x is live on exit from B and is assigned a value in B, then we ass 2 units of
saving
Flow graph of an inner loop
56
UNIT 4: CODE GENERATIONNG
57
UNIT 4: CODE GENERATIONNG
Code sequence using global register
assignment
Register allocation by Graph coloring
58
UNIT 4: Register allocation
• Two passes are used
– Target-machine instructions are selected as though there are an infinite
number of symbolic registers
– Assign physical registers to symbolic ones
• Create a register-interference graph
• Nodes are symbolic registers and edges connects two nodes if one is live
at a point where the other is defined.
• For example in the previous example an edge connects a and d in the
graph
• Use a graph coloring algorithm to assign registers.
Intermediate-code tree for a[i]=b+1
59
UNIT 4: CODE GENERATIONNG
Tree-rewriting rules
60
UNIT 4: CODE GENERATIONNG
Dynamic Programming Algorithm
61
UNIT 4: CODE GENERATIONNG
• Compute bottom-up for each node n of the expression tree T an
array C of costs, in which the ith component C[i] is the optimal cost
of computing the subtree S rooted at n into a register, assuming i
registers are available for the computation, for
• Traverse T, using the cost vectors to determine which subtrees of T
must be computed into memory.
• Traverse each tree using the cost vectors and associated
instructions to generate the final target code. The code for the
subtrees computed into memory locations is generated first.

r
i 

1
Syntax tree for(a-b)+c*(d/e) with cost
vector at each node
62
UNIT 4: CODE GENERATIONNG
Minimum cost of evaluating the root
registers available
63
UNIT 4: CODE GENERATIONNG
• Compute the left subtree with two registers available into
register R0, compute the right subtree with one register
available into register R1, and use the instruction ADD R0, R0,
R1 to compute the root. This sequence has cost 2+5+1=8.
• Compute the right subtree with two registers available into R l
, compute the left subtree with one register available into R0,
and use the instruction ADD R0, R0, R1. This sequence has
cost 4+2+1=7.
• Compute the right subtree into memory location M, compute
the left subtree with two registers available into register RO,
and use the instruction ADD R0, R0, M. This sequence has cost
5+2+1=8.
Thank you
64

COMPILER DESIGN Run-Time Environments

  • 1.
    COMPILER DESIGN Run-Time Environments DrR Jegadeesan Prof-CSE Jyothishmathi Institute of Technology and Science, Karimnagar 1
  • 2.
    2 Syllabus: Run-Time Environments: Storageorganization, Stack Allocation of Space, Access to Nonlocal Data on the Stack, Heap Management, Introduction to Garbage Collection, Introduction to Trace-Based Collection. Code Generation: Issues in the Design of a Code Generator, The Target Language, Addresses in the Target Code, Basic Blocks and Flow Graphs, Optimization of Basic Blocks, A Simple Code Generator, Peephole Optimization, Register Allocation and Assignment, Dynamic Programming Code-Generation
  • 3.
    Storage Organization 3 UNIT 4: Run-Time Environments
  • 4.
    Static vs. DynamicAllocation 4 • Static: Compile time, Dynamic: Runtime allocation • Many compilers use some combination of following – Stack storage: for local variables, parameters and so on – Heap storage: Data that may outlive the call to the procedure that created it • Stack allocation is a valid allocation for procedures since procedure calls are nested Storage Organization
  • 5.
    Activation records 5 • Procedurecalls and returns are usually managed by a run-time stack called the control stack. • Each live activation has an activation record (sometimes called a frame) • The root of activation tree is at the bottom of the stack • The current execution path specifies the content of the stack with the last activation has record in the top of the stack. Storage Organization
  • 6.
    6 UNIT 4 :StackAllocation of Space A General ActivationRecord
  • 7.
    Activation Record 7 • Temporaryvalues • Local data • A saved machine status • An “access link” • A control link • Space for the return value of the called function • The actual parameters used by the calling procedure UNIT 4 :Stack Allocation of Space
  • 8.
    Downward-growing stack ofactivation records 8 UNIT 4 :Stack Allocation of Space
  • 9.
    Designing Calling Sequences 9 UNIT4 : Access to Nonlocal Data on the Stack • Values communicated between caller and callee are generally placed at the beginning of callee’s activation record • Fixed-length items: are generally placed at the middle • Items whose size may not be known early enough: are placed at the end of activation record • We must locate the top-of-stack pointer judiciously: a common approach is to have it point to the end of fixed length fields.
  • 10.
    Division of tasksbetween caller and callee 10 UNIT 4 : Access to Nonlocal Data on the Stack
  • 11.
    calling sequence 11 • Thecaller evaluates the actual parameters • The caller stores a return address and the old value of top-sp into the callee's activation record. • The callee saves the register values and other status information. • The callee initializes its local data and begins execution. UNIT 4 : Access to Nonlocal Data on the Stack
  • 12.
    corresponding return sequence 12 •The callee places the return value next to the parameters • Using information in the machine-status field, the callee restores top-sp and other registers, and then branches to the return address that the caller placed in the status field. • Although top-sp has been decremented, the caller knows where the return value is, relative to the current value of top-sp; the caller therefore may use that value. UNIT 4 : Access to Nonlocal Data on the Stack
  • 13.
    Access to dynamicallyallocated arrays 13 UNIT 4 : Access to Nonlocal Data on the Stack
  • 14.
    UNIT 4: HEAPMANGEMENT HEAP ALLOCATION 14 •Variables local to a procedure are allocated and de-allocated only at runtime. • Heap allocation is used to dynamically allocate memory to the variables and claim it back when the variables are no more required. •Except statically allocated memory area, both stack and heap memory can grow and shrink dynamically and unexpectedly. Therefore, they cannot be provided with a fixed amount of memory in the system.
  • 15.
  • 16.
    16 UNIT 4: HEAPMANGEMENT HEAP ALLOCATION •As shown in the image above, the text part of the code is allocated a fixed amount of memory. •Stack and heap memory are arranged at the extremes of total memory allocated to the program. •Both shrink and grow against each other. •Parameter Passing •The communication medium among procedures is known as parameter passing. The values of the variables from a calling procedure are transferred to the called procedure by some mechanism.
  • 17.
    17 •Before moving ahead,first go through some basic terminologies pertaining to the values in a program. •r-value •The value of an expression is called its r-value. The value contained in a single variable also becomes an r-value if it appears on the right-hand side of the assignment operator. r-values can always be assigned to some other variable. •l-value •The location of memory (address) where an expression is stored is known as the l-value of that expression. It always appears at the left hand side of an assignment operator. UNIT 4: HEAP MANGEMENT HEAP ALLOCATION
  • 18.
    18 •For example: •day =1; week = day * 7; month = 1; year = month * 12; •From this example, we understand that constant values like 1, 7, 12, and •variables like day, week, month and year, all have r-values. • Only variables have l-values as they also represent the memory location assigned to them. •For example: •7 = x + y; •is an l-value error, as the constant 7 does not represent any memory location. UNIT 4: HEAP MANGEMENT HEAP ALLOCATION
  • 19.
    UNIT 4 :Garbage collection Garbage collection Definition 19 •Garbage collection (GC) is a dynamic approach to automatic memory management and heap allocation that processes and identifies dead memory blocks and reallocates storage for reuse. •The primary purpose of garbage collection is to reduce memory leaks. •GC implementation requires three primary approaches, as follows: •Mark-and-sweep - In process when memory runs out, the GC locates all accessible memory and then reclaims available memory.
  • 20.
    20 UNIT 4 :Garbage collection Garbage collection Definition •Reference counting - Allocated objects contain a reference count of the referencing number. •When the memory count is zero, the object is garbage and is then destroyed. The freed memory returns to the memory heap. •Copy collection - There are two memory partitions. •If the first partition is full, the GC locates all accessible data structures and copies them to the second partition, compacting memory after GC process and allowing continuous free memory .
  • 21.
    UNIT 4: Introductionto Trace-Based Collection Trace-Based Collection Sub Heading 21 Instead of collecting garbage as it is created, trace-based collectors run periodically to find unreachable objects and reclaim their space. Typically, we run the
  • 22.
    22 . •Trace-based collector wheneverthe free space is exhausted or its amount drops below some threshold. •We begin this section by introducing the simplest "mark-and-sweep" gar-bage collection algorithm. •We then describe the variety of trace-based algorithms in terms of four states that chunks of memory can be put in. •This section also contains a number of improvements on the basic algorithm, including those in which object relocation is a part of the garbage-collection function UNIT 4: Introduction to Trace-Based Collection Trace-Based Collection Sub Heading
  • 23.
    UNIT 4: CODEGENERATIONNG Introduction 23 • The final phase of a compiler is code generator • It receives an intermediate representation (IR) with supplementary information in symbol table • Produces a semantically equivalent target program • Code generator main tasks: – Instruction selection – Register allocation and assignment – Instruction ordering Front end Code optimizer Code Generator
  • 24.
    Issues in theDesign of Code Generator 24 UNIT 4: CODE GENERATIONNG • The most important criterion is that it produces correct code • Input to the code generator – IR + Symbol table – We assume front end produces low-level IR, i.e. values of names in it can be directly manipulated by the machine instructions. – Syntactic and semantic errors have been already detected • The target program – Common target architectures are: RISC, CISC and Stack based machines – In this chapter we use a very simple RISC-like computer with addition of some CISC-like addressing modes
  • 25.
    complexity of mapping 25 UNIT4: CODE GENERATIONNG • the level of the IR • the nature of the instruction-set architecture • the desired quality of the generated code. x=y+z LD R0, y ADD R0, R0, z ST x, R0 a=b+c d=a+e LD R0, b ADD R0, R0, c ST a, R0 LD R0, a ADD R0, R0, e ST d, R0
  • 26.
    Introduction 26 UNIT 4: Registerallocation • Two sub problems – Register allocation: selecting the set of variables that will reside in registers at each point in the program – Resister assignment: selecting specific register that a variable reside in • Complications imposed by the hardware architecture – Example: register pairs for multiplication and division t=a+b t=t*c T=t/d t=a+b t=t+c T=t/d L R1, a A R1, b M R0, c D R0, d ST R1, t L R0, a A R0, b M R0, c SRDA R0, 32 D R0, d ST R1, t
  • 27.
    Addressing Modes 27 UNIT 4:Addressing in target code • variable name: x • indexed address: a(r) like LD R1, a(R2) means R1=contents ( a+ contents(R2)) • integer indexed by a register : like LD R1, 100(R2) • Indirect addressing mode: *r and *100(r) • immediate constant addressing mode: like LD R1, #100
  • 28.
    conditional-jump three-address instruction 28 UNIT4: conditional-jump three-address instruction If x<y goto L LD R1, x // R1 = x LD R2, y // R2 = y SUB R1, R1, R2 // R1 = R1 - R2 BLTZ R1, M // i f R1 < 0 jump t o M
  • 29.
    Basic blocks andflow graphs 29 UNIT 4: Basic blocks and flow graphs • Partition the intermediate code into basic blocks – The flow of control can only enter the basic block through the first instruction in the block. That is, there are no jumps into the middle of the block. – Control will leave the block without halting or branching, except possibly at the last instruction in the block. • The basic blocks become the nodes of a flow graph
  • 30.
    rules for findingleaders 30 UNIT 4: CODE GENERATIONNG • The first three-address instruction in the intermediate code is a leader. • Any instruction that is the target of a conditional or unconditional jump is a leader • Any instruction that immediately follows a conditional or unconditional jump is a leader.
  • 31.
    Intermediate code toset a 10*10 matrix to an identity matrix 31 UNIT 4: CODE GENERATIONNG
  • 32.
    Flow graph basedon Basic Blocks 32 UNIT 4: CODE GENERATIONNG
  • 33.
    liveness and next-useinformation 33 UNIT 4: CODE GENERATIONNG • We wish to determine for each three address statement x=y+z what the next uses of x, y and z are. • Algorithm: – Attach to statement i the information currently found in the symbol table regarding the next use and liveness of x, y, and z. – In the symbol table, set x to "not live" and "no next use.“ – In the symbol table, set y and z to "live" and the next uses of y and z to i.
  • 34.
    DAG representation ofbasic blocks 34 UNIT 4: CODE GENERATIONNG • There is a node in the DAG for each of the initial values of the variables appearing in the basic block. • There is a node N associated with each statement s within the block. The children of N are those nodes corresponding to statements that are the last definitions, prior to s, of the operands used by s. • Node N is labeled by the operator applied at s, and also attached to N is the list of variables for which it is the last definition within the block. • Certain nodes are designated output nodes. These are the nodes whose variables are live on exit from the block.
  • 35.
    Code improving transformations 35 UNIT4: CODE GENERATIONNG • We can eliminate local common subexpressions, that is, instructions that compute a value that has already been computed. • We can eliminate dead code, that is, instructions that compute a value that is never used. • We can reorder statements that do not depend on one another; such reordering may reduce the time a temporary value needs to be preserved in a register. • We can apply algebraic laws to reorder operands of three-address instructions, and sometimes t hereby simplify t he computation.
  • 36.
    DAG for basicblock 36 UNIT 4: CODE GENERATIONNG
  • 37.
    DAG for basicblock 37 UNIT 4: CODE GENERATIONNG
  • 38.
    array accesses ina DAG 38 UNIT 4: CODE GENERATIONNG • An assignment from an array, like x = a [i], is represented by creating a node with operator =[] and two children representing the initial value of the array, a0 in this case, and the index i. Variable x becomes a label of this new node. • An assignment to an array, like a [j] = y, is represented by a new node with operator []= and three children representing a0, j and y. There is no variable labeling this node. What is different is that the creation of this node kills all currently constructed nodes whose value depends on a0. A node that has been killed cannot receive any more labels; that is, it cannot become a common subexpression.
  • 39.
    DAG for asequence of array assignments 39 UNIT 4: CODE GENERATIONNG
  • 40.
    Rules for reconstructingthe basic block from a DAG 40 UNIT 4: CODE GENERATIONNG • The order of instructions must respect the order of nodes in the DAG. That is, we cannot compute a node's value until we have computed a value for each of its children. • Assignments to an array must follow all previous assignments to, or evaluations from, the same array, according to the order of these instructions in the original basic block. • Evaluations of array elements must follow any previous (according to the original block) assignments to the same array. The only permutation allowed is that two evaluations from the same array may be done in either order, as long as neither crosses over an assignment to that array. • Any use of a variable must follow all previous (according to the original block) procedure calls or indirect assignments through a pointer. • Any procedure call or indirect assignment through a pointer must follow all previous (according to the original block) evaluations of any variable.
  • 41.
    principal uses ofregisters 41 UNIT 4: CODE GENERATIONNG • In most machine architectures, some or all of the operands of an operation must be in registers in order to perform the operation. • Registers make good temporaries - places to hold the result of a subexpression while a larger expression is being evaluated, or more generally, a place to hold a variable that is used only within a single basic block. • Registers are often used to help with run-time storage management, for example, to manage the run-time stack, including the maintenance of stack pointers and possibly the top elements of the stack itself.
  • 42.
    Descriptors for datastructure 42 UNIT 4: CODE GENERATIONNG • For each available register, a register descriptor keeps track of the variable names whose current value is in that register. Since we shall use only those registers that are available for local use within a basic block, we assume that initially, all register descriptors are empty. As the code generation progresses, each register will hold the value of zero or more names. • For each program variable, an address descriptor keeps track of the location or locations where the current value of that variable can be found. The location might be a register, a memory address, a stack location, or some set of more than one of these. The information can be stored in the symbol-table entry for that variable name.
  • 43.
    Machine Instructions forOperations 43 UNIT 4: CODE GENERATIONNG • Use getReg(x = y + z) to select registers for x, y, and z. Call these Rx, Ry and Rz. • If y is not in Ry (according to the register descriptor for Ry), then issue an instruction LD Ry, y', where y' is one of the memory locations for y (according to the address descriptor for y). • Similarly, if z is not in Rz, issue and instruction LD Rz, z', where z' is a location for x . • Issue the instruction ADD Rx , Ry, Rz.
  • 44.
    Rules for updatingthe register and address descriptors 44 UNIT 4: CODE GENERATIONNG • For the instruction LD R, x – Change the register descriptor for register R so it holds only x. – Change the address descriptor for x by adding register R as an additional location. • For the instruction ST x, R, change the address descriptor for x to include its own memory location. • For an operation such as ADD Rx, Ry, Rz implementing a three-address instruction x = y + x – Change the register descriptor for Rx so that it holds only x. – Change the address descriptor for x so that its only location is Rx. Note that the memory location for x is not now in the address descriptor for x. – Remove Rx from the address descriptor of any variable other than x. • When we process a copy statement x = y, after generating the load for y into register Ry, if needed, and after managing descriptors as for all load statements (per rule I): – Add x to the register descriptor for Ry. – Change the address descriptor for x so that its only location is Ry .
  • 45.
    Instructions generated andthe changes in the register and address descriptors 45 UNIT 4: CODE GENERATIONNG
  • 46.
    Rules for pickingregister Ry for y 46 UNIT 4: CODE GENERATIONNG • If y is currently in a register, pick a register already containing y as Ry. Do not issue a machine instruction to load this register, as none is needed. • If y is not in a register, but there is a register that is currently empty, pick one such register as Ry. • The difficult case occurs when y is not in a register, and there is no register that is currently empty. We need to pick one of the allowable registers anyway, and we need to make it safe to reuse.
  • 47.
    Possibilities for valueof R 47 UNIT 4: CODE GENERATIONNG • If the address descriptor for v says that v is somewhere besides R, then we are OK. • If v is x, the value being computed by instruction I, and x is not also one of the other operands of instruction I (z in this example), then we are OK. The reason is that in this case, we know this value of x is never again going to be used, so we are free to ignore it. • Otherwise, if v is not used later (that is, after the instruction I, there are no further uses of v, and if v is live on exit from the block, then v is recomputed within the block), then we are OK. • If we are not OK by one of the first two cases, then we need to generate the store instruction ST v, R to place a copy of v in its own memory location. This operation is called a spill.
  • 48.
    Selection of theregister Rx 48 UNIT 4: CODE GENERATIONNG • Since a new value of x is being computed, a register that holds only x is always an acceptable choice for Rx. • If y is not used after instruction I, and Ry holds only y after being loaded, Ry can also be used as Rx. A similar option holds regarding z and Rx.
  • 49.
    49 UNIT 4: PEEPHOLEOPTIMIZATIONS Characteristic of peephole optimizations • Redundant-instruction elimination • Flow-of-control optimizations • Algebraic simplifications • Use of machine idioms
  • 50.
    50 UNIT 4: CODEGENERATIONNG Redundant-instruction elimination • LD a, R0 ST R0, a • if debug == 1 goto L1 goto L2 L I : print debugging information L2:
  • 51.
    Flow-of-control optimizations 51 UNIT 4:CODE GENERATIONNG goto L1 ... Ll: goto L2 Can be replaced by: goto L2 ... Ll: goto L2 if a<b goto L1 ... Ll: goto L2 Can be replaced by: if a<b goto L2 ... Ll: goto L2
  • 52.
    Algebraic simplifications 52 UNIT 4:CODE GENERATIONNG • x=x+0 • x=x*1
  • 53.
    Register Allocation andAssignment 53 UNIT 4: CODE GENERATIONNG • Global Register Allocation • Usage Counts • Register Assignment for Outer Loops • Register Allocation by Graph Coloring
  • 54.
    Global register allocation 54 UNIT4: Global register allocation • Previously explained algorithm does local (block based) register allocation • This resulted that all live variables be stored at the end of block • To save some of these stores and their corresponding loads, we might arrange to assign registers to frequently used variables and keep these registers consistent across block boundaries (globally) • Some options are: – Keep values of variables used in loops inside registers – Use graph coloring approach for more globally allocation
  • 55.
    Usage counts 55 UNIT 4:CODE GENERATIONNG • For the loops we can approximate the saving by register allocation as: – Sum over all blocks (B) in a loop (L) – For each uses of x before any definition in the block we add one unit of saving – If x is live on exit from B and is assigned a value in B, then we ass 2 units of saving
  • 56.
    Flow graph ofan inner loop 56 UNIT 4: CODE GENERATIONNG
  • 57.
    57 UNIT 4: CODEGENERATIONNG Code sequence using global register assignment
  • 58.
    Register allocation byGraph coloring 58 UNIT 4: Register allocation • Two passes are used – Target-machine instructions are selected as though there are an infinite number of symbolic registers – Assign physical registers to symbolic ones • Create a register-interference graph • Nodes are symbolic registers and edges connects two nodes if one is live at a point where the other is defined. • For example in the previous example an edge connects a and d in the graph • Use a graph coloring algorithm to assign registers.
  • 59.
    Intermediate-code tree fora[i]=b+1 59 UNIT 4: CODE GENERATIONNG
  • 60.
  • 61.
    Dynamic Programming Algorithm 61 UNIT4: CODE GENERATIONNG • Compute bottom-up for each node n of the expression tree T an array C of costs, in which the ith component C[i] is the optimal cost of computing the subtree S rooted at n into a register, assuming i registers are available for the computation, for • Traverse T, using the cost vectors to determine which subtrees of T must be computed into memory. • Traverse each tree using the cost vectors and associated instructions to generate the final target code. The code for the subtrees computed into memory locations is generated first.  r i   1
  • 62.
    Syntax tree for(a-b)+c*(d/e)with cost vector at each node 62 UNIT 4: CODE GENERATIONNG
  • 63.
    Minimum cost ofevaluating the root registers available 63 UNIT 4: CODE GENERATIONNG • Compute the left subtree with two registers available into register R0, compute the right subtree with one register available into register R1, and use the instruction ADD R0, R0, R1 to compute the root. This sequence has cost 2+5+1=8. • Compute the right subtree with two registers available into R l , compute the left subtree with one register available into R0, and use the instruction ADD R0, R0, R1. This sequence has cost 4+2+1=7. • Compute the right subtree into memory location M, compute the left subtree with two registers available into register RO, and use the instruction ADD R0, R0, M. This sequence has cost 5+2+1=8.
  • 64.