CODE
GENERATION
UNIT-5
Learning Objectives
 To understand the issues in generating code
Keyword : code generation
Introduction
Definition:
In computing, code generation is the process by which a
compiler's code generator converts some intermediate representation
of source code into a form (e.g., machine code) that can be readily
executed by a machine. Sophisticated compilers typically perform
multiple passes over various intermediate forms.
Cont.….
 So far we discussed front-end completely
 Where we discussed lexical phase ,syntax phase, semantic phase
completely
 After doing that we did intermediate code generator where we
have converted code into three address code
 All the (Front-end, code optimizer, code generator ) are access the
symbol table
Cont.…
 Code produced by complier must be correct
 Source to target program transformation is semantic preserving
 Code produced by compiler should be of high quality
 Effective use of target machine resources
 The code which is generated is in optimal state
Issues in the design of Code
Generator
 Input to the code generator
 Target Program
 Memory Management
 Instruction selection
 Register allocation
Input to the code generator:
 Intermediate representation of the source program
 Linear – Postfix
 Tables – Quadruples,Triples,Indirect triples
 No-Linear – AST,DAG
 In addition we also need a symbol table information give it to the
code generator
Target Program:
 The Back-end code generator of a complier may different forms of
code, depending on the requirements.
 Absolute machine code – Executable
 Relocatable machine code – Object files
 Assembly language
 Byte code forms for interpreters like JVM
 Implementing code generation requires thorough understanding of
the target machine architecture and its instruction set
 Our machine:
 Byte – addressable(word = 4 bytes)
 Has n general purpose registers R0,R1,……,Rn-1
 Two – address instructions of the form
 OP SOURCE , DESTINATION
Target machine :
Op-Code&Address modes
 Op – code , for example
 MOV (move content of source to destination)
 ADD (add content of source to destination)
 SUB (subtract content of source from destination)
 It has 6 address modes, They are
Instruction selection : Cost
 Machine is a simple , non-super-scalar processor with fixed
instruction costs
 Realistic machine have deep pipelines, I –cache, D –cache etc.
 Define the cost of instruction
 1+cost(source node)+cost(destination node)
Instruction selection :
Addressing modes
 Suppose we translate “ a:b+c” into
MOV b,R0
ADD c,R0
MOV R0,a
 Assuming address of a , b and c are stored in R0,R1 and R2
MOV *R1,*R2
ADD *R2,*R0
 Assuming R1 and R2 contain values of b and c
ADD R2,R1
MOV R1,a
Need for Global code optimization:
 Suppose we translate three – address code ” x:=y+z” to:
MOV y,R0
ADD z,R0
MOV R0,x
This is for single instruction…..
Cont…..
 For double instructions:
EX: a:=b+c
d:=a+e
to:
MOV a,R0
ADD b,R0
MOV R0,a Redundant as R0 is used.
MOV a,R0
ADD e,R0
MOV R0,d
Register Allocation and Assignment:
 Efficient utilization of the limited set of register s is important to
generate good code
 Registers are assigned by
 Register allocation to select the set of variables that will reside in
registers at a point in the code
 Register assignment to pick the specific register that a variable will
reside in particular point of time
Choice of Evaluation Order:
a+b-(c+d)*e t1:=a+b MOV a,R0
t2:=c+d ADD b,R0
t3:=e*t2 MOV R0,t1
t4:=t1-t3 MOV c,R1
ADD d,R1
MOV e,R0
MUL R1,R0
MOV t1,R1
SUB R0,R1
MOV R1,t4
Reordered Instructions and code:
t2: = c+d MOV c,R0
t3: = e*t2 ADD d,R0
t1: = a+b MOV e,R1
t4: = t1-t3 MUL R0,R1
MOV a ,R0
ADD b,R0
SUB R1,R0
MOV R0,t4
Cont.….
 When instruction are independent , their evolution order can be
changed to utilize registers and save on instruction set
 This is also possible with code generator
ANY QUIERIES
DARSHAN SAI REDDY.U
15121A15B2

Issues in the design of Code Generator

  • 1.
  • 2.
    Learning Objectives  Tounderstand the issues in generating code Keyword : code generation
  • 3.
    Introduction Definition: In computing, codegeneration is the process by which a compiler's code generator converts some intermediate representation of source code into a form (e.g., machine code) that can be readily executed by a machine. Sophisticated compilers typically perform multiple passes over various intermediate forms.
  • 5.
    Cont.….  So farwe discussed front-end completely  Where we discussed lexical phase ,syntax phase, semantic phase completely  After doing that we did intermediate code generator where we have converted code into three address code  All the (Front-end, code optimizer, code generator ) are access the symbol table
  • 6.
    Cont.…  Code producedby complier must be correct  Source to target program transformation is semantic preserving  Code produced by compiler should be of high quality  Effective use of target machine resources  The code which is generated is in optimal state
  • 7.
    Issues in thedesign of Code Generator  Input to the code generator  Target Program  Memory Management  Instruction selection  Register allocation
  • 8.
    Input to thecode generator:  Intermediate representation of the source program  Linear – Postfix  Tables – Quadruples,Triples,Indirect triples  No-Linear – AST,DAG  In addition we also need a symbol table information give it to the code generator
  • 9.
    Target Program:  TheBack-end code generator of a complier may different forms of code, depending on the requirements.  Absolute machine code – Executable  Relocatable machine code – Object files  Assembly language  Byte code forms for interpreters like JVM
  • 10.
     Implementing codegeneration requires thorough understanding of the target machine architecture and its instruction set  Our machine:  Byte – addressable(word = 4 bytes)  Has n general purpose registers R0,R1,……,Rn-1  Two – address instructions of the form  OP SOURCE , DESTINATION
  • 11.
    Target machine : Op-Code&Addressmodes  Op – code , for example  MOV (move content of source to destination)  ADD (add content of source to destination)  SUB (subtract content of source from destination)  It has 6 address modes, They are
  • 13.
    Instruction selection :Cost  Machine is a simple , non-super-scalar processor with fixed instruction costs  Realistic machine have deep pipelines, I –cache, D –cache etc.  Define the cost of instruction  1+cost(source node)+cost(destination node)
  • 16.
    Instruction selection : Addressingmodes  Suppose we translate “ a:b+c” into MOV b,R0 ADD c,R0 MOV R0,a  Assuming address of a , b and c are stored in R0,R1 and R2 MOV *R1,*R2 ADD *R2,*R0  Assuming R1 and R2 contain values of b and c ADD R2,R1 MOV R1,a
  • 17.
    Need for Globalcode optimization:  Suppose we translate three – address code ” x:=y+z” to: MOV y,R0 ADD z,R0 MOV R0,x This is for single instruction…..
  • 18.
    Cont…..  For doubleinstructions: EX: a:=b+c d:=a+e to: MOV a,R0 ADD b,R0 MOV R0,a Redundant as R0 is used. MOV a,R0 ADD e,R0 MOV R0,d
  • 19.
    Register Allocation andAssignment:  Efficient utilization of the limited set of register s is important to generate good code  Registers are assigned by  Register allocation to select the set of variables that will reside in registers at a point in the code  Register assignment to pick the specific register that a variable will reside in particular point of time
  • 21.
    Choice of EvaluationOrder: a+b-(c+d)*e t1:=a+b MOV a,R0 t2:=c+d ADD b,R0 t3:=e*t2 MOV R0,t1 t4:=t1-t3 MOV c,R1 ADD d,R1 MOV e,R0 MUL R1,R0 MOV t1,R1 SUB R0,R1 MOV R1,t4
  • 22.
    Reordered Instructions andcode: t2: = c+d MOV c,R0 t3: = e*t2 ADD d,R0 t1: = a+b MOV e,R1 t4: = t1-t3 MUL R0,R1 MOV a ,R0 ADD b,R0 SUB R1,R0 MOV R0,t4
  • 23.
    Cont.….  When instructionare independent , their evolution order can be changed to utilize registers and save on instruction set  This is also possible with code generator
  • 24.
    ANY QUIERIES DARSHAN SAIREDDY.U 15121A15B2