The source code of the compiler is distributed across the following directories and files:
ir/
canon.sml
: Contains the code for Tree IR canonizationframe.sml
: Represents the stack frame used while translating the AST to Tree IRinfo.sml
: Contains the info required while translating the AST to Tree IRprettyTree.sml
: Contains the code for pretty printing the Tree IR codetranslate.sml
: Contains the code for translating the AST to Tree IRtree.sig
: Contains the signature of the Tree IRtree.sml
: Contains the structure of the Tree IR
target/
codeGen.sml
: Contains the code for generating the MIPS code from the Canonicalized Tree IRconvToMIPS.sml
: Contains helper functions to create MIPS assembly statementsmips.sig
andmips.sml
: Signature and structure of the MIPS ASTmipsInst.sml
: Contains the structure for Mips-Inst-Basic-BlocksprettyMips.sml
: Contains helper functions for the pretty printing of the MIPS assembly code
tiger/
ast.sig
andast.sml
: Signature and structure of the Tiger ASTconvToTiger.sml
: Contains helper functions to create Tiger expressionsprettyTigerAST.sml
: Contains helper functions for the pretty printing of the Tiger expressionstiger.grm
: Grammar of the Tiger languagetiger.lex
: Lexical analysis file
utils/
basicBlocks.sml
: Contains the code for the basic blocks assignmentgraph.sml
: Contains the code for the graphs assignmenttemp.sml
: Structure to create new temporary values and labelsutils.sml
: General purpose helper functions
tc.mlb
: The ML-Basis file for compilationtc.sml
: The main code where the execution begins. This uses the lexer-parser, translator and pretty printers to generate MIPS assembly from the Tiger source code
We have used the following design choices:
- Everything would be organized on the stack, even the variables that are declared at the top level are organized on the stack since the top level code is itself is thought to be inside a function declaration.
- Currenlty, we don't have a register allocation algorithm, hence most computation stores the value into the stack, retrieves the value into temporary registers, performs the computation and places the result into the stack. This causes the compiled program to be somewhat slow.
- There are built-in functions for printing expression values, they are
print
andprintln
which print the value without or with a newline respectively - The error checking mechanism is quite week as of now
These restrictions help in less work to be left for the canonization and code generation.
- BINOP can take only temporaries as input
- All expressions should be of the form Tree.ESEQ (stmt, resultTemp)
- This way we will always extract the statement from the ESEQ and the resultTemp and we will have no ESEQ statements in the final Tree IR code
- CALL can only be used by the
print
andprintln
functions for actual function call. These will be the only functions that will have argument list with one register. For the rest of the functions, arguments will be pushed onto the stack. - CALL are always the child of the EXP
- We can return the return values in the return value register
- MOVE can happen between MEM and TEMP, TEMP and MEM, TEMP and BINOP applied on TEMPs, TEMP and CONST