A compilation project for Thrid year students of IMT Atlantique (the former Telecom Bretagne)
The specification of the project can be found here (Authorization Required)
- How to clone the project
- The project structure
- How to build the compiler
- How to execute the compiler
- How to test the Compiler
- How to contribute to the Project
- To do list
- Problems
- Contributors
Using git clone
git clone https://redmine-df.telecom-bretagne.eu/git/f2b304_compiler_cn
then enter your Username and Password in the command prompt
The project is divided by 2 parts
- phase1: the first deliverable that contains the lexical and syntax analyzer for the language minijava
- phase2: the second deliverable that contains the type checking, compiling and executing for the language minijava based on the lexical and syntax analyzer that offered by professors
To build or execute the compiler, plase enter one of the two deliverables, either phase1
or phase2
Using cd ./phase1
or cd ./phase2
Then using the shell script build
to build the compiler
./build
or using ocamlbuild
to build the compiler
ocamlbuild Main.byte
Notes for contributors
The main file is
Main/Main.ml
, it should not be modified. It opens the given file,creates a lexing buffer, initializes the location and call the compile function of the moduleMain/compile.ml
. It is this function that you should modify to call your parser.
Using the shell script minijavac
to execute the compiler
./minijavac <filename>
or using the following command to build and then execute the compiler on the given file named <filename>
ocamlbuild Main.byte -- <filename>
By default, the program searches for file with the extension
.java
and append it to the given filename if it does not end with it.
Using the shell script test
to test the Compiler
./test
it will execute Main.byte
on all files in the directory Evaluator
If you are a team member of the project, please review the Guidelines for Contributing to this repository in order to make appropriate contributions
Deadline 15/01/2018
- Line Terminators
- Input Elements and Tokens
- White Space
- Comments
- Identifiers
- Keywords
- for
- while
- else
- if
- Literals
- Int
- String
- Separators
- brace
- parenthese
- dot
- comma
- semicolon
- Operators
-
=
Simple Assignment Operator -
+ - * / %
Arithmetic Operators -
+ - ++ -- !
Unary Operators -
== != > < <= >=
Equality and Relational Operators -
&& ||
Conditional Operators
-
- Keywords
- class
- static
- extends
- return
- new
- Classes
- Class Declaration
- simple class declaration
- simple class declaration with extends
- Field Declarations
- static Fields
- non-static Fields
- Method Declarations
- static Methods
- non-static Methods
- Class Declaration
Deadline 25/02/2018
note for the reviewer: [ ] denotes item that needs to do while [x] denotes item that has done
- The construction of the class definition environment. This environment contains the type of methods for each class. This phase ignores the attributes (which are not visible outside the class) and the method bodies.
- create a class definition environment type called
class_env
, it contains 4 fields as follows- methods: a
Hashtbl
that maps from method name to method return type and argument type - constructors: a
Hashtbl
that maps from constructor name to class reference type and argument type - attributes: a
Hashtbl
that maps from attribute name to attribute type (declared type) - parent: a class reference type that refers to its class
- methods: a
- create a
Hashtbl
that maps from class
- create a class definition environment type called
- The second phase is concerned with verifying that the inside of classes is correct (mainly the body of methods). She will also make sure of the correction of the higher level expression.
- create 3 verification methods that verify the following aspects of the program
-
verify_methods
that checks the type of methods- create a local definition environment type called
current_env
that contains 3 fields as follows- returntype: the declared return type of the method
- variables: a
Hashtbl
that maps from local variable name to local variable declared type - this_class: the id of the class
- env_type: a string that identifies the type of the local definition environment, it could be
constructor
,method
orattribute
, in this case, theenv_type
ismethod
- write a verification method (
verify_declared_args
) that checks the declared type of variables in the method argument list- check if there exists Duplicate Local Variable
- write a verification method (
verify_statement
) that checks the body of the method- check variable declaration statement
- check block of statement
- check expression
- check return statement when it's none, ex:
return;
- check return statement when it's not none, ex:
return x;
- check throw statement
- it does check if exception type or a supertype of that exception type is mentioned in a throws clause in the declaration of the method, it should be checked in compiling
- check while statement
- check if without else statement
- check if with else statement
- check for statement
- check try statement
- create a local definition environment type called
-
verify_constructors
that checks the type of constructors- same as verify_methods, except for the following minor difference
returntype
in the local definition environmentcurrent_env
is a reference to the class it belongs toenv_type
in the local definition environmentcurrent_env
isconstructor
- check return statement in
verify_statement
is slightly different since constructors can havereuturn;
but not something likereturn x;
-
verify_attributes
that checks the type of attributes- create a local definition environment type called
current_env
it contains 3 fields as following- returntype: since attributes have no return value, so it sets to be
Type.Void
- variables: a
Hashtbl
that maps from local variable name to local variable declared type - this_class: the id of the current class
- env_type: which is
attribute
here
- returntype: since attributes have no return value, so it sets to be
- write a verification expression (
verify_expression
) that checks the declared type of an expression Inverify_expression
:- check
New
expression type which instantiates a class - check
NewArray
expression type which declares an array like: new int[5] - check
Call
expression type which calls a method, here, we didn't checkthis
keyword when calling a method. For the moment, it only supports the case when the class name has already existes inclass_en
hashtable. - check
Attr
expression type which calls an attribute - check
If
expression type - check
Val
expression type which is the primitive type like int, string... in an expression - check
Name
expression type which represents a variable - check
ArrayInit
expression type which initializes an array like {1,2,3} - check
Array
expression type (TODO). This part has not been done for the moment - check
AssignExp
expression type which compares an assignment operation type - check
Post
expression type which is some post operations type, like: a++, b--... - check
Pre
expression type which is some pre operations type, like: !a, ~b... - cehck
Op
expression type which is some operation optype, like: ||, &&, +, -... - check
CondOp
expression type which is conditional operation, like a ? b : c - check
Cast
expression type - check
Type
expression type - check
ClassOf
expression type - check
Instanceof
expression type - check
VoidClass
expression
- check
- write an verification method (
verify_assignop_type
) that checks the declared type of an attribute match the type of the expression. It has three inputs:- t1: the type of an attribute
- t2: the type of the corresponding expression of an attribute
- op: the type of operation, here is
Type.Assign
- create a local definition environment type called
-
- add support to
this
keyword within a class in order to do type checking likethis.a = 5;
- add location in exception message in order to locate errors
- create 3 verification methods that verify the following aspects of the program
- add support to overload methods and constructors
- ArgumentAlreadyExists
- when found duplicated argument in constructor argument list -> ArgumentAlreadyExists("[pident of argument]")
- when found duplicated argument in method argument list -> ArgumentAlreadyExists("[pident of argument]")
- ArgumentTypeNotExiste
- when found the arguments in a called function don't existe a declared method
- ArgumentTypeNotMatch
- when found the arguments in a called function don't match a declared method -> ArgumentTypeNotMatch("Arguments' type in "^meth_name^" not match")
- AttributeAlreadyExists
- when found duplicated attribute in class definition environment -> AttributeAlreadyExists("[aname of attribute]")
- ClassAlreadyExists
- ConstructorAlreadyExists
- when found duplicated constructor in class definition environment -> ConstructorAlreadyExists("[cname of constructor]")
- DuplicateLocalVariable
- when found duplcated variable in variable declaration statement (VarDecl) -> DuplicateLocalVariable("[decalred type] [variable id]")
- when found duplcated variable in the init part of for loop statement (For(fil,eo,el,s)) -> DuplicateLocalVariable("[decalred type] [variable id]")
- also raise this exception when found duplcated variable in variable declaration statement in the body of block, if, if else, for, while statement -> DuplicateLocalVariable("[decalred type] [variable id]")
- IncompatibleTypes
- when constructor try to return a variable -> ("unexpected return value")
- when method return does not contain variable -> IncompatibleTypes("missing return value")
- when method return type does not corresponds with the declared one -> IncompatibleTypes("missing return value")
- when condition in if statement is not boolean -> IncompatibleTypes("[actual type] cannot be converted to boolean")
- when condition in if else statement is not boolean -> IncompatibleTypes("[actual type] cannot be converted to boolean")
- when loop condition in for statement is not boolean -> IncompatibleTypes("[actual type] cannot be converted to boolean")
- when loop condition in while statement is not boolean -> IncompatibleTypes("[actual type] cannot be converted to boolean")
- InvalidMethodDeclaration
- when method declaration does not have return type -> InvalidMethodDeclaration("return type required")
- MethodAlreadyExists
- when found duplicated method in class definition environment -> MethodAlreadyExists("[mname of method]")
- UnknownActualType
- when actual type of a variable cannot be determined in variable declaration statement (VarDecl) -> UnknownActualType("[edesc] don't have type information")
- when actual type of a variable cannot be determined in the init part of for loop statement (For(fil,eo,el,s)) -> UnknownActualType("[edesc] don't have type information")
- when actual type of variable cannot be determined in the condition part of while loop statement (While(e,s)) -> UnknownActualType("[edesc]: unknow type in while condition")
- when actual type of variable cannot be determined in the condition part of if statement (If(e,s,None)) -> UnknownActualType("[edesc]: unknow type in if condition")
- when actual type of variable cannot be determined in the condition part of if else statement (If(e,s,Some s2)) -> UnknownActualType("[edesc]: unknow type in if else condition")
- UnknownVariable
- when the variable does not existe in current environment or global environment -> UnknownVariable("[variable_name]")
- UnknownClass
- UnknownMethod
- WrongTypePrefixOperation
- when the prefix operation type is not match -> WrongTypePrefixOperation("[operation, expr]")
- WrongTypePostfixOperation
- when the postfix operation type is not match -> WrongTypePostfixOperation("[operation, expr]")
- WrongInvokedArgumentsLength
- when actual and formal argument lists differ in length -> WrongInvokedArgumentsLength()
- WrongTypesAssignOperation
- when an assignment operation type is not match -> WrongTypesAssignOperation("[expr1_type, op, expr2_type]")
- WrongTypesOperation
- when an operation type is not match -> WrongTypesAssignOperation("[expr1_type, op, expr2_type]")
- errors related to overloading
- errors related to generic types
- errors related to
this
keyword
Evaluation and execute by certain means
Construction of class descriptors table and method table.
class descriptors table : name - classTable : (string, globalClassDescriptor) Hashtbl.t
method table : name - methodTable : (string, astmethod) Hashtbl.t
All the contents of functions of different classes are saved in the methodTable All the contents of classes are saved in the classTable except for the contents of functions Here we use the name of functions and type of params to in class descriptor to find the content of function in the methodTable
functions:
- class descriptor of a class : func_name_typeOfpara1,typeOfpara2 classname_func_name_typeOfpara1,typeOfpara2
- method table of : classname_func_name_typeOfpara1,typeOfpara2 astmethod In this way, functions that has the same name but different type of paramters are permitted in the compilation
constructors: name : typeOfpara1,typeOfpara2 content: astconst In this way, different types of constructors are permitted
Please take care that overriding are not supported in the typage. For testing the overriding, please delete the typage function first.
ParentClassNotDefined :raised when parent class is not defined in the file SameFunctionAlreadyDefined: raised when function of class have the same name and the same type of parameters SameFunctionConstructorsDefined : raised when constructors of class have the same name and the same type of parameters
- not support method overloading
- not support generic types
- not support typing related to
this
keyword
- First part: Lexical and syntactic analyzers
- Expression: Shuwei ZHANG & Jinhai ZHOU
- Classes: Xiaofeng ZHOU & Keyu PU
- Second part: The Type-checking and the Execution
- Type-checking: Shuwei ZHANG & [Jinhai ZHOU]
- Execution: Xiaofeng ZHOU & Keyu PU
This work is licensed under a Creative Commons Attribution 4.0 International License.