CSE 304/504: Compiler Design

Spring 2016

Homework #4, Type Checking & Name Resolution

Due: April 4


Contents


Read the following description carefully before starting on the assignment. Also, go through the brief overview of the project path that describes the steps involved. A list of what materials are available and what you are expected to submit are also given.

Homework Description

In this assignment, you will build a type checker for Decaf. Recall the description of Decaf's syntax is given in its manual, and the AST you built in HW #3. The type checker you will build in this assignment will:

As in the past, you will produce a compiler front end that will take a single command-line argument, which is a file name; if that file contains a syntactically valid Decaf program, then it will construct an AST for that program. Upon successful construction of the AST, it check the type constraints as outlined in this handout (see section on type checking), and resolve any names that could not be resolved when the AST was constructed (see section on name resolution). Any errors found during type checking and name resolution should be reported with sufficient detail (including the nature of the error and the line numbers in the source program corresponding to the error). If type checking and name resolution proceeded without any error, it will then print the AST decorated with types to standard output (see section on printing).

  1. Changes to the AST Structure
  2. Type Checking Constraints
  3. Name Resolution
  4. Additional Error Checks
  5. Printing the AST
  6. Initialization
  7. CSE 504 vs. CSE 304

Changes to AST Structure:

The AST structure remains mostly unchanged from HW #3. The changes are as follows. (AST structures not mentioned below are unchanged).

Type Record:

From HW #3:
Each type is either an array type are an elementary type. Elementary types, in turn, may be one of built-in types: int, float, boolean, or string; or a user-defined type, which is a class name.

An array type is parameterized by its "base type", which is another type.

Subtype Relation:

The type checker will rely on a subtype relation defined over types. This relation follows the discussion in class, with one addition: for the purposes of type checking, we will consider int to be a subtype of float. In summary:

Changes to Expressions:

Three expressions--- field access, method invocation, and new object creation--- will all have a new piece of information in the AST denoting the actual field/method/constructor accessed in this expression. This information is determined by name resolution. For instance, consider a field access expression e.x. With name resolution, (in the absence of any errors), we will be able to determine that the expression accesses field with id j (note field id's are unique). Then the field access expression e.x will contain information that the expression refers to field j. Similarly, with name resolution, we will determine the unique method or constructor used in a method invocation or new object creation expressions (respectively).

Type Checking Constraints

Statement Record:

Each statement record will now have an additional field indicating whether or not the statement is type correct (i.e. has no type errors).
  1. If-stmt: This statement is type correct if if the condition part is of boolean type; and the "Then" and "Else" parts are type correct.
  2. While-stmt: This statement is type correct if if the condition part is of boolean type; and the loop body is type correct.
  3. For-stmt: This statement is type correct if if the condition part is of boolean type; and the initializer expression, update expression and the loop body are all type correct.
  4. Return-stmt: This statement is type correct if the type of the expression matches the declared return type of the current method.
  5. Expr-stmt: This statement is type correct if the expression is type correct.
  6. Block-stmt: This statement is type correct if all the statements in the sequence are type correct.

Expression Record:

Each expression record will now have an additional field indicating its type.
  1. Constant-expressions:
  2. Var-expression: has, as its type, the declared type of the corresponding variable.
  3. Unary-expressions:
  4. Binary-expressions:
  5. Assign-expression: Consider e1 = e2. The type of this expression is e2's type, provided:
  6. Auto-expression: has the same type as its argument expression e if e's type is int or float; has type error otherwise.
  7. Field-access-expression: Let p.x be the field access expression, and let z be the field obtained by name resolution. Then this expression's type is same as z's declared type, provided: This expressions type is error if name resolution had errors or the above conditions do not hold.
  8. Method-call-expression: Let p.f(e1, ..., en) be the method access expression, and let h be the method obtained by name resolution. Then this expression's type is same as h's declared return type, provided: This expressions type is error if name resolution had errors or the above conditions do not hold.
  9. New-object-expression: Let A(e1,...,en) be the constructor and arguments in this expression. The type of this expression is user(A) if name resolution succeeds.
  10. This-expression: has type user(A) where A is the current class.
  11. Super-expression: has type user(B) if B is the immediate superclass of the current class; and error if the current class has no superclass.
  12. Class-reference-expression: has type class-literal(A) where A is the literal class name in this expression.
  13. Array-access-expression:504 only has type T if the base expression's type is array(T) and the index expression's type is int; has type error otherwise.
  14. New-array-expression:504 only Let T be the base type and e1,...,en be the sequence of array dimensions. Then, this expression has type array(array(...repeated n times (T)...)), provided each expression ei is of type int. This expression has type error if any of the dimension expressions is not of int type.

Name Resolution:

Decaf resolves overloaded names statically, at compile-time, using type information.

Note that all local variables are resolved while constructing the AST. The only names that are not resolved at that time are the fields, methods and constructors.

Fields:

Consider a field access expression of the form e.x for some base expression e.

Methods:Simpler for 304

Consider a field access expression of the form e.f(e1,...,en) for some base expression e.

Let the types of e1,...,en be T1,...,Tn, and all the argument expressions are type correct.

For CSE 304 students: note that you do not have to deal with method overloading. Hence, you need to find the applicable method, and if successful, whether it is accessible.

Constructors:Simpler for 304

Consider a new object creation expression of the form A(e1,...,en) for some class A.

Let the types of e1,...,en be T1,...,Tn, and all the argument expressions are type correct.

The constructor resolved for this expression is the constructor with id j declared in class A such that:

For CSE 304 students: note that you do not have to deal with constructor overloading. Hence, you need to find the applicable constructor, and if successful, whether it is accessible.

Additional error checks:504 only

While performing type checking, when an error is encountered, you are expected to print out information about the error (e.g. "Type error at line... int expression found when boolean is expected."). In addition, you should detect the following errors related to overloaded methods and constructors.

Note that there may be multiple methods (and, of course, constructors) with the same name defined in a single class. For overloaded methods/constructors, we will now impose an additional constraint. For every pair of methods with the same name (or constructors) defined in a single class:

  1. Let T1 → S1 and T2 → S2 be their signatures. Then either T1 is a strict subclass of T2 or T2 is a strict subclass of T1.

    Here, we say that T is a strict subclass of T' if T is a subclass of T', and T is not identical to T'.

Your type checker is expected to check the above constraint and report any errors.

Printing the AST

AST printing will follow the same form as in HW #3 with the following changes:

Initialization

Same as in HW #3.

CSE 504 vs CSE 304

Project Path

  1. Start with a working AST builder.
  2. Assume no method invocations, constructors or field accesses; Write a type checker that works on this restricted subset of programs. You may want to work with even smaller subsets than these.
  3. Proceed by adding field accesses first, and finally method invocations and constructors.

Deliverables

Your front-end for a Decaf compiler should be organized into the following files:
  1. Lexer:    decaflexer.py, PLY/lex scanner specification file. (unchanged from HW2)
  2. Parser:    decafparser.py, PLY/yacc parser specification file. (unchanged from HW3)
  3. AST:    ast.py, table and class definitions for Decaf's AST (modified from HW3 to add type-related fields and definitions).
  4. Type Checker:    typecheck.py, definitions for evaluating the type constraints and for name resolution.

    Note: This may be folded into ast.py as long as readability is maintained.

  5. Main Top-Level:    decafch.py, containing the main python function that ties the modules together, takes command line argument of input file name, and processes the Decaf program in that file.
  6. Documentation:    README.txt which documents the contents of all the other files.
We will be looking for a submission with the five/six files specified above, with the specified names. Deviating from these specifications may entail loss of points.

Grading

This maximum score on this assignment will be 40, divided as follows: 2 points for correct type inference for simple expressions (those that do not have other expressions as parts), 4 points for other simple expressions (binary/unary/auto) 8 points for correct resolution of field names and typechecking of field expressions; 8 points for correct resolution of method names and typechecking of method invocations; 4 points for correct resolution of constructors and typechecking of new object creation expressions; 8 points for correct typechecking of other expressions and statements; 4 points for error reporting; and 2 points for documentation/program readability.

As usual, complicated programs may not get the full grade unless accompanied by comments documenting it.

Submission

Submit your homework using Git, as done for HW1, before midnight of Mon., April 4.

We will assume that HW4 will be submitted by the same teams that submitted HW3. If there is any change in the team structure, you must inform the instructor the Friday before the submission deadline (Fri. April 1) and get the submission repositories for the new teams set up.

You may submit the homework multiple times; the last submission will be graded.

Change Log


C. R. Ramakrishnan
Last modified: Tue Mar 29 20:10:00 EDT 2016