Annie Liu's CSE675: Program Transformation and Program Analysis (Fall `00)

CSE675 (Fall `00)
Program Transformation and Program Analysis

[ General Information | Course Outline | Lectures | Handouts | Readings | Systems ]
[What's new]


General Information

Course description: This course is for students interested in advanced program manipulation techniques and supporting tools. The topics focus on a central problem in computer science: the construction and analysis of correct and efficient computer programs. We will concentrate on systematic methods for problem solving and program construction. The course will present methods and techniques for program transformation and supporting program analysis, as well as program manipulation systems.

This will include a survey of current systems in this area, some of which apply directly to the year 2000 problem and beyond. Students will implement program analyses and transformations and experiment with advanced programming environments and program transformation systems.

Prerequisites: a programming language course, a compiler course, or instructor's permission, and a working knowledge of C, or a functional language like Lisp, Scheme, or ML. | Credits: 3

Instructor: Annie Liu | Email: liu@cs.sunysb.edu | Office: CS Building 1433, 632-8463

Hours: Wed. 12-2:30PM, in CS 2212 | Office Hours: after class or by appointment


Course Outline

We will go right into program transformations. This starts with basic transformation rules, such as unfold and fold, as well as simplifications based on algebraic properties. Then, we discuss composition / fusion / deforestation, partial evaluation / mixed computation, staging transformation, and generalized specialization. This smoothly continues with incrementalization, memoization and tabulation, and promotion and accumulation. Continuing this, we discuss three extremely powerful transformations involving sets and relations: finite differencing of set operations, dominated convergence of fixed-point expressions, and data structure selection by real-time simulation using based representation. We will also discuss other related transformations for translation and optimization.

We will describe supporting program analysis as they are needed. This includes forward and backward analyses based on control and data dependencies. We will discuss methods based on abstract interpretation, domain projections, set constraints, and types, for analyzing static data, dead code, etc. and for analyses involving recursive data, arrays, pointers, etc.

We will also talk about program manipulation systems and implementation techniques as we use them. Such systems include powerful program analysis and transformation systems as well as advanced programming environments and environment generators.

There are weekly assignments, some of which require programming. There will be no exams, but there will be a relatively large project to be completed by the end of the semester.


Lectures

Lecture 1 (09/05/00): Overview. Program construction and problem solving; program transformation and program analysis; tools and applications; two examples: longest common subsequence and image blurring.
Lecture 2 (09/07/00): Problems in programming. Clarity and simplicity vs efficiency and language-specific details; good functional and object-oriented programming; a simple list processing example. Lectures 1-2 notes
Exercises: Handout E1. Reading: Chapter 5 of the textbook Introduction to Algorithms by Cormen, Leiserson, and Rivest.

Lecture 3 (09/13/00): High-level problem description. Sets and maps; graph reachability problem; specifications using loops, fixed-point expressions, and inference rules.
Lecture 4 (09/13/00): Systematically deriving efficient solutions. Finite differencing: incrementally maintaining results of expensive computations; real-time simulation: data structure selection; reachability example.
Exercises: Handout E2. Reading: Everything you can find for interesting problems.

Lecture 5 (09/27/00): Basic rules and strategies for transforming recursive functions. Definition, unfold/fold, instantiation/abstraction, algebraic laws, classical examples.
Lecture 6 (09/27/00): Introducing vs eliminating intermediate values. Tupling (memoization, tabulation, promotion and accumulation, ...), call graphs and cuts; deforestation (composition, fusion, stream processing, ...), treeless forms and termination.
Exercises: Handout E3. Reading: Handout Burstall and Darlington papers.

Lecture 7 (10/04/00): Partial evaluation and program specialization. Concepts and history, examples, issues and techniques, more applications.
Lecture 8 (10/04/00): Binding-time analysis. Program analysis overview; abstract interpretation: abstract domain, analysis equations, fixed-point iteration.
Exercises: Handout E4. Reading: Online Jones paper.

Lecture 9 (10/18/00): Incrementalization. Incremental programs; using previous results, caching intermediate results, discovering auxiliary information; examples.
Lecture 10 (10/18/00): Dead-code analysis. Grammar constraint based analysis; liveness patterns, constructing constraints, simplifying constraints.
Exercises: Handout E5. Reading: Online Liu paper.

Lecture 11 (10/25/00): Program optimization by incrementalization. Relation to partial evaluation, tupling, and fusion; optimizing recursive functions and identifying minimum increment operations.
Lecture 12 (10/25/00): Dynamic programming examples. Identify a minimum increment operation, cache-inc-prune, form an optimized program; longest common subsequence, hanoi tower, etc.
Exercises: Handout E6. Reading: Online slides for an ESOP'99 paper.

Lecture 13 (11/01/00): More on optimizing recursive functions. More on auxiliary information for dynamic programming; examples: 0-1 knapsack, car painting.
Lecture 14 (11/01/00). Transforming recursion to iteration. Basic method by incrementalization; multiple base cases, recursive cases, and recursive calls; stack and pointer reversal; examples.
Exercises: Handout E7. Reading: Online slides for a PEPM'00 paper.

Lecture 15 (11/08/00): Optimizing loops. Strength reduction of expensive primitives in loops, incrementalizing the loop body.
Lecture 16 (11/08/00): Associated optimizations of loops. Folding initialization, replacing termination test, minimizing loop body.
Reading: Online slides for an IFIP'97 paper.

Lecture 17 (11/15/00): Optimizing aggregate array computations. Identify aggregate array computations and update operations, remove from decrement set and add from increment set, form optimized loop.
Lecture 18 (11/15/00): Discussions and more examples. Aggregate area of arbitrary shapes, parallelism, data locality, usage based caching vs cache-and-prune, making computation incremental.
Exercises: Handout E8. Reading: Online slides for an ICCL'98 paper.

Lecture 19 (11/22/00): Programming with sets and maps. SETL, set and map operations, general set formers, other set-related expressions and statements.
Lecture 20 (11/22/00): Examples. Tree center, topological sort, graph reachability; clear, succinct, but inefficient programs.
Exercises: Handout E9. Reading: The SETL2 Programming Language.

Lecture 21 (11/29/00): Finite differencing of set exrepssions. Basic ideas; derivative: pre and post; derivative w.r.t. code block; sequential composition, chain rule.
Lecture 22 (11/29/00): Discussion and auxiliary maps. Pre vs post derivatives; strict operations; discontinuity removal / iterator inversion using auxiliary expressions.
Exercises: Install SETL2. Reading: Handout Paige and Koenig paper (pages 402-423, Appendix B).

Lecture 23 (12/6/00): Finite differencing examples. Canonical forms, expensive computations and auxiliary expressions, use of chain rule and derivatives.
Lecture 23 (12/6/00): Loop optimization using finite differencing. Initialization, vertical and horizontal fusion; eliminating dead code.
Exercises: Handout E10. Reading: Handout Paige and Koenig paper (pages 429-442, Appendix A).

Lecture 25 (12/13/00): Data structure selection. Real-time simulation of set operations on a RAM; problems with hashing and hidden copy; classes of operations; based representations.
Lecture 26 (12/13/00) Real-time simulation by type inference. Types and constraints; associating representations with types; inferring types based on operations; inferring values. APTS, KIDS, Cachet.
Exercises: Handout E11. Reading: Online Paige papers, notes by Paige.


Handouts

Handout Q: Questionnaire

Handout E1: Exercise 1: Writing, Reading, and Designing Programs
Handout E2: Exercise 2: Testing and Judging Programs; Getting Interesting Problems
Handout E3: Exercise 3: Getting Interesting Problems; Refining Problem Descriptions
Handout E4: Exercise 4: Partial Evaluation and Program Specialization
Handout E5: Exercise 5: Incremental Programs and Incrementalization
Handout E6: Exercise 6: Caching, Memoization, Tabulation, Dynamic Programming
Handout E7: Exercise 7: More on Dynamic Programming, Recursion to Iteration
Handout E8: Exercise 8: Optimizing Aggregate Array Computations in Loops
Handout E9: Exercise 9: Programming with Sets and Maps
Handout E10: Exercise 10: Finite Differencing of Set Expressions
Handout E11: Exercise 11: Data Structure Selection, APTS, and Project Report

Handout S4: Solution 4
Handout S7: Solution 7
Handout S8: Solution 8
Handout S9: Solution 9
Handout S10: Solution 10


Readings

Basic transformations

A Transformation System for Developing Recursive Programs, by R. M. Burstall and John Darlington
Journal of the ACM, 21(8):613-641, 1978.

An Introduction to Partial Evaluation, by Neil D. Jones
ACM Computing Surveys, 28(3):480-503, 1996. (pdf)

Efficiency by Incrementalization: An Introduction, by Y. Annie Liu
Higher-Order and Symbolic Computation, 13(4):289-313, 2000. (ps.gz | ps)

Optimizing recursive functions and recursive data structures

Dynamic Programming via Static Incrementalization, by Y. Annie Liu and Scott Stoller
Proceedings of the 8th European Symposium on Programming (ESOP), pages 288-305, March 1999. (slides.ps.gz | slides.ps)

From recursion to iteration: what are the optimizations? by Y. Annie Liu and Scott Stoller
Proceedings of the ACM SIGPLAN 2000 Workshop on Partial Evaluation and Semantics-Based Program Manipulation (PEPM), pages 73-82, January 2000. (slides.ps.gz | slides.ps)

Optimizing loops and arrays

Principled Strength Reduction, by Y. Annie Liu
Algorithmic Languages and Calculi, Chapter 14, pages 357-381, Chapman & Hall, 1997. (slides.ps.gz | slides.ps)

Loop optimization for aggregate array computations, by Y. Annie Liu and Scott Stoller
Proceedings of the 1998 IEEE International Conference on Computer Languages (ICCL), pages 262-271, May 1998. (slides.ps.gz | slides.ps)

Optimizing sets and relations

The SETL2 Programming Language, by W. Kirk Snyder (with no OO features here)
Technical Report 490, Courant Institute, New York University, September 1990. (ps, just use the second part as manual)

Finite differencing of computable expressions, by Robert Paige and Shaye Koenig
ACM Transactions on Programming Languages and Systems, 4(3):402-454, 1982.

Real-time simulation of a set machine on a RAM, by Robert Paige
Computing and Information, Vol. II, pages 69-73, 1989. (ps | notes by Paige)

Viewing a program transformation system at work, by Robert Paige
Proceedings of the Joint 6th Intl. Conf. on Programming Language Implementation and Logic Programming (PLILP) and 4th Intl. Conf. on Algebraic and Logic Programming (ALP), pages 5-24, 1994. (ps)


Systems

Similix: A self-applicable partial evaluator for Scheme

Tempo Specializer: A partial evaluator for C

The Synthesizer Generator: A language-based editing-environment generator ("demo")

Cachet: A program transformation and analysis system for incrementalization (yet to be installed)

SETL2: A set-based programming language (guide to installation)

New APTS: An automatic program transformation system

Old APTS (guide to local installation, usage, examples)


liu@cs.sunysb.edu