Program Transformation and Program Analysis

[ General Information
| Course Outline
| Lectures
| Handouts
| Readings
| Systems
]

[What's new]

**Course description:** This course is for students interested
in advanced program manipulation techniques and supporting tools. The
topics focus on a central problem in computer science: the
construction and analysis of correct and efficient computer programs.
We will concentrate on systematic methods for problem solving and
program construction. The course will present methods and techniques
for program transformation and supporting program analysis, as well as
program manipulation systems.

- Program transformation: source-to-source transformation, partial evaluation, incrementalization, finite differencing, data representation selection, other translation and optimization related techniques.
- Program analysis: control and data dependencies, abstract interpretation, constraint-based analysis, efficient analysis algorithms.
- Program manipulation systems and techniques: interactive programming environments, declarative frameworks, tree rewrite methods, other utilities.

**Prerequisites:** a programming language course, a compiler
course, or instructor's permission, and a working knowledge of C, or a
functional language like Lisp, Scheme, or ML. | **Credits:** 3

**Instructor:** Annie Liu |
**Email:** liu@cs.sunysb.edu |
**Office:** CS Building 1433, 632-8463

**Hours:** Wed. 12-2:30PM, in CS 2212 | **Office Hours:**
after class or by appointment

We will go right into program transformations. This starts with basic transformation rules, such as unfold and fold, as well as simplifications based on algebraic properties. Then, we discuss composition / fusion / deforestation, partial evaluation / mixed computation, staging transformation, and generalized specialization. This smoothly continues with incrementalization, memoization and tabulation, and promotion and accumulation. Continuing this, we discuss three extremely powerful transformations involving sets and relations: finite differencing of set operations, dominated convergence of fixed-point expressions, and data structure selection by real-time simulation using based representation. We will also discuss other related transformations for translation and optimization.

We will describe supporting program analysis as they are needed. This includes forward and backward analyses based on control and data dependencies. We will discuss methods based on abstract interpretation, domain projections, set constraints, and types, for analyzing static data, dead code, etc. and for analyses involving recursive data, arrays, pointers, etc.

We will also talk about program manipulation systems and implementation techniques as we use them. Such systems include powerful program analysis and transformation systems as well as advanced programming environments and environment generators.

There are weekly assignments, some of which require programming. There will be no exams, but there will be a relatively large project to be completed by the end of the semester.

**Lecture 3** (09/13/00): High-level problem description. Sets and
maps; graph reachability problem; specifications using loops,
fixed-point expressions, and inference rules.

**Lecture 4** (09/13/00): Systematically deriving efficient
solutions. Finite differencing: incrementally maintaining results of
expensive computations; real-time simulation: data structure selection;
reachability example.

**Exercises**: Handout E2. **Reading**: Everything you can find
for interesting problems.

**Lecture 5** (09/27/00): Basic rules and strategies for
transforming recursive functions. Definition, unfold/fold,
instantiation/abstraction, algebraic laws, classical examples.

**Lecture 6** (09/27/00): Introducing vs eliminating intermediate
values. Tupling (memoization, tabulation, promotion and accumulation,
...), call graphs and cuts; deforestation (composition, fusion, stream
processing, ...), treeless forms and termination.

**Exercises**: Handout E3.
**Reading**: Handout Burstall and Darlington papers.

**Lecture 7** (10/04/00): Partial evaluation and program
specialization. Concepts and history, examples, issues and
techniques, more applications.

**Lecture 8** (10/04/00): Binding-time analysis. Program analysis
overview; abstract interpretation: abstract domain, analysis equations,
fixed-point iteration.

**Exercises**: Handout E4.
**Reading**: Online Jones paper.

**Lecture 9** (10/18/00): Incrementalization. Incremental
programs; using previous results, caching intermediate results,
discovering auxiliary information; examples.

**Lecture 10** (10/18/00): Dead-code analysis. Grammar constraint
based analysis; liveness patterns, constructing constraints,
simplifying constraints.

**Exercises**: Handout E5.
**Reading**: Online Liu paper.

**Lecture 11** (10/25/00): Program optimization by incrementalization.
Relation to partial evaluation, tupling, and fusion; optimizing
recursive functions and identifying minimum increment operations.

**Lecture 12** (10/25/00): Dynamic programming examples. Identify
a minimum increment operation, cache-inc-prune, form an optimized program;
longest common subsequence, hanoi tower, etc.

**Exercises**: Handout E6. **Reading**: Online slides for an ESOP'99 paper.

**Lecture 13** (11/01/00): More on optimizing recursive
functions. More on auxiliary information for dynamic programming;
examples:

**Lecture 14** (11/01/00). Transforming recursion to iteration. Basic
method by incrementalization; multiple base cases, recursive cases,
and recursive calls; stack and pointer reversal; examples.

**Exercises**: Handout E7. **Reading**: Online slides for a PEPM'00 paper.

**Lecture 15** (11/08/00): Optimizing loops. Strength reduction of
expensive primitives in loops, incrementalizing the loop body.

**Lecture 16** (11/08/00): Associated optimizations of loops. Folding
initialization, replacing termination test, minimizing loop body.

**Reading**: Online slides for an IFIP'97 paper.

**Lecture 17** (11/15/00): Optimizing aggregate array computations.
Identify aggregate array computations and update operations, remove
from decrement set and add from increment set, form optimized loop.

**Lecture 18** (11/15/00): Discussions and more examples. Aggregate area
of arbitrary shapes, parallelism, data locality, usage based caching
vs cache-and-prune, making computation incremental.

**Exercises**: Handout E8. **Reading**: Online slides for an ICCL'98 paper.

**Lecture 19** (11/22/00): Programming with sets and maps.
SETL, set and map operations, general set formers, other set-related
expressions and statements.

**Lecture 20** (11/22/00): Examples. Tree center, topological sort,
graph reachability; clear, succinct, but inefficient programs.

**Exercises**: Handout E9. **Reading**: The SETL2 Programming Language.

**Lecture 21** (11/29/00): Finite differencing of set exrepssions.
Basic ideas; derivative: pre and post; derivative w.r.t. code block;
sequential composition, chain rule.

**Lecture 22** (11/29/00): Discussion and auxiliary maps. Pre vs
post derivatives; strict operations; discontinuity removal / iterator
inversion using auxiliary expressions.

**Exercises**: Install SETL2.
**Reading**: Handout Paige and Koenig paper (pages 402-423,
Appendix B).

**Lecture 23** (12/6/00): Finite differencing examples.
Canonical forms, expensive computations and auxiliary expressions, use
of chain rule and derivatives.

**Lecture 23** (12/6/00): Loop optimization using finite
differencing. Initialization, vertical and horizontal fusion;
eliminating dead code.

**Exercises**: Handout E10.
**Reading**: Handout Paige and Koenig paper (pages 429-442, Appendix A).

**Lecture 25** (12/13/00): Data structure selection. Real-time
simulation of set operations on a RAM; problems with hashing and
hidden copy; classes of operations; based representations.

**Lecture 26** (12/13/00) Real-time simulation by type inference.
Types and constraints; associating representations with types;
inferring types based on operations; inferring values. APTS, KIDS, Cachet.

**Exercises**: Handout E11.
**Reading**: Online Paige papers, notes by Paige.

Handout Q: Questionnaire

Handout E1: Exercise 1: Writing, Reading, and Designing Programs

Handout E2: Exercise 2: Testing and Judging Programs; Getting Interesting Problems

Handout E3: Exercise 3: Getting Interesting Problems;
Refining Problem Descriptions

Handout E4: Exercise 4: Partial Evaluation and Program Specialization

Handout E5: Exercise 5: Incremental Programs and Incrementalization

Handout E6: Exercise 6: Caching, Memoization, Tabulation, Dynamic Programming

Handout E7: Exercise 7: More on Dynamic Programming, Recursion to Iteration

Handout E8: Exercise 8: Optimizing Aggregate Array Computations in Loops

Handout E9: Exercise 9: Programming with Sets and Maps

Handout E10: Exercise 10: Finite Differencing of Set Expressions

Handout E11: Exercise 11: Data Structure Selection, APTS, and Project Report

Handout S4: Solution 4

Handout S7: Solution 7

Handout S8: Solution 8

Handout S9: Solution 9

Handout S10: Solution 10

**Basic transformations**

A Transformation System for Developing Recursive Programs,
by R. M. Burstall and John Darlington

Journal of the ACM, 21(8):613-641, 1978.

An Introduction to Partial Evaluation,
by Neil D. Jones

ACM Computing Surveys, 28(3):480-503, 1996.
(pdf)

Efficiency by Incrementalization: An Introduction,
by Y. Annie Liu

Higher-Order and Symbolic Computation, 13(4):289-313,
2000.
(ps.gz
| ps)

**Optimizing recursive functions and recursive data structures**

Dynamic Programming via Static Incrementalization,
by Y. Annie Liu and Scott Stoller

Proceedings of the 8th European
Symposium on Programming (ESOP), pages 288-305, March 1999.
(slides.ps.gz
| slides.ps)

From recursion to iteration: what are the optimizations?
by Y. Annie Liu and Scott Stoller

Proceedings of the ACM SIGPLAN
2000 Workshop on Partial Evaluation and Semantics-Based Program
Manipulation (PEPM), pages 73-82, January 2000.
(slides.ps.gz
| slides.ps)

**Optimizing loops and arrays**

Principled Strength Reduction, by Y. Annie Liu

Algorithmic Languages and Calculi, Chapter 14, pages 357-381, Chapman &
Hall, 1997.
(slides.ps.gz
| slides.ps)

Loop optimization for aggregate array computations, by Y. Annie Liu and Scott Stoller

Proceedings of the 1998 IEEE International Conference on Computer Languages (ICCL), pages 262-271, May 1998.
(slides.ps.gz
| slides.ps)

**Optimizing sets and relations**

The SETL2 Programming Language, by W. Kirk Snyder (with no OO features here)

Technical Report 490, Courant Institute, New York University, September 1990.
(ps,
just use the second part as manual)

Finite differencing of computable expressions, by Robert Paige and Shaye Koenig

ACM Transactions on Programming Languages and Systems, 4(3):402-454, 1982.

Real-time simulation of a set machine on a RAM, by Robert Paige

Computing and Information, Vol. II, pages 69-73, 1989.
(ps
| notes by Paige)

Viewing a program transformation system at work, by Robert Paige

Proceedings of the Joint 6th Intl. Conf. on Programming Language
Implementation and Logic Programming (PLILP) and 4th Intl. Conf. on Algebraic
and Logic Programming (ALP), pages 5-24, 1994.
(ps)

Similix:
A self-applicable partial evaluator for Scheme

Tempo Specializer: A
partial evaluator for C

The Synthesizer Generator: A language-based editing-environment generator
("demo")

Cachet: A program transformation and analysis system for incrementalization (yet to be installed)

SETL2: A set-based programming language
(guide to installation)

New APTS: An automatic program transformation system

Old APTS (guide to local installation, usage, examples)

liu@cs.sunysb.edu