# Incrementalization-

## Objectives

• We are engaged in an ambitious effort to derive incremental programs automatically (or semi-automatically) from non-incremental programs written in standard programming languages. This approach contrasts with earlier approaches that aimed to incrementally evaluate non-incremental programs.

• In essence, every program computes by fixed-point iteration, expressed as recursive functions or loops. This is why loop optimizations are so important. A loop body can be regarded as a program f parameterized by an induction variable x that is incremented on each iteration by a change operation +. Efficient iterative computation relies on effective use of state, i.e., computing the result of each iteration using stored results of previous iterations. This is why strength reduction and related techniques are crucial for performance.

• Given a program f and an input change operation +, a program f' that computes f(x+y) efficiently by using the result of the previous computation of f(x) is called an incremental version of f under +. Sometimes, information other than the result of f(x) needs to be maintained and used for efficient incremental computation of f(x+y). We call a function that computes such information an extended version of f. Thus, the goal of computing loops efficiently corresponds to constructing an extended version of a program f and deriving an incremental version of the extended version under an input change operation +.

• In general, incremental computation aims to solve a problem on a sequence of inputs that differ only slightly from one another, making use of the previously computed output in computing a new output, instead of computing the new output from scratch. Incremental computation is a fundamental issue relevant throughout computer software, e.g., optimizing compilers, transformational program development, and interactive systems.

## Results

• Thus far, we have partitioned the problem of deriving incremental programs into three subproblems:

• P1. Exploiting the result of f(x), i.e., the return value of f(x).

• P2. Caching, exploiting, and maintaining intermediate results of f(x), i.e., values computed in the middle of computing f(x).

• P3. Discovering, computing, exploiting, and maintaining auxiliary information of f(x), i.e., information not computed at all f(x), that can be inexpensively maintained.

We summarize here the essence of our methods:

• P1. In "Systematic Derivation of Incremental Programs", we gave a general systematic transformational approach for deriving an incremental version f' of a program f under an input change +. The basic idea is to identify in the computation of f(x+y) those subcomputations that are also performed in the computation of f(x) and whose values can be retrieved from the cached result r of f(x). The computation of f(x+y) is symbolically transformed to avoid re-performing these subcomputations by replacing them with corresponding retrievals. This efficient way of computing f(x+y) is captured in the definition of f'(x,y,r).

• P2. In "Caching Intermediate Results for Program Improvement", we gave a method, called cache-and-prune, for statically transforming programs to cache all intermediate results useful for incremental computation. The basic idea is to

• I. extend the program f to a program f-bar that returns all intermediate results,

• II. incrementalize the program f-bar under + to obtain an incremental version f-bar' of f-bar using our method for P1,

• III. analyze the dependencies in f-bar', then prune the extended program f-bar to a program f-hat that returns only the useful intermediate results, and prune the program f-bar' to obtain a program f-hat' that incrementally maintains only the useful intermediate results.

• P3. In "Discovering Auxiliary Information for Incremental Computation", we proposed a approach for finding auxiliary information. Auxiliary information is, by definition, useful information about x that is not computed by f(x). Where, then, can one find it? The key insight of this approach is:

• A. Consider, as candidate auxiliary information for f, all intermediate computations of an incremental version of f that depend only on x; such an incremental version can be obtained using some techniques we developed for solving P1 and P2. (We use techniques developed for solving P1 and P2, instead of just P1, so that the candidate auxiliary information includes auxiliary information useful for efficiently maintaining the intermediate results.)

How can one discover which pieces of candidate auxiliary information are useful and how they can be used? We proposed:

• B. Extend f with all candidate auxiliary information, then apply some techniques used in our methods for P1 and P2 to obtain an extended version and an incremental extended version that together compute, exploit, and maintain only useful intermediate results and useful auxiliary information.

• Thus, on the one hand, one can regard the method for P2 as an extension to method for P1; on the other hand, one can regard method for P1 as aids for solving P2. Similarly, on the one hand, one can regard the method for P3 as an extension to methods for P1 and P2; on the other hand, one can regard methods for P1 and P2 as aids for solving P3. The modular components complement one another to form a comprehensive principled approach for incremental computation and therefore also for efficient iterative computation generally. Although the entire approach seems complex, each module or step is simple.

• In "CACHET: An Interactive, Incremental-Attribution-Based Program Transformation System For Deriving Incremental Programs" we describe our prototype implementation of these ideas.

Y. Annie Liu yanhong@cs.cornell.edu Last updated 6/29/96