The Cache-Oblivious Gaussian Elimination Paradigm: Theoretical Framework and Experimental Evaluation

Rezaul Alam Chowdhury and Vijaya Ramachandran

Technical Report TR-06-04
The University of Texas at Austin, Department of Computer Sciences, Mar. 2006, 36 pages

The cache-oblivious Gaussian Elimination Paradigm (GEP) was introduced by the authors in an earlier paper to obtain efficient cache-oblivious algorithms for several important problems that have algorithms with triply-nested loops similar to those that occur in Gaussian elimination. These include Gaussian elimination and LU-decomposition without pivoting, all-pairs shortest paths and matrix multiplication. In this paper, we prove several important properties of the cache-oblivious framework for GEP given earlier, which we denote by I-GEP. We build on these results to obtain C-GEP, a completely general cache-oblivious implementation of GEP that applies to any code in GEP form, and which has the same time and I/O bounds as the earlier algorithm, while using a modest amount of additional space. We present an experimental evaluation of the caching performance of I-GEP and C-GEP in relation to the traditional Gaussian elimination algorithm. Our experimental results indicate that I-GEP and C-GEP outperform GEP on inputs of reasonable size, with dramatic improvement in running time over GEP when the data is out of core. `Tiling', an important loop transformation technique employed by optimizing compilers in order to improve temporal locality in nested loops, is a cache-aware method that does not adapt to all levels in a multi-level memory hierarchy. The cache-oblivious GEP framework (either I-GEP or C-GEP) produces system-independent I/O-efficient code for triply nested loops of the form that appears in Gaussian elimination without pivoting, and is potentially applicable to being used by optimizing compilers for loop transformation.

Download: PSPDF