The Cache-oblivious Gaussian Elimination Paradigm: Theoretical Framework, Parallelization and Experimental Evaluation

Rezaul Alam Chowdhury and Vijaya Ramachandran

Proceedings of the 19th Annual ACM Symposium on Parallelism in Algorithms and Architectures (SPAA 2007), San Diego, California, pp. 71-80, 2007


The Gaussian Elimination Paradigm (GEP) was introduced by the authors in [Chowdhury-Ramachandran, SODA'06] to represent the triply-nested loop computation that occurs in several important algorithms including Gaussian elimination without pivoting and Floyd-Warshall's all-pairs shortest paths algorithm. An efficient cache-oblivious algorithm for these instances of GEP was presented in [Chowdhury-Ramachandran, SODA'06]. In this paper we establish several important properties of this cache-oblivious framework, and extend the framework to solve GEP in its full generality within the same time and I/O bounds. We then analyze a parallel implementation of the framework and its caching performance for both shared and distributed caches. We present extensive experimental results for both in-core and out-of-core performance of our algorithms. We consider both sequential and parallel implementations of our algorithms, and compare them with finely-tuned cache-aware BLAS code for matrix multiplication and Gaussian elimination without pivoting.

Our results indicate that cache-oblivious GEP offers an attractive trade-off between efficiency and portability.

Download (copyright restrictions may apply): PSPDF