Experimental Evaluation of a Cache-Oblivious LCS Algorithm

Rezaul Alam Chowdhury

Technical Report TR-05-43
The University of Texas at Austin, Department of Computer Sciences, Oct. 2005, 12 pages


We present the results of an extensive computational study of an I/O-optimal cache-oblivious LCS (longest common subsequence) algorithm developed by Chowdhury and Ramachandran. Three variants of the algorithm were implemented (CO denoting the fastest variant) along with the widely used linear-space LCS algorithm by Dan Hirschberg (denoted Hi). Both algorithms were tested on both random and real-world (CFTR DNA) sequences consisting upto 2 million symbols each, and timing and caching data were obtained on three state-of-the-art architectures (Intel Xeon, AMD Opteron and SUN UltraSparc-III+). In our experiments:

  • CO ran a factor of 2 to 6 times faster than Hi.
  • Hi incurred upto 4,000 times more L1 cache misses and upto 30,000 times more L2 cache misses than CO when run on pairs of sequences consisting of 1 million symbols each.
  • CO executed 40%-50% fewer machines instructions than Hi.
  • Unlike Hi, CO was able to conceal the effects of caches on its running time; its actual running time could be predicted quite accurately from its theoretical time complexity.
  • CO was less sensitive to alphabet size than Hi.

These results suggest that CO and the algorithmic technique employed by CO can be of practical use in many fields including sequence alignment in computational biology.

Download: PSPDF