We used gprof, opensource gnu c profiler for profiling shorty
assembler and finding botlenecks in the code. Bottle necks from gmon
output are shown below.
Click
here to view the performance results output.
We have optimized the performance of the above methods by in-lining the functions called from these functions. One such example is returning the base of the corresponding alphabet. Other steps where program is lagging is printing the contigs to standar I/O and additional information produced during the assembly like bambus-contig file. We have added debug statemnets to the code to print these values only when debug option is set to on. This value can be configured in the config file.
In addition to the above optimizations, there is scope for more optimizations. We also observed that most of the time is spent in the below steps.
1. writing the contig output and the geography information of the contig.
2. Currently the code takes only one seed, it can be further improved by providing more seeds to the assembler.