A NOVEL METHOD FOR REDUCING COMPUTATIONAL COMPLEXITY OF WHOLE GENOME SEQUENCE ALIGNMENT
Genomic sequence alignment is a powerful tool for finding common subsequence patterns shared by the input sequences and identifying evolutionary relationships between the species. However, the running time and space requirement of genome alignment have often been very extensive. In this research, we propose a novel algorithm called Coarse-Grained AlignmenT (CGAT) algorithm, for reducing computational complexity necessary for cross-species whole genome sequence alignment. The CGAT first divides the input sequences into “blocks” with a fixed length and aligns these blocks to each other. The generated block-level alignment is then refined at the nucleotide level. This two-step procedure can drastically reduce the overall computational time and space necessary for an alignment. In this paper, we show the effectiveness of the proposed algorithm by applying it to whole genome sequences of several bacteria.