Research PapersNo Access

APPROXIMATE MATCHING OF STRUCTURED MOTIFS IN DNA SEQUENCES

Département d'informatique et de recherche opérationnelle, Université de Montréal, CP 6128 Succursale Centre-ville, Montréal, Québec H3C 3J7, Canada

Corresponding author.

Search for more papers by this author

MATHIEU RAFFINOT

CNRS — Equipe Génome et Informatique, Evry, and ENS, 46 rue d'Ulm, Paris, France

Search for more papers by this author

JEAN-EUDES DUCHESNE

Département d'informatique et de recherche opérationnelle, Université de Montréal, Canada

Search for more papers by this author

MATHIEU LAJOIE

Département d'informatique et de recherche opérationnelle, Université de Montréal, Canada

Search for more papers by this author

, and

NICOLAS LUC

GEREQ Inc.1450 City Councillors Bureau 790, Montréal, Québec H3A 2E6, Canada

Search for more papers by this author

https://doi.org/10.1142/S0219720005001065Cited by:2 (Source: Crossref)

Abstract

Several methods have been developed for identifying more or less complex RNA structures in a genome. All these methods are based on the search for conserved primary and secondary sub-structures. In this paper, we present a simple formal representation of a helix, which is a combination of sequence and folding constraints, as a constrained regular expression. This representation allows us to develop a well-founded algorithm that searches for all approximate matches of a helix in a genome. The algorithm is based on an alignment graph constructed from several copies of a pushdown automaton, arranged one on top of another. This is a first attempt to take advantage of the possibilities of pushdown automata in the context of approximate matching. The worst time complexity is O(krpn), where k is the error threshold, n the size of the genome, p the size of the secondary expression, and r its number of union symbols. We then extend the algorithm to search for pseudo-knots and secondary structures containing an arbitrary number of helices.

Keywords:

References

B. Billoud, M. Kontic and A. Viari, Nucleic Acids Res. 24(8), 1395 (1996). Crossref, Medline, Google Scholar
A. Cornish–Bowden, Nucleic Acids Res. 13, 3021 (1985). Crossref, Medline, Google Scholar
S. Eddy and R. Durbin, Nucleic Acids Res. 22, 2079 (1994). Crossref, Medline, Google Scholar
Eddy S. R., RNABOB: a program to search for RNA secondary structure motifs in sequence databases, http://bioweb.pasteur.fr/docs/man/man/rnabob.1.html#toc1, 1992 . Google Scholar
N. El-Mabrouk and F. Lisacek, J. Mol. Biol. 264, 46 (1996). Crossref, Medline, Google Scholar
N. El-Mabrouk and M. Raffinot, Proceedings of the Sixth Annual International Conference on Computational Molecular Biology (RECOMB) (ACM press, 2002) pp. 156–164. Crossref, Google Scholar
G. A. Fichant and C. Burks, J. Mol. Biol. 220, 659 (1991). Crossref, Medline, Google Scholar
C. Gaspinet al., J. Mol. Biol. 297, (2000). Crossref, Medline, Google Scholar
D. Gautheret, F. Major and R. Cedergren, Comput. Appl. Biosci. 6(4), 325 (1990). Medline, Google Scholar
S. Grafet al., Nucleic Acids Res. 29(1), 196 (2001). Crossref, Medline, Google Scholar
B. F. Langet al., Nature 387, 493 (1997). Crossref, Medline, Google Scholar
F. Lisacek, Y. Diaz and F. Michel, J. Mol. Biol. 235, 1206 (1994). Crossref, Medline, Google Scholar
T. M. Lowe and S. R. Eddy, Nucleic Acids Res. 25, 955 (1997). Crossref, Medline, Google Scholar
T. M. Lowe and S. R. Eddy, Science 283(5405), 1168 (1999). Crossref, Medline, Google Scholar
T. Mackeet al., Nucleic Acids Res. 29, 4724 (2001). Crossref, Medline, Google Scholar
E. W. Myers, Inform. Process. Lett. 54(2), 85 (1995). Crossref, Google Scholar
E. W. Myers, J. Mol. Biol. 3(1), 33 (1996). Google Scholar
E. W. Myers and W. Miller, Bull. Math. Biol. 51(1), 5 (1989). Crossref, Medline, Google Scholar
A. D. Omeret al., Science 288(5465), 517 (2000). Crossref, Medline, Google Scholar
E. Rivas and S. R. Eddy, Bioinformatics 16(4), 334 (2000). Crossref, Medline, Google Scholar
M. F. Sagot and A. Viari, Eighth Combinatorial Pattern Matching Conference, Lecture Notes in Computer Science 1264, eds. A. Apostolico and J. Hein (Springer, 1997) pp. 224–246. Crossref, Google Scholar
Y. Sakakibaraet al., Nucleic Acids Res. 22(23), 5112 (1994). Crossref, Medline, Google Scholar
P. H. Sellers, J. Algorithm 1, 359 (1980). Crossref, Google Scholar
M. Szymanskiet al., Nucleic Acids Res. 27(1), 158 (1999). Crossref, Medline, Google Scholar
K. Thompson, Comm. ACM 11, 419 (1968). Crossref, Google Scholar