A SYNTACTIC PATTERN RECOGNITION SYSTEM FOR DNA SEQUENCES
We review both theoretical and practical results of a linguistic approach to studying the structure of features of DNA sequences. Using generative grammars, complex assemblages can not only be described and analyzed abstractly, but also concretely, such that features can be searched for by a general-purpose parser. Our parser, called GENLANG, uses an extended logic grammar formalism and has found features as complex as tRNA genes, group I introns, and protein-encoding genes, within input sequences on a genomic scale.