COMPUTER ANALYSIS AND RECOGNITION OF FUNCTIONAL SITES VIA OLIGONUCLEOTIDE PATTERN DISTRIBUTIONS
The problems of functional site analysis and recognition are considered in this article. The methods for selection of context features important for site functioning and recognition via interactive and automatic analysis of nucleotide sequences of functional sites are described. The first method, which is based on the theory of utility, is applied for generation and estimation of hypothesis on functional site context features with a high level of recognition ability and specificity. The second method permits us to reveal nonrandom patterns of short oligonucleotide distribution within the functional sites.
The construction of methods for functional site recognition using the revealed context features are also presented. Two different approaches will be described: “Site-Video” system for functional site analysis and recognition, and the method of large scale oligonucleotide distributions. By using these approaches, recognition programs for splice sites and promoters of eukaryotes are constructed. The mean error of recognition is from 10 to 15% for the functional sites studied.