World Scientific
Skip main navigation

Cookies Notification

We use cookies on this site to enhance your user experience. By continuing to browse the site, you consent to the use of our cookies. Learn More
×

System Upgrade on Tue, May 28th, 2024 at 2am (EDT)

Existing users will be able to log into the site and access content. However, E-commerce and registration of new users may not be available for up to 12 hours.
For online purchase, please visit us again. Contact us at customercare@wspc.com for any enquiries.

A LINGUISTIC INTEGRATION OF A BIOLOGICAL DATABASE

    https://doi.org/10.1142/9789814503655_0040Cited by:0 (Source: Crossref)
    Abstract:

    One of the major theoretical concerns associated with the Human Genome Project is that of the methodology to decipher “raw” sequences of DNA. This work is concerned with a subsequent problem, the one of how huge amounts of already deciphered information that will emerge in the near future can be integrated in order to enhance our biological understanding. The formal foundations for a linguistic theory of the regulation of gene expression will be discussed. The linguistic analysis presented here is restricted to sequences with known biological function since: i) there is no way to obtain, from DNA sequences alone, a regulatory representation of transcription units, and ii) the elements of substitution -methodologically equivalent to phonemes- are complete sequences of the binding sites of proteins.

    We have recently collected and analyzed the regulatory regions of a large number of E.coli promoters. The number of sigma 70 promoters studied may well represent the largest homogeneous body of knowledge of gene regulation at present. This collection is a data set for the construction of a grammar of the sigma 70 system of transcription and regulation. This grammatical model generates all the arrays of the collection, as well as novel combinations predicted to be consistent with the principles of the data set. This Grammar is testable, as well as expandable if the analysis of emerging data requires it. The elaboration of a linguistic methodology capable of integrating prokaryotic data constitutes a preliminary step towards the analysis and integration of the more complex eukaryotic systems of regulation.