World Scientific
Skip main navigation

Cookies Notification

We use cookies on this site to enhance your user experience. By continuing to browse the site, you consent to the use of our cookies. Learn More
×

System Upgrade on Tue, May 28th, 2024 at 2am (EDT)

Existing users will be able to log into the site and access content. However, E-commerce and registration of new users may not be available for up to 12 hours.
For online purchase, please visit us again. Contact us at customercare@wspc.com for any enquiries.

Chapter 8: Optimised Peptide Pattern Discovery

      https://doi.org/10.1142/9789811240126_0008Cited by:0 (Source: Crossref)
      Abstract:

      When analysing the patterns of protease cleavage sites or posttranslational modification sites based on peptide data, a key question is whether it is possible to discover interpretable and explainable as well visible rules by which how peptides are classified can be well-understood. A linear model benefits a better interpretation between experimental data sets such as peptides and peptide labels. However, the relationship between peptides used in either protease cleavage pattern discovery or posttranslational modification pattern discovery and peptide labels may not always be simple. Moreover, peptides are non-numerical data. On the other hand, most nonlinear models such as neural network models do not offer sufficient insight into data. The decision-tree algorithms or the random forest algorithms are capable of providing a better interpretation to a model. However, in order to discover the optimal models, an expensive exhaustive enumeration has to be considered. This is why the evolutionary computation approaches have provided a better way and have been well-employed in many areas for generating optimal or near optimal models with a better interpretation capability. This chapter will introduce a different type of machine learning approaches for this kind of biological pattern discovery. It is the genetic programming algorithm, which is one type of the evolutionary computation approaches. This chapter will introduce how the genetic programming algorithm can be used for discovering the interpretable rules for a peptide data set. This chapter will also show how the rules developed by the genetic programming models can interpret the residue interplay within peptides.