World Scientific
Skip main navigation

Cookies Notification

We use cookies on this site to enhance your user experience. By continuing to browse the site, you consent to the use of our cookies. Learn More
×
Spring Sale: Get 35% off with a min. purchase of 2 titles. Use code SPRING35. Valid till 31st Mar 2025.

System Upgrade on Tue, May 28th, 2024 at 2am (EDT)

Existing users will be able to log into the site and access content. However, E-commerce and registration of new users may not be available for up to 12 hours.
For online purchase, please visit us again. Contact us at customercare@wspc.com for any enquiries.

Chapter 3: Protease Cleavage Pattern Discovery

      https://doi.org/10.1142/9789811240126_0003Cited by:0 (Source: Crossref)
      Abstract:

      A protein functions when it interacts with molecules or chemicals. Among many formats of interactions, protease cleavage is one of the widely researched subjects for several decades. This type of research aims to build up a predictive model based on collected laboratory data to discover novel interactions. Such a model is commonly established based on laboratory-verified protease cleavage data, in which the association knowledge between protease cleavage structure and protease cleavage function can be examined. A protease cleavage structure commonly means a primary sequence (or sub-sequence or peptide) which is believed to contain a specific amino acid composition pattern or trend in relationship with the protease cleavage function. In other words, the protease cleavage pattern must not show a random amino acid composition. Instead, the composition of the amino acids in a data set of protease cleaved peptides should demonstrate a trend for a specific protease to recognise for the interaction. To make a protease cleavage pattern discovery model to work efficiently, two types of peptides are collected and pooled together for constructing a model. They are the cleaved peptides and the non-cleaved peptides. Non-cleaved peptides definitely must have no trend of the amino acid composition at all. Instead, they must show random distribution of the amino acids. By the contrast comparison between a data set with a trend and a data set without any trend, a pattern by which two types of data can be discriminated can thus be discovered and can thus be formulated as the knowledge or the intelligence. The purpose of discovering such intelligence is, no doubt, for the prediction. The protease cleavage pattern discovery problem is a classification problem in machine learning. This chapter will introduce several classification analysis or discriminant analysis algorithms. This chapter will discuss how these algorithms can be used for the protease cleavage pattern discovery. Importantly, how to encode peptides so as to make encoded data biologically sound is the key to make a protease cleavage pattern discovery task successful. This chapter will therefore introduce the bio-basis function, a new and cutting-edge approach for encoding peptides. Based on the bio-basis function, several advanced algorithms will be introduced for the protease cleavage pattern discovery.