AN ITERATIVE ALGORITHM TO QUANTIFY FACTORS INFLUENCING PEPTIDE FRAGMENTATION DURING TANDEM MASS SPECTROMETRY
Abstract
In protein identification by tandem mass spectrometry, it is critical to accurately predict the theoretical spectrum for a peptide sequence. To date, the widely-used database searching methods adopted simple statistical models for predicting. For some peptide, these models usually yield a theoretical spectrum with a significant deviation from the experimental one. In this paper, in order to derive an improved predicting model, we utilized a non-linear programming model to quantify the factors impacting peptide fragmentation. Then, an iterative algorithm was proposed to solve this optimization problem. Upon a training set of 1803 spectra, the experimental result showed a good agreement with some known principles about peptide fragmentation, such as the tendency to cleave at the middle of peptide, and Pro's preference of the N-terminal cleavage. Moreover, upon a testing set of 941 spectra, comparison of the predicted spectra against the experimental ones showed that this method can generate reasonable predictions. The results in this paper can offer help to both database searching and de novo methods.