USE OF THE BACKPROPAGATION NEURAL NETWORK ALGORITHM FOR PREDICTION OF PROTEIN FOLDING PATTERNS
We have used a large scale backpropagation neural network simulator, BigNet, running on Cray 2 and X-MP machines, to learn, recall and predict protein structures from sequence. In the present study, we extended previous work with a revised training/testing set (20 training, 4 testing) and a more detailed analysis of BigNet's operation. We describe an enhanced training development environment and new methods, including training data preprocessing and sequence shifting, that maximize generalization and minimize artifacts in distance matrices produced by the neural network. The results demonstrate improved learning and generalization performance relative to previous reports. The trained network produced good predictions of distance matrices when presented with novel sequence data from proteins homologous to proteins in the training set.