TY - JOUR
T1 - NetTurnP - Neural Network Prediction of Beta-turns by Use of Evolutionary information and predicted Protein Sequence Features
AU - Petersen, Bent
AU - Lundegaard, Claus
AU - Petersen, Thomas Nordahl
PY - 2010
Y1 - 2010
N2 - β-turns are the most common type of non-repetitive structures, and constitute on average 25% of the amino acids in proteins. The formation of β-turns plays an important role in protein folding, protein stability and molecular recognition processes. In this work we present the neural network method NetTurnP, for prediction of two-class β-turns and prediction of the individual β-turn types, by use of evolutionary information and predicted protein sequence features. It has been evaluated against a commonly used dataset BT426, and achieves a Matthews correlation coefficient of 0.50, which is the highest reported performance on a two-class prediction of β-turn and not-β-turn. Furthermore NetTurnP shows improved performance on some of the specific β-turn types. In the present work, neural network methods have been trained to predict β-turn or not and individual β-turn types from the primary amino acid sequence. The individual β-turn types I, I', II, II', VIII, VIa1, VIa2, VIba and IV have been predicted based on classifications by PROMOTIF, and the two-class prediction of β-turn or not is a superset comprised of all β-turn types. The performance is evaluated using a golden set of non-homologous sequences known as BT426. Our two-class prediction method achieves a performance of: MCC = 0.50, Qtotal = 82.1%, sensitivity = 75.6%, PPV = 68.8% and AUC = 0.864. We have compared our performance to eleven other prediction methods that obtain Matthews correlation coefficients in the range of 0.17 - 0.47. For the type specific β-turn predictions, only type I and II can be predicted with reasonable Matthews correlation coefficients, where we obtain performance values of 0.36 and 0.31, respectively. Conclusion: The NetTurnP method has been implemented as a webserver, which is freely available at http://www.cbs.dtu. dk/services/NetTurnP/. NetTurnP is the only available webserver that allows submission of multiple sequences.
AB - β-turns are the most common type of non-repetitive structures, and constitute on average 25% of the amino acids in proteins. The formation of β-turns plays an important role in protein folding, protein stability and molecular recognition processes. In this work we present the neural network method NetTurnP, for prediction of two-class β-turns and prediction of the individual β-turn types, by use of evolutionary information and predicted protein sequence features. It has been evaluated against a commonly used dataset BT426, and achieves a Matthews correlation coefficient of 0.50, which is the highest reported performance on a two-class prediction of β-turn and not-β-turn. Furthermore NetTurnP shows improved performance on some of the specific β-turn types. In the present work, neural network methods have been trained to predict β-turn or not and individual β-turn types from the primary amino acid sequence. The individual β-turn types I, I', II, II', VIII, VIa1, VIa2, VIba and IV have been predicted based on classifications by PROMOTIF, and the two-class prediction of β-turn or not is a superset comprised of all β-turn types. The performance is evaluated using a golden set of non-homologous sequences known as BT426. Our two-class prediction method achieves a performance of: MCC = 0.50, Qtotal = 82.1%, sensitivity = 75.6%, PPV = 68.8% and AUC = 0.864. We have compared our performance to eleven other prediction methods that obtain Matthews correlation coefficients in the range of 0.17 - 0.47. For the type specific β-turn predictions, only type I and II can be predicted with reasonable Matthews correlation coefficients, where we obtain performance values of 0.36 and 0.31, respectively. Conclusion: The NetTurnP method has been implemented as a webserver, which is freely available at http://www.cbs.dtu. dk/services/NetTurnP/. NetTurnP is the only available webserver that allows submission of multiple sequences.
KW - Algorithms
KW - Amino Acid Sequence
KW - Computational Biology/methods
KW - Evolution, Molecular
KW - Internet
KW - Molecular Sequence Data
KW - Neural Networks (Computer)
KW - Protein Structure, Secondary
KW - Proteins/chemistry
KW - Reproducibility of Results
U2 - 10.1371/journal.pone.0015079
DO - 10.1371/journal.pone.0015079
M3 - Journal article
C2 - 21152409
SN - 1932-6203
VL - 5
JO - PLoS ONE
JF - PLoS ONE
IS - 11
M1 - e15079
ER -