Abstract
β-turns are the most common type of non-repetitive structures, and constitute on average 25% of the amino acids in proteins. The formation of β-turns plays an important role in protein folding, protein stability and molecular recognition processes. In this work we present the neural network method NetTurnP, for prediction of two-class β-turns and prediction of the individual β-turn types, by use of evolutionary information and predicted protein sequence features. It has been evaluated against a commonly used dataset BT426, and achieves a Matthews correlation coefficient of 0.50, which is the highest reported performance on a two-class prediction of β-turn and not-β-turn. Furthermore NetTurnP shows improved performance on some of the specific β-turn types. In the present work, neural network methods have been trained to predict β-turn or not and individual β-turn types from the primary amino acid sequence. The individual β-turn types I, I', II, II', VIII, VIa1, VIa2, VIba and IV have been predicted based on classifications by PROMOTIF, and the two-class prediction of β-turn or not is a superset comprised of all β-turn types. The performance is evaluated using a golden set of non-homologous sequences known as BT426. Our two-class prediction method achieves a performance of: MCC = 0.50, Qtotal = 82.1%, sensitivity = 75.6%, PPV = 68.8% and AUC = 0.864. We have compared our performance to eleven other prediction methods that obtain Matthews correlation coefficients in the range of 0.17 - 0.47. For the type specific β-turn predictions, only type I and II can be predicted with reasonable Matthews correlation coefficients, where we obtain performance values of 0.36 and 0.31, respectively. Conclusion: The NetTurnP method has been implemented as a webserver, which is freely available at http://www.cbs.dtu. dk/services/NetTurnP/. NetTurnP is the only available webserver that allows submission of multiple sequences.
Original language | English |
---|---|
Article number | e15079 |
Journal | PLoS ONE |
Volume | 5 |
Issue number | 11 |
Number of pages | 9 |
ISSN | 1932-6203 |
DOIs | |
Publication status | Published - 2010 |
Externally published | Yes |
Keywords
- Algorithms
- Amino Acid Sequence
- Computational Biology/methods
- Evolution, Molecular
- Internet
- Molecular Sequence Data
- Neural Networks (Computer)
- Protein Structure, Secondary
- Proteins/chemistry
- Reproducibility of Results