TY - JOUR
T1 - CCHMM_PROF: a HMM-based coiled-coil predictor with evolutionary information
AU - Bartoli, Lisa
AU - Fariselli, Piero
AU - Krogh, Anders
AU - Casadio, Rita
N1 - Keywords: Computational Biology; Databases, Protein; Protein Conformation; Protein Interaction Mapping; Proteins; Software; Structure-Activity Relationship
PY - 2009
Y1 - 2009
N2 - MOTIVATION: The widespread coiled-coil structural motif in proteins is known to mediate a variety of biological interactions. Recognizing a coiled-coil containing sequence and locating its coiled-coil domains are key steps towards the determination of the protein structure and function. Different tools are available for predicting coiled-coil domains in protein sequences, including those based on position-specific score matrices and machine learning methods. RESULTS: In this article, we introduce a hidden Markov model (CCHMM_PROF) that exploits the information contained in multiple sequence alignments (profiles) to predict coiled-coil regions. The new method discriminates coiled-coil sequences with an accuracy of 97% and achieves a true positive rate of 79% with only 1% of false positives. Furthermore, when predicting the location of coiled-coil segments in protein sequences, the method reaches an accuracy of 80% at the residue level and a best per-segment and per-protein efficiency of 81% and 80%, respectively. The results indicate that CCHMM_PROF outperforms all the existing tools and can be adopted for large-scale genome annotation. AVAILABILITY: The dataset is available at http://www.biocomp.unibo.it/ approximately lisa/coiled-coils. The predictor is freely available at http://gpcr.biocomp.unibo.it/cgi/predictors/cchmmprof/pred_cchmmprof.cgi. CONTACT: [email protected].
AB - MOTIVATION: The widespread coiled-coil structural motif in proteins is known to mediate a variety of biological interactions. Recognizing a coiled-coil containing sequence and locating its coiled-coil domains are key steps towards the determination of the protein structure and function. Different tools are available for predicting coiled-coil domains in protein sequences, including those based on position-specific score matrices and machine learning methods. RESULTS: In this article, we introduce a hidden Markov model (CCHMM_PROF) that exploits the information contained in multiple sequence alignments (profiles) to predict coiled-coil regions. The new method discriminates coiled-coil sequences with an accuracy of 97% and achieves a true positive rate of 79% with only 1% of false positives. Furthermore, when predicting the location of coiled-coil segments in protein sequences, the method reaches an accuracy of 80% at the residue level and a best per-segment and per-protein efficiency of 81% and 80%, respectively. The results indicate that CCHMM_PROF outperforms all the existing tools and can be adopted for large-scale genome annotation. AVAILABILITY: The dataset is available at http://www.biocomp.unibo.it/ approximately lisa/coiled-coils. The predictor is freely available at http://gpcr.biocomp.unibo.it/cgi/predictors/cchmmprof/pred_cchmmprof.cgi. CONTACT: [email protected].
U2 - 10.1093/bioinformatics/btp539
DO - 10.1093/bioinformatics/btp539
M3 - Journal article
C2 - 19744995
SN - 1367-4803
VL - 25
SP - 2757
EP - 2763
JO - Bioinformatics
JF - Bioinformatics
IS - 21
ER -