Abstract
Many biological responses to intra- and extracellular stimuli are regulated through complex networks of transient protein interactions where a globular domain in one protein recognizes a linear peptide from another, creating a relatively small contact interface. These peptide stretches are often found in unstructured regions of proteins, and contain a consensus motif complementary to the interaction surface displayed by their binding partners. While most current methods for the de novo discovery of such motifs exploit their tendency to occur in disordered regions, our work here focuses on another observation: upon binding to their partner domain, motifs adopt a well-defined structure. Indeed, through the analysis of all peptide-mediated interactions of known high-resolution three-dimensional (3D) structure, we found that the structure of the peptide may be as characteristic as the consensus motif, and help identify target peptides even though they do not match the established patterns. Our analyses of the structural features of known motifs reveal that they tend to have a particular stretched and elongated structure, unlike most other peptides of the same length. Accordingly, we have implemented a strategy based on a Support Vector Machine that uses this features, along with other structure-encoded information about binding interfaces, to search the set of protein interactions of known 3D structure and to identify unnoticed peptide-mediated interactions among them. We have also derived consensus patterns for these interactions, whenever enough information was available, and compared our results with established linear motif patterns and their binding domains. Finally, to cross-validate our identification strategy, we scanned interactome networks from four model organisms with our newly derived patterns to see if any of them occurred more often than expected. Indeed, we found significant over-representations for 64 domain-motif interactions, 46 of which had not been described before, involving over 6,000 interactions in total for which we could suggest the molecular details determining the binding.
Original language | English |
---|---|
Journal | PLOS Computational Biology |
Volume | 6 |
Issue number | 5 |
Pages (from-to) | e1000789 |
ISSN | 1553-7358 |
DOIs | |
Publication status | Published - 20 May 2010 |
Externally published | Yes |
Keywords
- Amino Acid Sequence
- Animals
- Artificial Intelligence
- Computational Biology/methods
- Humans
- Models, Molecular
- Molecular Sequence Data
- Peptides/chemistry
- Protein Binding
- Protein Conformation
- Protein Interaction Domains and Motifs
- Protein Interaction Mapping/methods
- Proteins/chemistry
- Reproducibility of Results
- Sequence Alignment