TY - JOUR
T1 - Potentials of mean force for protein structure prediction vindicated, formalized and generalized
AU - Hamelryck, Thomas
AU - Borg, Mikael
AU - Paluszewski, Martin
AU - Paulsen, Jonas
AU - Frellsen, Jes
AU - Andreetta, Christian
AU - Boomsma, Wouter Krogh
AU - Bottaro, Sandro
AU - Ferkinghoff-Borg, Jesper
PY - 2010
Y1 - 2010
N2 - Understanding protein structure is of crucial importance in science, medicine and biotechnology. For about two decades, knowledge-based potentials based on pairwise distances--so-called "potentials of mean force" (PMFs)--have been center stage in the prediction and design of protein structure and the simulation of protein folding. However, the validity, scope and limitations of these potentials are still vigorously debated and disputed, and the optimal choice of the reference state--a necessary component of these potentials--is an unsolved problem. PMFs are loosely justified by analogy to the reversible work theorem in statistical physics, or by a statistical argument based on a likelihood function. Both justifications are insightful but leave many questions unanswered. Here, we show for the first time that PMFs can be seen as approximations to quantities that do have a rigorous probabilistic justification: they naturally arise when probability distributions over different features of proteins need to be combined. We call these quantities "reference ratio distributions" deriving from the application of the "reference ratio method." This new view is not only of theoretical relevance but leads to many insights that are of direct practical use: the reference state is uniquely defined and does not require external physical insights; the approach can be generalized beyond pairwise distances to arbitrary features of protein structure; and it becomes clear for which purposes the use of these quantities is justified. We illustrate these insights with two applications, involving the radius of gyration and hydrogen bonding. In the latter case, we also show how the reference ratio method can be iteratively applied to sculpt an energy funnel. Our results considerably increase the understanding and scope of energy functions derived from known biomolecular structures.
AB - Understanding protein structure is of crucial importance in science, medicine and biotechnology. For about two decades, knowledge-based potentials based on pairwise distances--so-called "potentials of mean force" (PMFs)--have been center stage in the prediction and design of protein structure and the simulation of protein folding. However, the validity, scope and limitations of these potentials are still vigorously debated and disputed, and the optimal choice of the reference state--a necessary component of these potentials--is an unsolved problem. PMFs are loosely justified by analogy to the reversible work theorem in statistical physics, or by a statistical argument based on a likelihood function. Both justifications are insightful but leave many questions unanswered. Here, we show for the first time that PMFs can be seen as approximations to quantities that do have a rigorous probabilistic justification: they naturally arise when probability distributions over different features of proteins need to be combined. We call these quantities "reference ratio distributions" deriving from the application of the "reference ratio method." This new view is not only of theoretical relevance but leads to many insights that are of direct practical use: the reference state is uniquely defined and does not require external physical insights; the approach can be generalized beyond pairwise distances to arbitrary features of protein structure; and it becomes clear for which purposes the use of these quantities is justified. We illustrate these insights with two applications, involving the radius of gyration and hydrogen bonding. In the latter case, we also show how the reference ratio method can be iteratively applied to sculpt an energy funnel. Our results considerably increase the understanding and scope of energy functions derived from known biomolecular structures.
KW - Algorithms
KW - Computational Biology
KW - Hydrogen Bonding
KW - Models, Molecular
KW - Protein Conformation
KW - Protein Folding
KW - Reproducibility of Results
KW - Thermodynamics
U2 - 10.1371/journal.pone.0013714
DO - 10.1371/journal.pone.0013714
M3 - Journal article
C2 - 21103041
SN - 1932-6203
VL - 5
JO - PLoS Computational Biology
JF - PLoS Computational Biology
IS - 11
M1 - e13714
ER -