A simple probabilistic model of multibody interactions in proteins

Kristoffer Enøe Johansson; Thomas Hamelryck

doi:10.1002/prot.24277

A simple probabilistic model of multibody interactions in proteins

Kristoffer Enøe Johansson, Thomas Hamelryck

5 Citationer (Scopus)

Abstract

Protein structure prediction methods typically use statistical potentials, which rely on statistics derived from a database of know protein structures. In the vast majority of cases, these potentials involve pairwise distances or contacts between amino acids or atoms. Although some potentials beyond pairwise interactions have been described, the formulation of a general multibody potential is seen as intractable due to the perceived limited amount of data. In this article, we show that it is possible to formulate a probabilistic model of higher order interactions in proteins, without arbitrarily limiting the number of contacts. The success of this approach is based on replacing a naive table-based approach with a simple hierarchical model involving suitable probability distributions and conditional independence assumptions. The model captures the joint probability distribution of an amino acid and its neighbors, local structure and solvent exposure. We show that this model can be used to approximate the conditional probability distribution of an amino acid sequence given a structure using a pseudo-likelihood approach. We verify the model by decoy recognition and site-specific amino acid predictions. Our coarse-grained model is compared to state-of-art methods that use full atomic detail. This article illustrates how the use of simple probabilistic models can lead to new opportunities in the treatment of nonlocal interactions in knowledge-based protein structure prediction and design.

Originalsprog	Engelsk
Tidsskrift	Proteins: Structure, Function, and Bioinformatics
Vol/bind	81
Udgave nummer	8
Sider (fra-til)	1340-1350
Antal sider	11
ISSN	0887-3585
DOI	https://doi.org/10.1002/prot.24277
Status	Udgivet - aug. 2013

Adgang til dokumentet

10.1002/prot.24277

Citationsformater

@article{8aa1d3682c4347c8a2442ddd128da3b4,

title = "A simple probabilistic model of multibody interactions in proteins",

abstract = "Protein structure prediction methods typically use statistical potentials, which rely on statistics derived from a database of know protein structures. In the vast majority of cases, these potentials involve pairwise distances or contacts between amino acids or atoms. Although some potentials beyond pairwise interactions have been described, the formulation of a general multibody potential is seen as intractable due to the perceived limited amount of data. In this article, we show that it is possible to formulate a probabilistic model of higher order interactions in proteins, without arbitrarily limiting the number of contacts. The success of this approach is based on replacing a naive table-based approach with a simple hierarchical model involving suitable probability distributions and conditional independence assumptions. The model captures the joint probability distribution of an amino acid and its neighbors, local structure and solvent exposure. We show that this model can be used to approximate the conditional probability distribution of an amino acid sequence given a structure using a pseudo-likelihood approach. We verify the model by decoy recognition and site-specific amino acid predictions. Our coarse-grained model is compared to state-of-art methods that use full atomic detail. This article illustrates how the use of simple probabilistic models can lead to new opportunities in the treatment of nonlocal interactions in knowledge-based protein structure prediction and design.",

author = "Johansson, {Kristoffer En{\o}e} and Thomas Hamelryck",

note = "Copyright {\textcopyright} 2013 Wiley Periodicals, Inc., a Wiley company.",

year = "2013",

month = aug,

doi = "10.1002/prot.24277",

language = "English",

volume = "81",

pages = "1340--1350",

journal = "Proteins: Structure, Function, and Bioinformatics",

issn = "0887-3585",

publisher = "JohnWiley & Sons, Inc.",

number = "8",

}

TY - JOUR

T1 - A simple probabilistic model of multibody interactions in proteins

AU - Johansson, Kristoffer Enøe

AU - Hamelryck, Thomas

PY - 2013/8

Y1 - 2013/8

N2 - Protein structure prediction methods typically use statistical potentials, which rely on statistics derived from a database of know protein structures. In the vast majority of cases, these potentials involve pairwise distances or contacts between amino acids or atoms. Although some potentials beyond pairwise interactions have been described, the formulation of a general multibody potential is seen as intractable due to the perceived limited amount of data. In this article, we show that it is possible to formulate a probabilistic model of higher order interactions in proteins, without arbitrarily limiting the number of contacts. The success of this approach is based on replacing a naive table-based approach with a simple hierarchical model involving suitable probability distributions and conditional independence assumptions. The model captures the joint probability distribution of an amino acid and its neighbors, local structure and solvent exposure. We show that this model can be used to approximate the conditional probability distribution of an amino acid sequence given a structure using a pseudo-likelihood approach. We verify the model by decoy recognition and site-specific amino acid predictions. Our coarse-grained model is compared to state-of-art methods that use full atomic detail. This article illustrates how the use of simple probabilistic models can lead to new opportunities in the treatment of nonlocal interactions in knowledge-based protein structure prediction and design.

AB - Protein structure prediction methods typically use statistical potentials, which rely on statistics derived from a database of know protein structures. In the vast majority of cases, these potentials involve pairwise distances or contacts between amino acids or atoms. Although some potentials beyond pairwise interactions have been described, the formulation of a general multibody potential is seen as intractable due to the perceived limited amount of data. In this article, we show that it is possible to formulate a probabilistic model of higher order interactions in proteins, without arbitrarily limiting the number of contacts. The success of this approach is based on replacing a naive table-based approach with a simple hierarchical model involving suitable probability distributions and conditional independence assumptions. The model captures the joint probability distribution of an amino acid and its neighbors, local structure and solvent exposure. We show that this model can be used to approximate the conditional probability distribution of an amino acid sequence given a structure using a pseudo-likelihood approach. We verify the model by decoy recognition and site-specific amino acid predictions. Our coarse-grained model is compared to state-of-art methods that use full atomic detail. This article illustrates how the use of simple probabilistic models can lead to new opportunities in the treatment of nonlocal interactions in knowledge-based protein structure prediction and design.

U2 - 10.1002/prot.24277

DO - 10.1002/prot.24277

M3 - Journal article

C2 - 23468247

SN - 0887-3585

VL - 81

SP - 1340

EP - 1350

JO - Proteins: Structure, Function, and Bioinformatics

JF - Proteins: Structure, Function, and Bioinformatics

IS - 8

ER -

A simple probabilistic model of multibody interactions in proteins

Abstract

Adgang til dokumentet

Fingeraftryk

Citationsformater