Reliable prediction of T-cell epitopes using neural networks with novel sequence representations

Morten Nielsen; Claus Lundegaard; Peder Worning; Sanne Lise Lauemøller; Kasper Lamberth; Søren Buus; Søren Brunak; Ole Lund

Reliable prediction of T-cell epitopes using neural networks with novel sequence representations

Morten Nielsen, Claus Lundegaard, Peder Worning, Sanne Lise Lauemøller, Kasper Lamberth, Søren Buus, Søren Brunak, Ole Lund

Department of Immunology and Microbiology

648 Citations (Scopus)

Abstract

In this paper we describe an improved neural network method to predict T-cell class I epitopes. A novel input representation has been developed consisting of a combination of sparse encoding, Blosum encoding, and input derived from hidden Markov models. We demonstrate that the combination of several neural networks derived using different sequence-encoding schemes has a performance superior to neural networks derived using a single sequence-encoding scheme. The new method is shown to have a performance that is substantially higher than that of other methods. By use of mutual information calculations we show that peptides that bind to the HLA A*0204 complex display signal of higher order sequence correlations. Neural networks are ideally suited to integrate such higher order correlations when predicting the binding affinity. It is this feature combined with the use of several neural networks derived from different and novel sequence-encoding schemes and the ability of the neural network to be trained on data consisting of continuous binding affinities that gives the new method an improved performance. The difference in predictive performance between the neural network methods and that of the matrix-driven methods is found to be most significant for peptides that bind strongly to the HLA molecule, confirming that the signal of higher order sequence correlation is most strongly present in high-binding peptides. Finally, we use the method to predict T-cell epitopes for the genome of hepatitis C virus and discuss possible applications of the prediction method to guide the process of rational vaccine design.

Original language	English
Journal	Protein Science
Volume	12
Issue number	5
Pages (from-to)	1007-17
Number of pages	10
ISSN	0961-8368
Publication status	Published - 2003

UN SDGs

This output contributes to the following UN Sustainable Development Goals (SDGs)

Cite this

@article{5aa38fe0ebcb11ddbf70000ea68e967b,

title = "Reliable prediction of T-cell epitopes using neural networks with novel sequence representations",

abstract = "In this paper we describe an improved neural network method to predict T-cell class I epitopes. A novel input representation has been developed consisting of a combination of sparse encoding, Blosum encoding, and input derived from hidden Markov models. We demonstrate that the combination of several neural networks derived using different sequence-encoding schemes has a performance superior to neural networks derived using a single sequence-encoding scheme. The new method is shown to have a performance that is substantially higher than that of other methods. By use of mutual information calculations we show that peptides that bind to the HLA A*0204 complex display signal of higher order sequence correlations. Neural networks are ideally suited to integrate such higher order correlations when predicting the binding affinity. It is this feature combined with the use of several neural networks derived from different and novel sequence-encoding schemes and the ability of the neural network to be trained on data consisting of continuous binding affinities that gives the new method an improved performance. The difference in predictive performance between the neural network methods and that of the matrix-driven methods is found to be most significant for peptides that bind strongly to the HLA molecule, confirming that the signal of higher order sequence correlation is most strongly present in high-binding peptides. Finally, we use the method to predict T-cell epitopes for the genome of hepatitis C virus and discuss possible applications of the prediction method to guide the process of rational vaccine design.",

author = "Morten Nielsen and Claus Lundegaard and Peder Worning and Lauem{\o}ller, {Sanne Lise} and Kasper Lamberth and S{\o}ren Buus and S{\o}ren Brunak and Ole Lund",

note = "Keywords: Amino Acid Sequence; Epitopes, T-Lymphocyte; Genome, Viral; HLA-A2 Antigen; Hepacivirus; Histocompatibility Antigens Class I; Humans; Markov Chains; Models, Molecular; Neural Networks (Computer); Peptides; Protein Binding",

year = "2003",

language = "English",

volume = "12",

pages = "1007--17",

journal = "Protein Science",

issn = "0961-8368",

publisher = "Wiley-Blackwell",

number = "5",

}

TY - JOUR

T1 - Reliable prediction of T-cell epitopes using neural networks with novel sequence representations

AU - Nielsen, Morten

AU - Lundegaard, Claus

AU - Worning, Peder

AU - Lauemøller, Sanne Lise

AU - Lamberth, Kasper

AU - Buus, Søren

AU - Brunak, Søren

AU - Lund, Ole

N1 - Keywords: Amino Acid Sequence; Epitopes, T-Lymphocyte; Genome, Viral; HLA-A2 Antigen; Hepacivirus; Histocompatibility Antigens Class I; Humans; Markov Chains; Models, Molecular; Neural Networks (Computer); Peptides; Protein Binding

PY - 2003

Y1 - 2003

N2 - In this paper we describe an improved neural network method to predict T-cell class I epitopes. A novel input representation has been developed consisting of a combination of sparse encoding, Blosum encoding, and input derived from hidden Markov models. We demonstrate that the combination of several neural networks derived using different sequence-encoding schemes has a performance superior to neural networks derived using a single sequence-encoding scheme. The new method is shown to have a performance that is substantially higher than that of other methods. By use of mutual information calculations we show that peptides that bind to the HLA A*0204 complex display signal of higher order sequence correlations. Neural networks are ideally suited to integrate such higher order correlations when predicting the binding affinity. It is this feature combined with the use of several neural networks derived from different and novel sequence-encoding schemes and the ability of the neural network to be trained on data consisting of continuous binding affinities that gives the new method an improved performance. The difference in predictive performance between the neural network methods and that of the matrix-driven methods is found to be most significant for peptides that bind strongly to the HLA molecule, confirming that the signal of higher order sequence correlation is most strongly present in high-binding peptides. Finally, we use the method to predict T-cell epitopes for the genome of hepatitis C virus and discuss possible applications of the prediction method to guide the process of rational vaccine design.

AB - In this paper we describe an improved neural network method to predict T-cell class I epitopes. A novel input representation has been developed consisting of a combination of sparse encoding, Blosum encoding, and input derived from hidden Markov models. We demonstrate that the combination of several neural networks derived using different sequence-encoding schemes has a performance superior to neural networks derived using a single sequence-encoding scheme. The new method is shown to have a performance that is substantially higher than that of other methods. By use of mutual information calculations we show that peptides that bind to the HLA A*0204 complex display signal of higher order sequence correlations. Neural networks are ideally suited to integrate such higher order correlations when predicting the binding affinity. It is this feature combined with the use of several neural networks derived from different and novel sequence-encoding schemes and the ability of the neural network to be trained on data consisting of continuous binding affinities that gives the new method an improved performance. The difference in predictive performance between the neural network methods and that of the matrix-driven methods is found to be most significant for peptides that bind strongly to the HLA molecule, confirming that the signal of higher order sequence correlation is most strongly present in high-binding peptides. Finally, we use the method to predict T-cell epitopes for the genome of hepatitis C virus and discuss possible applications of the prediction method to guide the process of rational vaccine design.

M3 - Journal article

C2 - 12717023

SN - 0961-8368

VL - 12

SP - 1007

EP - 1017

JO - Protein Science

JF - Protein Science

IS - 5

ER -

Reliable prediction of T-cell epitopes using neural networks with novel sequence representations

Abstract

UN SDGs

Fingerprint

Cite this