Large-scale validation of methods for cytotoxic T-lymphocyte epitope prediction

Mette V Larsen; Claus Lundegaard; Kasper Lamberth; Søren Buus; Ole Lund; Morten Nielsen

doi:10.1186/1471-2105-8-424

Large-scale validation of methods for cytotoxic T-lymphocyte epitope prediction

Mette V Larsen, Claus Lundegaard, Kasper Lamberth, Søren Buus, Ole Lund, Morten Nielsen

Department of Immunology and Microbiology

354 Citations (Scopus)

Abstract

BACKGROUND: Reliable predictions of Cytotoxic T lymphocyte (CTL) epitopes are essential for rational vaccine design. Most importantly, they can minimize the experimental effort needed to identify epitopes. NetCTL is a web-based tool designed for predicting human CTL epitopes in any given protein. It does so by integrating predictions of proteasomal cleavage, TAP transport efficiency, and MHC class I affinity. At least four other methods have been developed recently that likewise attempt to predict CTL epitopes: EpiJen, MAPPP, MHC-pathway, and WAPP. In order to compare the performance of prediction methods, objective benchmarks and standardized performance measures are needed. Here, we develop such large-scale benchmark and corresponding performance measures and report the performance of an updated version 1.2 of NetCTL in comparison with the four other methods. RESULTS: We define a number of performance measures that can handle the different types of output data from the five methods. We use two evaluation datasets consisting of known HIV CTL epitopes and their source proteins. The source proteins are split into all possible 9 mers and except for annotated epitopes; all other 9 mers are considered non-epitopes. In the RANK measure, we compare two methods at a time and count how often each of the methods rank the epitope highest. In another measure, we find the specificity of the methods at three predefined sensitivity values. Lastly, for each method, we calculate the percentage of known epitopes that rank within the 5% peptides with the highest predicted score. CONCLUSION: NetCTL-1.2 is demonstrated to have a higher predictive performance than EpiJen, MAPPP, MHC-pathway, and WAPP on all performance measures. The higher performance of NetCTL-1.2 as compared to EpiJen and MHC-pathway is, however, not statistically significant on all measures. In the large-scale benchmark calculation consisting of 216 known HIV epitopes covering all 12 recognized HLA supertypes, the NetCTL-1.2 method was shown to have a sensitivity among the 5% top-scoring peptides above 0.72. On this dataset, the best of the other methods achieved a sensitivity of 0.64. The NetCTL-1.2 method is available at http://www.cbs.dtu.dk/services/NetCTL.All used datasets are available at http://www.cbs.dtu.dk/suppl/immunology/CTL-1.2.php.

Original language	English
Journal	BMC bioinformatics
Volume	8
Pages (from-to)	424
DOIs	https://doi.org/10.1186/1471-2105-8-424
Publication status	Published - 2007

UN SDGs

This output contributes to the following UN Sustainable Development Goals (SDGs)

Access to Document

10.1186/1471-2105-8-424

Cite this

@article{01f4fed0ebc911ddbf70000ea68e967b,

title = "Large-scale validation of methods for cytotoxic T-lymphocyte epitope prediction",

abstract = "BACKGROUND: Reliable predictions of Cytotoxic T lymphocyte (CTL) epitopes are essential for rational vaccine design. Most importantly, they can minimize the experimental effort needed to identify epitopes. NetCTL is a web-based tool designed for predicting human CTL epitopes in any given protein. It does so by integrating predictions of proteasomal cleavage, TAP transport efficiency, and MHC class I affinity. At least four other methods have been developed recently that likewise attempt to predict CTL epitopes: EpiJen, MAPPP, MHC-pathway, and WAPP. In order to compare the performance of prediction methods, objective benchmarks and standardized performance measures are needed. Here, we develop such large-scale benchmark and corresponding performance measures and report the performance of an updated version 1.2 of NetCTL in comparison with the four other methods. RESULTS: We define a number of performance measures that can handle the different types of output data from the five methods. We use two evaluation datasets consisting of known HIV CTL epitopes and their source proteins. The source proteins are split into all possible 9 mers and except for annotated epitopes; all other 9 mers are considered non-epitopes. In the RANK measure, we compare two methods at a time and count how often each of the methods rank the epitope highest. In another measure, we find the specificity of the methods at three predefined sensitivity values. Lastly, for each method, we calculate the percentage of known epitopes that rank within the 5% peptides with the highest predicted score. CONCLUSION: NetCTL-1.2 is demonstrated to have a higher predictive performance than EpiJen, MAPPP, MHC-pathway, and WAPP on all performance measures. The higher performance of NetCTL-1.2 as compared to EpiJen and MHC-pathway is, however, not statistically significant on all measures. In the large-scale benchmark calculation consisting of 216 known HIV epitopes covering all 12 recognized HLA supertypes, the NetCTL-1.2 method was shown to have a sensitivity among the 5% top-scoring peptides above 0.72. On this dataset, the best of the other methods achieved a sensitivity of 0.64. The NetCTL-1.2 method is available at http://www.cbs.dtu.dk/services/NetCTL.All used datasets are available at http://www.cbs.dtu.dk/suppl/immunology/CTL-1.2.php.",

author = "Larsen, {Mette V} and Claus Lundegaard and Kasper Lamberth and S{\o}ren Buus and Ole Lund and Morten Nielsen",

note = "Keywords: Algorithms; Binding Sites; Epitope Mapping; Epitopes, T-Lymphocyte; Protein Binding; Sequence Analysis, Protein; T-Lymphocytes, Cytotoxic",

year = "2007",

doi = "10.1186/1471-2105-8-424",

language = "English",

volume = "8",

pages = "424",

journal = "Bioinformatics",

issn = "1367-4803",

publisher = "Oxford University Press",

}

TY - JOUR

T1 - Large-scale validation of methods for cytotoxic T-lymphocyte epitope prediction

AU - Larsen, Mette V

AU - Lundegaard, Claus

AU - Lamberth, Kasper

AU - Buus, Søren

AU - Lund, Ole

AU - Nielsen, Morten

N1 - Keywords: Algorithms; Binding Sites; Epitope Mapping; Epitopes, T-Lymphocyte; Protein Binding; Sequence Analysis, Protein; T-Lymphocytes, Cytotoxic

PY - 2007

Y1 - 2007

N2 - BACKGROUND: Reliable predictions of Cytotoxic T lymphocyte (CTL) epitopes are essential for rational vaccine design. Most importantly, they can minimize the experimental effort needed to identify epitopes. NetCTL is a web-based tool designed for predicting human CTL epitopes in any given protein. It does so by integrating predictions of proteasomal cleavage, TAP transport efficiency, and MHC class I affinity. At least four other methods have been developed recently that likewise attempt to predict CTL epitopes: EpiJen, MAPPP, MHC-pathway, and WAPP. In order to compare the performance of prediction methods, objective benchmarks and standardized performance measures are needed. Here, we develop such large-scale benchmark and corresponding performance measures and report the performance of an updated version 1.2 of NetCTL in comparison with the four other methods. RESULTS: We define a number of performance measures that can handle the different types of output data from the five methods. We use two evaluation datasets consisting of known HIV CTL epitopes and their source proteins. The source proteins are split into all possible 9 mers and except for annotated epitopes; all other 9 mers are considered non-epitopes. In the RANK measure, we compare two methods at a time and count how often each of the methods rank the epitope highest. In another measure, we find the specificity of the methods at three predefined sensitivity values. Lastly, for each method, we calculate the percentage of known epitopes that rank within the 5% peptides with the highest predicted score. CONCLUSION: NetCTL-1.2 is demonstrated to have a higher predictive performance than EpiJen, MAPPP, MHC-pathway, and WAPP on all performance measures. The higher performance of NetCTL-1.2 as compared to EpiJen and MHC-pathway is, however, not statistically significant on all measures. In the large-scale benchmark calculation consisting of 216 known HIV epitopes covering all 12 recognized HLA supertypes, the NetCTL-1.2 method was shown to have a sensitivity among the 5% top-scoring peptides above 0.72. On this dataset, the best of the other methods achieved a sensitivity of 0.64. The NetCTL-1.2 method is available at http://www.cbs.dtu.dk/services/NetCTL.All used datasets are available at http://www.cbs.dtu.dk/suppl/immunology/CTL-1.2.php.

AB - BACKGROUND: Reliable predictions of Cytotoxic T lymphocyte (CTL) epitopes are essential for rational vaccine design. Most importantly, they can minimize the experimental effort needed to identify epitopes. NetCTL is a web-based tool designed for predicting human CTL epitopes in any given protein. It does so by integrating predictions of proteasomal cleavage, TAP transport efficiency, and MHC class I affinity. At least four other methods have been developed recently that likewise attempt to predict CTL epitopes: EpiJen, MAPPP, MHC-pathway, and WAPP. In order to compare the performance of prediction methods, objective benchmarks and standardized performance measures are needed. Here, we develop such large-scale benchmark and corresponding performance measures and report the performance of an updated version 1.2 of NetCTL in comparison with the four other methods. RESULTS: We define a number of performance measures that can handle the different types of output data from the five methods. We use two evaluation datasets consisting of known HIV CTL epitopes and their source proteins. The source proteins are split into all possible 9 mers and except for annotated epitopes; all other 9 mers are considered non-epitopes. In the RANK measure, we compare two methods at a time and count how often each of the methods rank the epitope highest. In another measure, we find the specificity of the methods at three predefined sensitivity values. Lastly, for each method, we calculate the percentage of known epitopes that rank within the 5% peptides with the highest predicted score. CONCLUSION: NetCTL-1.2 is demonstrated to have a higher predictive performance than EpiJen, MAPPP, MHC-pathway, and WAPP on all performance measures. The higher performance of NetCTL-1.2 as compared to EpiJen and MHC-pathway is, however, not statistically significant on all measures. In the large-scale benchmark calculation consisting of 216 known HIV epitopes covering all 12 recognized HLA supertypes, the NetCTL-1.2 method was shown to have a sensitivity among the 5% top-scoring peptides above 0.72. On this dataset, the best of the other methods achieved a sensitivity of 0.64. The NetCTL-1.2 method is available at http://www.cbs.dtu.dk/services/NetCTL.All used datasets are available at http://www.cbs.dtu.dk/suppl/immunology/CTL-1.2.php.

U2 - 10.1186/1471-2105-8-424

DO - 10.1186/1471-2105-8-424

M3 - Journal article

C2 - 17973982

SN - 1367-4803

VL - 8

SP - 424

JO - Bioinformatics

JF - Bioinformatics

ER -

Large-scale validation of methods for cytotoxic T-lymphocyte epitope prediction

Abstract

UN SDGs

Access to Document

Fingerprint

Cite this