Systematically benchmarking peptide-MHC binding predictors: From synthetic to naturally processed epitopes

A number of machine learning-based predictors have been developed for identifying immunogenic T-cell epitopes based on major histocompatibility complex (MHC) class I and II binding affinities. Rationally selecting the most appropriate tool has been complicated by the evolving training data and machi...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:PLoS computational biology 2018-11, Vol.14 (11), p.e1006457-e1006457
Hauptverfasser: Zhao, Weilong, Sher, Xinwei
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page e1006457
container_issue 11
container_start_page e1006457
container_title PLoS computational biology
container_volume 14
creator Zhao, Weilong
Sher, Xinwei
description A number of machine learning-based predictors have been developed for identifying immunogenic T-cell epitopes based on major histocompatibility complex (MHC) class I and II binding affinities. Rationally selecting the most appropriate tool has been complicated by the evolving training data and machine learning methods. Despite the recent advances made in generating high-quality MHC-eluted, naturally processed ligandome, the reliability of new predictors on these epitopes has yet to be evaluated. This study reports the latest benchmarking on an extensive set of MHC-binding predictors by using newly available, untested data of both synthetic and naturally processed epitopes. 32 human leukocyte antigen (HLA) class I and 24 HLA class II alleles are included in the blind test set. Artificial neural network (ANN)-based approaches demonstrated better performance than regression-based machine learning and structural modeling. Among the 18 predictors benchmarked, ANN-based mhcflurry and nn_align perform the best for MHC class I 9-mer and class II 15-mer predictions, respectively, on binding/non-binding classification (Area Under Curves = 0.911). NetMHCpan4 also demonstrated comparable predictive power. Our customization of mhcflurry to a pan-HLA predictor has achieved similar accuracy to NetMHCpan. The overall accuracy of these methods are comparable between 9-mer and 10-mer testing data. However, the top methods deliver low correlations between the predicted versus the experimental affinities for strong MHC binders. When used on naturally processed MHC-ligands, tools that have been trained on elution data (NetMHCpan4 and MixMHCpred) shows better accuracy than pure binding affinity predictor. The variability of false prediction rate is considerable among HLA types and datasets. Finally, structure-based predictor of Rosetta FlexPepDock is less optimal compared to the machine learning approaches. With our benchmarking of MHC-binding and MHC-elution predictors using a comprehensive metrics, a unbiased view for establishing best practice of T-cell epitope predictions is presented, facilitating future development of methods in immunogenomics.
doi_str_mv 10.1371/journal.pcbi.1006457
format Article
fullrecord <record><control><sourceid>gale_plos_</sourceid><recordid>TN_cdi_plos_journals_2250635261</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><galeid>A564080822</galeid><doaj_id>oai_doaj_org_article_1e407d1f72334307852d2172b98d0a7c</doaj_id><sourcerecordid>A564080822</sourcerecordid><originalsourceid>FETCH-LOGICAL-c633t-7c41dc8772947beb1766745ffecc719f561745dad58dd4e2f6800927d48748563</originalsourceid><addsrcrecordid>eNqVkktv1DAUhSMEoqXwDxBEYgOLGfx2wgKpGlE6UgGJwtpy7JuphyQOtoOYf4_n0aqD2KAs4tx85_jeq1MUzzGaYyrx27WfwqC7-WgaN8cICcblg-IUc05nkvLq4b3zSfEkxjVC-ViLx8UJRQxViOHTYn29iQl6nZzRXbcpGxjMTa_DDzesyhHG5CzMPl0uysYNdlcLYJ1JPsR35UXwfRk3Q7qBrC-TLwedprAzGoM3ECPYEkaX_AjxafGo1V2EZ4f3WfH94sO3xeXs6svH5eL8amYEpWkmDcPWVFKSmskGGiyFkIy3LRgjcd1ygfOn1ZZX1jIgragQqom0rJKs4oKeFS_3vmPnozqsKSpCOBKUE4EzsdwT1uu1GoPLA2-U107tCj6slA55og4UBoakxa0klDKKZMWJJViSpq4s0tJkr_eH26amB2tgSHkBR6bHfwZ3o1b-lxKEMERlNnh9MAj-5wQxqd5FA12nB_BT7hvT3HrFWZ3RV3-h_55uvqdWOg_ghtbne01-LPTO-AFal-vnXGwzUBGSBW-OBJlJ8Dut9BSjWl5__Q_28zHL9qwJPsYA7d1WMFLbDN-2r7YZVocMZ9mL-xu9E92Glv4BA-ntvw</addsrcrecordid><sourcetype>Open Website</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2250635261</pqid></control><display><type>article</type><title>Systematically benchmarking peptide-MHC binding predictors: From synthetic to naturally processed epitopes</title><source>MEDLINE</source><source>DOAJ Directory of Open Access Journals</source><source>Public Library of Science (PLoS) Journals Open Access</source><source>EZB-FREE-00999 freely available EZB journals</source><source>PubMed Central</source><creator>Zhao, Weilong ; Sher, Xinwei</creator><contributor>Peters, Bjoern</contributor><creatorcontrib>Zhao, Weilong ; Sher, Xinwei ; Peters, Bjoern</creatorcontrib><description>A number of machine learning-based predictors have been developed for identifying immunogenic T-cell epitopes based on major histocompatibility complex (MHC) class I and II binding affinities. Rationally selecting the most appropriate tool has been complicated by the evolving training data and machine learning methods. Despite the recent advances made in generating high-quality MHC-eluted, naturally processed ligandome, the reliability of new predictors on these epitopes has yet to be evaluated. This study reports the latest benchmarking on an extensive set of MHC-binding predictors by using newly available, untested data of both synthetic and naturally processed epitopes. 32 human leukocyte antigen (HLA) class I and 24 HLA class II alleles are included in the blind test set. Artificial neural network (ANN)-based approaches demonstrated better performance than regression-based machine learning and structural modeling. Among the 18 predictors benchmarked, ANN-based mhcflurry and nn_align perform the best for MHC class I 9-mer and class II 15-mer predictions, respectively, on binding/non-binding classification (Area Under Curves = 0.911). NetMHCpan4 also demonstrated comparable predictive power. Our customization of mhcflurry to a pan-HLA predictor has achieved similar accuracy to NetMHCpan. The overall accuracy of these methods are comparable between 9-mer and 10-mer testing data. However, the top methods deliver low correlations between the predicted versus the experimental affinities for strong MHC binders. When used on naturally processed MHC-ligands, tools that have been trained on elution data (NetMHCpan4 and MixMHCpred) shows better accuracy than pure binding affinity predictor. The variability of false prediction rate is considerable among HLA types and datasets. Finally, structure-based predictor of Rosetta FlexPepDock is less optimal compared to the machine learning approaches. With our benchmarking of MHC-binding and MHC-elution predictors using a comprehensive metrics, a unbiased view for establishing best practice of T-cell epitope predictions is presented, facilitating future development of methods in immunogenomics.</description><identifier>ISSN: 1553-7358</identifier><identifier>ISSN: 1553-734X</identifier><identifier>EISSN: 1553-7358</identifier><identifier>DOI: 10.1371/journal.pcbi.1006457</identifier><identifier>PMID: 30408041</identifier><language>eng</language><publisher>United States: Public Library of Science</publisher><subject>Accuracy ; Affinity ; Algorithms ; Alleles ; Antigenic determinants ; Antigens ; Artificial intelligence ; Artificial neural networks ; Benchmarking ; Benchmarks ; Best practice ; Binders ; Binding ; Bioinformatics ; Biology and Life Sciences ; Cancer ; Cancer Vaccines - immunology ; Computer and Information Sciences ; Datasets ; Datasets as Topic ; Elution ; Epitopes ; Epitopes, T-Lymphocyte - chemistry ; Epitopes, T-Lymphocyte - immunology ; Epitopes, T-Lymphocyte - metabolism ; Histocompatibility antigen HLA ; Histocompatibility Antigens Class I - immunology ; Histocompatibility Antigens Class I - metabolism ; Histocompatibility Antigens Class II - immunology ; Histocompatibility Antigens Class II - metabolism ; HLA antigens ; Humans ; Immune response ; Immunogenicity ; Immunogenicity, Vaccine ; Immunotherapy ; Learning algorithms ; Learning theory ; Leukocytes ; Ligands ; Lymphocytes ; Lymphocytes T ; Machine Learning ; Major histocompatibility complex ; Major Histocompatibility Complex - immunology ; Mass spectrometry ; Medicine and Health Sciences ; Neural networks ; Observations ; Peptides ; Peptides - chemistry ; Peptides - immunology ; Peptides - metabolism ; Physical Sciences ; Physiological aspects ; Predictions ; Protein Binding ; Reliability analysis ; Reproducibility of Results ; Research and Analysis Methods ; Scientific imaging ; T cell receptors ; T cells ; T-Lymphocytes - immunology ; Vaccines</subject><ispartof>PLoS computational biology, 2018-11, Vol.14 (11), p.e1006457-e1006457</ispartof><rights>COPYRIGHT 2018 Public Library of Science</rights><rights>2018 Zhao, Sher. This is an open access article distributed under the terms of the Creative Commons Attribution License: http://creativecommons.org/licenses/by/4.0/ (the “License”), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><rights>2018 Zhao, Sher 2018 Zhao, Sher</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c633t-7c41dc8772947beb1766745ffecc719f561745dad58dd4e2f6800927d48748563</citedby><cites>FETCH-LOGICAL-c633t-7c41dc8772947beb1766745ffecc719f561745dad58dd4e2f6800927d48748563</cites><orcidid>0000-0001-8909-5322 ; 0000-0003-3977-3695</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC6224037/pdf/$$EPDF$$P50$$Gpubmedcentral$$Hfree_for_read</linktopdf><linktohtml>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC6224037/$$EHTML$$P50$$Gpubmedcentral$$Hfree_for_read</linktohtml><link.rule.ids>230,314,727,780,784,864,885,2102,2928,23866,27924,27925,53791,53793</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/30408041$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><contributor>Peters, Bjoern</contributor><creatorcontrib>Zhao, Weilong</creatorcontrib><creatorcontrib>Sher, Xinwei</creatorcontrib><title>Systematically benchmarking peptide-MHC binding predictors: From synthetic to naturally processed epitopes</title><title>PLoS computational biology</title><addtitle>PLoS Comput Biol</addtitle><description>A number of machine learning-based predictors have been developed for identifying immunogenic T-cell epitopes based on major histocompatibility complex (MHC) class I and II binding affinities. Rationally selecting the most appropriate tool has been complicated by the evolving training data and machine learning methods. Despite the recent advances made in generating high-quality MHC-eluted, naturally processed ligandome, the reliability of new predictors on these epitopes has yet to be evaluated. This study reports the latest benchmarking on an extensive set of MHC-binding predictors by using newly available, untested data of both synthetic and naturally processed epitopes. 32 human leukocyte antigen (HLA) class I and 24 HLA class II alleles are included in the blind test set. Artificial neural network (ANN)-based approaches demonstrated better performance than regression-based machine learning and structural modeling. Among the 18 predictors benchmarked, ANN-based mhcflurry and nn_align perform the best for MHC class I 9-mer and class II 15-mer predictions, respectively, on binding/non-binding classification (Area Under Curves = 0.911). NetMHCpan4 also demonstrated comparable predictive power. Our customization of mhcflurry to a pan-HLA predictor has achieved similar accuracy to NetMHCpan. The overall accuracy of these methods are comparable between 9-mer and 10-mer testing data. However, the top methods deliver low correlations between the predicted versus the experimental affinities for strong MHC binders. When used on naturally processed MHC-ligands, tools that have been trained on elution data (NetMHCpan4 and MixMHCpred) shows better accuracy than pure binding affinity predictor. The variability of false prediction rate is considerable among HLA types and datasets. Finally, structure-based predictor of Rosetta FlexPepDock is less optimal compared to the machine learning approaches. With our benchmarking of MHC-binding and MHC-elution predictors using a comprehensive metrics, a unbiased view for establishing best practice of T-cell epitope predictions is presented, facilitating future development of methods in immunogenomics.</description><subject>Accuracy</subject><subject>Affinity</subject><subject>Algorithms</subject><subject>Alleles</subject><subject>Antigenic determinants</subject><subject>Antigens</subject><subject>Artificial intelligence</subject><subject>Artificial neural networks</subject><subject>Benchmarking</subject><subject>Benchmarks</subject><subject>Best practice</subject><subject>Binders</subject><subject>Binding</subject><subject>Bioinformatics</subject><subject>Biology and Life Sciences</subject><subject>Cancer</subject><subject>Cancer Vaccines - immunology</subject><subject>Computer and Information Sciences</subject><subject>Datasets</subject><subject>Datasets as Topic</subject><subject>Elution</subject><subject>Epitopes</subject><subject>Epitopes, T-Lymphocyte - chemistry</subject><subject>Epitopes, T-Lymphocyte - immunology</subject><subject>Epitopes, T-Lymphocyte - metabolism</subject><subject>Histocompatibility antigen HLA</subject><subject>Histocompatibility Antigens Class I - immunology</subject><subject>Histocompatibility Antigens Class I - metabolism</subject><subject>Histocompatibility Antigens Class II - immunology</subject><subject>Histocompatibility Antigens Class II - metabolism</subject><subject>HLA antigens</subject><subject>Humans</subject><subject>Immune response</subject><subject>Immunogenicity</subject><subject>Immunogenicity, Vaccine</subject><subject>Immunotherapy</subject><subject>Learning algorithms</subject><subject>Learning theory</subject><subject>Leukocytes</subject><subject>Ligands</subject><subject>Lymphocytes</subject><subject>Lymphocytes T</subject><subject>Machine Learning</subject><subject>Major histocompatibility complex</subject><subject>Major Histocompatibility Complex - immunology</subject><subject>Mass spectrometry</subject><subject>Medicine and Health Sciences</subject><subject>Neural networks</subject><subject>Observations</subject><subject>Peptides</subject><subject>Peptides - chemistry</subject><subject>Peptides - immunology</subject><subject>Peptides - metabolism</subject><subject>Physical Sciences</subject><subject>Physiological aspects</subject><subject>Predictions</subject><subject>Protein Binding</subject><subject>Reliability analysis</subject><subject>Reproducibility of Results</subject><subject>Research and Analysis Methods</subject><subject>Scientific imaging</subject><subject>T cell receptors</subject><subject>T cells</subject><subject>T-Lymphocytes - immunology</subject><subject>Vaccines</subject><issn>1553-7358</issn><issn>1553-734X</issn><issn>1553-7358</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2018</creationdate><recordtype>article</recordtype><sourceid>EIF</sourceid><sourceid>ABUWG</sourceid><sourceid>AFKRA</sourceid><sourceid>AZQEC</sourceid><sourceid>BENPR</sourceid><sourceid>CCPQU</sourceid><sourceid>DWQXO</sourceid><sourceid>GNUQQ</sourceid><sourceid>DOA</sourceid><recordid>eNqVkktv1DAUhSMEoqXwDxBEYgOLGfx2wgKpGlE6UgGJwtpy7JuphyQOtoOYf4_n0aqD2KAs4tx85_jeq1MUzzGaYyrx27WfwqC7-WgaN8cICcblg-IUc05nkvLq4b3zSfEkxjVC-ViLx8UJRQxViOHTYn29iQl6nZzRXbcpGxjMTa_DDzesyhHG5CzMPl0uysYNdlcLYJ1JPsR35UXwfRk3Q7qBrC-TLwedprAzGoM3ECPYEkaX_AjxafGo1V2EZ4f3WfH94sO3xeXs6svH5eL8amYEpWkmDcPWVFKSmskGGiyFkIy3LRgjcd1ygfOn1ZZX1jIgragQqom0rJKs4oKeFS_3vmPnozqsKSpCOBKUE4EzsdwT1uu1GoPLA2-U107tCj6slA55og4UBoakxa0klDKKZMWJJViSpq4s0tJkr_eH26amB2tgSHkBR6bHfwZ3o1b-lxKEMERlNnh9MAj-5wQxqd5FA12nB_BT7hvT3HrFWZ3RV3-h_55uvqdWOg_ghtbne01-LPTO-AFal-vnXGwzUBGSBW-OBJlJ8Dut9BSjWl5__Q_28zHL9qwJPsYA7d1WMFLbDN-2r7YZVocMZ9mL-xu9E92Glv4BA-ntvw</recordid><startdate>20181101</startdate><enddate>20181101</enddate><creator>Zhao, Weilong</creator><creator>Sher, Xinwei</creator><general>Public Library of Science</general><general>Public Library of Science (PLoS)</general><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>ISN</scope><scope>ISR</scope><scope>3V.</scope><scope>7QO</scope><scope>7QP</scope><scope>7TK</scope><scope>7TM</scope><scope>7X7</scope><scope>7XB</scope><scope>88E</scope><scope>8AL</scope><scope>8FD</scope><scope>8FE</scope><scope>8FG</scope><scope>8FH</scope><scope>8FI</scope><scope>8FJ</scope><scope>8FK</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>ARAPS</scope><scope>AZQEC</scope><scope>BBNVY</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>BHPHI</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>FR3</scope><scope>FYUFA</scope><scope>GHDGH</scope><scope>GNUQQ</scope><scope>HCIFZ</scope><scope>JQ2</scope><scope>K7-</scope><scope>K9.</scope><scope>LK8</scope><scope>M0N</scope><scope>M0S</scope><scope>M1P</scope><scope>M7P</scope><scope>P5Z</scope><scope>P62</scope><scope>P64</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>Q9U</scope><scope>RC3</scope><scope>7X8</scope><scope>5PM</scope><scope>DOA</scope><orcidid>https://orcid.org/0000-0001-8909-5322</orcidid><orcidid>https://orcid.org/0000-0003-3977-3695</orcidid></search><sort><creationdate>20181101</creationdate><title>Systematically benchmarking peptide-MHC binding predictors: From synthetic to naturally processed epitopes</title><author>Zhao, Weilong ; Sher, Xinwei</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c633t-7c41dc8772947beb1766745ffecc719f561745dad58dd4e2f6800927d48748563</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2018</creationdate><topic>Accuracy</topic><topic>Affinity</topic><topic>Algorithms</topic><topic>Alleles</topic><topic>Antigenic determinants</topic><topic>Antigens</topic><topic>Artificial intelligence</topic><topic>Artificial neural networks</topic><topic>Benchmarking</topic><topic>Benchmarks</topic><topic>Best practice</topic><topic>Binders</topic><topic>Binding</topic><topic>Bioinformatics</topic><topic>Biology and Life Sciences</topic><topic>Cancer</topic><topic>Cancer Vaccines - immunology</topic><topic>Computer and Information Sciences</topic><topic>Datasets</topic><topic>Datasets as Topic</topic><topic>Elution</topic><topic>Epitopes</topic><topic>Epitopes, T-Lymphocyte - chemistry</topic><topic>Epitopes, T-Lymphocyte - immunology</topic><topic>Epitopes, T-Lymphocyte - metabolism</topic><topic>Histocompatibility antigen HLA</topic><topic>Histocompatibility Antigens Class I - immunology</topic><topic>Histocompatibility Antigens Class I - metabolism</topic><topic>Histocompatibility Antigens Class II - immunology</topic><topic>Histocompatibility Antigens Class II - metabolism</topic><topic>HLA antigens</topic><topic>Humans</topic><topic>Immune response</topic><topic>Immunogenicity</topic><topic>Immunogenicity, Vaccine</topic><topic>Immunotherapy</topic><topic>Learning algorithms</topic><topic>Learning theory</topic><topic>Leukocytes</topic><topic>Ligands</topic><topic>Lymphocytes</topic><topic>Lymphocytes T</topic><topic>Machine Learning</topic><topic>Major histocompatibility complex</topic><topic>Major Histocompatibility Complex - immunology</topic><topic>Mass spectrometry</topic><topic>Medicine and Health Sciences</topic><topic>Neural networks</topic><topic>Observations</topic><topic>Peptides</topic><topic>Peptides - chemistry</topic><topic>Peptides - immunology</topic><topic>Peptides - metabolism</topic><topic>Physical Sciences</topic><topic>Physiological aspects</topic><topic>Predictions</topic><topic>Protein Binding</topic><topic>Reliability analysis</topic><topic>Reproducibility of Results</topic><topic>Research and Analysis Methods</topic><topic>Scientific imaging</topic><topic>T cell receptors</topic><topic>T cells</topic><topic>T-Lymphocytes - immunology</topic><topic>Vaccines</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Zhao, Weilong</creatorcontrib><creatorcontrib>Sher, Xinwei</creatorcontrib><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>Gale In Context: Canada</collection><collection>Gale In Context: Science</collection><collection>ProQuest Central (Corporate)</collection><collection>Biotechnology Research Abstracts</collection><collection>Calcium &amp; Calcified Tissue Abstracts</collection><collection>Neurosciences Abstracts</collection><collection>Nucleic Acids Abstracts</collection><collection>Health &amp; Medical Collection</collection><collection>ProQuest Central (purchase pre-March 2016)</collection><collection>Medical Database (Alumni Edition)</collection><collection>Computing Database (Alumni Edition)</collection><collection>Technology Research Database</collection><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>ProQuest Natural Science Collection</collection><collection>Hospital Premium Collection</collection><collection>Hospital Premium Collection (Alumni Edition)</collection><collection>ProQuest Central (Alumni) (purchase pre-March 2016)</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest Central UK/Ireland</collection><collection>Advanced Technologies &amp; Aerospace Collection</collection><collection>ProQuest Central Essentials</collection><collection>Biological Science Collection</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>Natural Science Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>Engineering Research Database</collection><collection>Health Research Premium Collection</collection><collection>Health Research Premium Collection (Alumni)</collection><collection>ProQuest Central Student</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Computer Science Collection</collection><collection>Computer Science Database</collection><collection>ProQuest Health &amp; Medical Complete (Alumni)</collection><collection>ProQuest Biological Science Collection</collection><collection>Computing Database</collection><collection>Health &amp; Medical Collection (Alumni Edition)</collection><collection>Medical Database</collection><collection>Biological Science Database</collection><collection>Advanced Technologies &amp; Aerospace Database</collection><collection>ProQuest Advanced Technologies &amp; Aerospace Collection</collection><collection>Biotechnology and BioEngineering Abstracts</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>ProQuest Central Basic</collection><collection>Genetics Abstracts</collection><collection>MEDLINE - Academic</collection><collection>PubMed Central (Full Participant titles)</collection><collection>DOAJ Directory of Open Access Journals</collection><jtitle>PLoS computational biology</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Zhao, Weilong</au><au>Sher, Xinwei</au><au>Peters, Bjoern</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Systematically benchmarking peptide-MHC binding predictors: From synthetic to naturally processed epitopes</atitle><jtitle>PLoS computational biology</jtitle><addtitle>PLoS Comput Biol</addtitle><date>2018-11-01</date><risdate>2018</risdate><volume>14</volume><issue>11</issue><spage>e1006457</spage><epage>e1006457</epage><pages>e1006457-e1006457</pages><issn>1553-7358</issn><issn>1553-734X</issn><eissn>1553-7358</eissn><abstract>A number of machine learning-based predictors have been developed for identifying immunogenic T-cell epitopes based on major histocompatibility complex (MHC) class I and II binding affinities. Rationally selecting the most appropriate tool has been complicated by the evolving training data and machine learning methods. Despite the recent advances made in generating high-quality MHC-eluted, naturally processed ligandome, the reliability of new predictors on these epitopes has yet to be evaluated. This study reports the latest benchmarking on an extensive set of MHC-binding predictors by using newly available, untested data of both synthetic and naturally processed epitopes. 32 human leukocyte antigen (HLA) class I and 24 HLA class II alleles are included in the blind test set. Artificial neural network (ANN)-based approaches demonstrated better performance than regression-based machine learning and structural modeling. Among the 18 predictors benchmarked, ANN-based mhcflurry and nn_align perform the best for MHC class I 9-mer and class II 15-mer predictions, respectively, on binding/non-binding classification (Area Under Curves = 0.911). NetMHCpan4 also demonstrated comparable predictive power. Our customization of mhcflurry to a pan-HLA predictor has achieved similar accuracy to NetMHCpan. The overall accuracy of these methods are comparable between 9-mer and 10-mer testing data. However, the top methods deliver low correlations between the predicted versus the experimental affinities for strong MHC binders. When used on naturally processed MHC-ligands, tools that have been trained on elution data (NetMHCpan4 and MixMHCpred) shows better accuracy than pure binding affinity predictor. The variability of false prediction rate is considerable among HLA types and datasets. Finally, structure-based predictor of Rosetta FlexPepDock is less optimal compared to the machine learning approaches. With our benchmarking of MHC-binding and MHC-elution predictors using a comprehensive metrics, a unbiased view for establishing best practice of T-cell epitope predictions is presented, facilitating future development of methods in immunogenomics.</abstract><cop>United States</cop><pub>Public Library of Science</pub><pmid>30408041</pmid><doi>10.1371/journal.pcbi.1006457</doi><orcidid>https://orcid.org/0000-0001-8909-5322</orcidid><orcidid>https://orcid.org/0000-0003-3977-3695</orcidid><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 1553-7358
ispartof PLoS computational biology, 2018-11, Vol.14 (11), p.e1006457-e1006457
issn 1553-7358
1553-734X
1553-7358
language eng
recordid cdi_plos_journals_2250635261
source MEDLINE; DOAJ Directory of Open Access Journals; Public Library of Science (PLoS) Journals Open Access; EZB-FREE-00999 freely available EZB journals; PubMed Central
subjects Accuracy
Affinity
Algorithms
Alleles
Antigenic determinants
Antigens
Artificial intelligence
Artificial neural networks
Benchmarking
Benchmarks
Best practice
Binders
Binding
Bioinformatics
Biology and Life Sciences
Cancer
Cancer Vaccines - immunology
Computer and Information Sciences
Datasets
Datasets as Topic
Elution
Epitopes
Epitopes, T-Lymphocyte - chemistry
Epitopes, T-Lymphocyte - immunology
Epitopes, T-Lymphocyte - metabolism
Histocompatibility antigen HLA
Histocompatibility Antigens Class I - immunology
Histocompatibility Antigens Class I - metabolism
Histocompatibility Antigens Class II - immunology
Histocompatibility Antigens Class II - metabolism
HLA antigens
Humans
Immune response
Immunogenicity
Immunogenicity, Vaccine
Immunotherapy
Learning algorithms
Learning theory
Leukocytes
Ligands
Lymphocytes
Lymphocytes T
Machine Learning
Major histocompatibility complex
Major Histocompatibility Complex - immunology
Mass spectrometry
Medicine and Health Sciences
Neural networks
Observations
Peptides
Peptides - chemistry
Peptides - immunology
Peptides - metabolism
Physical Sciences
Physiological aspects
Predictions
Protein Binding
Reliability analysis
Reproducibility of Results
Research and Analysis Methods
Scientific imaging
T cell receptors
T cells
T-Lymphocytes - immunology
Vaccines
title Systematically benchmarking peptide-MHC binding predictors: From synthetic to naturally processed epitopes
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-23T23%3A17%3A11IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-gale_plos_&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Systematically%20benchmarking%20peptide-MHC%20binding%20predictors:%20From%20synthetic%20to%20naturally%20processed%20epitopes&rft.jtitle=PLoS%20computational%20biology&rft.au=Zhao,%20Weilong&rft.date=2018-11-01&rft.volume=14&rft.issue=11&rft.spage=e1006457&rft.epage=e1006457&rft.pages=e1006457-e1006457&rft.issn=1553-7358&rft.eissn=1553-7358&rft_id=info:doi/10.1371/journal.pcbi.1006457&rft_dat=%3Cgale_plos_%3EA564080822%3C/gale_plos_%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2250635261&rft_id=info:pmid/30408041&rft_galeid=A564080822&rft_doaj_id=oai_doaj_org_article_1e407d1f72334307852d2172b98d0a7c&rfr_iscdi=true