MHCAttnNet: predicting MHC-peptide bindings for MHC alleles classes I and II using an attention-based deep neural model

Abstract Motivation Accurate prediction of binding between a major histocompatibility complex (MHC) allele and a peptide plays a major role in the synthesis of personalized cancer vaccines. The immune system struggles to distinguish between a cancerous and a healthy cell. In a patient suffering from...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Bioinformatics (Oxford, England) England), 2020-07, Vol.36 (Supplement_1), p.i399-i406
Hauptverfasser: Venkatesh, Gopalakrishnan, Grover, Aayush, Srinivasaraghavan, G, Rao, Shrisha
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page i406
container_issue Supplement_1
container_start_page i399
container_title Bioinformatics (Oxford, England)
container_volume 36
creator Venkatesh, Gopalakrishnan
Grover, Aayush
Srinivasaraghavan, G
Rao, Shrisha
description Abstract Motivation Accurate prediction of binding between a major histocompatibility complex (MHC) allele and a peptide plays a major role in the synthesis of personalized cancer vaccines. The immune system struggles to distinguish between a cancerous and a healthy cell. In a patient suffering from cancer who has a particular MHC allele, only those peptides that bind with the MHC allele with high affinity, help the immune system recognize the cancerous cells. Results MHCAttnNet is a deep neural model that uses an attention mechanism to capture the relevant subsequences of the amino acid sequences of peptides and MHC alleles. It then uses this to accurately predict the MHC-peptide binding. MHCAttnNet achieves an AUC-PRC score of 94.18% with 161 class I MHC alleles, which outperforms the state-of-the-art models for this task. MHCAttnNet also achieves a better F1-score in comparison to the state-of-the-art models while covering a larger number of class II MHC alleles. The attention mechanism used by MHCAttnNet provides a heatmap over the amino acids thus indicating the important subsequences present in the amino acid sequence. This approach also allows us to focus on a much smaller number of relevant trigrams corresponding to the amino acid sequence of an MHC allele, from 9251 possible trigrams to about 258. This significantly reduces the number of amino acid subsequences that need to be clinically tested. Availability and implementation The data and source code are available at https://github.com/gopuvenkat/MHCAttnNet.
doi_str_mv 10.1093/bioinformatics/btaa479
format Article
fullrecord <record><control><sourceid>proquest_pubme</sourceid><recordid>TN_cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_7355292</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><oup_id>10.1093/bioinformatics/btaa479</oup_id><sourcerecordid>2423533533</sourcerecordid><originalsourceid>FETCH-LOGICAL-c522t-69ff0869e053cb5584a2ee64d2869a263b9e52332bb2340fbe3bf29b395049e13</originalsourceid><addsrcrecordid>eNqNUU1rGzEUFKUhdt38haBjL5topZVs9VAIJh8GJ7mkZyGt3qYqWmm70ibk31fGrmluAcET82bmDQxC5zW5qIlkl8ZFF7o49jq7Nl2arHWzlJ_QvGZiWTWruv58_BM2Q19S-k0I4YSLUzRjVPAlW4k5er2_W1_lHB4gf8fDCNa12YVnXOBqgCE7C9i4YAuWcLm3W2DtPXhIuPU6pTI3WAeLNxs8pZ1WB6xzhpBdDJXRCSy2AAMOMI3a4z5a8F_RSad9grPDXKCfN9dP67tq-3i7WV9tq5ZTmishu46shATCWWs4XzWaAojG0gJqKpiRwClj1BjKGtIZYKaj0jDJSSOhZgv0Y-87TKYH25ZUJYMaRtfr8U1F7dT7TXC_1HN8UUvGOZW0GHw7GIzxzwQpq96lFrzXAeKUFG0o42z3ClXsqe0YUxqhO56pidq1pt63pg6tFeH5_yGPsn81FUK9J8Rp-KjpX9-_rA8</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2423533533</pqid></control><display><type>article</type><title>MHCAttnNet: predicting MHC-peptide bindings for MHC alleles classes I and II using an attention-based deep neural model</title><source>MEDLINE</source><source>Oxford Journals Open Access Collection</source><source>EZB-FREE-00999 freely available EZB journals</source><source>PubMed Central</source><source>Alma/SFX Local Collection</source><creator>Venkatesh, Gopalakrishnan ; Grover, Aayush ; Srinivasaraghavan, G ; Rao, Shrisha</creator><creatorcontrib>Venkatesh, Gopalakrishnan ; Grover, Aayush ; Srinivasaraghavan, G ; Rao, Shrisha</creatorcontrib><description>Abstract Motivation Accurate prediction of binding between a major histocompatibility complex (MHC) allele and a peptide plays a major role in the synthesis of personalized cancer vaccines. The immune system struggles to distinguish between a cancerous and a healthy cell. In a patient suffering from cancer who has a particular MHC allele, only those peptides that bind with the MHC allele with high affinity, help the immune system recognize the cancerous cells. Results MHCAttnNet is a deep neural model that uses an attention mechanism to capture the relevant subsequences of the amino acid sequences of peptides and MHC alleles. It then uses this to accurately predict the MHC-peptide binding. MHCAttnNet achieves an AUC-PRC score of 94.18% with 161 class I MHC alleles, which outperforms the state-of-the-art models for this task. MHCAttnNet also achieves a better F1-score in comparison to the state-of-the-art models while covering a larger number of class II MHC alleles. The attention mechanism used by MHCAttnNet provides a heatmap over the amino acids thus indicating the important subsequences present in the amino acid sequence. This approach also allows us to focus on a much smaller number of relevant trigrams corresponding to the amino acid sequence of an MHC allele, from 9251 possible trigrams to about 258. This significantly reduces the number of amino acid subsequences that need to be clinically tested. Availability and implementation The data and source code are available at https://github.com/gopuvenkat/MHCAttnNet.</description><identifier>ISSN: 1367-4803</identifier><identifier>EISSN: 1367-4811</identifier><identifier>DOI: 10.1093/bioinformatics/btaa479</identifier><identifier>PMID: 32657386</identifier><language>eng</language><publisher>England: Oxford University Press</publisher><subject>Alleles ; Histocompatibility Antigens Class I - metabolism ; HLA Antigens ; Humans ; Peptides - metabolism ; Protein Binding ; Studies of Phenotypes and Clinical Applications</subject><ispartof>Bioinformatics (Oxford, England), 2020-07, Vol.36 (Supplement_1), p.i399-i406</ispartof><rights>The Author(s) 2020. Published by Oxford University Press. 2020</rights><rights>The Author(s) 2020. Published by Oxford University Press.</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c522t-69ff0869e053cb5584a2ee64d2869a263b9e52332bb2340fbe3bf29b395049e13</citedby><cites>FETCH-LOGICAL-c522t-69ff0869e053cb5584a2ee64d2869a263b9e52332bb2340fbe3bf29b395049e13</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC7355292/pdf/$$EPDF$$P50$$Gpubmedcentral$$Hfree_for_read</linktopdf><linktohtml>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC7355292/$$EHTML$$P50$$Gpubmedcentral$$Hfree_for_read</linktohtml><link.rule.ids>230,314,727,780,784,885,27924,27925,53791,53793</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/32657386$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Venkatesh, Gopalakrishnan</creatorcontrib><creatorcontrib>Grover, Aayush</creatorcontrib><creatorcontrib>Srinivasaraghavan, G</creatorcontrib><creatorcontrib>Rao, Shrisha</creatorcontrib><title>MHCAttnNet: predicting MHC-peptide bindings for MHC alleles classes I and II using an attention-based deep neural model</title><title>Bioinformatics (Oxford, England)</title><addtitle>Bioinformatics</addtitle><description>Abstract Motivation Accurate prediction of binding between a major histocompatibility complex (MHC) allele and a peptide plays a major role in the synthesis of personalized cancer vaccines. The immune system struggles to distinguish between a cancerous and a healthy cell. In a patient suffering from cancer who has a particular MHC allele, only those peptides that bind with the MHC allele with high affinity, help the immune system recognize the cancerous cells. Results MHCAttnNet is a deep neural model that uses an attention mechanism to capture the relevant subsequences of the amino acid sequences of peptides and MHC alleles. It then uses this to accurately predict the MHC-peptide binding. MHCAttnNet achieves an AUC-PRC score of 94.18% with 161 class I MHC alleles, which outperforms the state-of-the-art models for this task. MHCAttnNet also achieves a better F1-score in comparison to the state-of-the-art models while covering a larger number of class II MHC alleles. The attention mechanism used by MHCAttnNet provides a heatmap over the amino acids thus indicating the important subsequences present in the amino acid sequence. This approach also allows us to focus on a much smaller number of relevant trigrams corresponding to the amino acid sequence of an MHC allele, from 9251 possible trigrams to about 258. This significantly reduces the number of amino acid subsequences that need to be clinically tested. Availability and implementation The data and source code are available at https://github.com/gopuvenkat/MHCAttnNet.</description><subject>Alleles</subject><subject>Histocompatibility Antigens Class I - metabolism</subject><subject>HLA Antigens</subject><subject>Humans</subject><subject>Peptides - metabolism</subject><subject>Protein Binding</subject><subject>Studies of Phenotypes and Clinical Applications</subject><issn>1367-4803</issn><issn>1367-4811</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2020</creationdate><recordtype>article</recordtype><sourceid>TOX</sourceid><sourceid>EIF</sourceid><recordid>eNqNUU1rGzEUFKUhdt38haBjL5topZVs9VAIJh8GJ7mkZyGt3qYqWmm70ibk31fGrmluAcET82bmDQxC5zW5qIlkl8ZFF7o49jq7Nl2arHWzlJ_QvGZiWTWruv58_BM2Q19S-k0I4YSLUzRjVPAlW4k5er2_W1_lHB4gf8fDCNa12YVnXOBqgCE7C9i4YAuWcLm3W2DtPXhIuPU6pTI3WAeLNxs8pZ1WB6xzhpBdDJXRCSy2AAMOMI3a4z5a8F_RSad9grPDXKCfN9dP67tq-3i7WV9tq5ZTmishu46shATCWWs4XzWaAojG0gJqKpiRwClj1BjKGtIZYKaj0jDJSSOhZgv0Y-87TKYH25ZUJYMaRtfr8U1F7dT7TXC_1HN8UUvGOZW0GHw7GIzxzwQpq96lFrzXAeKUFG0o42z3ClXsqe0YUxqhO56pidq1pt63pg6tFeH5_yGPsn81FUK9J8Rp-KjpX9-_rA8</recordid><startdate>20200701</startdate><enddate>20200701</enddate><creator>Venkatesh, Gopalakrishnan</creator><creator>Grover, Aayush</creator><creator>Srinivasaraghavan, G</creator><creator>Rao, Shrisha</creator><general>Oxford University Press</general><scope>TOX</scope><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7X8</scope><scope>5PM</scope></search><sort><creationdate>20200701</creationdate><title>MHCAttnNet: predicting MHC-peptide bindings for MHC alleles classes I and II using an attention-based deep neural model</title><author>Venkatesh, Gopalakrishnan ; Grover, Aayush ; Srinivasaraghavan, G ; Rao, Shrisha</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c522t-69ff0869e053cb5584a2ee64d2869a263b9e52332bb2340fbe3bf29b395049e13</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2020</creationdate><topic>Alleles</topic><topic>Histocompatibility Antigens Class I - metabolism</topic><topic>HLA Antigens</topic><topic>Humans</topic><topic>Peptides - metabolism</topic><topic>Protein Binding</topic><topic>Studies of Phenotypes and Clinical Applications</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Venkatesh, Gopalakrishnan</creatorcontrib><creatorcontrib>Grover, Aayush</creatorcontrib><creatorcontrib>Srinivasaraghavan, G</creatorcontrib><creatorcontrib>Rao, Shrisha</creatorcontrib><collection>Oxford Journals Open Access Collection</collection><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>MEDLINE - Academic</collection><collection>PubMed Central (Full Participant titles)</collection><jtitle>Bioinformatics (Oxford, England)</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Venkatesh, Gopalakrishnan</au><au>Grover, Aayush</au><au>Srinivasaraghavan, G</au><au>Rao, Shrisha</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>MHCAttnNet: predicting MHC-peptide bindings for MHC alleles classes I and II using an attention-based deep neural model</atitle><jtitle>Bioinformatics (Oxford, England)</jtitle><addtitle>Bioinformatics</addtitle><date>2020-07-01</date><risdate>2020</risdate><volume>36</volume><issue>Supplement_1</issue><spage>i399</spage><epage>i406</epage><pages>i399-i406</pages><issn>1367-4803</issn><eissn>1367-4811</eissn><abstract>Abstract Motivation Accurate prediction of binding between a major histocompatibility complex (MHC) allele and a peptide plays a major role in the synthesis of personalized cancer vaccines. The immune system struggles to distinguish between a cancerous and a healthy cell. In a patient suffering from cancer who has a particular MHC allele, only those peptides that bind with the MHC allele with high affinity, help the immune system recognize the cancerous cells. Results MHCAttnNet is a deep neural model that uses an attention mechanism to capture the relevant subsequences of the amino acid sequences of peptides and MHC alleles. It then uses this to accurately predict the MHC-peptide binding. MHCAttnNet achieves an AUC-PRC score of 94.18% with 161 class I MHC alleles, which outperforms the state-of-the-art models for this task. MHCAttnNet also achieves a better F1-score in comparison to the state-of-the-art models while covering a larger number of class II MHC alleles. The attention mechanism used by MHCAttnNet provides a heatmap over the amino acids thus indicating the important subsequences present in the amino acid sequence. This approach also allows us to focus on a much smaller number of relevant trigrams corresponding to the amino acid sequence of an MHC allele, from 9251 possible trigrams to about 258. This significantly reduces the number of amino acid subsequences that need to be clinically tested. Availability and implementation The data and source code are available at https://github.com/gopuvenkat/MHCAttnNet.</abstract><cop>England</cop><pub>Oxford University Press</pub><pmid>32657386</pmid><doi>10.1093/bioinformatics/btaa479</doi><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 1367-4803
ispartof Bioinformatics (Oxford, England), 2020-07, Vol.36 (Supplement_1), p.i399-i406
issn 1367-4803
1367-4811
language eng
recordid cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_7355292
source MEDLINE; Oxford Journals Open Access Collection; EZB-FREE-00999 freely available EZB journals; PubMed Central; Alma/SFX Local Collection
subjects Alleles
Histocompatibility Antigens Class I - metabolism
HLA Antigens
Humans
Peptides - metabolism
Protein Binding
Studies of Phenotypes and Clinical Applications
title MHCAttnNet: predicting MHC-peptide bindings for MHC alleles classes I and II using an attention-based deep neural model
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-26T01%3A41%3A12IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_pubme&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=MHCAttnNet:%20predicting%20MHC-peptide%20bindings%20for%20MHC%20alleles%20classes%20I%20and%20II%20using%20an%20attention-based%20deep%20neural%20model&rft.jtitle=Bioinformatics%20(Oxford,%20England)&rft.au=Venkatesh,%20Gopalakrishnan&rft.date=2020-07-01&rft.volume=36&rft.issue=Supplement_1&rft.spage=i399&rft.epage=i406&rft.pages=i399-i406&rft.issn=1367-4803&rft.eissn=1367-4811&rft_id=info:doi/10.1093/bioinformatics/btaa479&rft_dat=%3Cproquest_pubme%3E2423533533%3C/proquest_pubme%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2423533533&rft_id=info:pmid/32657386&rft_oup_id=10.1093/bioinformatics/btaa479&rfr_iscdi=true