Kermut: Composite kernel regression for protein variant effects

Reliable prediction of protein variant effects is crucial for both protein optimization and for advancing biological understanding. For practical use in protein engineering, it is important that we can also provide reliable uncertainty estimates for our predictions, and while prediction accuracy has...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Groth, Peter Mørch, Kerrn, Mads Herbert, Olsen, Lars, Salomon, Jesper, Boomsma, Wouter
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page
container_issue
container_start_page
container_title
container_volume
creator Groth, Peter Mørch
Kerrn, Mads Herbert
Olsen, Lars
Salomon, Jesper
Boomsma, Wouter
description Reliable prediction of protein variant effects is crucial for both protein optimization and for advancing biological understanding. For practical use in protein engineering, it is important that we can also provide reliable uncertainty estimates for our predictions, and while prediction accuracy has seen much progress in recent years, uncertainty metrics are rarely reported. We here provide a Gaussian process regression model, Kermut, with a novel composite kernel for modeling mutation similarity, which obtains state-of-the-art performance for supervised protein variant effect prediction while also offering estimates of uncertainty through its posterior. An analysis of the quality of the uncertainty estimates demonstrates that our model provides meaningful levels of overall calibration, but that instance-specific uncertainty calibration remains more challenging.
doi_str_mv 10.48550/arxiv.2407.00002
format Article
fullrecord <record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_2407_00002</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2407_00002</sourcerecordid><originalsourceid>FETCH-arxiv_primary_2407_000023</originalsourceid><addsrcrecordid>eNpjYJA0NNAzsTA1NdBPLKrILNMzMjEw1zMAAiNOBnvv1KLc0hIrBef83IL84sySVIXs1KK81ByFotT0otTi4sz8PIW0_CKFgqL8ktTMPIWyxKLMxLwShdS0tNTkkmIeBta0xJziVF4ozc0g7-Ya4uyhC7YqvqAoMzexqDIeZGU82EpjwioA7eM2xA</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Kermut: Composite kernel regression for protein variant effects</title><source>arXiv.org</source><creator>Groth, Peter Mørch ; Kerrn, Mads Herbert ; Olsen, Lars ; Salomon, Jesper ; Boomsma, Wouter</creator><creatorcontrib>Groth, Peter Mørch ; Kerrn, Mads Herbert ; Olsen, Lars ; Salomon, Jesper ; Boomsma, Wouter</creatorcontrib><description>Reliable prediction of protein variant effects is crucial for both protein optimization and for advancing biological understanding. For practical use in protein engineering, it is important that we can also provide reliable uncertainty estimates for our predictions, and while prediction accuracy has seen much progress in recent years, uncertainty metrics are rarely reported. We here provide a Gaussian process regression model, Kermut, with a novel composite kernel for modeling mutation similarity, which obtains state-of-the-art performance for supervised protein variant effect prediction while also offering estimates of uncertainty through its posterior. An analysis of the quality of the uncertainty estimates demonstrates that our model provides meaningful levels of overall calibration, but that instance-specific uncertainty calibration remains more challenging.</description><identifier>DOI: 10.48550/arxiv.2407.00002</identifier><language>eng</language><subject>Computer Science - Learning ; Quantitative Biology - Biomolecules</subject><creationdate>2024-04</creationdate><rights>http://creativecommons.org/licenses/by/4.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,776,881</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/2407.00002$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.2407.00002$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Groth, Peter Mørch</creatorcontrib><creatorcontrib>Kerrn, Mads Herbert</creatorcontrib><creatorcontrib>Olsen, Lars</creatorcontrib><creatorcontrib>Salomon, Jesper</creatorcontrib><creatorcontrib>Boomsma, Wouter</creatorcontrib><title>Kermut: Composite kernel regression for protein variant effects</title><description>Reliable prediction of protein variant effects is crucial for both protein optimization and for advancing biological understanding. For practical use in protein engineering, it is important that we can also provide reliable uncertainty estimates for our predictions, and while prediction accuracy has seen much progress in recent years, uncertainty metrics are rarely reported. We here provide a Gaussian process regression model, Kermut, with a novel composite kernel for modeling mutation similarity, which obtains state-of-the-art performance for supervised protein variant effect prediction while also offering estimates of uncertainty through its posterior. An analysis of the quality of the uncertainty estimates demonstrates that our model provides meaningful levels of overall calibration, but that instance-specific uncertainty calibration remains more challenging.</description><subject>Computer Science - Learning</subject><subject>Quantitative Biology - Biomolecules</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNpjYJA0NNAzsTA1NdBPLKrILNMzMjEw1zMAAiNOBnvv1KLc0hIrBef83IL84sySVIXs1KK81ByFotT0otTi4sz8PIW0_CKFgqL8ktTMPIWyxKLMxLwShdS0tNTkkmIeBta0xJziVF4ozc0g7-Ya4uyhC7YqvqAoMzexqDIeZGU82EpjwioA7eM2xA</recordid><startdate>20240409</startdate><enddate>20240409</enddate><creator>Groth, Peter Mørch</creator><creator>Kerrn, Mads Herbert</creator><creator>Olsen, Lars</creator><creator>Salomon, Jesper</creator><creator>Boomsma, Wouter</creator><scope>AKY</scope><scope>ALC</scope><scope>GOX</scope></search><sort><creationdate>20240409</creationdate><title>Kermut: Composite kernel regression for protein variant effects</title><author>Groth, Peter Mørch ; Kerrn, Mads Herbert ; Olsen, Lars ; Salomon, Jesper ; Boomsma, Wouter</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-arxiv_primary_2407_000023</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Computer Science - Learning</topic><topic>Quantitative Biology - Biomolecules</topic><toplevel>online_resources</toplevel><creatorcontrib>Groth, Peter Mørch</creatorcontrib><creatorcontrib>Kerrn, Mads Herbert</creatorcontrib><creatorcontrib>Olsen, Lars</creatorcontrib><creatorcontrib>Salomon, Jesper</creatorcontrib><creatorcontrib>Boomsma, Wouter</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv Quantitative Biology</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Groth, Peter Mørch</au><au>Kerrn, Mads Herbert</au><au>Olsen, Lars</au><au>Salomon, Jesper</au><au>Boomsma, Wouter</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Kermut: Composite kernel regression for protein variant effects</atitle><date>2024-04-09</date><risdate>2024</risdate><abstract>Reliable prediction of protein variant effects is crucial for both protein optimization and for advancing biological understanding. For practical use in protein engineering, it is important that we can also provide reliable uncertainty estimates for our predictions, and while prediction accuracy has seen much progress in recent years, uncertainty metrics are rarely reported. We here provide a Gaussian process regression model, Kermut, with a novel composite kernel for modeling mutation similarity, which obtains state-of-the-art performance for supervised protein variant effect prediction while also offering estimates of uncertainty through its posterior. An analysis of the quality of the uncertainty estimates demonstrates that our model provides meaningful levels of overall calibration, but that instance-specific uncertainty calibration remains more challenging.</abstract><doi>10.48550/arxiv.2407.00002</doi><oa>free_for_read</oa></addata></record>
fulltext fulltext_linktorsrc
identifier DOI: 10.48550/arxiv.2407.00002
ispartof
issn
language eng
recordid cdi_arxiv_primary_2407_00002
source arXiv.org
subjects Computer Science - Learning
Quantitative Biology - Biomolecules
title Kermut: Composite kernel regression for protein variant effects
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-26T22%3A45%3A32IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Kermut:%20Composite%20kernel%20regression%20for%20protein%20variant%20effects&rft.au=Groth,%20Peter%20M%C3%B8rch&rft.date=2024-04-09&rft_id=info:doi/10.48550/arxiv.2407.00002&rft_dat=%3Carxiv_GOX%3E2407_00002%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true