Kermut: Composite kernel regression for protein variant effects
Reliable prediction of protein variant effects is crucial for both protein optimization and for advancing biological understanding. For practical use in protein engineering, it is important that we can also provide reliable uncertainty estimates for our predictions, and while prediction accuracy has...
Gespeichert in:
Hauptverfasser: | , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | |
---|---|
container_issue | |
container_start_page | |
container_title | |
container_volume | |
creator | Groth, Peter Mørch Kerrn, Mads Herbert Olsen, Lars Salomon, Jesper Boomsma, Wouter |
description | Reliable prediction of protein variant effects is crucial for both protein
optimization and for advancing biological understanding. For practical use in
protein engineering, it is important that we can also provide reliable
uncertainty estimates for our predictions, and while prediction accuracy has
seen much progress in recent years, uncertainty metrics are rarely reported. We
here provide a Gaussian process regression model, Kermut, with a novel
composite kernel for modeling mutation similarity, which obtains
state-of-the-art performance for supervised protein variant effect prediction
while also offering estimates of uncertainty through its posterior. An analysis
of the quality of the uncertainty estimates demonstrates that our model
provides meaningful levels of overall calibration, but that instance-specific
uncertainty calibration remains more challenging. |
doi_str_mv | 10.48550/arxiv.2407.00002 |
format | Article |
fullrecord | <record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_2407_00002</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2407_00002</sourcerecordid><originalsourceid>FETCH-arxiv_primary_2407_000023</originalsourceid><addsrcrecordid>eNpjYJA0NNAzsTA1NdBPLKrILNMzMjEw1zMAAiNOBnvv1KLc0hIrBef83IL84sySVIXs1KK81ByFotT0otTi4sz8PIW0_CKFgqL8ktTMPIWyxKLMxLwShdS0tNTkkmIeBta0xJziVF4ozc0g7-Ya4uyhC7YqvqAoMzexqDIeZGU82EpjwioA7eM2xA</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Kermut: Composite kernel regression for protein variant effects</title><source>arXiv.org</source><creator>Groth, Peter Mørch ; Kerrn, Mads Herbert ; Olsen, Lars ; Salomon, Jesper ; Boomsma, Wouter</creator><creatorcontrib>Groth, Peter Mørch ; Kerrn, Mads Herbert ; Olsen, Lars ; Salomon, Jesper ; Boomsma, Wouter</creatorcontrib><description>Reliable prediction of protein variant effects is crucial for both protein
optimization and for advancing biological understanding. For practical use in
protein engineering, it is important that we can also provide reliable
uncertainty estimates for our predictions, and while prediction accuracy has
seen much progress in recent years, uncertainty metrics are rarely reported. We
here provide a Gaussian process regression model, Kermut, with a novel
composite kernel for modeling mutation similarity, which obtains
state-of-the-art performance for supervised protein variant effect prediction
while also offering estimates of uncertainty through its posterior. An analysis
of the quality of the uncertainty estimates demonstrates that our model
provides meaningful levels of overall calibration, but that instance-specific
uncertainty calibration remains more challenging.</description><identifier>DOI: 10.48550/arxiv.2407.00002</identifier><language>eng</language><subject>Computer Science - Learning ; Quantitative Biology - Biomolecules</subject><creationdate>2024-04</creationdate><rights>http://creativecommons.org/licenses/by/4.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,776,881</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/2407.00002$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.2407.00002$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Groth, Peter Mørch</creatorcontrib><creatorcontrib>Kerrn, Mads Herbert</creatorcontrib><creatorcontrib>Olsen, Lars</creatorcontrib><creatorcontrib>Salomon, Jesper</creatorcontrib><creatorcontrib>Boomsma, Wouter</creatorcontrib><title>Kermut: Composite kernel regression for protein variant effects</title><description>Reliable prediction of protein variant effects is crucial for both protein
optimization and for advancing biological understanding. For practical use in
protein engineering, it is important that we can also provide reliable
uncertainty estimates for our predictions, and while prediction accuracy has
seen much progress in recent years, uncertainty metrics are rarely reported. We
here provide a Gaussian process regression model, Kermut, with a novel
composite kernel for modeling mutation similarity, which obtains
state-of-the-art performance for supervised protein variant effect prediction
while also offering estimates of uncertainty through its posterior. An analysis
of the quality of the uncertainty estimates demonstrates that our model
provides meaningful levels of overall calibration, but that instance-specific
uncertainty calibration remains more challenging.</description><subject>Computer Science - Learning</subject><subject>Quantitative Biology - Biomolecules</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNpjYJA0NNAzsTA1NdBPLKrILNMzMjEw1zMAAiNOBnvv1KLc0hIrBef83IL84sySVIXs1KK81ByFotT0otTi4sz8PIW0_CKFgqL8ktTMPIWyxKLMxLwShdS0tNTkkmIeBta0xJziVF4ozc0g7-Ya4uyhC7YqvqAoMzexqDIeZGU82EpjwioA7eM2xA</recordid><startdate>20240409</startdate><enddate>20240409</enddate><creator>Groth, Peter Mørch</creator><creator>Kerrn, Mads Herbert</creator><creator>Olsen, Lars</creator><creator>Salomon, Jesper</creator><creator>Boomsma, Wouter</creator><scope>AKY</scope><scope>ALC</scope><scope>GOX</scope></search><sort><creationdate>20240409</creationdate><title>Kermut: Composite kernel regression for protein variant effects</title><author>Groth, Peter Mørch ; Kerrn, Mads Herbert ; Olsen, Lars ; Salomon, Jesper ; Boomsma, Wouter</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-arxiv_primary_2407_000023</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Computer Science - Learning</topic><topic>Quantitative Biology - Biomolecules</topic><toplevel>online_resources</toplevel><creatorcontrib>Groth, Peter Mørch</creatorcontrib><creatorcontrib>Kerrn, Mads Herbert</creatorcontrib><creatorcontrib>Olsen, Lars</creatorcontrib><creatorcontrib>Salomon, Jesper</creatorcontrib><creatorcontrib>Boomsma, Wouter</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv Quantitative Biology</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Groth, Peter Mørch</au><au>Kerrn, Mads Herbert</au><au>Olsen, Lars</au><au>Salomon, Jesper</au><au>Boomsma, Wouter</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Kermut: Composite kernel regression for protein variant effects</atitle><date>2024-04-09</date><risdate>2024</risdate><abstract>Reliable prediction of protein variant effects is crucial for both protein
optimization and for advancing biological understanding. For practical use in
protein engineering, it is important that we can also provide reliable
uncertainty estimates for our predictions, and while prediction accuracy has
seen much progress in recent years, uncertainty metrics are rarely reported. We
here provide a Gaussian process regression model, Kermut, with a novel
composite kernel for modeling mutation similarity, which obtains
state-of-the-art performance for supervised protein variant effect prediction
while also offering estimates of uncertainty through its posterior. An analysis
of the quality of the uncertainty estimates demonstrates that our model
provides meaningful levels of overall calibration, but that instance-specific
uncertainty calibration remains more challenging.</abstract><doi>10.48550/arxiv.2407.00002</doi><oa>free_for_read</oa></addata></record> |
fulltext | fulltext_linktorsrc |
identifier | DOI: 10.48550/arxiv.2407.00002 |
ispartof | |
issn | |
language | eng |
recordid | cdi_arxiv_primary_2407_00002 |
source | arXiv.org |
subjects | Computer Science - Learning Quantitative Biology - Biomolecules |
title | Kermut: Composite kernel regression for protein variant effects |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-26T22%3A45%3A32IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Kermut:%20Composite%20kernel%20regression%20for%20protein%20variant%20effects&rft.au=Groth,%20Peter%20M%C3%B8rch&rft.date=2024-04-09&rft_id=info:doi/10.48550/arxiv.2407.00002&rft_dat=%3Carxiv_GOX%3E2407_00002%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true |