Applying interpretable machine learning in computational biology—pitfalls, recommendations and opportunities for new developments
Recent advances in machine learning have enabled the development of next-generation predictive models for complex computational biology problems, thereby spurring the use of interpretable machine learning (IML) to unveil biological insights. However, guidelines for using IML in computational biology...
Gespeichert in:
Veröffentlicht in: | Nature methods 2024-08, Vol.21 (8), p.1454-1461 |
---|---|
Hauptverfasser: | , , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 1461 |
---|---|
container_issue | 8 |
container_start_page | 1454 |
container_title | Nature methods |
container_volume | 21 |
creator | Chen, Valerie Yang, Muyu Cui, Wenbo Kim, Joon Sik Talwalkar, Ameet Ma, Jian |
description | Recent advances in machine learning have enabled the development of next-generation predictive models for complex computational biology problems, thereby spurring the use of interpretable machine learning (IML) to unveil biological insights. However, guidelines for using IML in computational biology are generally underdeveloped. We provide an overview of IML methods and evaluation techniques and discuss common pitfalls encountered when applying IML methods to computational biology problems. We also highlight open questions, especially in the era of large language models, and call for collaboration between IML and computational biology researchers.
This Perspective discusses the methodologies, application and evaluation of interpretable machine learning (IML) approaches in computational biology, with particular focus on common pitfalls when using IML and how to avoid them. |
doi_str_mv | 10.1038/s41592-024-02359-7 |
format | Article |
fullrecord | <record><control><sourceid>proquest_pubme</sourceid><recordid>TN_cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_11348280</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>3091283395</sourcerecordid><originalsourceid>FETCH-LOGICAL-c356t-dd089f21038b58cc0b22f63f05e7d4fae91030c9e0932607508b29378738043b3</originalsourceid><addsrcrecordid>eNp9kctu1TAQhiNERUvhBVggS2xYkOJLfGKvUFVxkyp1U9aW40xOXTm2sZ1WZ4fEK_CEPAlu05bLgoVlS_83_3jmb5oXBB8RzMTb3BEuaYtpVw_jsu0fNQeEd6LtCeaP799Ykv3mac6XGDPWUf6k2WeSUCo7ctB8P47R7azfIusLpJig6MEBmrW5sB6QA538KiMT5rgUXWzw2qHBBhe2u5_ffkRbJu1cfoMSVGYGP95CGWk_ohBjSGXxtljIaAoJebhGI1yBC7GyJT9r9mp9hud392Hz5cP785NP7enZx88nx6etYXxT2nHEQk70ZvSBC2PwQOm0YRPm0I_dpEFWCRsJWDK6wT3HYqCS9aJnAndsYIfNu9U3LsMMo6m9k3YqJjvrtFNBW_W34u2F2oYrRQjrBBW4Ory-c0jh6wK5qNlmA85pD2HJitVdU8GY5BV99Q96GZZUF7dSmAgmWaXoSpkUck4wPfyGYHUzqVpDVjVkdRuy6mvRyz_neCi5T7UCbAVylfwW0u_e_7H9BenYtsc</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>3091018393</pqid></control><display><type>article</type><title>Applying interpretable machine learning in computational biology—pitfalls, recommendations and opportunities for new developments</title><source>MEDLINE</source><source>Springer Nature - Complete Springer Journals</source><source>Nature Journals Online</source><creator>Chen, Valerie ; Yang, Muyu ; Cui, Wenbo ; Kim, Joon Sik ; Talwalkar, Ameet ; Ma, Jian</creator><creatorcontrib>Chen, Valerie ; Yang, Muyu ; Cui, Wenbo ; Kim, Joon Sik ; Talwalkar, Ameet ; Ma, Jian</creatorcontrib><description>Recent advances in machine learning have enabled the development of next-generation predictive models for complex computational biology problems, thereby spurring the use of interpretable machine learning (IML) to unveil biological insights. However, guidelines for using IML in computational biology are generally underdeveloped. We provide an overview of IML methods and evaluation techniques and discuss common pitfalls encountered when applying IML methods to computational biology problems. We also highlight open questions, especially in the era of large language models, and call for collaboration between IML and computational biology researchers.
This Perspective discusses the methodologies, application and evaluation of interpretable machine learning (IML) approaches in computational biology, with particular focus on common pitfalls when using IML and how to avoid them.</description><identifier>ISSN: 1548-7091</identifier><identifier>ISSN: 1548-7105</identifier><identifier>EISSN: 1548-7105</identifier><identifier>DOI: 10.1038/s41592-024-02359-7</identifier><identifier>PMID: 39122941</identifier><language>eng</language><publisher>New York: Nature Publishing Group US</publisher><subject>631/114/1305 ; 631/114/2397 ; 631/1647/794 ; 631/208/212 ; Algorithms ; Bioinformatics ; Biological Microscopy ; Biological Techniques ; Biology ; Biomedical and Life Sciences ; Biomedical Engineering/Biotechnology ; Collaboration ; Computational Biology - methods ; Computer applications ; Computer science ; Design techniques ; Gene expression ; Humans ; Large language models ; Learning algorithms ; Life Sciences ; Machine Learning ; Neural networks ; Perspective ; Prediction models ; Proteins ; Proteomics</subject><ispartof>Nature methods, 2024-08, Vol.21 (8), p.1454-1461</ispartof><rights>Springer Nature America, Inc. 2024. Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.</rights><rights>2024. Springer Nature America, Inc.</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c356t-dd089f21038b58cc0b22f63f05e7d4fae91030c9e0932607508b29378738043b3</cites><orcidid>0000-0002-0142-0328 ; 0009-0006-9866-4439 ; 0000-0001-6650-1893 ; 0000-0002-4202-5834 ; 0009-0007-2783-0265 ; 0009-0006-0057-4735</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://link.springer.com/content/pdf/10.1038/s41592-024-02359-7$$EPDF$$P50$$Gspringer$$H</linktopdf><linktohtml>$$Uhttps://link.springer.com/10.1038/s41592-024-02359-7$$EHTML$$P50$$Gspringer$$H</linktohtml><link.rule.ids>230,314,776,780,881,27901,27902,41464,42533,51294</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/39122941$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Chen, Valerie</creatorcontrib><creatorcontrib>Yang, Muyu</creatorcontrib><creatorcontrib>Cui, Wenbo</creatorcontrib><creatorcontrib>Kim, Joon Sik</creatorcontrib><creatorcontrib>Talwalkar, Ameet</creatorcontrib><creatorcontrib>Ma, Jian</creatorcontrib><title>Applying interpretable machine learning in computational biology—pitfalls, recommendations and opportunities for new developments</title><title>Nature methods</title><addtitle>Nat Methods</addtitle><addtitle>Nat Methods</addtitle><description>Recent advances in machine learning have enabled the development of next-generation predictive models for complex computational biology problems, thereby spurring the use of interpretable machine learning (IML) to unveil biological insights. However, guidelines for using IML in computational biology are generally underdeveloped. We provide an overview of IML methods and evaluation techniques and discuss common pitfalls encountered when applying IML methods to computational biology problems. We also highlight open questions, especially in the era of large language models, and call for collaboration between IML and computational biology researchers.
This Perspective discusses the methodologies, application and evaluation of interpretable machine learning (IML) approaches in computational biology, with particular focus on common pitfalls when using IML and how to avoid them.</description><subject>631/114/1305</subject><subject>631/114/2397</subject><subject>631/1647/794</subject><subject>631/208/212</subject><subject>Algorithms</subject><subject>Bioinformatics</subject><subject>Biological Microscopy</subject><subject>Biological Techniques</subject><subject>Biology</subject><subject>Biomedical and Life Sciences</subject><subject>Biomedical Engineering/Biotechnology</subject><subject>Collaboration</subject><subject>Computational Biology - methods</subject><subject>Computer applications</subject><subject>Computer science</subject><subject>Design techniques</subject><subject>Gene expression</subject><subject>Humans</subject><subject>Large language models</subject><subject>Learning algorithms</subject><subject>Life Sciences</subject><subject>Machine Learning</subject><subject>Neural networks</subject><subject>Perspective</subject><subject>Prediction models</subject><subject>Proteins</subject><subject>Proteomics</subject><issn>1548-7091</issn><issn>1548-7105</issn><issn>1548-7105</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>EIF</sourceid><recordid>eNp9kctu1TAQhiNERUvhBVggS2xYkOJLfGKvUFVxkyp1U9aW40xOXTm2sZ1WZ4fEK_CEPAlu05bLgoVlS_83_3jmb5oXBB8RzMTb3BEuaYtpVw_jsu0fNQeEd6LtCeaP799Ykv3mac6XGDPWUf6k2WeSUCo7ctB8P47R7azfIusLpJig6MEBmrW5sB6QA538KiMT5rgUXWzw2qHBBhe2u5_ffkRbJu1cfoMSVGYGP95CGWk_ohBjSGXxtljIaAoJebhGI1yBC7GyJT9r9mp9hud392Hz5cP785NP7enZx88nx6etYXxT2nHEQk70ZvSBC2PwQOm0YRPm0I_dpEFWCRsJWDK6wT3HYqCS9aJnAndsYIfNu9U3LsMMo6m9k3YqJjvrtFNBW_W34u2F2oYrRQjrBBW4Ory-c0jh6wK5qNlmA85pD2HJitVdU8GY5BV99Q96GZZUF7dSmAgmWaXoSpkUck4wPfyGYHUzqVpDVjVkdRuy6mvRyz_neCi5T7UCbAVylfwW0u_e_7H9BenYtsc</recordid><startdate>20240801</startdate><enddate>20240801</enddate><creator>Chen, Valerie</creator><creator>Yang, Muyu</creator><creator>Cui, Wenbo</creator><creator>Kim, Joon Sik</creator><creator>Talwalkar, Ameet</creator><creator>Ma, Jian</creator><general>Nature Publishing Group US</general><general>Nature Publishing Group</general><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7QL</scope><scope>7QO</scope><scope>7SS</scope><scope>7TK</scope><scope>7U9</scope><scope>8FD</scope><scope>C1K</scope><scope>FR3</scope><scope>H94</scope><scope>K9.</scope><scope>M7N</scope><scope>P64</scope><scope>RC3</scope><scope>7X8</scope><scope>5PM</scope><orcidid>https://orcid.org/0000-0002-0142-0328</orcidid><orcidid>https://orcid.org/0009-0006-9866-4439</orcidid><orcidid>https://orcid.org/0000-0001-6650-1893</orcidid><orcidid>https://orcid.org/0000-0002-4202-5834</orcidid><orcidid>https://orcid.org/0009-0007-2783-0265</orcidid><orcidid>https://orcid.org/0009-0006-0057-4735</orcidid></search><sort><creationdate>20240801</creationdate><title>Applying interpretable machine learning in computational biology—pitfalls, recommendations and opportunities for new developments</title><author>Chen, Valerie ; Yang, Muyu ; Cui, Wenbo ; Kim, Joon Sik ; Talwalkar, Ameet ; Ma, Jian</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c356t-dd089f21038b58cc0b22f63f05e7d4fae91030c9e0932607508b29378738043b3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>631/114/1305</topic><topic>631/114/2397</topic><topic>631/1647/794</topic><topic>631/208/212</topic><topic>Algorithms</topic><topic>Bioinformatics</topic><topic>Biological Microscopy</topic><topic>Biological Techniques</topic><topic>Biology</topic><topic>Biomedical and Life Sciences</topic><topic>Biomedical Engineering/Biotechnology</topic><topic>Collaboration</topic><topic>Computational Biology - methods</topic><topic>Computer applications</topic><topic>Computer science</topic><topic>Design techniques</topic><topic>Gene expression</topic><topic>Humans</topic><topic>Large language models</topic><topic>Learning algorithms</topic><topic>Life Sciences</topic><topic>Machine Learning</topic><topic>Neural networks</topic><topic>Perspective</topic><topic>Prediction models</topic><topic>Proteins</topic><topic>Proteomics</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Chen, Valerie</creatorcontrib><creatorcontrib>Yang, Muyu</creatorcontrib><creatorcontrib>Cui, Wenbo</creatorcontrib><creatorcontrib>Kim, Joon Sik</creatorcontrib><creatorcontrib>Talwalkar, Ameet</creatorcontrib><creatorcontrib>Ma, Jian</creatorcontrib><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>Bacteriology Abstracts (Microbiology B)</collection><collection>Biotechnology Research Abstracts</collection><collection>Entomology Abstracts (Full archive)</collection><collection>Neurosciences Abstracts</collection><collection>Virology and AIDS Abstracts</collection><collection>Technology Research Database</collection><collection>Environmental Sciences and Pollution Management</collection><collection>Engineering Research Database</collection><collection>AIDS and Cancer Research Abstracts</collection><collection>ProQuest Health & Medical Complete (Alumni)</collection><collection>Algology Mycology and Protozoology Abstracts (Microbiology C)</collection><collection>Biotechnology and BioEngineering Abstracts</collection><collection>Genetics Abstracts</collection><collection>MEDLINE - Academic</collection><collection>PubMed Central (Full Participant titles)</collection><jtitle>Nature methods</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Chen, Valerie</au><au>Yang, Muyu</au><au>Cui, Wenbo</au><au>Kim, Joon Sik</au><au>Talwalkar, Ameet</au><au>Ma, Jian</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Applying interpretable machine learning in computational biology—pitfalls, recommendations and opportunities for new developments</atitle><jtitle>Nature methods</jtitle><stitle>Nat Methods</stitle><addtitle>Nat Methods</addtitle><date>2024-08-01</date><risdate>2024</risdate><volume>21</volume><issue>8</issue><spage>1454</spage><epage>1461</epage><pages>1454-1461</pages><issn>1548-7091</issn><issn>1548-7105</issn><eissn>1548-7105</eissn><abstract>Recent advances in machine learning have enabled the development of next-generation predictive models for complex computational biology problems, thereby spurring the use of interpretable machine learning (IML) to unveil biological insights. However, guidelines for using IML in computational biology are generally underdeveloped. We provide an overview of IML methods and evaluation techniques and discuss common pitfalls encountered when applying IML methods to computational biology problems. We also highlight open questions, especially in the era of large language models, and call for collaboration between IML and computational biology researchers.
This Perspective discusses the methodologies, application and evaluation of interpretable machine learning (IML) approaches in computational biology, with particular focus on common pitfalls when using IML and how to avoid them.</abstract><cop>New York</cop><pub>Nature Publishing Group US</pub><pmid>39122941</pmid><doi>10.1038/s41592-024-02359-7</doi><tpages>8</tpages><orcidid>https://orcid.org/0000-0002-0142-0328</orcidid><orcidid>https://orcid.org/0009-0006-9866-4439</orcidid><orcidid>https://orcid.org/0000-0001-6650-1893</orcidid><orcidid>https://orcid.org/0000-0002-4202-5834</orcidid><orcidid>https://orcid.org/0009-0007-2783-0265</orcidid><orcidid>https://orcid.org/0009-0006-0057-4735</orcidid><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | ISSN: 1548-7091 |
ispartof | Nature methods, 2024-08, Vol.21 (8), p.1454-1461 |
issn | 1548-7091 1548-7105 1548-7105 |
language | eng |
recordid | cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_11348280 |
source | MEDLINE; Springer Nature - Complete Springer Journals; Nature Journals Online |
subjects | 631/114/1305 631/114/2397 631/1647/794 631/208/212 Algorithms Bioinformatics Biological Microscopy Biological Techniques Biology Biomedical and Life Sciences Biomedical Engineering/Biotechnology Collaboration Computational Biology - methods Computer applications Computer science Design techniques Gene expression Humans Large language models Learning algorithms Life Sciences Machine Learning Neural networks Perspective Prediction models Proteins Proteomics |
title | Applying interpretable machine learning in computational biology—pitfalls, recommendations and opportunities for new developments |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-31T00%3A20%3A56IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_pubme&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Applying%20interpretable%20machine%20learning%20in%20computational%20biology%E2%80%94pitfalls,%20recommendations%20and%20opportunities%20for%20new%20developments&rft.jtitle=Nature%20methods&rft.au=Chen,%20Valerie&rft.date=2024-08-01&rft.volume=21&rft.issue=8&rft.spage=1454&rft.epage=1461&rft.pages=1454-1461&rft.issn=1548-7091&rft.eissn=1548-7105&rft_id=info:doi/10.1038/s41592-024-02359-7&rft_dat=%3Cproquest_pubme%3E3091283395%3C/proquest_pubme%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=3091018393&rft_id=info:pmid/39122941&rfr_iscdi=true |