Prioritizing Virtual Screening with Interpretable Interaction Fingerprints

Machine learning-based drug discovery success depends on molecular representation. Yet traditional molecular fingerprints omit both the protein and pointers back to structural information that would enable better model interpretability. Therefore, we propose LUNA, a Python 3 toolkit that calculates...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Journal of chemical information and modeling 2022-09, Vol.62 (18), p.4300-4318
Hauptverfasser: Fassio, Alexandre V., Shub, Laura, Ponzoni, Luca, McKinley, Jessica, O’Meara, Matthew J., Ferreira, Rafaela S., Keiser, Michael J., de Melo Minardi, Raquel C.
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 4318
container_issue 18
container_start_page 4300
container_title Journal of chemical information and modeling
container_volume 62
creator Fassio, Alexandre V.
Shub, Laura
Ponzoni, Luca
McKinley, Jessica
O’Meara, Matthew J.
Ferreira, Rafaela S.
Keiser, Michael J.
de Melo Minardi, Raquel C.
description Machine learning-based drug discovery success depends on molecular representation. Yet traditional molecular fingerprints omit both the protein and pointers back to structural information that would enable better model interpretability. Therefore, we propose LUNA, a Python 3 toolkit that calculates and encodes protein–ligand interactions into new hashed fingerprints inspired by Extended Connectivity FingerPrint (ECFP): EIFP (Extended Interaction FingerPrint), FIFP (Functional Interaction FingerPrint), and Hybrid Interaction FingerPrint (HIFP). LUNA also provides visual strategies to make the fingerprints interpretable. We performed three major experiments exploring the fingerprints’ use. First, we trained machine learning models to reproduce DOCK3.7 scores using 1 million docked Dopamine D4 complexes. We found that EIFP-4,096 performed (R 2 = 0.61) superior to related molecular and interaction fingerprints. Second, we used LUNA to support interpretable machine learning models. Finally, we demonstrate that interaction fingerprints can accurately identify similarities across molecular complexes that other fingerprints overlook. Hence, we envision LUNA and its interface fingerprints as promising methods for machine learning-based virtual screening campaigns. LUNA is freely available at https://github.com/keiserlab/LUNA.
doi_str_mv 10.1021/acs.jcim.2c00695
format Article
fullrecord <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_miscellaneous_2714390374</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2718383777</sourcerecordid><originalsourceid>FETCH-LOGICAL-a383t-9738fb67511791215d8e97795b8ee5084f14479f1bb0f0410f99fd497f0551143</originalsourceid><addsrcrecordid>eNp1kM9LwzAUx4MoOKd3jwUvHux8aZqmOcpwOhko-ANvIc0SzejamaSI_vWmdrsInt6vz_fx3hehUwwTDBm-lMpPVsquJ5kCKDjdQyNMc57yAl73dznlxSE68n4FQAgvshG6e3C2dTbYb9u8JS_WhU7WyaNyWjd959OG92TeBO02TgdZ1XqopAq2bZJZZPqRbYI_RgdG1l6fbOMYPc-un6a36eL-Zj69WqSSlCSknJHSVAWjGDOOM0yXpeaMcVqVWlMoc4PznHGDqwoM5BgM52aZc2aARk1Oxuh82Ltx7UenfRBr65Wua9notvMiYxHiQFiPnv1BV23nmnhdT5XxHsZYpGCglGu9d9qI-NBaui-BQfTmimiu6M0VW3Oj5GKQ_E52O__FfwBannzQ</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2718383777</pqid></control><display><type>article</type><title>Prioritizing Virtual Screening with Interpretable Interaction Fingerprints</title><source>ACS Publications</source><creator>Fassio, Alexandre V. ; Shub, Laura ; Ponzoni, Luca ; McKinley, Jessica ; O’Meara, Matthew J. ; Ferreira, Rafaela S. ; Keiser, Michael J. ; de Melo Minardi, Raquel C.</creator><creatorcontrib>Fassio, Alexandre V. ; Shub, Laura ; Ponzoni, Luca ; McKinley, Jessica ; O’Meara, Matthew J. ; Ferreira, Rafaela S. ; Keiser, Michael J. ; de Melo Minardi, Raquel C.</creatorcontrib><description>Machine learning-based drug discovery success depends on molecular representation. Yet traditional molecular fingerprints omit both the protein and pointers back to structural information that would enable better model interpretability. Therefore, we propose LUNA, a Python 3 toolkit that calculates and encodes protein–ligand interactions into new hashed fingerprints inspired by Extended Connectivity FingerPrint (ECFP): EIFP (Extended Interaction FingerPrint), FIFP (Functional Interaction FingerPrint), and Hybrid Interaction FingerPrint (HIFP). LUNA also provides visual strategies to make the fingerprints interpretable. We performed three major experiments exploring the fingerprints’ use. First, we trained machine learning models to reproduce DOCK3.7 scores using 1 million docked Dopamine D4 complexes. We found that EIFP-4,096 performed (R 2 = 0.61) superior to related molecular and interaction fingerprints. Second, we used LUNA to support interpretable machine learning models. Finally, we demonstrate that interaction fingerprints can accurately identify similarities across molecular complexes that other fingerprints overlook. Hence, we envision LUNA and its interface fingerprints as promising methods for machine learning-based virtual screening campaigns. LUNA is freely available at https://github.com/keiserlab/LUNA.</description><identifier>ISSN: 1549-9596</identifier><identifier>EISSN: 1549-960X</identifier><identifier>DOI: 10.1021/acs.jcim.2c00695</identifier><language>eng</language><publisher>Washington: American Chemical Society</publisher><subject>Chemical fingerprinting ; Dopamine ; Machine learning ; Machine Learning and Deep Learning ; Proteins ; Screening</subject><ispartof>Journal of chemical information and modeling, 2022-09, Vol.62 (18), p.4300-4318</ispartof><rights>2022 American Chemical Society</rights><rights>Copyright American Chemical Society Sep 26, 2022</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-a383t-9738fb67511791215d8e97795b8ee5084f14479f1bb0f0410f99fd497f0551143</citedby><cites>FETCH-LOGICAL-a383t-9738fb67511791215d8e97795b8ee5084f14479f1bb0f0410f99fd497f0551143</cites><orcidid>0000-0003-0211-0396 ; 0000-0001-5190-100X ; 0000-0002-2182-4709 ; 0000-0002-3128-5331 ; 0000-0002-1240-2192 ; 0000-0001-8125-582X ; 0000-0002-8786-915X</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://pubs.acs.org/doi/pdf/10.1021/acs.jcim.2c00695$$EPDF$$P50$$Gacs$$H</linktopdf><linktohtml>$$Uhttps://pubs.acs.org/doi/10.1021/acs.jcim.2c00695$$EHTML$$P50$$Gacs$$H</linktohtml><link.rule.ids>314,776,780,2752,27053,27901,27902,56713,56763</link.rule.ids></links><search><creatorcontrib>Fassio, Alexandre V.</creatorcontrib><creatorcontrib>Shub, Laura</creatorcontrib><creatorcontrib>Ponzoni, Luca</creatorcontrib><creatorcontrib>McKinley, Jessica</creatorcontrib><creatorcontrib>O’Meara, Matthew J.</creatorcontrib><creatorcontrib>Ferreira, Rafaela S.</creatorcontrib><creatorcontrib>Keiser, Michael J.</creatorcontrib><creatorcontrib>de Melo Minardi, Raquel C.</creatorcontrib><title>Prioritizing Virtual Screening with Interpretable Interaction Fingerprints</title><title>Journal of chemical information and modeling</title><addtitle>J. Chem. Inf. Model</addtitle><description>Machine learning-based drug discovery success depends on molecular representation. Yet traditional molecular fingerprints omit both the protein and pointers back to structural information that would enable better model interpretability. Therefore, we propose LUNA, a Python 3 toolkit that calculates and encodes protein–ligand interactions into new hashed fingerprints inspired by Extended Connectivity FingerPrint (ECFP): EIFP (Extended Interaction FingerPrint), FIFP (Functional Interaction FingerPrint), and Hybrid Interaction FingerPrint (HIFP). LUNA also provides visual strategies to make the fingerprints interpretable. We performed three major experiments exploring the fingerprints’ use. First, we trained machine learning models to reproduce DOCK3.7 scores using 1 million docked Dopamine D4 complexes. We found that EIFP-4,096 performed (R 2 = 0.61) superior to related molecular and interaction fingerprints. Second, we used LUNA to support interpretable machine learning models. Finally, we demonstrate that interaction fingerprints can accurately identify similarities across molecular complexes that other fingerprints overlook. Hence, we envision LUNA and its interface fingerprints as promising methods for machine learning-based virtual screening campaigns. LUNA is freely available at https://github.com/keiserlab/LUNA.</description><subject>Chemical fingerprinting</subject><subject>Dopamine</subject><subject>Machine learning</subject><subject>Machine Learning and Deep Learning</subject><subject>Proteins</subject><subject>Screening</subject><issn>1549-9596</issn><issn>1549-960X</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2022</creationdate><recordtype>article</recordtype><recordid>eNp1kM9LwzAUx4MoOKd3jwUvHux8aZqmOcpwOhko-ANvIc0SzejamaSI_vWmdrsInt6vz_fx3hehUwwTDBm-lMpPVsquJ5kCKDjdQyNMc57yAl73dznlxSE68n4FQAgvshG6e3C2dTbYb9u8JS_WhU7WyaNyWjd959OG92TeBO02TgdZ1XqopAq2bZJZZPqRbYI_RgdG1l6fbOMYPc-un6a36eL-Zj69WqSSlCSknJHSVAWjGDOOM0yXpeaMcVqVWlMoc4PznHGDqwoM5BgM52aZc2aARk1Oxuh82Ltx7UenfRBr65Wua9notvMiYxHiQFiPnv1BV23nmnhdT5XxHsZYpGCglGu9d9qI-NBaui-BQfTmimiu6M0VW3Oj5GKQ_E52O__FfwBannzQ</recordid><startdate>20220926</startdate><enddate>20220926</enddate><creator>Fassio, Alexandre V.</creator><creator>Shub, Laura</creator><creator>Ponzoni, Luca</creator><creator>McKinley, Jessica</creator><creator>O’Meara, Matthew J.</creator><creator>Ferreira, Rafaela S.</creator><creator>Keiser, Michael J.</creator><creator>de Melo Minardi, Raquel C.</creator><general>American Chemical Society</general><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SR</scope><scope>7U5</scope><scope>8BQ</scope><scope>8FD</scope><scope>JG9</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>7X8</scope><orcidid>https://orcid.org/0000-0003-0211-0396</orcidid><orcidid>https://orcid.org/0000-0001-5190-100X</orcidid><orcidid>https://orcid.org/0000-0002-2182-4709</orcidid><orcidid>https://orcid.org/0000-0002-3128-5331</orcidid><orcidid>https://orcid.org/0000-0002-1240-2192</orcidid><orcidid>https://orcid.org/0000-0001-8125-582X</orcidid><orcidid>https://orcid.org/0000-0002-8786-915X</orcidid></search><sort><creationdate>20220926</creationdate><title>Prioritizing Virtual Screening with Interpretable Interaction Fingerprints</title><author>Fassio, Alexandre V. ; Shub, Laura ; Ponzoni, Luca ; McKinley, Jessica ; O’Meara, Matthew J. ; Ferreira, Rafaela S. ; Keiser, Michael J. ; de Melo Minardi, Raquel C.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a383t-9738fb67511791215d8e97795b8ee5084f14479f1bb0f0410f99fd497f0551143</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2022</creationdate><topic>Chemical fingerprinting</topic><topic>Dopamine</topic><topic>Machine learning</topic><topic>Machine Learning and Deep Learning</topic><topic>Proteins</topic><topic>Screening</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Fassio, Alexandre V.</creatorcontrib><creatorcontrib>Shub, Laura</creatorcontrib><creatorcontrib>Ponzoni, Luca</creatorcontrib><creatorcontrib>McKinley, Jessica</creatorcontrib><creatorcontrib>O’Meara, Matthew J.</creatorcontrib><creatorcontrib>Ferreira, Rafaela S.</creatorcontrib><creatorcontrib>Keiser, Michael J.</creatorcontrib><creatorcontrib>de Melo Minardi, Raquel C.</creatorcontrib><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Engineered Materials Abstracts</collection><collection>Solid State and Superconductivity Abstracts</collection><collection>METADEX</collection><collection>Technology Research Database</collection><collection>Materials Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>MEDLINE - Academic</collection><jtitle>Journal of chemical information and modeling</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Fassio, Alexandre V.</au><au>Shub, Laura</au><au>Ponzoni, Luca</au><au>McKinley, Jessica</au><au>O’Meara, Matthew J.</au><au>Ferreira, Rafaela S.</au><au>Keiser, Michael J.</au><au>de Melo Minardi, Raquel C.</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Prioritizing Virtual Screening with Interpretable Interaction Fingerprints</atitle><jtitle>Journal of chemical information and modeling</jtitle><addtitle>J. Chem. Inf. Model</addtitle><date>2022-09-26</date><risdate>2022</risdate><volume>62</volume><issue>18</issue><spage>4300</spage><epage>4318</epage><pages>4300-4318</pages><issn>1549-9596</issn><eissn>1549-960X</eissn><abstract>Machine learning-based drug discovery success depends on molecular representation. Yet traditional molecular fingerprints omit both the protein and pointers back to structural information that would enable better model interpretability. Therefore, we propose LUNA, a Python 3 toolkit that calculates and encodes protein–ligand interactions into new hashed fingerprints inspired by Extended Connectivity FingerPrint (ECFP): EIFP (Extended Interaction FingerPrint), FIFP (Functional Interaction FingerPrint), and Hybrid Interaction FingerPrint (HIFP). LUNA also provides visual strategies to make the fingerprints interpretable. We performed three major experiments exploring the fingerprints’ use. First, we trained machine learning models to reproduce DOCK3.7 scores using 1 million docked Dopamine D4 complexes. We found that EIFP-4,096 performed (R 2 = 0.61) superior to related molecular and interaction fingerprints. Second, we used LUNA to support interpretable machine learning models. Finally, we demonstrate that interaction fingerprints can accurately identify similarities across molecular complexes that other fingerprints overlook. Hence, we envision LUNA and its interface fingerprints as promising methods for machine learning-based virtual screening campaigns. LUNA is freely available at https://github.com/keiserlab/LUNA.</abstract><cop>Washington</cop><pub>American Chemical Society</pub><doi>10.1021/acs.jcim.2c00695</doi><tpages>19</tpages><orcidid>https://orcid.org/0000-0003-0211-0396</orcidid><orcidid>https://orcid.org/0000-0001-5190-100X</orcidid><orcidid>https://orcid.org/0000-0002-2182-4709</orcidid><orcidid>https://orcid.org/0000-0002-3128-5331</orcidid><orcidid>https://orcid.org/0000-0002-1240-2192</orcidid><orcidid>https://orcid.org/0000-0001-8125-582X</orcidid><orcidid>https://orcid.org/0000-0002-8786-915X</orcidid><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 1549-9596
ispartof Journal of chemical information and modeling, 2022-09, Vol.62 (18), p.4300-4318
issn 1549-9596
1549-960X
language eng
recordid cdi_proquest_miscellaneous_2714390374
source ACS Publications
subjects Chemical fingerprinting
Dopamine
Machine learning
Machine Learning and Deep Learning
Proteins
Screening
title Prioritizing Virtual Screening with Interpretable Interaction Fingerprints
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-21T17%3A59%3A35IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Prioritizing%20Virtual%20Screening%20with%20Interpretable%20Interaction%20Fingerprints&rft.jtitle=Journal%20of%20chemical%20information%20and%20modeling&rft.au=Fassio,%20Alexandre%20V.&rft.date=2022-09-26&rft.volume=62&rft.issue=18&rft.spage=4300&rft.epage=4318&rft.pages=4300-4318&rft.issn=1549-9596&rft.eissn=1549-960X&rft_id=info:doi/10.1021/acs.jcim.2c00695&rft_dat=%3Cproquest_cross%3E2718383777%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2718383777&rft_id=info:pmid/&rfr_iscdi=true