UV-adVISor: Attention-Based Recurrent Neural Networks to Predict UV–Vis Spectra

Ultraviolet–visible (UV–Vis) absorption spectra are routinely collected as part of high-performance liquid chromatography (HPLC) analysis systems and can be used to identify chemical reaction products by comparison to the reference spectra. Here, we present UV-adVISor as a new computational tool for...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Analytical chemistry (Washington) 2021-12, Vol.93 (48), p.16076-16085
Hauptverfasser: Urbina, Fabio, Batra, Kushal, Luebke, Kevin J, White, Jason D, Matsiev, Daniel, Olson, Lori L, Malerich, Jeremiah P, Hupcey, Maggie A. Z, Madrid, Peter B, Ekins, Sean
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 16085
container_issue 48
container_start_page 16076
container_title Analytical chemistry (Washington)
container_volume 93
creator Urbina, Fabio
Batra, Kushal
Luebke, Kevin J
White, Jason D
Matsiev, Daniel
Olson, Lori L
Malerich, Jeremiah P
Hupcey, Maggie A. Z
Madrid, Peter B
Ekins, Sean
description Ultraviolet–visible (UV–Vis) absorption spectra are routinely collected as part of high-performance liquid chromatography (HPLC) analysis systems and can be used to identify chemical reaction products by comparison to the reference spectra. Here, we present UV-adVISor as a new computational tool for predicting the UV–Vis spectra from a molecule’s structure alone. UV–Vis prediction was approached as a sequence-to-sequence problem. We utilized Long-Short Term Memory and attention-based neural networks with Extended Connectivity Fingerprint Diameter 6 or molecule SMILES to generate predictive models for the UV spectra. We have produced two spectrum datasets (dataset I, N = 949, and dataset II, N = 2222) using different compound collections and spectrum acquisition methods to train, validate, and test our models. We evaluated the prediction accuracy of the complete spectra by the correspondence of wavelengths of absorbance maxima and with a series of statistical measures (the best test set median model parameters are in parentheses for model II), including RMSE (0.064), R 2 (0.71), and dynamic time warping (DTW, 0.194) of the entire spectrum curve. Scrambling molecule structures with the experimental spectra during training resulted in a degraded R 2, confirming the utility of the approaches for prediction. UV-adVISor is able to provide fast and accurate predictions for libraries of compounds.
doi_str_mv 10.1021/acs.analchem.1c03741
format Article
fullrecord <record><control><sourceid>proquest_pubme</sourceid><recordid>TN_cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_9137254</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2616228833</sourcerecordid><originalsourceid>FETCH-LOGICAL-a543t-c6dc915b0089a75135fde692b0a30f8f6939cdad3109744ddce64e8131ce0303</originalsourceid><addsrcrecordid>eNp9kc9u1DAQxi0EosvCGyAUiQuXLDO2kzgckErFn0oV_9ru1fLaE5qSjRfbAXHjHXhDngSvdrsCDpxG8vy-b8bzMfYQYYHA8amxcWFGM9grWi_Qgmgk3mIzrDiUtVL8NpsBgCh5A3DE7sV4DYAIWN9lR0Iq5DXwGftwuSyNW56e-_CsOE6JxtT7sXxhIrniI9kphPxUvKUpmCGX9M2Hz7FIvngfyPU2FZfLXz9-LvtYnG_IpmDuszudGSI92Nc5u3j18uLkTXn27vXpyfFZaSopUmlrZ1usVgCqNU2Fouoc1S1fgRHQqa5uRWudcQKhbaR0zlItSaFASyBAzNnzne1mWq0pd8c8e9Cb0K9N-K696fXfnbG_0p_8V92iaHheYc6e7A2C_zJRTHrdR0vDYEbyU9T5PiiVkHI76_E_6LWfQr79lsKac6WEyJTcUTb4GAN1h2UQ9DYynSPTN5HpfWRZ9ujPjxxENxllAHbAVn4Y_F_P3xm5pjo</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2616228833</pqid></control><display><type>article</type><title>UV-adVISor: Attention-Based Recurrent Neural Networks to Predict UV–Vis Spectra</title><source>ACS Publications</source><source>MEDLINE</source><creator>Urbina, Fabio ; Batra, Kushal ; Luebke, Kevin J ; White, Jason D ; Matsiev, Daniel ; Olson, Lori L ; Malerich, Jeremiah P ; Hupcey, Maggie A. Z ; Madrid, Peter B ; Ekins, Sean</creator><creatorcontrib>Urbina, Fabio ; Batra, Kushal ; Luebke, Kevin J ; White, Jason D ; Matsiev, Daniel ; Olson, Lori L ; Malerich, Jeremiah P ; Hupcey, Maggie A. Z ; Madrid, Peter B ; Ekins, Sean</creatorcontrib><description>Ultraviolet–visible (UV–Vis) absorption spectra are routinely collected as part of high-performance liquid chromatography (HPLC) analysis systems and can be used to identify chemical reaction products by comparison to the reference spectra. Here, we present UV-adVISor as a new computational tool for predicting the UV–Vis spectra from a molecule’s structure alone. UV–Vis prediction was approached as a sequence-to-sequence problem. We utilized Long-Short Term Memory and attention-based neural networks with Extended Connectivity Fingerprint Diameter 6 or molecule SMILES to generate predictive models for the UV spectra. We have produced two spectrum datasets (dataset I, N = 949, and dataset II, N = 2222) using different compound collections and spectrum acquisition methods to train, validate, and test our models. We evaluated the prediction accuracy of the complete spectra by the correspondence of wavelengths of absorbance maxima and with a series of statistical measures (the best test set median model parameters are in parentheses for model II), including RMSE (0.064), R 2 (0.71), and dynamic time warping (DTW, 0.194) of the entire spectrum curve. Scrambling molecule structures with the experimental spectra during training resulted in a degraded R 2, confirming the utility of the approaches for prediction. UV-adVISor is able to provide fast and accurate predictions for libraries of compounds.</description><identifier>ISSN: 0003-2700</identifier><identifier>EISSN: 1520-6882</identifier><identifier>DOI: 10.1021/acs.analchem.1c03741</identifier><identifier>PMID: 34812602</identifier><language>eng</language><publisher>United States: American Chemical Society</publisher><subject>Absorption spectra ; Advisors ; Chemical reactions ; Chemistry ; Chromatography, High Pressure Liquid ; Computer applications ; Datasets ; Diameters ; High performance liquid chromatography ; Light ; Liquid chromatography ; Long short-term memory ; Model testing ; Molecular structure ; Neural networks ; Neural Networks, Computer ; Prediction models ; Reaction products ; Recurrent neural networks ; Software ; Ultraviolet radiation ; Ultraviolet spectra ; Wavelengths</subject><ispartof>Analytical chemistry (Washington), 2021-12, Vol.93 (48), p.16076-16085</ispartof><rights>2021 American Chemical Society</rights><rights>Copyright American Chemical Society Dec 7, 2021</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-a543t-c6dc915b0089a75135fde692b0a30f8f6939cdad3109744ddce64e8131ce0303</citedby><cites>FETCH-LOGICAL-a543t-c6dc915b0089a75135fde692b0a30f8f6939cdad3109744ddce64e8131ce0303</cites><orcidid>0000-0002-5691-5790 ; 0000-0002-4895-5610</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://pubs.acs.org/doi/pdf/10.1021/acs.analchem.1c03741$$EPDF$$P50$$Gacs$$H</linktopdf><linktohtml>$$Uhttps://pubs.acs.org/doi/10.1021/acs.analchem.1c03741$$EHTML$$P50$$Gacs$$H</linktohtml><link.rule.ids>230,314,776,780,881,2752,27053,27901,27902,56713,56763</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/34812602$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Urbina, Fabio</creatorcontrib><creatorcontrib>Batra, Kushal</creatorcontrib><creatorcontrib>Luebke, Kevin J</creatorcontrib><creatorcontrib>White, Jason D</creatorcontrib><creatorcontrib>Matsiev, Daniel</creatorcontrib><creatorcontrib>Olson, Lori L</creatorcontrib><creatorcontrib>Malerich, Jeremiah P</creatorcontrib><creatorcontrib>Hupcey, Maggie A. Z</creatorcontrib><creatorcontrib>Madrid, Peter B</creatorcontrib><creatorcontrib>Ekins, Sean</creatorcontrib><title>UV-adVISor: Attention-Based Recurrent Neural Networks to Predict UV–Vis Spectra</title><title>Analytical chemistry (Washington)</title><addtitle>Anal. Chem</addtitle><description>Ultraviolet–visible (UV–Vis) absorption spectra are routinely collected as part of high-performance liquid chromatography (HPLC) analysis systems and can be used to identify chemical reaction products by comparison to the reference spectra. Here, we present UV-adVISor as a new computational tool for predicting the UV–Vis spectra from a molecule’s structure alone. UV–Vis prediction was approached as a sequence-to-sequence problem. We utilized Long-Short Term Memory and attention-based neural networks with Extended Connectivity Fingerprint Diameter 6 or molecule SMILES to generate predictive models for the UV spectra. We have produced two spectrum datasets (dataset I, N = 949, and dataset II, N = 2222) using different compound collections and spectrum acquisition methods to train, validate, and test our models. We evaluated the prediction accuracy of the complete spectra by the correspondence of wavelengths of absorbance maxima and with a series of statistical measures (the best test set median model parameters are in parentheses for model II), including RMSE (0.064), R 2 (0.71), and dynamic time warping (DTW, 0.194) of the entire spectrum curve. Scrambling molecule structures with the experimental spectra during training resulted in a degraded R 2, confirming the utility of the approaches for prediction. UV-adVISor is able to provide fast and accurate predictions for libraries of compounds.</description><subject>Absorption spectra</subject><subject>Advisors</subject><subject>Chemical reactions</subject><subject>Chemistry</subject><subject>Chromatography, High Pressure Liquid</subject><subject>Computer applications</subject><subject>Datasets</subject><subject>Diameters</subject><subject>High performance liquid chromatography</subject><subject>Light</subject><subject>Liquid chromatography</subject><subject>Long short-term memory</subject><subject>Model testing</subject><subject>Molecular structure</subject><subject>Neural networks</subject><subject>Neural Networks, Computer</subject><subject>Prediction models</subject><subject>Reaction products</subject><subject>Recurrent neural networks</subject><subject>Software</subject><subject>Ultraviolet radiation</subject><subject>Ultraviolet spectra</subject><subject>Wavelengths</subject><issn>0003-2700</issn><issn>1520-6882</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2021</creationdate><recordtype>article</recordtype><sourceid>EIF</sourceid><recordid>eNp9kc9u1DAQxi0EosvCGyAUiQuXLDO2kzgckErFn0oV_9ru1fLaE5qSjRfbAXHjHXhDngSvdrsCDpxG8vy-b8bzMfYQYYHA8amxcWFGM9grWi_Qgmgk3mIzrDiUtVL8NpsBgCh5A3DE7sV4DYAIWN9lR0Iq5DXwGftwuSyNW56e-_CsOE6JxtT7sXxhIrniI9kphPxUvKUpmCGX9M2Hz7FIvngfyPU2FZfLXz9-LvtYnG_IpmDuszudGSI92Nc5u3j18uLkTXn27vXpyfFZaSopUmlrZ1usVgCqNU2Fouoc1S1fgRHQqa5uRWudcQKhbaR0zlItSaFASyBAzNnzne1mWq0pd8c8e9Cb0K9N-K696fXfnbG_0p_8V92iaHheYc6e7A2C_zJRTHrdR0vDYEbyU9T5PiiVkHI76_E_6LWfQr79lsKac6WEyJTcUTb4GAN1h2UQ9DYynSPTN5HpfWRZ9ujPjxxENxllAHbAVn4Y_F_P3xm5pjo</recordid><startdate>20211207</startdate><enddate>20211207</enddate><creator>Urbina, Fabio</creator><creator>Batra, Kushal</creator><creator>Luebke, Kevin J</creator><creator>White, Jason D</creator><creator>Matsiev, Daniel</creator><creator>Olson, Lori L</creator><creator>Malerich, Jeremiah P</creator><creator>Hupcey, Maggie A. Z</creator><creator>Madrid, Peter B</creator><creator>Ekins, Sean</creator><general>American Chemical Society</general><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7QF</scope><scope>7QO</scope><scope>7QQ</scope><scope>7SC</scope><scope>7SE</scope><scope>7SP</scope><scope>7SR</scope><scope>7TA</scope><scope>7TB</scope><scope>7TM</scope><scope>7U5</scope><scope>7U7</scope><scope>7U9</scope><scope>8BQ</scope><scope>8FD</scope><scope>C1K</scope><scope>F28</scope><scope>FR3</scope><scope>H8D</scope><scope>H8G</scope><scope>H94</scope><scope>JG9</scope><scope>JQ2</scope><scope>KR7</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>P64</scope><scope>7X8</scope><scope>5PM</scope><orcidid>https://orcid.org/0000-0002-5691-5790</orcidid><orcidid>https://orcid.org/0000-0002-4895-5610</orcidid></search><sort><creationdate>20211207</creationdate><title>UV-adVISor: Attention-Based Recurrent Neural Networks to Predict UV–Vis Spectra</title><author>Urbina, Fabio ; Batra, Kushal ; Luebke, Kevin J ; White, Jason D ; Matsiev, Daniel ; Olson, Lori L ; Malerich, Jeremiah P ; Hupcey, Maggie A. Z ; Madrid, Peter B ; Ekins, Sean</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a543t-c6dc915b0089a75135fde692b0a30f8f6939cdad3109744ddce64e8131ce0303</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2021</creationdate><topic>Absorption spectra</topic><topic>Advisors</topic><topic>Chemical reactions</topic><topic>Chemistry</topic><topic>Chromatography, High Pressure Liquid</topic><topic>Computer applications</topic><topic>Datasets</topic><topic>Diameters</topic><topic>High performance liquid chromatography</topic><topic>Light</topic><topic>Liquid chromatography</topic><topic>Long short-term memory</topic><topic>Model testing</topic><topic>Molecular structure</topic><topic>Neural networks</topic><topic>Neural Networks, Computer</topic><topic>Prediction models</topic><topic>Reaction products</topic><topic>Recurrent neural networks</topic><topic>Software</topic><topic>Ultraviolet radiation</topic><topic>Ultraviolet spectra</topic><topic>Wavelengths</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Urbina, Fabio</creatorcontrib><creatorcontrib>Batra, Kushal</creatorcontrib><creatorcontrib>Luebke, Kevin J</creatorcontrib><creatorcontrib>White, Jason D</creatorcontrib><creatorcontrib>Matsiev, Daniel</creatorcontrib><creatorcontrib>Olson, Lori L</creatorcontrib><creatorcontrib>Malerich, Jeremiah P</creatorcontrib><creatorcontrib>Hupcey, Maggie A. Z</creatorcontrib><creatorcontrib>Madrid, Peter B</creatorcontrib><creatorcontrib>Ekins, Sean</creatorcontrib><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>Aluminium Industry Abstracts</collection><collection>Biotechnology Research Abstracts</collection><collection>Ceramic Abstracts</collection><collection>Computer and Information Systems Abstracts</collection><collection>Corrosion Abstracts</collection><collection>Electronics &amp; Communications Abstracts</collection><collection>Engineered Materials Abstracts</collection><collection>Materials Business File</collection><collection>Mechanical &amp; Transportation Engineering Abstracts</collection><collection>Nucleic Acids Abstracts</collection><collection>Solid State and Superconductivity Abstracts</collection><collection>Toxicology Abstracts</collection><collection>Virology and AIDS Abstracts</collection><collection>METADEX</collection><collection>Technology Research Database</collection><collection>Environmental Sciences and Pollution Management</collection><collection>ANTE: Abstracts in New Technology &amp; Engineering</collection><collection>Engineering Research Database</collection><collection>Aerospace Database</collection><collection>Copper Technical Reference Library</collection><collection>AIDS and Cancer Research Abstracts</collection><collection>Materials Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Civil Engineering Abstracts</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>Biotechnology and BioEngineering Abstracts</collection><collection>MEDLINE - Academic</collection><collection>PubMed Central (Full Participant titles)</collection><jtitle>Analytical chemistry (Washington)</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Urbina, Fabio</au><au>Batra, Kushal</au><au>Luebke, Kevin J</au><au>White, Jason D</au><au>Matsiev, Daniel</au><au>Olson, Lori L</au><au>Malerich, Jeremiah P</au><au>Hupcey, Maggie A. Z</au><au>Madrid, Peter B</au><au>Ekins, Sean</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>UV-adVISor: Attention-Based Recurrent Neural Networks to Predict UV–Vis Spectra</atitle><jtitle>Analytical chemistry (Washington)</jtitle><addtitle>Anal. Chem</addtitle><date>2021-12-07</date><risdate>2021</risdate><volume>93</volume><issue>48</issue><spage>16076</spage><epage>16085</epage><pages>16076-16085</pages><issn>0003-2700</issn><eissn>1520-6882</eissn><abstract>Ultraviolet–visible (UV–Vis) absorption spectra are routinely collected as part of high-performance liquid chromatography (HPLC) analysis systems and can be used to identify chemical reaction products by comparison to the reference spectra. Here, we present UV-adVISor as a new computational tool for predicting the UV–Vis spectra from a molecule’s structure alone. UV–Vis prediction was approached as a sequence-to-sequence problem. We utilized Long-Short Term Memory and attention-based neural networks with Extended Connectivity Fingerprint Diameter 6 or molecule SMILES to generate predictive models for the UV spectra. We have produced two spectrum datasets (dataset I, N = 949, and dataset II, N = 2222) using different compound collections and spectrum acquisition methods to train, validate, and test our models. We evaluated the prediction accuracy of the complete spectra by the correspondence of wavelengths of absorbance maxima and with a series of statistical measures (the best test set median model parameters are in parentheses for model II), including RMSE (0.064), R 2 (0.71), and dynamic time warping (DTW, 0.194) of the entire spectrum curve. Scrambling molecule structures with the experimental spectra during training resulted in a degraded R 2, confirming the utility of the approaches for prediction. UV-adVISor is able to provide fast and accurate predictions for libraries of compounds.</abstract><cop>United States</cop><pub>American Chemical Society</pub><pmid>34812602</pmid><doi>10.1021/acs.analchem.1c03741</doi><tpages>10</tpages><orcidid>https://orcid.org/0000-0002-5691-5790</orcidid><orcidid>https://orcid.org/0000-0002-4895-5610</orcidid><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 0003-2700
ispartof Analytical chemistry (Washington), 2021-12, Vol.93 (48), p.16076-16085
issn 0003-2700
1520-6882
language eng
recordid cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_9137254
source ACS Publications; MEDLINE
subjects Absorption spectra
Advisors
Chemical reactions
Chemistry
Chromatography, High Pressure Liquid
Computer applications
Datasets
Diameters
High performance liquid chromatography
Light
Liquid chromatography
Long short-term memory
Model testing
Molecular structure
Neural networks
Neural Networks, Computer
Prediction models
Reaction products
Recurrent neural networks
Software
Ultraviolet radiation
Ultraviolet spectra
Wavelengths
title UV-adVISor: Attention-Based Recurrent Neural Networks to Predict UV–Vis Spectra
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-02T12%3A40%3A12IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_pubme&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=UV-adVISor:%20Attention-Based%20Recurrent%20Neural%20Networks%20to%20Predict%20UV%E2%80%93Vis%20Spectra&rft.jtitle=Analytical%20chemistry%20(Washington)&rft.au=Urbina,%20Fabio&rft.date=2021-12-07&rft.volume=93&rft.issue=48&rft.spage=16076&rft.epage=16085&rft.pages=16076-16085&rft.issn=0003-2700&rft.eissn=1520-6882&rft_id=info:doi/10.1021/acs.analchem.1c03741&rft_dat=%3Cproquest_pubme%3E2616228833%3C/proquest_pubme%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2616228833&rft_id=info:pmid/34812602&rfr_iscdi=true