Data-centric machine learning in quantum information science
We propose a series of data-centric heuristics for improving the performance of machine learning systems when applied to problems in quantum information science. In particular, we consider how systematic engineering of training sets can significantly enhance the accuracy of pre-trained neural networ...
Gespeichert in:
Veröffentlicht in: | Machine learning: science and technology 2022-12, Vol.3 (4), p.4 |
---|---|
Hauptverfasser: | , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | |
---|---|
container_issue | 4 |
container_start_page | 4 |
container_title | Machine learning: science and technology |
container_volume | 3 |
creator | Lohani, Sanjaya Lukens, Joseph M Glasser, Ryan T Searles, Thomas A Kirby, Brian T |
description | We propose a series of data-centric heuristics for improving the performance of machine learning systems when applied to problems in quantum information science. In particular, we consider how systematic engineering of training sets can significantly enhance the accuracy of pre-trained neural networks used for quantum state reconstruction without altering the underlying architecture. We find that it is not always optimal to engineer training sets to exactly match the expected distribution of a target scenario, and instead, performance can be further improved by biasing the training set to be slightly more mixed than the target. This is due to the heterogeneity in the number of free variables required to describe states of different purity, and as a result, overall accuracy of the network improves when training sets of a fixed size focus on states with the least constrained free variables. For further clarity, we also include a ‘toy model’ demonstration of how spurious correlations can inadvertently enter synthetic data sets used for training, how the performance of systems trained with these correlations can degrade dramatically, and how the inclusion of even relatively few counterexamples can effectively remedy such problems. |
doi_str_mv | 10.1088/2632-2153/ac9036 |
format | Article |
fullrecord | <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_crossref_primary_10_1088_2632_2153_ac9036</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2719491701</sourcerecordid><originalsourceid>FETCH-LOGICAL-c377t-9ffa6278b9e8aafac332d1b405bef237c7afbb7f14bddf66209fb39c7163d0ef3</originalsourceid><addsrcrecordid>eNp1kL1PwzAQxS0EElXpzhjBSsAfSRxLLKh8SpVYYLbOjk1dNXZrOwP_PamCgIXpnk6_9-70EDon-Jrgtr2hDaMlJTW7AS0wa47Q7Gd1_EefokVKG4wxrQmrKZ6h23vIUGrjc3S66EGvnTfF1kD0zn8Uzhf7AXwe-lHaEHvILvgiaWe8NmfoxMI2mcX3nKP3x4e35XO5en16Wd6tSs04z6WwFhrKWyVMC2BBM0Y7oipcK2Mp45qDVYpbUqmus01DsbCKCc1JwzpsLJujiyk3pOzkeDwbvdbBe6OzJG1bt1U1QpcTtIthP5iU5SYM0Y9_ScqJqAThmIwUnigdQ0rRWLmLrof4KQmWhy7loSx5KEtOXY6Wq8niwu4381_8C_IfdA8</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2719491701</pqid></control><display><type>article</type><title>Data-centric machine learning in quantum information science</title><source>IOP Publishing Free Content</source><source>DOAJ Directory of Open Access Journals</source><source>Elektronische Zeitschriftenbibliothek - Frei zugängliche E-Journals</source><creator>Lohani, Sanjaya ; Lukens, Joseph M ; Glasser, Ryan T ; Searles, Thomas A ; Kirby, Brian T</creator><creatorcontrib>Lohani, Sanjaya ; Lukens, Joseph M ; Glasser, Ryan T ; Searles, Thomas A ; Kirby, Brian T</creatorcontrib><description>We propose a series of data-centric heuristics for improving the performance of machine learning systems when applied to problems in quantum information science. In particular, we consider how systematic engineering of training sets can significantly enhance the accuracy of pre-trained neural networks used for quantum state reconstruction without altering the underlying architecture. We find that it is not always optimal to engineer training sets to exactly match the expected distribution of a target scenario, and instead, performance can be further improved by biasing the training set to be slightly more mixed than the target. This is due to the heterogeneity in the number of free variables required to describe states of different purity, and as a result, overall accuracy of the network improves when training sets of a fixed size focus on states with the least constrained free variables. For further clarity, we also include a ‘toy model’ demonstration of how spurious correlations can inadvertently enter synthetic data sets used for training, how the performance of systems trained with these correlations can degrade dramatically, and how the inclusion of even relatively few counterexamples can effectively remedy such problems.</description><identifier>ISSN: 2632-2153</identifier><identifier>EISSN: 2632-2153</identifier><identifier>DOI: 10.1088/2632-2153/ac9036</identifier><identifier>CODEN: MLSTCK</identifier><language>eng</language><publisher>Bristol: IOP Publishing</publisher><subject>Engineering training ; Heterogeneity ; Information science ; Machine learning ; Neural networks ; quantum noise and quantum operations ; Quantum phenomena ; quantum tomography</subject><ispartof>Machine learning: science and technology, 2022-12, Vol.3 (4), p.4</ispartof><rights>2022 The Author(s). Published by IOP Publishing Ltd</rights><rights>2022 The Author(s). Published by IOP Publishing Ltd. This work is published under http://creativecommons.org/licenses/by/4.0 (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c377t-9ffa6278b9e8aafac332d1b405bef237c7afbb7f14bddf66209fb39c7163d0ef3</citedby><cites>FETCH-LOGICAL-c377t-9ffa6278b9e8aafac332d1b405bef237c7afbb7f14bddf66209fb39c7163d0ef3</cites><orcidid>0000-0003-0699-0669 ; 0000-0002-0532-7884 ; 0000-0002-2698-9887 ; 0000-0001-9650-4462 ; 0000000196504462 ; 0000000306990669</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://iopscience.iop.org/article/10.1088/2632-2153/ac9036/pdf$$EPDF$$P50$$Giop$$Hfree_for_read</linktopdf><link.rule.ids>230,314,776,780,860,881,27903,27904,38869,53846</link.rule.ids><backlink>$$Uhttps://www.osti.gov/biblio/1885844$$D View this record in Osti.gov$$Hfree_for_read</backlink></links><search><creatorcontrib>Lohani, Sanjaya</creatorcontrib><creatorcontrib>Lukens, Joseph M</creatorcontrib><creatorcontrib>Glasser, Ryan T</creatorcontrib><creatorcontrib>Searles, Thomas A</creatorcontrib><creatorcontrib>Kirby, Brian T</creatorcontrib><title>Data-centric machine learning in quantum information science</title><title>Machine learning: science and technology</title><addtitle>MLST</addtitle><addtitle>Mach. Learn.: Sci. Technol</addtitle><description>We propose a series of data-centric heuristics for improving the performance of machine learning systems when applied to problems in quantum information science. In particular, we consider how systematic engineering of training sets can significantly enhance the accuracy of pre-trained neural networks used for quantum state reconstruction without altering the underlying architecture. We find that it is not always optimal to engineer training sets to exactly match the expected distribution of a target scenario, and instead, performance can be further improved by biasing the training set to be slightly more mixed than the target. This is due to the heterogeneity in the number of free variables required to describe states of different purity, and as a result, overall accuracy of the network improves when training sets of a fixed size focus on states with the least constrained free variables. For further clarity, we also include a ‘toy model’ demonstration of how spurious correlations can inadvertently enter synthetic data sets used for training, how the performance of systems trained with these correlations can degrade dramatically, and how the inclusion of even relatively few counterexamples can effectively remedy such problems.</description><subject>Engineering training</subject><subject>Heterogeneity</subject><subject>Information science</subject><subject>Machine learning</subject><subject>Neural networks</subject><subject>quantum noise and quantum operations</subject><subject>Quantum phenomena</subject><subject>quantum tomography</subject><issn>2632-2153</issn><issn>2632-2153</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2022</creationdate><recordtype>article</recordtype><sourceid>O3W</sourceid><sourceid>ABUWG</sourceid><sourceid>AFKRA</sourceid><sourceid>AZQEC</sourceid><sourceid>BENPR</sourceid><sourceid>CCPQU</sourceid><sourceid>DWQXO</sourceid><sourceid>GNUQQ</sourceid><recordid>eNp1kL1PwzAQxS0EElXpzhjBSsAfSRxLLKh8SpVYYLbOjk1dNXZrOwP_PamCgIXpnk6_9-70EDon-Jrgtr2hDaMlJTW7AS0wa47Q7Gd1_EefokVKG4wxrQmrKZ6h23vIUGrjc3S66EGvnTfF1kD0zn8Uzhf7AXwe-lHaEHvILvgiaWe8NmfoxMI2mcX3nKP3x4e35XO5en16Wd6tSs04z6WwFhrKWyVMC2BBM0Y7oipcK2Mp45qDVYpbUqmus01DsbCKCc1JwzpsLJujiyk3pOzkeDwbvdbBe6OzJG1bt1U1QpcTtIthP5iU5SYM0Y9_ScqJqAThmIwUnigdQ0rRWLmLrof4KQmWhy7loSx5KEtOXY6Wq8niwu4381_8C_IfdA8</recordid><startdate>20221201</startdate><enddate>20221201</enddate><creator>Lohani, Sanjaya</creator><creator>Lukens, Joseph M</creator><creator>Glasser, Ryan T</creator><creator>Searles, Thomas A</creator><creator>Kirby, Brian T</creator><general>IOP Publishing</general><scope>O3W</scope><scope>TSCCA</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>3V.</scope><scope>7XB</scope><scope>88I</scope><scope>8FE</scope><scope>8FG</scope><scope>8FK</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>ARAPS</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>GNUQQ</scope><scope>HCIFZ</scope><scope>JQ2</scope><scope>K7-</scope><scope>M2P</scope><scope>P62</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>Q9U</scope><scope>OTOTI</scope><orcidid>https://orcid.org/0000-0003-0699-0669</orcidid><orcidid>https://orcid.org/0000-0002-0532-7884</orcidid><orcidid>https://orcid.org/0000-0002-2698-9887</orcidid><orcidid>https://orcid.org/0000-0001-9650-4462</orcidid><orcidid>https://orcid.org/0000000196504462</orcidid><orcidid>https://orcid.org/0000000306990669</orcidid></search><sort><creationdate>20221201</creationdate><title>Data-centric machine learning in quantum information science</title><author>Lohani, Sanjaya ; Lukens, Joseph M ; Glasser, Ryan T ; Searles, Thomas A ; Kirby, Brian T</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c377t-9ffa6278b9e8aafac332d1b405bef237c7afbb7f14bddf66209fb39c7163d0ef3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2022</creationdate><topic>Engineering training</topic><topic>Heterogeneity</topic><topic>Information science</topic><topic>Machine learning</topic><topic>Neural networks</topic><topic>quantum noise and quantum operations</topic><topic>Quantum phenomena</topic><topic>quantum tomography</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Lohani, Sanjaya</creatorcontrib><creatorcontrib>Lukens, Joseph M</creatorcontrib><creatorcontrib>Glasser, Ryan T</creatorcontrib><creatorcontrib>Searles, Thomas A</creatorcontrib><creatorcontrib>Kirby, Brian T</creatorcontrib><collection>IOP Publishing Free Content</collection><collection>IOPscience (Open Access)</collection><collection>CrossRef</collection><collection>ProQuest Central (Corporate)</collection><collection>ProQuest Central (purchase pre-March 2016)</collection><collection>Science Database (Alumni Edition)</collection><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>ProQuest Central (Alumni) (purchase pre-March 2016)</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest Central UK/Ireland</collection><collection>Advanced Technologies & Aerospace Collection</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>ProQuest Central Student</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Computer Science Collection</collection><collection>Computer Science Database</collection><collection>Science Database</collection><collection>ProQuest Advanced Technologies & Aerospace Collection</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>ProQuest Central Basic</collection><collection>OSTI.GOV</collection><jtitle>Machine learning: science and technology</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Lohani, Sanjaya</au><au>Lukens, Joseph M</au><au>Glasser, Ryan T</au><au>Searles, Thomas A</au><au>Kirby, Brian T</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Data-centric machine learning in quantum information science</atitle><jtitle>Machine learning: science and technology</jtitle><stitle>MLST</stitle><addtitle>Mach. Learn.: Sci. Technol</addtitle><date>2022-12-01</date><risdate>2022</risdate><volume>3</volume><issue>4</issue><spage>4</spage><pages>4-</pages><issn>2632-2153</issn><eissn>2632-2153</eissn><coden>MLSTCK</coden><abstract>We propose a series of data-centric heuristics for improving the performance of machine learning systems when applied to problems in quantum information science. In particular, we consider how systematic engineering of training sets can significantly enhance the accuracy of pre-trained neural networks used for quantum state reconstruction without altering the underlying architecture. We find that it is not always optimal to engineer training sets to exactly match the expected distribution of a target scenario, and instead, performance can be further improved by biasing the training set to be slightly more mixed than the target. This is due to the heterogeneity in the number of free variables required to describe states of different purity, and as a result, overall accuracy of the network improves when training sets of a fixed size focus on states with the least constrained free variables. For further clarity, we also include a ‘toy model’ demonstration of how spurious correlations can inadvertently enter synthetic data sets used for training, how the performance of systems trained with these correlations can degrade dramatically, and how the inclusion of even relatively few counterexamples can effectively remedy such problems.</abstract><cop>Bristol</cop><pub>IOP Publishing</pub><doi>10.1088/2632-2153/ac9036</doi><tpages>20</tpages><orcidid>https://orcid.org/0000-0003-0699-0669</orcidid><orcidid>https://orcid.org/0000-0002-0532-7884</orcidid><orcidid>https://orcid.org/0000-0002-2698-9887</orcidid><orcidid>https://orcid.org/0000-0001-9650-4462</orcidid><orcidid>https://orcid.org/0000000196504462</orcidid><orcidid>https://orcid.org/0000000306990669</orcidid><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | ISSN: 2632-2153 |
ispartof | Machine learning: science and technology, 2022-12, Vol.3 (4), p.4 |
issn | 2632-2153 2632-2153 |
language | eng |
recordid | cdi_crossref_primary_10_1088_2632_2153_ac9036 |
source | IOP Publishing Free Content; DOAJ Directory of Open Access Journals; Elektronische Zeitschriftenbibliothek - Frei zugängliche E-Journals |
subjects | Engineering training Heterogeneity Information science Machine learning Neural networks quantum noise and quantum operations Quantum phenomena quantum tomography |
title | Data-centric machine learning in quantum information science |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-21T20%3A50%3A34IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Data-centric%20machine%20learning%20in%20quantum%20information%20science&rft.jtitle=Machine%20learning:%20science%20and%20technology&rft.au=Lohani,%20Sanjaya&rft.date=2022-12-01&rft.volume=3&rft.issue=4&rft.spage=4&rft.pages=4-&rft.issn=2632-2153&rft.eissn=2632-2153&rft.coden=MLSTCK&rft_id=info:doi/10.1088/2632-2153/ac9036&rft_dat=%3Cproquest_cross%3E2719491701%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2719491701&rft_id=info:pmid/&rfr_iscdi=true |