Recovery from Non-Decomposable Distance Oracles

A line of work has looked at the problem of recovering an input from distance queries. In this setting, there is an unknown sequence s ∈ {0, 1} ≤ n , and one chooses a set of queries y ∈ {0, 1} O ( n ) and receives d ( s , y ) for a distance function d . The goal is to make as few queries as possibl...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on information theory 2023-10, Vol.69 (10), p.1-1
Hauptverfasser: Hu, Zhuangfei, Li, Xinda, Woodruff, David P., Zhang, Hongyang, Zhang, Shufan
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 1
container_issue 10
container_start_page 1
container_title IEEE transactions on information theory
container_volume 69
creator Hu, Zhuangfei
Li, Xinda
Woodruff, David P.
Zhang, Hongyang
Zhang, Shufan
description A line of work has looked at the problem of recovering an input from distance queries. In this setting, there is an unknown sequence s ∈ {0, 1} ≤ n , and one chooses a set of queries y ∈ {0, 1} O ( n ) and receives d ( s , y ) for a distance function d . The goal is to make as few queries as possible to recover s . Although this problem is well-studied for decomposable distances, i.e., distances of the form d ( s , y ) = Σ n i =1 f ( s i , y i ) for some function f , which includes the important cases of Hamming distance, ℓ p -norms, and M -estimators, to the best of our knowledge this problem has not been studied for non-decomposable distances, for which there are important instances including edit distance, dynamic time warping (DTW), Fréchet distance, earth mover's distance, and others. We initiate the study and develop a general framework for such distances. Interestingly, for some distances such as DTW or Fréchet, exact recovery of the sequence s is provably impossible, and so we show by allowing the characters in y to be drawn from a slightly larger alphabet this then becomes possible. In a number of cases we obtain optimal or near-optimal query complexity. One motivation for understanding non-adaptivity is that the query sequence can be fixed and provide a non-linear embedding of the input, which can be used in downstream applications involving, e.g., neural networks for natural language processing.
doi_str_mv 10.1109/TIT.2023.3289981
format Article
fullrecord <record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_crossref_primary_10_1109_TIT_2023_3289981</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>10164646</ieee_id><sourcerecordid>2865090602</sourcerecordid><originalsourceid>FETCH-LOGICAL-c334t-108a060cd76ced82957f393356f4f01c9174ffbdd8ef3f101756f6fb5a8d43f03</originalsourceid><addsrcrecordid>eNpNkM1rAjEQxUNpodb23kMPCz2vzuRrk2PRfghSodhzyGYTWFFjEy343zeihzKHYWbeewM_Qh4RRoigx8vZckSBshGjSmuFV2SAQjS1loJfkwEAqlpzrm7JXc6rMnKBdEDGX97FX5-OVUhxU33GbT0tm80uZtuufTXt895una8Wybq1z_fkJth19g-XPiTfb6_LyUc9X7zPJi_z2jHG9zWCsiDBdY10vlNUiyYwzZiQgQdAp7HhIbRdp3xgAQGbcpGhFVZ1nAVgQ_J8zt2l-HPweW9W8ZC25aWhSgrQJZ0WFZxVLsWckw9ml_qNTUeDYE5YTMFiTljMBUuxPJ0tvff-nxwlL8X-AFw5XNc</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2865090602</pqid></control><display><type>article</type><title>Recovery from Non-Decomposable Distance Oracles</title><source>IEEE Electronic Library (IEL)</source><creator>Hu, Zhuangfei ; Li, Xinda ; Woodruff, David P. ; Zhang, Hongyang ; Zhang, Shufan</creator><creatorcontrib>Hu, Zhuangfei ; Li, Xinda ; Woodruff, David P. ; Zhang, Hongyang ; Zhang, Shufan</creatorcontrib><description>A line of work has looked at the problem of recovering an input from distance queries. In this setting, there is an unknown sequence s ∈ {0, 1} ≤ n , and one chooses a set of queries y ∈ {0, 1} O ( n ) and receives d ( s , y ) for a distance function d . The goal is to make as few queries as possible to recover s . Although this problem is well-studied for decomposable distances, i.e., distances of the form d ( s , y ) = Σ n i =1 f ( s i , y i ) for some function f , which includes the important cases of Hamming distance, ℓ p -norms, and M -estimators, to the best of our knowledge this problem has not been studied for non-decomposable distances, for which there are important instances including edit distance, dynamic time warping (DTW), Fréchet distance, earth mover's distance, and others. We initiate the study and develop a general framework for such distances. Interestingly, for some distances such as DTW or Fréchet, exact recovery of the sequence s is provably impossible, and so we show by allowing the characters in y to be drawn from a slightly larger alphabet this then becomes possible. In a number of cases we obtain optimal or near-optimal query complexity. One motivation for understanding non-adaptivity is that the query sequence can be fixed and provide a non-linear embedding of the input, which can be used in downstream applications involving, e.g., neural networks for natural language processing.</description><identifier>ISSN: 0018-9448</identifier><identifier>EISSN: 1557-9654</identifier><identifier>DOI: 10.1109/TIT.2023.3289981</identifier><identifier>CODEN: IETTAW</identifier><language>eng</language><publisher>New York: IEEE</publisher><subject>Complexity theory ; Decomposition ; DTW Distance ; Edit Distance ; Encoding ; Fréchet Distance ; Natural language processing ; Neural networks ; Perturbation methods ; Queries ; Recovery ; Robustness ; Sequence Recovery ; Symbols ; Testing ; Upper bound</subject><ispartof>IEEE transactions on information theory, 2023-10, Vol.69 (10), p.1-1</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2023</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c334t-108a060cd76ced82957f393356f4f01c9174ffbdd8ef3f101756f6fb5a8d43f03</citedby><cites>FETCH-LOGICAL-c334t-108a060cd76ced82957f393356f4f01c9174ffbdd8ef3f101756f6fb5a8d43f03</cites><orcidid>0000-0002-0548-6068 ; 0000-0002-0983-2730 ; 0009-0000-7238-4005 ; 0009-0009-0077-2469</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/10164646$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,776,780,792,27903,27904,54737</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/10164646$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Hu, Zhuangfei</creatorcontrib><creatorcontrib>Li, Xinda</creatorcontrib><creatorcontrib>Woodruff, David P.</creatorcontrib><creatorcontrib>Zhang, Hongyang</creatorcontrib><creatorcontrib>Zhang, Shufan</creatorcontrib><title>Recovery from Non-Decomposable Distance Oracles</title><title>IEEE transactions on information theory</title><addtitle>TIT</addtitle><description>A line of work has looked at the problem of recovering an input from distance queries. In this setting, there is an unknown sequence s ∈ {0, 1} ≤ n , and one chooses a set of queries y ∈ {0, 1} O ( n ) and receives d ( s , y ) for a distance function d . The goal is to make as few queries as possible to recover s . Although this problem is well-studied for decomposable distances, i.e., distances of the form d ( s , y ) = Σ n i =1 f ( s i , y i ) for some function f , which includes the important cases of Hamming distance, ℓ p -norms, and M -estimators, to the best of our knowledge this problem has not been studied for non-decomposable distances, for which there are important instances including edit distance, dynamic time warping (DTW), Fréchet distance, earth mover's distance, and others. We initiate the study and develop a general framework for such distances. Interestingly, for some distances such as DTW or Fréchet, exact recovery of the sequence s is provably impossible, and so we show by allowing the characters in y to be drawn from a slightly larger alphabet this then becomes possible. In a number of cases we obtain optimal or near-optimal query complexity. One motivation for understanding non-adaptivity is that the query sequence can be fixed and provide a non-linear embedding of the input, which can be used in downstream applications involving, e.g., neural networks for natural language processing.</description><subject>Complexity theory</subject><subject>Decomposition</subject><subject>DTW Distance</subject><subject>Edit Distance</subject><subject>Encoding</subject><subject>Fréchet Distance</subject><subject>Natural language processing</subject><subject>Neural networks</subject><subject>Perturbation methods</subject><subject>Queries</subject><subject>Recovery</subject><subject>Robustness</subject><subject>Sequence Recovery</subject><subject>Symbols</subject><subject>Testing</subject><subject>Upper bound</subject><issn>0018-9448</issn><issn>1557-9654</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNpNkM1rAjEQxUNpodb23kMPCz2vzuRrk2PRfghSodhzyGYTWFFjEy343zeihzKHYWbeewM_Qh4RRoigx8vZckSBshGjSmuFV2SAQjS1loJfkwEAqlpzrm7JXc6rMnKBdEDGX97FX5-OVUhxU33GbT0tm80uZtuufTXt895una8Wybq1z_fkJth19g-XPiTfb6_LyUc9X7zPJi_z2jHG9zWCsiDBdY10vlNUiyYwzZiQgQdAp7HhIbRdp3xgAQGbcpGhFVZ1nAVgQ_J8zt2l-HPweW9W8ZC25aWhSgrQJZ0WFZxVLsWckw9ml_qNTUeDYE5YTMFiTljMBUuxPJ0tvff-nxwlL8X-AFw5XNc</recordid><startdate>20231001</startdate><enddate>20231001</enddate><creator>Hu, Zhuangfei</creator><creator>Li, Xinda</creator><creator>Woodruff, David P.</creator><creator>Zhang, Hongyang</creator><creator>Zhang, Shufan</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><orcidid>https://orcid.org/0000-0002-0548-6068</orcidid><orcidid>https://orcid.org/0000-0002-0983-2730</orcidid><orcidid>https://orcid.org/0009-0000-7238-4005</orcidid><orcidid>https://orcid.org/0009-0009-0077-2469</orcidid></search><sort><creationdate>20231001</creationdate><title>Recovery from Non-Decomposable Distance Oracles</title><author>Hu, Zhuangfei ; Li, Xinda ; Woodruff, David P. ; Zhang, Hongyang ; Zhang, Shufan</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c334t-108a060cd76ced82957f393356f4f01c9174ffbdd8ef3f101756f6fb5a8d43f03</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Complexity theory</topic><topic>Decomposition</topic><topic>DTW Distance</topic><topic>Edit Distance</topic><topic>Encoding</topic><topic>Fréchet Distance</topic><topic>Natural language processing</topic><topic>Neural networks</topic><topic>Perturbation methods</topic><topic>Queries</topic><topic>Recovery</topic><topic>Robustness</topic><topic>Sequence Recovery</topic><topic>Symbols</topic><topic>Testing</topic><topic>Upper bound</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Hu, Zhuangfei</creatorcontrib><creatorcontrib>Li, Xinda</creatorcontrib><creatorcontrib>Woodruff, David P.</creatorcontrib><creatorcontrib>Zhang, Hongyang</creatorcontrib><creatorcontrib>Zhang, Shufan</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics &amp; Communications Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>IEEE transactions on information theory</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Hu, Zhuangfei</au><au>Li, Xinda</au><au>Woodruff, David P.</au><au>Zhang, Hongyang</au><au>Zhang, Shufan</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Recovery from Non-Decomposable Distance Oracles</atitle><jtitle>IEEE transactions on information theory</jtitle><stitle>TIT</stitle><date>2023-10-01</date><risdate>2023</risdate><volume>69</volume><issue>10</issue><spage>1</spage><epage>1</epage><pages>1-1</pages><issn>0018-9448</issn><eissn>1557-9654</eissn><coden>IETTAW</coden><abstract>A line of work has looked at the problem of recovering an input from distance queries. In this setting, there is an unknown sequence s ∈ {0, 1} ≤ n , and one chooses a set of queries y ∈ {0, 1} O ( n ) and receives d ( s , y ) for a distance function d . The goal is to make as few queries as possible to recover s . Although this problem is well-studied for decomposable distances, i.e., distances of the form d ( s , y ) = Σ n i =1 f ( s i , y i ) for some function f , which includes the important cases of Hamming distance, ℓ p -norms, and M -estimators, to the best of our knowledge this problem has not been studied for non-decomposable distances, for which there are important instances including edit distance, dynamic time warping (DTW), Fréchet distance, earth mover's distance, and others. We initiate the study and develop a general framework for such distances. Interestingly, for some distances such as DTW or Fréchet, exact recovery of the sequence s is provably impossible, and so we show by allowing the characters in y to be drawn from a slightly larger alphabet this then becomes possible. In a number of cases we obtain optimal or near-optimal query complexity. One motivation for understanding non-adaptivity is that the query sequence can be fixed and provide a non-linear embedding of the input, which can be used in downstream applications involving, e.g., neural networks for natural language processing.</abstract><cop>New York</cop><pub>IEEE</pub><doi>10.1109/TIT.2023.3289981</doi><tpages>1</tpages><orcidid>https://orcid.org/0000-0002-0548-6068</orcidid><orcidid>https://orcid.org/0000-0002-0983-2730</orcidid><orcidid>https://orcid.org/0009-0000-7238-4005</orcidid><orcidid>https://orcid.org/0009-0009-0077-2469</orcidid><oa>free_for_read</oa></addata></record>
fulltext fulltext_linktorsrc
identifier ISSN: 0018-9448
ispartof IEEE transactions on information theory, 2023-10, Vol.69 (10), p.1-1
issn 0018-9448
1557-9654
language eng
recordid cdi_crossref_primary_10_1109_TIT_2023_3289981
source IEEE Electronic Library (IEL)
subjects Complexity theory
Decomposition
DTW Distance
Edit Distance
Encoding
Fréchet Distance
Natural language processing
Neural networks
Perturbation methods
Queries
Recovery
Robustness
Sequence Recovery
Symbols
Testing
Upper bound
title Recovery from Non-Decomposable Distance Oracles
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-21T19%3A33%3A58IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Recovery%20from%20Non-Decomposable%20Distance%20Oracles&rft.jtitle=IEEE%20transactions%20on%20information%20theory&rft.au=Hu,%20Zhuangfei&rft.date=2023-10-01&rft.volume=69&rft.issue=10&rft.spage=1&rft.epage=1&rft.pages=1-1&rft.issn=0018-9448&rft.eissn=1557-9654&rft.coden=IETTAW&rft_id=info:doi/10.1109/TIT.2023.3289981&rft_dat=%3Cproquest_RIE%3E2865090602%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2865090602&rft_id=info:pmid/&rft_ieee_id=10164646&rfr_iscdi=true