Recovery from Non-Decomposable Distance Oracles

A line of work has looked at the problem of recovering an input from distance queries. In this setting, there is an unknown sequence s ∈ {0, 1} ≤ n , and one chooses a set of queries y ∈ {0, 1} O ( n ) and receives d ( s , y ) for a distance function d . The goal is to make as few queries as possibl...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on information theory 2023-10, Vol.69 (10), p.1-1
Hauptverfasser: Hu, Zhuangfei, Li, Xinda, Woodruff, David P., Zhang, Hongyang, Zhang, Shufan
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:A line of work has looked at the problem of recovering an input from distance queries. In this setting, there is an unknown sequence s ∈ {0, 1} ≤ n , and one chooses a set of queries y ∈ {0, 1} O ( n ) and receives d ( s , y ) for a distance function d . The goal is to make as few queries as possible to recover s . Although this problem is well-studied for decomposable distances, i.e., distances of the form d ( s , y ) = Σ n i =1 f ( s i , y i ) for some function f , which includes the important cases of Hamming distance, ℓ p -norms, and M -estimators, to the best of our knowledge this problem has not been studied for non-decomposable distances, for which there are important instances including edit distance, dynamic time warping (DTW), Fréchet distance, earth mover's distance, and others. We initiate the study and develop a general framework for such distances. Interestingly, for some distances such as DTW or Fréchet, exact recovery of the sequence s is provably impossible, and so we show by allowing the characters in y to be drawn from a slightly larger alphabet this then becomes possible. In a number of cases we obtain optimal or near-optimal query complexity. One motivation for understanding non-adaptivity is that the query sequence can be fixed and provide a non-linear embedding of the input, which can be used in downstream applications involving, e.g., neural networks for natural language processing.
ISSN:0018-9448
1557-9654
DOI:10.1109/TIT.2023.3289981