Recovering a Message from an Incomplete Set of Noisy Fragments

We consider the problem of communicating over a channel that breaks the message block into fragments of random lengths, shuffles them out of order, and deletes a random fraction of the fragments. Such a channel is motivated by applications in molecular data storage and forensics, and we refer to it...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Ravi, Aditya Narayan, Vahid, Alireza, Shomorony, Ilan
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:We consider the problem of communicating over a channel that breaks the message block into fragments of random lengths, shuffles them out of order, and deletes a random fraction of the fragments. Such a channel is motivated by applications in molecular data storage and forensics, and we refer to it as the torn-paper channel. We characterize the capacity of this channel under arbitrary fragment length distributions and deletion probabilities. Precisely, we show that the capacity is given by a closed-form expression that can be interpreted as F - A, where F is the coverage fraction ,i.e., the fraction of the input codeword that is covered by output fragments, and A is an alignment cost incurred due to the lack of ordering in the output fragments. We then consider a noisy version of the problem, where the fragments are corrupted by binary symmetric noise. We derive upper and lower bounds to the capacity, both of which can be seen as F - A expressions. These bounds match for specific choices of fragment length distributions, and they are approximately tight in cases where there are not too many short fragments.
DOI:10.48550/arxiv.2407.05544