On the Maximum Number of Non-Confusable Strings Evolving Under Short Tandem Duplications
The set of all \( q \)-ary strings that do not contain repeated substrings of length \( \leqslant\! 3 \) (i.e., that do not contain substrings of the form \( a a \), \( a b a b \), and \( a b c a b c \)) constitutes a code correcting an arbitrary number of tandem-duplication mutations of length \( \...
Gespeichert in:
Veröffentlicht in: | arXiv.org 2022-04 |
---|---|
1. Verfasser: | |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | The set of all \( q \)-ary strings that do not contain repeated substrings of length \( \leqslant\! 3 \) (i.e., that do not contain substrings of the form \( a a \), \( a b a b \), and \( a b c a b c \)) constitutes a code correcting an arbitrary number of tandem-duplication mutations of length \( \leqslant\! 3 \). In other words, any two such strings are non-confusable in the sense that they cannot produce the same string while evolving under tandem duplications of length \( \leqslant\! 3 \). We demonstrate that this code is asymptotically optimal in terms of rate, meaning that it represents the largest set of non-confusable strings up to subexponential factors. This result settles the zero-error capacity problem for the last remaining case of tandem-duplication channels satisfying the "root-uniqueness" property. |
---|---|
ISSN: | 2331-8422 |
DOI: | 10.48550/arxiv.1911.06561 |