Design of Capacity-Approaching Constrained Codes for DNA-Based Storage Systems

We consider coding techniques that limit the lengths of homopolymer runs in strands of nucleotides used in DNA-based mass data storage systems. We compute the maximum number of user bits that can be stored per nucleotide when a maximum homopolymer runlength constraint is imposed. We describe simple...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE communications letters 2018-02, Vol.22 (2), p.224-227
Hauptverfasser: Schouhamer Immink, Kees A., Kui Cai
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:We consider coding techniques that limit the lengths of homopolymer runs in strands of nucleotides used in DNA-based mass data storage systems. We compute the maximum number of user bits that can be stored per nucleotide when a maximum homopolymer runlength constraint is imposed. We describe simple and efficient implementations of coding techniques that avoid the occurrence of long homopolymers, and the rates of the constructed codes are close to the theoretical maximum. The proposed sequence replacement method for k-constrained q-ary data yields a significant improvement in coding redundancy than the prior art sequence replacement method for the k-constrained binary data. Using a simple transformation, standard binary maximum runlength limited sequences can be transformed into maximum runlength limited q-ary sequences which opens the door to applying the vast prior art binary code constructions to DNA-based storage.
ISSN:1089-7798
1558-2558
DOI:10.1109/LCOMM.2017.2775608