DBTRG: De Bruijn Trim rotation graph encoding for reliable DNA storage

DNA is a high-density, long-term stable, and scalable storage medium that can meet the increased demands on storage media resulting from the exponential growth of data. The existing DNA storage encoding schemes tend to achieve high-density storage but do not fully consider the local and global stabi...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Computational and structural biotechnology journal 2023-01, Vol.21, p.4469-4477
Hauptverfasser: Zhao, Yunzhu, Cao, Ben, Wang, Penghao, Wang, Kun, Wang, Bin
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:DNA is a high-density, long-term stable, and scalable storage medium that can meet the increased demands on storage media resulting from the exponential growth of data. The existing DNA storage encoding schemes tend to achieve high-density storage but do not fully consider the local and global stability of DNA sequences and the read and write accuracy of the stored information. To address these problems, this article presents a graph-based De Bruijn Trim Rotation Graph (DBTRG) encoding scheme. Through XOR between the proposed dynamic binary sequence and the original binary sequence, k-mers can be divided into the De Bruijn Trim graph, and the stored information can be compressed according to the overlapping relationship. The simulated experimental results show that DBTRG ensures base balance and diversity, reduces the likelihood of undesired motifs, and improves the stability of DNA storage and data recovery. Furthermore, the maintenance of an encoding rate of 1.92 while storing 510 KB images and the introduction of novel approaches and concepts for DNA storage encoding methods are achieved. •Dynamic binary sequence and original binary sequence XOR postpartition k-mers construct the De Bruijn Trim graph.•Deleting repeating base pairs in connected nodes can ensure base balance and reduce the number of undesired motifs.•Modify the rotating tree algorithm to satisfy the homopolymer length of 2.•Improve error-correcting capabilities through the use of RS codes.
ISSN:2001-0370
2001-0370
DOI:10.1016/j.csbj.2023.09.004