Prefix-free graphs and suffix array construction in sublinear space
A recent paradigm shift in bioinformatics from a single reference genome to a pangenome brought with it several graph structures. These graph structures must implement operations, such as efficient construction from multiple genomes and read mapping. Read mapping is a well-studied problem in sequent...
Gespeichert in:
Hauptverfasser: | , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | A recent paradigm shift in bioinformatics from a single reference genome to a
pangenome brought with it several graph structures. These graph structures must
implement operations, such as efficient construction from multiple genomes and
read mapping. Read mapping is a well-studied problem in sequential data, and,
together with data structures such as suffix array and Burrows-Wheeler
transform, allows for efficient computation. Attempts to achieve comparatively
high performance on graphs bring many complications since the common data
structures on strings are not easily obtainable for graphs. In this work, we
introduce prefix-free graphs, a novel pangenomic data structure; we show how to
construct them and how to use them to obtain well-known data structures from
stringology in sublinear space, allowing for many efficient operations on
pangenomes. |
---|---|
DOI: | 10.48550/arxiv.2306.14689 |