Methods and systems for data analysis and compression

The present disclosure provides computer implemented methods and systems for analysing datasets, such as large data sets output from nucleic acid sequencing technologies. In particular, the present disclosure provides for data analysis comprising computing the BWT of a collection of strings in an in...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Cox, Anthony James, Janin, Lilian
Format: Patent
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:The present disclosure provides computer implemented methods and systems for analysing datasets, such as large data sets output from nucleic acid sequencing technologies. In particular, the present disclosure provides for data analysis comprising computing the BWT of a collection of strings in an incremental, character by character, manner. The present disclosure also provides compression boosting strategies resulting in a BWT of a reordered collection of data that is more compressible by second stage compression methods compared to non-reordered computational analysis.