A fast machine learning dataloader for epigenetic tracks from BigWig files
We created bigwig-loader, a data-loader for epigenetic profiles from BigWig files that decompresses and processes information for multiple intervals from multiple BigWig files in parallel. This is an access pattern needed to create training batches for typical machine learning models on epigenetics...
Gespeichert in:
Veröffentlicht in: | Bioinformatics (Oxford, England) England), 2024-01, Vol.40 (1) |
---|---|
Hauptverfasser: | , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | We created bigwig-loader, a data-loader for epigenetic profiles from BigWig files that decompresses and processes information for multiple intervals from multiple BigWig files in parallel. This is an access pattern needed to create training batches for typical machine learning models on epigenetics data. Using a new codec, the decompression can be done on a graphical processing unit (GPU) making it fast enough to create the training batches during training, mitigating the need for saving preprocessed training examples to disk.
The bigwig-loader installation instructions and source code can be accessed at https://github.com/pfizer-opensource/bigwig-loader. |
---|---|
ISSN: | 1367-4811 1367-4803 1367-4811 |
DOI: | 10.1093/bioinformatics/btad767 |