A fast machine learning dataloader for epigenetic tracks from BigWig files

We created bigwig-loader, a data-loader for epigenetic profiles from BigWig files that decompresses and processes information for multiple intervals from multiple BigWig files in parallel. This is an access pattern needed to create training batches for typical machine learning models on epigenetics...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Bioinformatics (Oxford, England) England), 2024-01, Vol.40 (1)
Hauptverfasser: Retel, Joren Sebastian, Poehlmann, Andreas, Chiou, Josh, Steffen, Andreas, Clevert, Djork-Arné
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page
container_issue 1
container_start_page
container_title Bioinformatics (Oxford, England)
container_volume 40
creator Retel, Joren Sebastian
Poehlmann, Andreas
Chiou, Josh
Steffen, Andreas
Clevert, Djork-Arné
description We created bigwig-loader, a data-loader for epigenetic profiles from BigWig files that decompresses and processes information for multiple intervals from multiple BigWig files in parallel. This is an access pattern needed to create training batches for typical machine learning models on epigenetics data. Using a new codec, the decompression can be done on a graphical processing unit (GPU) making it fast enough to create the training batches during training, mitigating the need for saving preprocessed training examples to disk. The bigwig-loader installation instructions and source code can be accessed at https://github.com/pfizer-opensource/bigwig-loader.
doi_str_mv 10.1093/bioinformatics/btad767
format Article
fullrecord <record><control><sourceid>proquest_pubme</sourceid><recordid>TN_cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_10782802</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2913449814</sourcerecordid><originalsourceid>FETCH-LOGICAL-c362t-dfb53a461d30d80f3f14abd214faf2dbfc3c5c136b3a819c3a699fd2580baea53</originalsourceid><addsrcrecordid>eNpVUclOwzAQtRCIlsIvVD5yCbXjLM4JlYpVlbiAOFoTL6khsYudIvH3BLVU9DQjzds0D6EpJVeUVGxWW2-d8aGD3so4q3tQZVEeoTFlRZlknNLjf_sIncX4TgjJSV6cohHjtMxLXozR0xwbiD3uQK6s07jVEJx1DVbQQ-tB6YAHG6zXttFOD2a4DyA_IjbBd_jGNm-2wca2Op6jEwNt1Be7OUGvd7cvi4dk-Xz_uJgvE8mKtE-UqXMGWUEVI4oTwwzNoFYpzQyYVNVGMpnLIXrNgNNKMiiqyqg056QGDTmboOut7npTd1pJ7YZErVgH20H4Fh6sOLw4uxKN_xKUlDzlJB0ULncKwX9udOxFZ6PUbQtO-00UaUVZllWcZgO02EJl8DEGbfY-lIjfJsRhE2LXxECc_k-5p_29nv0AhLiNKQ</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2913449814</pqid></control><display><type>article</type><title>A fast machine learning dataloader for epigenetic tracks from BigWig files</title><source>MEDLINE</source><source>Directory of Open Access Journals</source><source>Oxford University Press Open Access</source><source>PubMed Central</source><source>Alma/SFX Local Collection</source><source>EZB Electronic Journals Library</source><creator>Retel, Joren Sebastian ; Poehlmann, Andreas ; Chiou, Josh ; Steffen, Andreas ; Clevert, Djork-Arné</creator><contributor>Martelli, Pier Luigi</contributor><creatorcontrib>Retel, Joren Sebastian ; Poehlmann, Andreas ; Chiou, Josh ; Steffen, Andreas ; Clevert, Djork-Arné ; Martelli, Pier Luigi</creatorcontrib><description>We created bigwig-loader, a data-loader for epigenetic profiles from BigWig files that decompresses and processes information for multiple intervals from multiple BigWig files in parallel. This is an access pattern needed to create training batches for typical machine learning models on epigenetics data. Using a new codec, the decompression can be done on a graphical processing unit (GPU) making it fast enough to create the training batches during training, mitigating the need for saving preprocessed training examples to disk. The bigwig-loader installation instructions and source code can be accessed at https://github.com/pfizer-opensource/bigwig-loader.</description><identifier>ISSN: 1367-4811</identifier><identifier>ISSN: 1367-4803</identifier><identifier>EISSN: 1367-4811</identifier><identifier>DOI: 10.1093/bioinformatics/btad767</identifier><identifier>PMID: 38175786</identifier><language>eng</language><publisher>England: Oxford University Press</publisher><subject>Applications Note ; Epigenesis, Genetic ; Epigenomics ; Software</subject><ispartof>Bioinformatics (Oxford, England), 2024-01, Vol.40 (1)</ispartof><rights>The Author(s) 2024. Published by Oxford University Press.</rights><rights>The Author(s) 2024. Published by Oxford University Press. 2024</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c362t-dfb53a461d30d80f3f14abd214faf2dbfc3c5c136b3a819c3a699fd2580baea53</cites><orcidid>0000-0003-3316-5525 ; 0000-0002-4618-0647</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC10782802/pdf/$$EPDF$$P50$$Gpubmedcentral$$Hfree_for_read</linktopdf><linktohtml>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC10782802/$$EHTML$$P50$$Gpubmedcentral$$Hfree_for_read</linktohtml><link.rule.ids>230,315,728,781,785,865,886,27926,27927,53793,53795</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/38175786$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><contributor>Martelli, Pier Luigi</contributor><creatorcontrib>Retel, Joren Sebastian</creatorcontrib><creatorcontrib>Poehlmann, Andreas</creatorcontrib><creatorcontrib>Chiou, Josh</creatorcontrib><creatorcontrib>Steffen, Andreas</creatorcontrib><creatorcontrib>Clevert, Djork-Arné</creatorcontrib><title>A fast machine learning dataloader for epigenetic tracks from BigWig files</title><title>Bioinformatics (Oxford, England)</title><addtitle>Bioinformatics</addtitle><description>We created bigwig-loader, a data-loader for epigenetic profiles from BigWig files that decompresses and processes information for multiple intervals from multiple BigWig files in parallel. This is an access pattern needed to create training batches for typical machine learning models on epigenetics data. Using a new codec, the decompression can be done on a graphical processing unit (GPU) making it fast enough to create the training batches during training, mitigating the need for saving preprocessed training examples to disk. The bigwig-loader installation instructions and source code can be accessed at https://github.com/pfizer-opensource/bigwig-loader.</description><subject>Applications Note</subject><subject>Epigenesis, Genetic</subject><subject>Epigenomics</subject><subject>Software</subject><issn>1367-4811</issn><issn>1367-4803</issn><issn>1367-4811</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>EIF</sourceid><recordid>eNpVUclOwzAQtRCIlsIvVD5yCbXjLM4JlYpVlbiAOFoTL6khsYudIvH3BLVU9DQjzds0D6EpJVeUVGxWW2-d8aGD3so4q3tQZVEeoTFlRZlknNLjf_sIncX4TgjJSV6cohHjtMxLXozR0xwbiD3uQK6s07jVEJx1DVbQQ-tB6YAHG6zXttFOD2a4DyA_IjbBd_jGNm-2wca2Op6jEwNt1Be7OUGvd7cvi4dk-Xz_uJgvE8mKtE-UqXMGWUEVI4oTwwzNoFYpzQyYVNVGMpnLIXrNgNNKMiiqyqg056QGDTmboOut7npTd1pJ7YZErVgH20H4Fh6sOLw4uxKN_xKUlDzlJB0ULncKwX9udOxFZ6PUbQtO-00UaUVZllWcZgO02EJl8DEGbfY-lIjfJsRhE2LXxECc_k-5p_29nv0AhLiNKQ</recordid><startdate>20240102</startdate><enddate>20240102</enddate><creator>Retel, Joren Sebastian</creator><creator>Poehlmann, Andreas</creator><creator>Chiou, Josh</creator><creator>Steffen, Andreas</creator><creator>Clevert, Djork-Arné</creator><general>Oxford University Press</general><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7X8</scope><scope>5PM</scope><orcidid>https://orcid.org/0000-0003-3316-5525</orcidid><orcidid>https://orcid.org/0000-0002-4618-0647</orcidid></search><sort><creationdate>20240102</creationdate><title>A fast machine learning dataloader for epigenetic tracks from BigWig files</title><author>Retel, Joren Sebastian ; Poehlmann, Andreas ; Chiou, Josh ; Steffen, Andreas ; Clevert, Djork-Arné</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c362t-dfb53a461d30d80f3f14abd214faf2dbfc3c5c136b3a819c3a699fd2580baea53</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Applications Note</topic><topic>Epigenesis, Genetic</topic><topic>Epigenomics</topic><topic>Software</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Retel, Joren Sebastian</creatorcontrib><creatorcontrib>Poehlmann, Andreas</creatorcontrib><creatorcontrib>Chiou, Josh</creatorcontrib><creatorcontrib>Steffen, Andreas</creatorcontrib><creatorcontrib>Clevert, Djork-Arné</creatorcontrib><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>MEDLINE - Academic</collection><collection>PubMed Central (Full Participant titles)</collection><jtitle>Bioinformatics (Oxford, England)</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Retel, Joren Sebastian</au><au>Poehlmann, Andreas</au><au>Chiou, Josh</au><au>Steffen, Andreas</au><au>Clevert, Djork-Arné</au><au>Martelli, Pier Luigi</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>A fast machine learning dataloader for epigenetic tracks from BigWig files</atitle><jtitle>Bioinformatics (Oxford, England)</jtitle><addtitle>Bioinformatics</addtitle><date>2024-01-02</date><risdate>2024</risdate><volume>40</volume><issue>1</issue><issn>1367-4811</issn><issn>1367-4803</issn><eissn>1367-4811</eissn><abstract>We created bigwig-loader, a data-loader for epigenetic profiles from BigWig files that decompresses and processes information for multiple intervals from multiple BigWig files in parallel. This is an access pattern needed to create training batches for typical machine learning models on epigenetics data. Using a new codec, the decompression can be done on a graphical processing unit (GPU) making it fast enough to create the training batches during training, mitigating the need for saving preprocessed training examples to disk. The bigwig-loader installation instructions and source code can be accessed at https://github.com/pfizer-opensource/bigwig-loader.</abstract><cop>England</cop><pub>Oxford University Press</pub><pmid>38175786</pmid><doi>10.1093/bioinformatics/btad767</doi><orcidid>https://orcid.org/0000-0003-3316-5525</orcidid><orcidid>https://orcid.org/0000-0002-4618-0647</orcidid><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 1367-4811
ispartof Bioinformatics (Oxford, England), 2024-01, Vol.40 (1)
issn 1367-4811
1367-4803
1367-4811
language eng
recordid cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_10782802
source MEDLINE; Directory of Open Access Journals; Oxford University Press Open Access; PubMed Central; Alma/SFX Local Collection; EZB Electronic Journals Library
subjects Applications Note
Epigenesis, Genetic
Epigenomics
Software
title A fast machine learning dataloader for epigenetic tracks from BigWig files
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-17T16%3A30%3A01IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_pubme&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=A%20fast%20machine%20learning%20dataloader%20for%20epigenetic%20tracks%20from%20BigWig%20files&rft.jtitle=Bioinformatics%20(Oxford,%20England)&rft.au=Retel,%20Joren%20Sebastian&rft.date=2024-01-02&rft.volume=40&rft.issue=1&rft.issn=1367-4811&rft.eissn=1367-4811&rft_id=info:doi/10.1093/bioinformatics/btad767&rft_dat=%3Cproquest_pubme%3E2913449814%3C/proquest_pubme%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2913449814&rft_id=info:pmid/38175786&rfr_iscdi=true