A fast machine learning dataloader for epigenetic tracks from BigWig files
We created bigwig-loader, a data-loader for epigenetic profiles from BigWig files that decompresses and processes information for multiple intervals from multiple BigWig files in parallel. This is an access pattern needed to create training batches for typical machine learning models on epigenetics...
Gespeichert in:
Veröffentlicht in: | Bioinformatics (Oxford, England) England), 2024-01, Vol.40 (1) |
---|---|
Hauptverfasser: | , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | |
---|---|
container_issue | 1 |
container_start_page | |
container_title | Bioinformatics (Oxford, England) |
container_volume | 40 |
creator | Retel, Joren Sebastian Poehlmann, Andreas Chiou, Josh Steffen, Andreas Clevert, Djork-Arné |
description | We created bigwig-loader, a data-loader for epigenetic profiles from BigWig files that decompresses and processes information for multiple intervals from multiple BigWig files in parallel. This is an access pattern needed to create training batches for typical machine learning models on epigenetics data. Using a new codec, the decompression can be done on a graphical processing unit (GPU) making it fast enough to create the training batches during training, mitigating the need for saving preprocessed training examples to disk.
The bigwig-loader installation instructions and source code can be accessed at https://github.com/pfizer-opensource/bigwig-loader. |
doi_str_mv | 10.1093/bioinformatics/btad767 |
format | Article |
fullrecord | <record><control><sourceid>proquest_pubme</sourceid><recordid>TN_cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_10782802</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2913449814</sourcerecordid><originalsourceid>FETCH-LOGICAL-c362t-dfb53a461d30d80f3f14abd214faf2dbfc3c5c136b3a819c3a699fd2580baea53</originalsourceid><addsrcrecordid>eNpVUclOwzAQtRCIlsIvVD5yCbXjLM4JlYpVlbiAOFoTL6khsYudIvH3BLVU9DQjzds0D6EpJVeUVGxWW2-d8aGD3so4q3tQZVEeoTFlRZlknNLjf_sIncX4TgjJSV6cohHjtMxLXozR0xwbiD3uQK6s07jVEJx1DVbQQ-tB6YAHG6zXttFOD2a4DyA_IjbBd_jGNm-2wca2Op6jEwNt1Be7OUGvd7cvi4dk-Xz_uJgvE8mKtE-UqXMGWUEVI4oTwwzNoFYpzQyYVNVGMpnLIXrNgNNKMiiqyqg056QGDTmboOut7npTd1pJ7YZErVgH20H4Fh6sOLw4uxKN_xKUlDzlJB0ULncKwX9udOxFZ6PUbQtO-00UaUVZllWcZgO02EJl8DEGbfY-lIjfJsRhE2LXxECc_k-5p_29nv0AhLiNKQ</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2913449814</pqid></control><display><type>article</type><title>A fast machine learning dataloader for epigenetic tracks from BigWig files</title><source>MEDLINE</source><source>Directory of Open Access Journals</source><source>Oxford University Press Open Access</source><source>PubMed Central</source><source>Alma/SFX Local Collection</source><source>EZB Electronic Journals Library</source><creator>Retel, Joren Sebastian ; Poehlmann, Andreas ; Chiou, Josh ; Steffen, Andreas ; Clevert, Djork-Arné</creator><contributor>Martelli, Pier Luigi</contributor><creatorcontrib>Retel, Joren Sebastian ; Poehlmann, Andreas ; Chiou, Josh ; Steffen, Andreas ; Clevert, Djork-Arné ; Martelli, Pier Luigi</creatorcontrib><description>We created bigwig-loader, a data-loader for epigenetic profiles from BigWig files that decompresses and processes information for multiple intervals from multiple BigWig files in parallel. This is an access pattern needed to create training batches for typical machine learning models on epigenetics data. Using a new codec, the decompression can be done on a graphical processing unit (GPU) making it fast enough to create the training batches during training, mitigating the need for saving preprocessed training examples to disk.
The bigwig-loader installation instructions and source code can be accessed at https://github.com/pfizer-opensource/bigwig-loader.</description><identifier>ISSN: 1367-4811</identifier><identifier>ISSN: 1367-4803</identifier><identifier>EISSN: 1367-4811</identifier><identifier>DOI: 10.1093/bioinformatics/btad767</identifier><identifier>PMID: 38175786</identifier><language>eng</language><publisher>England: Oxford University Press</publisher><subject>Applications Note ; Epigenesis, Genetic ; Epigenomics ; Software</subject><ispartof>Bioinformatics (Oxford, England), 2024-01, Vol.40 (1)</ispartof><rights>The Author(s) 2024. Published by Oxford University Press.</rights><rights>The Author(s) 2024. Published by Oxford University Press. 2024</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c362t-dfb53a461d30d80f3f14abd214faf2dbfc3c5c136b3a819c3a699fd2580baea53</cites><orcidid>0000-0003-3316-5525 ; 0000-0002-4618-0647</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC10782802/pdf/$$EPDF$$P50$$Gpubmedcentral$$Hfree_for_read</linktopdf><linktohtml>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC10782802/$$EHTML$$P50$$Gpubmedcentral$$Hfree_for_read</linktohtml><link.rule.ids>230,315,728,781,785,865,886,27926,27927,53793,53795</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/38175786$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><contributor>Martelli, Pier Luigi</contributor><creatorcontrib>Retel, Joren Sebastian</creatorcontrib><creatorcontrib>Poehlmann, Andreas</creatorcontrib><creatorcontrib>Chiou, Josh</creatorcontrib><creatorcontrib>Steffen, Andreas</creatorcontrib><creatorcontrib>Clevert, Djork-Arné</creatorcontrib><title>A fast machine learning dataloader for epigenetic tracks from BigWig files</title><title>Bioinformatics (Oxford, England)</title><addtitle>Bioinformatics</addtitle><description>We created bigwig-loader, a data-loader for epigenetic profiles from BigWig files that decompresses and processes information for multiple intervals from multiple BigWig files in parallel. This is an access pattern needed to create training batches for typical machine learning models on epigenetics data. Using a new codec, the decompression can be done on a graphical processing unit (GPU) making it fast enough to create the training batches during training, mitigating the need for saving preprocessed training examples to disk.
The bigwig-loader installation instructions and source code can be accessed at https://github.com/pfizer-opensource/bigwig-loader.</description><subject>Applications Note</subject><subject>Epigenesis, Genetic</subject><subject>Epigenomics</subject><subject>Software</subject><issn>1367-4811</issn><issn>1367-4803</issn><issn>1367-4811</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>EIF</sourceid><recordid>eNpVUclOwzAQtRCIlsIvVD5yCbXjLM4JlYpVlbiAOFoTL6khsYudIvH3BLVU9DQjzds0D6EpJVeUVGxWW2-d8aGD3so4q3tQZVEeoTFlRZlknNLjf_sIncX4TgjJSV6cohHjtMxLXozR0xwbiD3uQK6s07jVEJx1DVbQQ-tB6YAHG6zXttFOD2a4DyA_IjbBd_jGNm-2wca2Op6jEwNt1Be7OUGvd7cvi4dk-Xz_uJgvE8mKtE-UqXMGWUEVI4oTwwzNoFYpzQyYVNVGMpnLIXrNgNNKMiiqyqg056QGDTmboOut7npTd1pJ7YZErVgH20H4Fh6sOLw4uxKN_xKUlDzlJB0ULncKwX9udOxFZ6PUbQtO-00UaUVZllWcZgO02EJl8DEGbfY-lIjfJsRhE2LXxECc_k-5p_29nv0AhLiNKQ</recordid><startdate>20240102</startdate><enddate>20240102</enddate><creator>Retel, Joren Sebastian</creator><creator>Poehlmann, Andreas</creator><creator>Chiou, Josh</creator><creator>Steffen, Andreas</creator><creator>Clevert, Djork-Arné</creator><general>Oxford University Press</general><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7X8</scope><scope>5PM</scope><orcidid>https://orcid.org/0000-0003-3316-5525</orcidid><orcidid>https://orcid.org/0000-0002-4618-0647</orcidid></search><sort><creationdate>20240102</creationdate><title>A fast machine learning dataloader for epigenetic tracks from BigWig files</title><author>Retel, Joren Sebastian ; Poehlmann, Andreas ; Chiou, Josh ; Steffen, Andreas ; Clevert, Djork-Arné</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c362t-dfb53a461d30d80f3f14abd214faf2dbfc3c5c136b3a819c3a699fd2580baea53</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Applications Note</topic><topic>Epigenesis, Genetic</topic><topic>Epigenomics</topic><topic>Software</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Retel, Joren Sebastian</creatorcontrib><creatorcontrib>Poehlmann, Andreas</creatorcontrib><creatorcontrib>Chiou, Josh</creatorcontrib><creatorcontrib>Steffen, Andreas</creatorcontrib><creatorcontrib>Clevert, Djork-Arné</creatorcontrib><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>MEDLINE - Academic</collection><collection>PubMed Central (Full Participant titles)</collection><jtitle>Bioinformatics (Oxford, England)</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Retel, Joren Sebastian</au><au>Poehlmann, Andreas</au><au>Chiou, Josh</au><au>Steffen, Andreas</au><au>Clevert, Djork-Arné</au><au>Martelli, Pier Luigi</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>A fast machine learning dataloader for epigenetic tracks from BigWig files</atitle><jtitle>Bioinformatics (Oxford, England)</jtitle><addtitle>Bioinformatics</addtitle><date>2024-01-02</date><risdate>2024</risdate><volume>40</volume><issue>1</issue><issn>1367-4811</issn><issn>1367-4803</issn><eissn>1367-4811</eissn><abstract>We created bigwig-loader, a data-loader for epigenetic profiles from BigWig files that decompresses and processes information for multiple intervals from multiple BigWig files in parallel. This is an access pattern needed to create training batches for typical machine learning models on epigenetics data. Using a new codec, the decompression can be done on a graphical processing unit (GPU) making it fast enough to create the training batches during training, mitigating the need for saving preprocessed training examples to disk.
The bigwig-loader installation instructions and source code can be accessed at https://github.com/pfizer-opensource/bigwig-loader.</abstract><cop>England</cop><pub>Oxford University Press</pub><pmid>38175786</pmid><doi>10.1093/bioinformatics/btad767</doi><orcidid>https://orcid.org/0000-0003-3316-5525</orcidid><orcidid>https://orcid.org/0000-0002-4618-0647</orcidid><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | ISSN: 1367-4811 |
ispartof | Bioinformatics (Oxford, England), 2024-01, Vol.40 (1) |
issn | 1367-4811 1367-4803 1367-4811 |
language | eng |
recordid | cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_10782802 |
source | MEDLINE; Directory of Open Access Journals; Oxford University Press Open Access; PubMed Central; Alma/SFX Local Collection; EZB Electronic Journals Library |
subjects | Applications Note Epigenesis, Genetic Epigenomics Software |
title | A fast machine learning dataloader for epigenetic tracks from BigWig files |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-17T16%3A30%3A01IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_pubme&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=A%20fast%20machine%20learning%20dataloader%20for%20epigenetic%20tracks%20from%20BigWig%20files&rft.jtitle=Bioinformatics%20(Oxford,%20England)&rft.au=Retel,%20Joren%20Sebastian&rft.date=2024-01-02&rft.volume=40&rft.issue=1&rft.issn=1367-4811&rft.eissn=1367-4811&rft_id=info:doi/10.1093/bioinformatics/btad767&rft_dat=%3Cproquest_pubme%3E2913449814%3C/proquest_pubme%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2913449814&rft_id=info:pmid/38175786&rfr_iscdi=true |