An Introduction to MPEG-G: The First Open ISO/IEC Standard for the Compression and Exchange of Genomic Sequencing Data

The development and progress of high-throughput sequencing technologies have transformed the sequencing of DNA from a scientific research challenge to practice. With the release of the latest generation of sequencing machines, the cost of sequencing a whole human genome has dropped to less than \ 6...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Proceedings of the IEEE 2021-09, Vol.109 (9), p.1607-1622
Hauptverfasser: Voges, Jan, Hernaez, Mikel, Mattavelli, Marco, Ostermann, Jorn
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 1622
container_issue 9
container_start_page 1607
container_title Proceedings of the IEEE
container_volume 109
creator Voges, Jan
Hernaez, Mikel
Mattavelli, Marco
Ostermann, Jorn
description The development and progress of high-throughput sequencing technologies have transformed the sequencing of DNA from a scientific research challenge to practice. With the release of the latest generation of sequencing machines, the cost of sequencing a whole human genome has dropped to less than \ 600. Such achievements open the door to personalized medicine, where it is expected that genomic information of patients will be analyzed as a standard practice. However, the associated costs, related to storing, transmitting, and processing the large volumes of data, are already comparable to the costs of sequencing. To support the design of new and interoperable solutions for the representation, compression, and management of genomic sequencing data, the Moving Picture Experts Group (MPEG) jointly with working group 5 of ISO/TC276 "Biotechnology" has started to produce the ISO/IEC 23092 series, known as MPEG-G. MPEG-G does not only offer higher levels of compression compared with the state of the art but it also provides new functionalities, such as built-in support for random access in the compressed domain, support for data protection mechanisms, flexible storage, and streaming capabilities. MPEG-G only specifies the decoding syntax of compressed bitstreams, as well as a file format and a transport format. This allows for the development of new encoding solutions with higher degrees of optimization while maintaining compatibility with any existing MPEG-G decoder.
doi_str_mv 10.1109/JPROC.2021.3082027
format Article
fullrecord <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_2562950964</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>9455132</ieee_id><sourcerecordid>2562950964</sourcerecordid><originalsourceid>FETCH-LOGICAL-c339t-e92820dde68d546709727a911ece3a06b5697c3bc5de206cb1bdb46e1843a2063</originalsourceid><addsrcrecordid>eNo9kEtPAjEUhRujiYj-Ad00cT3Yx7QzdUdGQIwGIriedDoXHSIttsXov7eIcXWSe8-5jw-hS0oGlBJ18zB_nlUDRhgdcFImLY5QjwpRZowJeYx6hNAyU4yqU3QWwpoQwoXkPfQ5tHhqo3ftzsTOWRwdfpqPJtnkFi_fAI87HyKebSHZFrOb6ajCi6htq32LV87jmDyV22w9hLCPpxYefZk3bV8BuxWegHWbzuAFfOzAms6-4jsd9Tk6Wen3ABd_2kcv49Gyus8eZ5NpNXzMDOcqZqBYeqZtQZatyGVBVMEKrSgFA1wT2QipCsMbI1pgRJqGNm2TS6BlznUq8D66PszdepcOCLFeu523aWWduDAliJJ5crGDy3gXgodVvfXdRvvvmpJ6z7f-5Vvv-dZ_fFPo6hDqAOA_oHIhKGf8BzUodP4</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2562950964</pqid></control><display><type>article</type><title>An Introduction to MPEG-G: The First Open ISO/IEC Standard for the Compression and Exchange of Genomic Sequencing Data</title><source>IEEE Electronic Library (IEL)</source><creator>Voges, Jan ; Hernaez, Mikel ; Mattavelli, Marco ; Ostermann, Jorn</creator><creatorcontrib>Voges, Jan ; Hernaez, Mikel ; Mattavelli, Marco ; Ostermann, Jorn</creatorcontrib><description>The development and progress of high-throughput sequencing technologies have transformed the sequencing of DNA from a scientific research challenge to practice. With the release of the latest generation of sequencing machines, the cost of sequencing a whole human genome has dropped to less than &lt;inline-formula&gt; &lt;tex-math notation="LaTeX"&gt;\ &lt;/tex-math&gt;&lt;/inline-formula&gt;600. Such achievements open the door to personalized medicine, where it is expected that genomic information of patients will be analyzed as a standard practice. However, the associated costs, related to storing, transmitting, and processing the large volumes of data, are already comparable to the costs of sequencing. To support the design of new and interoperable solutions for the representation, compression, and management of genomic sequencing data, the Moving Picture Experts Group (MPEG) jointly with working group 5 of ISO/TC276 "Biotechnology" has started to produce the ISO/IEC 23092 series, known as MPEG-G. MPEG-G does not only offer higher levels of compression compared with the state of the art but it also provides new functionalities, such as built-in support for random access in the compressed domain, support for data protection mechanisms, flexible storage, and streaming capabilities. MPEG-G only specifies the decoding syntax of compressed bitstreams, as well as a file format and a transport format. This allows for the development of new encoding solutions with higher degrees of optimization while maintaining compatibility with any existing MPEG-G decoder.</description><identifier>ISSN: 0018-9219</identifier><identifier>EISSN: 1558-2256</identifier><identifier>DOI: 10.1109/JPROC.2021.3082027</identifier><identifier>CODEN: IEEPAD</identifier><language>eng</language><publisher>New York: IEEE</publisher><subject>Bioinformatics ; computational biology ; Computational modeling ; Cost analysis ; data compression ; Decoding ; Deoxyribonucleic acid ; DNA ; Encoding ; Format ; Gene sequencing ; Genomics ; Metadata ; MPEG encoders ; Optimization ; Random access ; Sequential analysis ; standardization ; Storage ; Transform coding</subject><ispartof>Proceedings of the IEEE, 2021-09, Vol.109 (9), p.1607-1622</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2021</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c339t-e92820dde68d546709727a911ece3a06b5697c3bc5de206cb1bdb46e1843a2063</citedby><cites>FETCH-LOGICAL-c339t-e92820dde68d546709727a911ece3a06b5697c3bc5de206cb1bdb46e1843a2063</cites><orcidid>0000-0002-7742-0332 ; 0000-0003-0443-2305 ; 0000-0002-6743-3324 ; 0000-0002-6080-660X</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/9455132$$EHTML$$P50$$Gieee$$Hfree_for_read</linktohtml><link.rule.ids>314,780,784,796,27924,27925,54758</link.rule.ids></links><search><creatorcontrib>Voges, Jan</creatorcontrib><creatorcontrib>Hernaez, Mikel</creatorcontrib><creatorcontrib>Mattavelli, Marco</creatorcontrib><creatorcontrib>Ostermann, Jorn</creatorcontrib><title>An Introduction to MPEG-G: The First Open ISO/IEC Standard for the Compression and Exchange of Genomic Sequencing Data</title><title>Proceedings of the IEEE</title><addtitle>JPROC</addtitle><description>The development and progress of high-throughput sequencing technologies have transformed the sequencing of DNA from a scientific research challenge to practice. With the release of the latest generation of sequencing machines, the cost of sequencing a whole human genome has dropped to less than &lt;inline-formula&gt; &lt;tex-math notation="LaTeX"&gt;\ &lt;/tex-math&gt;&lt;/inline-formula&gt;600. Such achievements open the door to personalized medicine, where it is expected that genomic information of patients will be analyzed as a standard practice. However, the associated costs, related to storing, transmitting, and processing the large volumes of data, are already comparable to the costs of sequencing. To support the design of new and interoperable solutions for the representation, compression, and management of genomic sequencing data, the Moving Picture Experts Group (MPEG) jointly with working group 5 of ISO/TC276 "Biotechnology" has started to produce the ISO/IEC 23092 series, known as MPEG-G. MPEG-G does not only offer higher levels of compression compared with the state of the art but it also provides new functionalities, such as built-in support for random access in the compressed domain, support for data protection mechanisms, flexible storage, and streaming capabilities. MPEG-G only specifies the decoding syntax of compressed bitstreams, as well as a file format and a transport format. This allows for the development of new encoding solutions with higher degrees of optimization while maintaining compatibility with any existing MPEG-G decoder.</description><subject>Bioinformatics</subject><subject>computational biology</subject><subject>Computational modeling</subject><subject>Cost analysis</subject><subject>data compression</subject><subject>Decoding</subject><subject>Deoxyribonucleic acid</subject><subject>DNA</subject><subject>Encoding</subject><subject>Format</subject><subject>Gene sequencing</subject><subject>Genomics</subject><subject>Metadata</subject><subject>MPEG encoders</subject><subject>Optimization</subject><subject>Random access</subject><subject>Sequential analysis</subject><subject>standardization</subject><subject>Storage</subject><subject>Transform coding</subject><issn>0018-9219</issn><issn>1558-2256</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2021</creationdate><recordtype>article</recordtype><sourceid>ESBDL</sourceid><sourceid>RIE</sourceid><recordid>eNo9kEtPAjEUhRujiYj-Ad00cT3Yx7QzdUdGQIwGIriedDoXHSIttsXov7eIcXWSe8-5jw-hS0oGlBJ18zB_nlUDRhgdcFImLY5QjwpRZowJeYx6hNAyU4yqU3QWwpoQwoXkPfQ5tHhqo3ftzsTOWRwdfpqPJtnkFi_fAI87HyKebSHZFrOb6ajCi6htq32LV87jmDyV22w9hLCPpxYefZk3bV8BuxWegHWbzuAFfOzAms6-4jsd9Tk6Wen3ABd_2kcv49Gyus8eZ5NpNXzMDOcqZqBYeqZtQZatyGVBVMEKrSgFA1wT2QipCsMbI1pgRJqGNm2TS6BlznUq8D66PszdepcOCLFeu523aWWduDAliJJ5crGDy3gXgodVvfXdRvvvmpJ6z7f-5Vvv-dZ_fFPo6hDqAOA_oHIhKGf8BzUodP4</recordid><startdate>20210901</startdate><enddate>20210901</enddate><creator>Voges, Jan</creator><creator>Hernaez, Mikel</creator><creator>Mattavelli, Marco</creator><creator>Ostermann, Jorn</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>ESBDL</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SP</scope><scope>8FD</scope><scope>L7M</scope><orcidid>https://orcid.org/0000-0002-7742-0332</orcidid><orcidid>https://orcid.org/0000-0003-0443-2305</orcidid><orcidid>https://orcid.org/0000-0002-6743-3324</orcidid><orcidid>https://orcid.org/0000-0002-6080-660X</orcidid></search><sort><creationdate>20210901</creationdate><title>An Introduction to MPEG-G: The First Open ISO/IEC Standard for the Compression and Exchange of Genomic Sequencing Data</title><author>Voges, Jan ; Hernaez, Mikel ; Mattavelli, Marco ; Ostermann, Jorn</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c339t-e92820dde68d546709727a911ece3a06b5697c3bc5de206cb1bdb46e1843a2063</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2021</creationdate><topic>Bioinformatics</topic><topic>computational biology</topic><topic>Computational modeling</topic><topic>Cost analysis</topic><topic>data compression</topic><topic>Decoding</topic><topic>Deoxyribonucleic acid</topic><topic>DNA</topic><topic>Encoding</topic><topic>Format</topic><topic>Gene sequencing</topic><topic>Genomics</topic><topic>Metadata</topic><topic>MPEG encoders</topic><topic>Optimization</topic><topic>Random access</topic><topic>Sequential analysis</topic><topic>standardization</topic><topic>Storage</topic><topic>Transform coding</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Voges, Jan</creatorcontrib><creatorcontrib>Hernaez, Mikel</creatorcontrib><creatorcontrib>Mattavelli, Marco</creatorcontrib><creatorcontrib>Ostermann, Jorn</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005–Present</collection><collection>IEEE Open Access Journals</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><collection>Electronics &amp; Communications Abstracts</collection><collection>Technology Research Database</collection><collection>Advanced Technologies Database with Aerospace</collection><jtitle>Proceedings of the IEEE</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Voges, Jan</au><au>Hernaez, Mikel</au><au>Mattavelli, Marco</au><au>Ostermann, Jorn</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>An Introduction to MPEG-G: The First Open ISO/IEC Standard for the Compression and Exchange of Genomic Sequencing Data</atitle><jtitle>Proceedings of the IEEE</jtitle><stitle>JPROC</stitle><date>2021-09-01</date><risdate>2021</risdate><volume>109</volume><issue>9</issue><spage>1607</spage><epage>1622</epage><pages>1607-1622</pages><issn>0018-9219</issn><eissn>1558-2256</eissn><coden>IEEPAD</coden><abstract>The development and progress of high-throughput sequencing technologies have transformed the sequencing of DNA from a scientific research challenge to practice. With the release of the latest generation of sequencing machines, the cost of sequencing a whole human genome has dropped to less than &lt;inline-formula&gt; &lt;tex-math notation="LaTeX"&gt;\ &lt;/tex-math&gt;&lt;/inline-formula&gt;600. Such achievements open the door to personalized medicine, where it is expected that genomic information of patients will be analyzed as a standard practice. However, the associated costs, related to storing, transmitting, and processing the large volumes of data, are already comparable to the costs of sequencing. To support the design of new and interoperable solutions for the representation, compression, and management of genomic sequencing data, the Moving Picture Experts Group (MPEG) jointly with working group 5 of ISO/TC276 "Biotechnology" has started to produce the ISO/IEC 23092 series, known as MPEG-G. MPEG-G does not only offer higher levels of compression compared with the state of the art but it also provides new functionalities, such as built-in support for random access in the compressed domain, support for data protection mechanisms, flexible storage, and streaming capabilities. MPEG-G only specifies the decoding syntax of compressed bitstreams, as well as a file format and a transport format. This allows for the development of new encoding solutions with higher degrees of optimization while maintaining compatibility with any existing MPEG-G decoder.</abstract><cop>New York</cop><pub>IEEE</pub><doi>10.1109/JPROC.2021.3082027</doi><tpages>16</tpages><orcidid>https://orcid.org/0000-0002-7742-0332</orcidid><orcidid>https://orcid.org/0000-0003-0443-2305</orcidid><orcidid>https://orcid.org/0000-0002-6743-3324</orcidid><orcidid>https://orcid.org/0000-0002-6080-660X</orcidid><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 0018-9219
ispartof Proceedings of the IEEE, 2021-09, Vol.109 (9), p.1607-1622
issn 0018-9219
1558-2256
language eng
recordid cdi_proquest_journals_2562950964
source IEEE Electronic Library (IEL)
subjects Bioinformatics
computational biology
Computational modeling
Cost analysis
data compression
Decoding
Deoxyribonucleic acid
DNA
Encoding
Format
Gene sequencing
Genomics
Metadata
MPEG encoders
Optimization
Random access
Sequential analysis
standardization
Storage
Transform coding
title An Introduction to MPEG-G: The First Open ISO/IEC Standard for the Compression and Exchange of Genomic Sequencing Data
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-03T11%3A59%3A15IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=An%20Introduction%20to%20MPEG-G:%20The%20First%20Open%20ISO/IEC%20Standard%20for%20the%20Compression%20and%20Exchange%20of%20Genomic%20Sequencing%20Data&rft.jtitle=Proceedings%20of%20the%20IEEE&rft.au=Voges,%20Jan&rft.date=2021-09-01&rft.volume=109&rft.issue=9&rft.spage=1607&rft.epage=1622&rft.pages=1607-1622&rft.issn=0018-9219&rft.eissn=1558-2256&rft.coden=IEEPAD&rft_id=info:doi/10.1109/JPROC.2021.3082027&rft_dat=%3Cproquest_cross%3E2562950964%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2562950964&rft_id=info:pmid/&rft_ieee_id=9455132&rfr_iscdi=true