Multi-strand Reconstruction from Substrings
The problem of string reconstruction based on its substrings spectrum has received significant attention recently due to its applicability to DNA data storage and sequencing. In contrast to previous works, we consider in this paper a setup of this problem where multiple strings are reconstructed tog...
Gespeichert in:
Veröffentlicht in: | arXiv.org 2021-08 |
---|---|
Hauptverfasser: | , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | |
---|---|
container_issue | |
container_start_page | |
container_title | arXiv.org |
container_volume | |
creator | Yehezkeally, Yonatan Marcovich, Sagi Yaakobi, Eitan |
description | The problem of string reconstruction based on its substrings spectrum has received significant attention recently due to its applicability to DNA data storage and sequencing. In contrast to previous works, we consider in this paper a setup of this problem where multiple strings are reconstructed together. Given a multiset \(S\) of strings, all their substrings of some fixed length \(\ell\), defined as the \(\ell\)-profile of \(S\), are received and the goal is to reconstruct all strings in \(S\). A multi-strand \(\ell\)-reconstruction code is a set of multisets such that every element \(S\) can be reconstructed from its \(\ell\)-profile. Given the number of strings~\(k\) and their length~\(n\), we first find a lower bound on the value of \(\ell\) necessary for existence of multi-strand \(\ell\)-reconstruction codes with non-vanishing asymptotic rate. We then present two constructions of such codes and show that their rates approach~\(1\) for values of \(\ell\) that asymptotically behave like the lower bound. |
doi_str_mv | 10.48550/arxiv.2108.11725 |
format | Article |
fullrecord | <record><control><sourceid>proquest_arxiv</sourceid><recordid>TN_cdi_arxiv_primary_2108_11725</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2565273085</sourcerecordid><originalsourceid>FETCH-LOGICAL-a525-fad60953305479023063cbcab98b79076f7144d8cbde3199e65c5af5597476743</originalsourceid><addsrcrecordid>eNotj09Lw0AQxRdBsNR-AE8GPEri7M7O_jlKUStUBO09bDaJpLRJ3U1Ev71r62nevPcY5sfYFYdCGiK4c-G7-yoEB1NwrgWdsZlA5LmRQlywRYxbABAqJYQzdvsy7cYuj2NwfZ29NX7ok5782A191oZhn71PVXK6_iNesvPW7WKz-J9ztnl82CxX-fr16Xl5v84dCcpbVyuwhAgktQWBoNBX3lXWVGnXqtVcytr4qm6QW9so8uRaIqulVlrinF2fzh5JykPo9i78lH9E5ZEoNW5OjUMYPqcmjuV2mEKffioFKRIawRD-AhbgTKc</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2565273085</pqid></control><display><type>article</type><title>Multi-strand Reconstruction from Substrings</title><source>arXiv.org</source><source>Free E- Journals</source><creator>Yehezkeally, Yonatan ; Marcovich, Sagi ; Yaakobi, Eitan</creator><creatorcontrib>Yehezkeally, Yonatan ; Marcovich, Sagi ; Yaakobi, Eitan</creatorcontrib><description>The problem of string reconstruction based on its substrings spectrum has received significant attention recently due to its applicability to DNA data storage and sequencing. In contrast to previous works, we consider in this paper a setup of this problem where multiple strings are reconstructed together. Given a multiset \(S\) of strings, all their substrings of some fixed length \(\ell\), defined as the \(\ell\)-profile of \(S\), are received and the goal is to reconstruct all strings in \(S\). A multi-strand \(\ell\)-reconstruction code is a set of multisets such that every element \(S\) can be reconstructed from its \(\ell\)-profile. Given the number of strings~\(k\) and their length~\(n\), we first find a lower bound on the value of \(\ell\) necessary for existence of multi-strand \(\ell\)-reconstruction codes with non-vanishing asymptotic rate. We then present two constructions of such codes and show that their rates approach~\(1\) for values of \(\ell\) that asymptotically behave like the lower bound.</description><identifier>EISSN: 2331-8422</identifier><identifier>DOI: 10.48550/arxiv.2108.11725</identifier><language>eng</language><publisher>Ithaca: Cornell University Library, arXiv.org</publisher><subject>Asymptotic properties ; Computer Science - Information Theory ; Data storage ; Gene sequencing ; Lower bounds ; Mathematics - Information Theory ; Reconstruction ; Strings</subject><ispartof>arXiv.org, 2021-08</ispartof><rights>2021. This work is published under http://arxiv.org/licenses/nonexclusive-distrib/1.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><rights>http://arxiv.org/licenses/nonexclusive-distrib/1.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,776,780,881,27904</link.rule.ids><backlink>$$Uhttps://doi.org/10.48550/arXiv.2108.11725$$DView paper in arXiv$$Hfree_for_read</backlink><backlink>$$Uhttps://doi.org/10.1109/ITW48936.2021.9611486$$DView published paper (Access to full text may be restricted)$$Hfree_for_read</backlink></links><search><creatorcontrib>Yehezkeally, Yonatan</creatorcontrib><creatorcontrib>Marcovich, Sagi</creatorcontrib><creatorcontrib>Yaakobi, Eitan</creatorcontrib><title>Multi-strand Reconstruction from Substrings</title><title>arXiv.org</title><description>The problem of string reconstruction based on its substrings spectrum has received significant attention recently due to its applicability to DNA data storage and sequencing. In contrast to previous works, we consider in this paper a setup of this problem where multiple strings are reconstructed together. Given a multiset \(S\) of strings, all their substrings of some fixed length \(\ell\), defined as the \(\ell\)-profile of \(S\), are received and the goal is to reconstruct all strings in \(S\). A multi-strand \(\ell\)-reconstruction code is a set of multisets such that every element \(S\) can be reconstructed from its \(\ell\)-profile. Given the number of strings~\(k\) and their length~\(n\), we first find a lower bound on the value of \(\ell\) necessary for existence of multi-strand \(\ell\)-reconstruction codes with non-vanishing asymptotic rate. We then present two constructions of such codes and show that their rates approach~\(1\) for values of \(\ell\) that asymptotically behave like the lower bound.</description><subject>Asymptotic properties</subject><subject>Computer Science - Information Theory</subject><subject>Data storage</subject><subject>Gene sequencing</subject><subject>Lower bounds</subject><subject>Mathematics - Information Theory</subject><subject>Reconstruction</subject><subject>Strings</subject><issn>2331-8422</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2021</creationdate><recordtype>article</recordtype><sourceid>ABUWG</sourceid><sourceid>AFKRA</sourceid><sourceid>AZQEC</sourceid><sourceid>BENPR</sourceid><sourceid>CCPQU</sourceid><sourceid>DWQXO</sourceid><sourceid>GOX</sourceid><recordid>eNotj09Lw0AQxRdBsNR-AE8GPEri7M7O_jlKUStUBO09bDaJpLRJ3U1Ev71r62nevPcY5sfYFYdCGiK4c-G7-yoEB1NwrgWdsZlA5LmRQlywRYxbABAqJYQzdvsy7cYuj2NwfZ29NX7ok5782A191oZhn71PVXK6_iNesvPW7WKz-J9ztnl82CxX-fr16Xl5v84dCcpbVyuwhAgktQWBoNBX3lXWVGnXqtVcytr4qm6QW9so8uRaIqulVlrinF2fzh5JykPo9i78lH9E5ZEoNW5OjUMYPqcmjuV2mEKffioFKRIawRD-AhbgTKc</recordid><startdate>20210826</startdate><enddate>20210826</enddate><creator>Yehezkeally, Yonatan</creator><creator>Marcovich, Sagi</creator><creator>Yaakobi, Eitan</creator><general>Cornell University Library, arXiv.org</general><scope>8FE</scope><scope>8FG</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>HCIFZ</scope><scope>L6V</scope><scope>M7S</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope><scope>AKY</scope><scope>AKZ</scope><scope>GOX</scope></search><sort><creationdate>20210826</creationdate><title>Multi-strand Reconstruction from Substrings</title><author>Yehezkeally, Yonatan ; Marcovich, Sagi ; Yaakobi, Eitan</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a525-fad60953305479023063cbcab98b79076f7144d8cbde3199e65c5af5597476743</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2021</creationdate><topic>Asymptotic properties</topic><topic>Computer Science - Information Theory</topic><topic>Data storage</topic><topic>Gene sequencing</topic><topic>Lower bounds</topic><topic>Mathematics - Information Theory</topic><topic>Reconstruction</topic><topic>Strings</topic><toplevel>online_resources</toplevel><creatorcontrib>Yehezkeally, Yonatan</creatorcontrib><creatorcontrib>Marcovich, Sagi</creatorcontrib><creatorcontrib>Yaakobi, Eitan</creatorcontrib><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Materials Science & Engineering Collection</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest Central UK/Ireland</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Engineering Collection</collection><collection>Engineering Database</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering Collection</collection><collection>arXiv Computer Science</collection><collection>arXiv Mathematics</collection><collection>arXiv.org</collection><jtitle>arXiv.org</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Yehezkeally, Yonatan</au><au>Marcovich, Sagi</au><au>Yaakobi, Eitan</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Multi-strand Reconstruction from Substrings</atitle><jtitle>arXiv.org</jtitle><date>2021-08-26</date><risdate>2021</risdate><eissn>2331-8422</eissn><abstract>The problem of string reconstruction based on its substrings spectrum has received significant attention recently due to its applicability to DNA data storage and sequencing. In contrast to previous works, we consider in this paper a setup of this problem where multiple strings are reconstructed together. Given a multiset \(S\) of strings, all their substrings of some fixed length \(\ell\), defined as the \(\ell\)-profile of \(S\), are received and the goal is to reconstruct all strings in \(S\). A multi-strand \(\ell\)-reconstruction code is a set of multisets such that every element \(S\) can be reconstructed from its \(\ell\)-profile. Given the number of strings~\(k\) and their length~\(n\), we first find a lower bound on the value of \(\ell\) necessary for existence of multi-strand \(\ell\)-reconstruction codes with non-vanishing asymptotic rate. We then present two constructions of such codes and show that their rates approach~\(1\) for values of \(\ell\) that asymptotically behave like the lower bound.</abstract><cop>Ithaca</cop><pub>Cornell University Library, arXiv.org</pub><doi>10.48550/arxiv.2108.11725</doi><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | EISSN: 2331-8422 |
ispartof | arXiv.org, 2021-08 |
issn | 2331-8422 |
language | eng |
recordid | cdi_arxiv_primary_2108_11725 |
source | arXiv.org; Free E- Journals |
subjects | Asymptotic properties Computer Science - Information Theory Data storage Gene sequencing Lower bounds Mathematics - Information Theory Reconstruction Strings |
title | Multi-strand Reconstruction from Substrings |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-23T19%3A00%3A00IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_arxiv&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Multi-strand%20Reconstruction%20from%20Substrings&rft.jtitle=arXiv.org&rft.au=Yehezkeally,%20Yonatan&rft.date=2021-08-26&rft.eissn=2331-8422&rft_id=info:doi/10.48550/arxiv.2108.11725&rft_dat=%3Cproquest_arxiv%3E2565273085%3C/proquest_arxiv%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2565273085&rft_id=info:pmid/&rfr_iscdi=true |