Prefix-free graphs and suffix array construction in sublinear space

A recent paradigm shift in bioinformatics from a single reference genome to a pangenome brought with it several graph structures. These graph structures must implement operations, such as efficient construction from multiple genomes and read mapping. Read mapping is a well-studied problem in sequent...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Baláž, Andrej, Petescia, Alessia
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page
container_issue
container_start_page
container_title
container_volume
creator Baláž, Andrej
Petescia, Alessia
description A recent paradigm shift in bioinformatics from a single reference genome to a pangenome brought with it several graph structures. These graph structures must implement operations, such as efficient construction from multiple genomes and read mapping. Read mapping is a well-studied problem in sequential data, and, together with data structures such as suffix array and Burrows-Wheeler transform, allows for efficient computation. Attempts to achieve comparatively high performance on graphs bring many complications since the common data structures on strings are not easily obtainable for graphs. In this work, we introduce prefix-free graphs, a novel pangenomic data structure; we show how to construct them and how to use them to obtain well-known data structures from stringology in sublinear space, allowing for many efficient operations on pangenomes.
doi_str_mv 10.48550/arxiv.2306.14689
format Article
fullrecord <record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_2306_14689</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2306_14689</sourcerecordid><originalsourceid>FETCH-LOGICAL-a679-b1eeac277f80e67bbeeda57c8889f01fcef9e358299a9985337f70bb340c54b03</originalsourceid><addsrcrecordid>eNotz8FKAzEUBdBsXEjrB7gyPzBjZjKZJMsyqBUKdtH98JK-p4GaDi-ttH9vW11duBcuHCEeG1V3zhj1DHxKP3WrVV83Xe_8vRjWjJROFTGi_GSYvoqEvJXlSJdaAjOcZdzncuBjPKR9lilfxrBLGYFlmSDiXNwR7Ao-_OdMbF5fNsOyWn28vQ-LVQW99VVoECG21pJT2NsQELdgbHTOeVINRSSP2rjWe_DeGa0tWRWC7lQ0XVB6Jp7-bm-KceL0DXwer5rxptG_kUZF2w</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Prefix-free graphs and suffix array construction in sublinear space</title><source>arXiv.org</source><creator>Baláž, Andrej ; Petescia, Alessia</creator><creatorcontrib>Baláž, Andrej ; Petescia, Alessia</creatorcontrib><description>A recent paradigm shift in bioinformatics from a single reference genome to a pangenome brought with it several graph structures. These graph structures must implement operations, such as efficient construction from multiple genomes and read mapping. Read mapping is a well-studied problem in sequential data, and, together with data structures such as suffix array and Burrows-Wheeler transform, allows for efficient computation. Attempts to achieve comparatively high performance on graphs bring many complications since the common data structures on strings are not easily obtainable for graphs. In this work, we introduce prefix-free graphs, a novel pangenomic data structure; we show how to construct them and how to use them to obtain well-known data structures from stringology in sublinear space, allowing for many efficient operations on pangenomes.</description><identifier>DOI: 10.48550/arxiv.2306.14689</identifier><language>eng</language><subject>Computer Science - Data Structures and Algorithms</subject><creationdate>2023-06</creationdate><rights>http://creativecommons.org/licenses/by/4.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,776,881</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/2306.14689$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.2306.14689$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Baláž, Andrej</creatorcontrib><creatorcontrib>Petescia, Alessia</creatorcontrib><title>Prefix-free graphs and suffix array construction in sublinear space</title><description>A recent paradigm shift in bioinformatics from a single reference genome to a pangenome brought with it several graph structures. These graph structures must implement operations, such as efficient construction from multiple genomes and read mapping. Read mapping is a well-studied problem in sequential data, and, together with data structures such as suffix array and Burrows-Wheeler transform, allows for efficient computation. Attempts to achieve comparatively high performance on graphs bring many complications since the common data structures on strings are not easily obtainable for graphs. In this work, we introduce prefix-free graphs, a novel pangenomic data structure; we show how to construct them and how to use them to obtain well-known data structures from stringology in sublinear space, allowing for many efficient operations on pangenomes.</description><subject>Computer Science - Data Structures and Algorithms</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNotz8FKAzEUBdBsXEjrB7gyPzBjZjKZJMsyqBUKdtH98JK-p4GaDi-ttH9vW11duBcuHCEeG1V3zhj1DHxKP3WrVV83Xe_8vRjWjJROFTGi_GSYvoqEvJXlSJdaAjOcZdzncuBjPKR9lilfxrBLGYFlmSDiXNwR7Ao-_OdMbF5fNsOyWn28vQ-LVQW99VVoECG21pJT2NsQELdgbHTOeVINRSSP2rjWe_DeGa0tWRWC7lQ0XVB6Jp7-bm-KceL0DXwer5rxptG_kUZF2w</recordid><startdate>20230626</startdate><enddate>20230626</enddate><creator>Baláž, Andrej</creator><creator>Petescia, Alessia</creator><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20230626</creationdate><title>Prefix-free graphs and suffix array construction in sublinear space</title><author>Baláž, Andrej ; Petescia, Alessia</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a679-b1eeac277f80e67bbeeda57c8889f01fcef9e358299a9985337f70bb340c54b03</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Computer Science - Data Structures and Algorithms</topic><toplevel>online_resources</toplevel><creatorcontrib>Baláž, Andrej</creatorcontrib><creatorcontrib>Petescia, Alessia</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Baláž, Andrej</au><au>Petescia, Alessia</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Prefix-free graphs and suffix array construction in sublinear space</atitle><date>2023-06-26</date><risdate>2023</risdate><abstract>A recent paradigm shift in bioinformatics from a single reference genome to a pangenome brought with it several graph structures. These graph structures must implement operations, such as efficient construction from multiple genomes and read mapping. Read mapping is a well-studied problem in sequential data, and, together with data structures such as suffix array and Burrows-Wheeler transform, allows for efficient computation. Attempts to achieve comparatively high performance on graphs bring many complications since the common data structures on strings are not easily obtainable for graphs. In this work, we introduce prefix-free graphs, a novel pangenomic data structure; we show how to construct them and how to use them to obtain well-known data structures from stringology in sublinear space, allowing for many efficient operations on pangenomes.</abstract><doi>10.48550/arxiv.2306.14689</doi><oa>free_for_read</oa></addata></record>
fulltext fulltext_linktorsrc
identifier DOI: 10.48550/arxiv.2306.14689
ispartof
issn
language eng
recordid cdi_arxiv_primary_2306_14689
source arXiv.org
subjects Computer Science - Data Structures and Algorithms
title Prefix-free graphs and suffix array construction in sublinear space
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-05T07%3A14%3A53IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Prefix-free%20graphs%20and%20suffix%20array%20construction%20in%20sublinear%20space&rft.au=Bal%C3%A1%C5%BE,%20Andrej&rft.date=2023-06-26&rft_id=info:doi/10.48550/arxiv.2306.14689&rft_dat=%3Carxiv_GOX%3E2306_14689%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true