Internal Dictionary Matching
We introduce data structures answering queries concerning the occurrences of patterns from a given dictionary D in fragments of a given string T of length n . The dictionary is internal in the sense that each pattern in D is given as a fragment of T . This way, D takes space proportional to the numb...
Gespeichert in:
Veröffentlicht in: | Algorithmica 2021-07, Vol.83 (7), p.2142-2169 |
---|---|
Hauptverfasser: | , , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 2169 |
---|---|
container_issue | 7 |
container_start_page | 2142 |
container_title | Algorithmica |
container_volume | 83 |
creator | Charalampopoulos, Panagiotis Kociumaka, Tomasz Mohamed, Manal Radoszewski, Jakub Rytter, Wojciech Waleń, Tomasz |
description | We introduce data structures answering queries concerning the occurrences of patterns from a given dictionary
D
in
fragments
of a given string
T
of length
n
. The dictionary is
internal
in the sense that each pattern in
D
is given as a fragment of
T
. This way,
D
takes space proportional to the number of patterns
d
=
|
D
|
rather than their total length, which could be
Θ
(
n
·
d
)
. In particular, we consider the following types of queries: reporting and counting
all
occurrences of patterns from
D
in a fragment
T
[
i
.
.
j
]
and reporting
distinct
patterns from
D
that occur in
T
[
i
.
.
j
]
. We show how to construct, in
O
(
(
n
+
d
)
log
O
(
1
)
n
)
time, a data structure that answers each of these queries in time
O
(
log
O
(
1
)
n
+
|
o
u
t
p
u
t
|
)
. The case of counting patterns is much more involved and needs a combination of a locally consistent parsing with orthogonal range searching. Reporting distinct patterns, on the other hand, uses the structure of maximal repetitions in strings. Finally, we provide tight—up to subpolynomial factors—upper and lower bounds for the case of a dynamic dictionary. |
doi_str_mv | 10.1007/s00453-021-00821-y |
format | Article |
fullrecord | <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_2544228756</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2544228756</sourcerecordid><originalsourceid>FETCH-LOGICAL-c363t-45e528347bd463fab3e5785a7be3a7e06559496fb218489e90fb5d356adfc2713</originalsourceid><addsrcrecordid>eNp9kDtPwzAUhS0EEqHwBxBDJWbD9ePazojKq1IRC8yWk9glVUmKnQ7597gEiY3l3uV8R0cfIZcMbhiAvk0AEgUFziiAyXc8IgWTglNAyY5JAUwbKhXTp-QspQ0A47pUBbladoOPndvO79t6aPvOxXH-4ob6o-3W5-QkuG3yF79_Rt4fH94Wz3T1-rRc3K1oLZQYqESP3Aipq0YqEVwlPGqDTldeOO1BIZayVKHizEhT-hJChY1A5ZpQc83EjFxPvbvYf-19Guym3x9GJctRSs6NRpVTfErVsU8p-mB3sf3Mey0De7BgJws2W7A_FuyYITFBKYe7tY9_1f9Q35OkXdU</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2544228756</pqid></control><display><type>article</type><title>Internal Dictionary Matching</title><source>SpringerNature Journals</source><creator>Charalampopoulos, Panagiotis ; Kociumaka, Tomasz ; Mohamed, Manal ; Radoszewski, Jakub ; Rytter, Wojciech ; Waleń, Tomasz</creator><creatorcontrib>Charalampopoulos, Panagiotis ; Kociumaka, Tomasz ; Mohamed, Manal ; Radoszewski, Jakub ; Rytter, Wojciech ; Waleń, Tomasz</creatorcontrib><description>We introduce data structures answering queries concerning the occurrences of patterns from a given dictionary
D
in
fragments
of a given string
T
of length
n
. The dictionary is
internal
in the sense that each pattern in
D
is given as a fragment of
T
. This way,
D
takes space proportional to the number of patterns
d
=
|
D
|
rather than their total length, which could be
Θ
(
n
·
d
)
. In particular, we consider the following types of queries: reporting and counting
all
occurrences of patterns from
D
in a fragment
T
[
i
.
.
j
]
and reporting
distinct
patterns from
D
that occur in
T
[
i
.
.
j
]
. We show how to construct, in
O
(
(
n
+
d
)
log
O
(
1
)
n
)
time, a data structure that answers each of these queries in time
O
(
log
O
(
1
)
n
+
|
o
u
t
p
u
t
|
)
. The case of counting patterns is much more involved and needs a combination of a locally consistent parsing with orthogonal range searching. Reporting distinct patterns, on the other hand, uses the structure of maximal repetitions in strings. Finally, we provide tight—up to subpolynomial factors—upper and lower bounds for the case of a dynamic dictionary.</description><identifier>ISSN: 0178-4617</identifier><identifier>EISSN: 1432-0541</identifier><identifier>DOI: 10.1007/s00453-021-00821-y</identifier><language>eng</language><publisher>New York: Springer US</publisher><subject>Algorithm Analysis and Problem Complexity ; Algorithms ; Computer Science ; Computer Systems Organization and Communication Networks ; Data structures ; Data Structures and Information Theory ; Dictionaries ; Lower bounds ; Mathematics of Computing ; Queries ; Strings ; Theory of Computation</subject><ispartof>Algorithmica, 2021-07, Vol.83 (7), p.2142-2169</ispartof><rights>The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2021</rights><rights>The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2021.</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c363t-45e528347bd463fab3e5785a7be3a7e06559496fb218489e90fb5d356adfc2713</citedby><cites>FETCH-LOGICAL-c363t-45e528347bd463fab3e5785a7be3a7e06559496fb218489e90fb5d356adfc2713</cites><orcidid>0000-0002-0067-6401 ; 0000-0002-2477-1702 ; 0000-0002-6024-1557 ; 0000-0002-7369-3309 ; 0000-0002-9162-6724 ; 0000-0002-1435-5051</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://link.springer.com/content/pdf/10.1007/s00453-021-00821-y$$EPDF$$P50$$Gspringer$$H</linktopdf><linktohtml>$$Uhttps://link.springer.com/10.1007/s00453-021-00821-y$$EHTML$$P50$$Gspringer$$H</linktohtml><link.rule.ids>314,780,784,27924,27925,41488,42557,51319</link.rule.ids></links><search><creatorcontrib>Charalampopoulos, Panagiotis</creatorcontrib><creatorcontrib>Kociumaka, Tomasz</creatorcontrib><creatorcontrib>Mohamed, Manal</creatorcontrib><creatorcontrib>Radoszewski, Jakub</creatorcontrib><creatorcontrib>Rytter, Wojciech</creatorcontrib><creatorcontrib>Waleń, Tomasz</creatorcontrib><title>Internal Dictionary Matching</title><title>Algorithmica</title><addtitle>Algorithmica</addtitle><description>We introduce data structures answering queries concerning the occurrences of patterns from a given dictionary
D
in
fragments
of a given string
T
of length
n
. The dictionary is
internal
in the sense that each pattern in
D
is given as a fragment of
T
. This way,
D
takes space proportional to the number of patterns
d
=
|
D
|
rather than their total length, which could be
Θ
(
n
·
d
)
. In particular, we consider the following types of queries: reporting and counting
all
occurrences of patterns from
D
in a fragment
T
[
i
.
.
j
]
and reporting
distinct
patterns from
D
that occur in
T
[
i
.
.
j
]
. We show how to construct, in
O
(
(
n
+
d
)
log
O
(
1
)
n
)
time, a data structure that answers each of these queries in time
O
(
log
O
(
1
)
n
+
|
o
u
t
p
u
t
|
)
. The case of counting patterns is much more involved and needs a combination of a locally consistent parsing with orthogonal range searching. Reporting distinct patterns, on the other hand, uses the structure of maximal repetitions in strings. Finally, we provide tight—up to subpolynomial factors—upper and lower bounds for the case of a dynamic dictionary.</description><subject>Algorithm Analysis and Problem Complexity</subject><subject>Algorithms</subject><subject>Computer Science</subject><subject>Computer Systems Organization and Communication Networks</subject><subject>Data structures</subject><subject>Data Structures and Information Theory</subject><subject>Dictionaries</subject><subject>Lower bounds</subject><subject>Mathematics of Computing</subject><subject>Queries</subject><subject>Strings</subject><subject>Theory of Computation</subject><issn>0178-4617</issn><issn>1432-0541</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2021</creationdate><recordtype>article</recordtype><recordid>eNp9kDtPwzAUhS0EEqHwBxBDJWbD9ePazojKq1IRC8yWk9glVUmKnQ7597gEiY3l3uV8R0cfIZcMbhiAvk0AEgUFziiAyXc8IgWTglNAyY5JAUwbKhXTp-QspQ0A47pUBbladoOPndvO79t6aPvOxXH-4ob6o-3W5-QkuG3yF79_Rt4fH94Wz3T1-rRc3K1oLZQYqESP3Aipq0YqEVwlPGqDTldeOO1BIZayVKHizEhT-hJChY1A5ZpQc83EjFxPvbvYf-19Guym3x9GJctRSs6NRpVTfErVsU8p-mB3sf3Mey0De7BgJws2W7A_FuyYITFBKYe7tY9_1f9Q35OkXdU</recordid><startdate>20210701</startdate><enddate>20210701</enddate><creator>Charalampopoulos, Panagiotis</creator><creator>Kociumaka, Tomasz</creator><creator>Mohamed, Manal</creator><creator>Radoszewski, Jakub</creator><creator>Rytter, Wojciech</creator><creator>Waleń, Tomasz</creator><general>Springer US</general><general>Springer Nature B.V</general><scope>AAYXX</scope><scope>CITATION</scope><orcidid>https://orcid.org/0000-0002-0067-6401</orcidid><orcidid>https://orcid.org/0000-0002-2477-1702</orcidid><orcidid>https://orcid.org/0000-0002-6024-1557</orcidid><orcidid>https://orcid.org/0000-0002-7369-3309</orcidid><orcidid>https://orcid.org/0000-0002-9162-6724</orcidid><orcidid>https://orcid.org/0000-0002-1435-5051</orcidid></search><sort><creationdate>20210701</creationdate><title>Internal Dictionary Matching</title><author>Charalampopoulos, Panagiotis ; Kociumaka, Tomasz ; Mohamed, Manal ; Radoszewski, Jakub ; Rytter, Wojciech ; Waleń, Tomasz</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c363t-45e528347bd463fab3e5785a7be3a7e06559496fb218489e90fb5d356adfc2713</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2021</creationdate><topic>Algorithm Analysis and Problem Complexity</topic><topic>Algorithms</topic><topic>Computer Science</topic><topic>Computer Systems Organization and Communication Networks</topic><topic>Data structures</topic><topic>Data Structures and Information Theory</topic><topic>Dictionaries</topic><topic>Lower bounds</topic><topic>Mathematics of Computing</topic><topic>Queries</topic><topic>Strings</topic><topic>Theory of Computation</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Charalampopoulos, Panagiotis</creatorcontrib><creatorcontrib>Kociumaka, Tomasz</creatorcontrib><creatorcontrib>Mohamed, Manal</creatorcontrib><creatorcontrib>Radoszewski, Jakub</creatorcontrib><creatorcontrib>Rytter, Wojciech</creatorcontrib><creatorcontrib>Waleń, Tomasz</creatorcontrib><collection>CrossRef</collection><jtitle>Algorithmica</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Charalampopoulos, Panagiotis</au><au>Kociumaka, Tomasz</au><au>Mohamed, Manal</au><au>Radoszewski, Jakub</au><au>Rytter, Wojciech</au><au>Waleń, Tomasz</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Internal Dictionary Matching</atitle><jtitle>Algorithmica</jtitle><stitle>Algorithmica</stitle><date>2021-07-01</date><risdate>2021</risdate><volume>83</volume><issue>7</issue><spage>2142</spage><epage>2169</epage><pages>2142-2169</pages><issn>0178-4617</issn><eissn>1432-0541</eissn><abstract>We introduce data structures answering queries concerning the occurrences of patterns from a given dictionary
D
in
fragments
of a given string
T
of length
n
. The dictionary is
internal
in the sense that each pattern in
D
is given as a fragment of
T
. This way,
D
takes space proportional to the number of patterns
d
=
|
D
|
rather than their total length, which could be
Θ
(
n
·
d
)
. In particular, we consider the following types of queries: reporting and counting
all
occurrences of patterns from
D
in a fragment
T
[
i
.
.
j
]
and reporting
distinct
patterns from
D
that occur in
T
[
i
.
.
j
]
. We show how to construct, in
O
(
(
n
+
d
)
log
O
(
1
)
n
)
time, a data structure that answers each of these queries in time
O
(
log
O
(
1
)
n
+
|
o
u
t
p
u
t
|
)
. The case of counting patterns is much more involved and needs a combination of a locally consistent parsing with orthogonal range searching. Reporting distinct patterns, on the other hand, uses the structure of maximal repetitions in strings. Finally, we provide tight—up to subpolynomial factors—upper and lower bounds for the case of a dynamic dictionary.</abstract><cop>New York</cop><pub>Springer US</pub><doi>10.1007/s00453-021-00821-y</doi><tpages>28</tpages><orcidid>https://orcid.org/0000-0002-0067-6401</orcidid><orcidid>https://orcid.org/0000-0002-2477-1702</orcidid><orcidid>https://orcid.org/0000-0002-6024-1557</orcidid><orcidid>https://orcid.org/0000-0002-7369-3309</orcidid><orcidid>https://orcid.org/0000-0002-9162-6724</orcidid><orcidid>https://orcid.org/0000-0002-1435-5051</orcidid><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | ISSN: 0178-4617 |
ispartof | Algorithmica, 2021-07, Vol.83 (7), p.2142-2169 |
issn | 0178-4617 1432-0541 |
language | eng |
recordid | cdi_proquest_journals_2544228756 |
source | SpringerNature Journals |
subjects | Algorithm Analysis and Problem Complexity Algorithms Computer Science Computer Systems Organization and Communication Networks Data structures Data Structures and Information Theory Dictionaries Lower bounds Mathematics of Computing Queries Strings Theory of Computation |
title | Internal Dictionary Matching |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-21T10%3A02%3A38IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Internal%20Dictionary%20Matching&rft.jtitle=Algorithmica&rft.au=Charalampopoulos,%20Panagiotis&rft.date=2021-07-01&rft.volume=83&rft.issue=7&rft.spage=2142&rft.epage=2169&rft.pages=2142-2169&rft.issn=0178-4617&rft.eissn=1432-0541&rft_id=info:doi/10.1007/s00453-021-00821-y&rft_dat=%3Cproquest_cross%3E2544228756%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2544228756&rft_id=info:pmid/&rfr_iscdi=true |