Similarity search of time-warped subsequences via a suffix tree

This paper proposes an indexing technique for fast retrieval of similar subsequences using the time-warping distance. The time-warping distance is a more suitable similarity measure than the Euclidean distance in many applications where sequences may be of different lengths and/or different sampling...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Information systems (Oxford) 2003-10, Vol.28 (7), p.867-883
Hauptverfasser:	Park, Sanghyun, Chu, Wesley W., Yoon, Jeehee, Won, Jungim
Format:	Artikel
Sprache:	eng
Schlagworte:	Categorization Computerized information storage and retrieval Indexing Searching Sequence database Similarity measures Similarity search Suffix tree Time-warping distance Tree structures
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	883
container_issue	7
container_start_page	867
container_title	Information systems (Oxford)
container_volume	28
creator	Park, Sanghyun Chu, Wesley W. Yoon, Jeehee Won, Jungim
description	This paper proposes an indexing technique for fast retrieval of similar subsequences using the time-warping distance. The time-warping distance is a more suitable similarity measure than the Euclidean distance in many applications where sequences may be of different lengths and/or different sampling rates. The proposed indexing technique employs a disk-based suffix tree as an index structure and uses lower-bound distance functions to filter out dissimilar subsequences without false dismissals. To make the index structure compact and hence accelerate the query processing, it converts sequences in the continuous domain into sequences in the discrete domain and stores only a subset of the suffixes whose first values are different from those of the immediately preceding suffixes. Extensive experiments with real and synthetic data sequences revealed that the proposed approach significantly outperforms the sequential scan and LB scan approaches and scales well in a large volume of sequence databases.
doi_str_mv	10.1016/S0306-4379(02)00102-3
format	Article
fullrecord	<record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_miscellaneous_57589343</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><els_id>S0306437902001023</els_id><sourcerecordid>57589343</sourcerecordid><originalsourceid>FETCH-LOGICAL-c338t-3fd3bd8b65a752690249696f1415c17daf4bd2b54d966f229d472cdda7ae1b3e3</originalsourceid><addsrcrecordid>eNqFkEtLAzEUhYMoWKs_QZiV6CKaxySZrIqILyi4qK5DJrnByEynJtNq_70zrbh1deFwzuGeD6FzSq4pofJmQTiRuORKXxJ2RQglDPMDNKGV4lgSJQ_R5M9yjE5y_iCEMKH1BM0WsY2NTbHfFhlscu9FF4o-toC_bFqBL_K6zvC5hqWDXGyiLewghRC_iz4BnKKjYJsMZ793it4e7l_vnvD85fH57naOHedVj3nwvPZVLYVVgklNWKmlloGWVDiqvA1l7VktSq-lDIxpXyrmvLfKAq058Cm62PeuUjc8k3vTxuygaewSunU2QolK85IPRrE3utTlnCCYVYqtTVtDiRlxmR0uM7IwhJkdLjPmZvscDCs2EZLJLo6jfUzgeuO7-E_DD2UKcbc</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>57589343</pqid></control><display><type>article</type><title>Similarity search of time-warped subsequences via a suffix tree</title><source>Access via ScienceDirect (Elsevier)</source><creator>Park, Sanghyun ; Chu, Wesley W. ; Yoon, Jeehee ; Won, Jungim</creator><creatorcontrib>Park, Sanghyun ; Chu, Wesley W. ; Yoon, Jeehee ; Won, Jungim</creatorcontrib><description>This paper proposes an indexing technique for fast retrieval of similar subsequences using the time-warping distance. The time-warping distance is a more suitable similarity measure than the Euclidean distance in many applications where sequences may be of different lengths and/or different sampling rates. The proposed indexing technique employs a disk-based suffix tree as an index structure and uses lower-bound distance functions to filter out dissimilar subsequences without false dismissals. To make the index structure compact and hence accelerate the query processing, it converts sequences in the continuous domain into sequences in the discrete domain and stores only a subset of the suffixes whose first values are different from those of the immediately preceding suffixes. Extensive experiments with real and synthetic data sequences revealed that the proposed approach significantly outperforms the sequential scan and LB scan approaches and scales well in a large volume of sequence databases.</description><identifier>ISSN: 0306-4379</identifier><identifier>EISSN: 1873-6076</identifier><identifier>DOI: 10.1016/S0306-4379(02)00102-3</identifier><language>eng</language><publisher>Elsevier Ltd</publisher><subject>Categorization ; Computerized information storage and retrieval ; Indexing ; Searching ; Sequence database ; Similarity measures ; Similarity search ; Suffix tree ; Time-warping distance ; Tree structures</subject><ispartof>Information systems (Oxford), 2003-10, Vol.28 (7), p.867-883</ispartof><rights>2002 Elsevier Science Ltd</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c338t-3fd3bd8b65a752690249696f1415c17daf4bd2b54d966f229d472cdda7ae1b3e3</citedby><cites>FETCH-LOGICAL-c338t-3fd3bd8b65a752690249696f1415c17daf4bd2b54d966f229d472cdda7ae1b3e3</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://dx.doi.org/10.1016/S0306-4379(02)00102-3$$EHTML$$P50$$Gelsevier$$H</linktohtml><link.rule.ids>314,780,784,3550,27924,27925,45995</link.rule.ids></links><search><creatorcontrib>Park, Sanghyun</creatorcontrib><creatorcontrib>Chu, Wesley W.</creatorcontrib><creatorcontrib>Yoon, Jeehee</creatorcontrib><creatorcontrib>Won, Jungim</creatorcontrib><title>Similarity search of time-warped subsequences via a suffix tree</title><title>Information systems (Oxford)</title><description>This paper proposes an indexing technique for fast retrieval of similar subsequences using the time-warping distance. The time-warping distance is a more suitable similarity measure than the Euclidean distance in many applications where sequences may be of different lengths and/or different sampling rates. The proposed indexing technique employs a disk-based suffix tree as an index structure and uses lower-bound distance functions to filter out dissimilar subsequences without false dismissals. To make the index structure compact and hence accelerate the query processing, it converts sequences in the continuous domain into sequences in the discrete domain and stores only a subset of the suffixes whose first values are different from those of the immediately preceding suffixes. Extensive experiments with real and synthetic data sequences revealed that the proposed approach significantly outperforms the sequential scan and LB scan approaches and scales well in a large volume of sequence databases.</description><subject>Categorization</subject><subject>Computerized information storage and retrieval</subject><subject>Indexing</subject><subject>Searching</subject><subject>Sequence database</subject><subject>Similarity measures</subject><subject>Similarity search</subject><subject>Suffix tree</subject><subject>Time-warping distance</subject><subject>Tree structures</subject><issn>0306-4379</issn><issn>1873-6076</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2003</creationdate><recordtype>article</recordtype><recordid>eNqFkEtLAzEUhYMoWKs_QZiV6CKaxySZrIqILyi4qK5DJrnByEynJtNq_70zrbh1deFwzuGeD6FzSq4pofJmQTiRuORKXxJ2RQglDPMDNKGV4lgSJQ_R5M9yjE5y_iCEMKH1BM0WsY2NTbHfFhlscu9FF4o-toC_bFqBL_K6zvC5hqWDXGyiLewghRC_iz4BnKKjYJsMZ793it4e7l_vnvD85fH57naOHedVj3nwvPZVLYVVgklNWKmlloGWVDiqvA1l7VktSq-lDIxpXyrmvLfKAq058Cm62PeuUjc8k3vTxuygaewSunU2QolK85IPRrE3utTlnCCYVYqtTVtDiRlxmR0uM7IwhJkdLjPmZvscDCs2EZLJLo6jfUzgeuO7-E_DD2UKcbc</recordid><startdate>20031001</startdate><enddate>20031001</enddate><creator>Park, Sanghyun</creator><creator>Chu, Wesley W.</creator><creator>Yoon, Jeehee</creator><creator>Won, Jungim</creator><general>Elsevier Ltd</general><scope>AAYXX</scope><scope>CITATION</scope><scope>E3H</scope><scope>F2A</scope></search><sort><creationdate>20031001</creationdate><title>Similarity search of time-warped subsequences via a suffix tree</title><author>Park, Sanghyun ; Chu, Wesley W. ; Yoon, Jeehee ; Won, Jungim</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c338t-3fd3bd8b65a752690249696f1415c17daf4bd2b54d966f229d472cdda7ae1b3e3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2003</creationdate><topic>Categorization</topic><topic>Computerized information storage and retrieval</topic><topic>Indexing</topic><topic>Searching</topic><topic>Sequence database</topic><topic>Similarity measures</topic><topic>Similarity search</topic><topic>Suffix tree</topic><topic>Time-warping distance</topic><topic>Tree structures</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Park, Sanghyun</creatorcontrib><creatorcontrib>Chu, Wesley W.</creatorcontrib><creatorcontrib>Yoon, Jeehee</creatorcontrib><creatorcontrib>Won, Jungim</creatorcontrib><collection>CrossRef</collection><collection>Library & Information Sciences Abstracts (LISA)</collection><collection>Library & Information Science Abstracts (LISA)</collection><jtitle>Information systems (Oxford)</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Park, Sanghyun</au><au>Chu, Wesley W.</au><au>Yoon, Jeehee</au><au>Won, Jungim</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Similarity search of time-warped subsequences via a suffix tree</atitle><jtitle>Information systems (Oxford)</jtitle><date>2003-10-01</date><risdate>2003</risdate><volume>28</volume><issue>7</issue><spage>867</spage><epage>883</epage><pages>867-883</pages><issn>0306-4379</issn><eissn>1873-6076</eissn><abstract>This paper proposes an indexing technique for fast retrieval of similar subsequences using the time-warping distance. The time-warping distance is a more suitable similarity measure than the Euclidean distance in many applications where sequences may be of different lengths and/or different sampling rates. The proposed indexing technique employs a disk-based suffix tree as an index structure and uses lower-bound distance functions to filter out dissimilar subsequences without false dismissals. To make the index structure compact and hence accelerate the query processing, it converts sequences in the continuous domain into sequences in the discrete domain and stores only a subset of the suffixes whose first values are different from those of the immediately preceding suffixes. Extensive experiments with real and synthetic data sequences revealed that the proposed approach significantly outperforms the sequential scan and LB scan approaches and scales well in a large volume of sequence databases.</abstract><pub>Elsevier Ltd</pub><doi>10.1016/S0306-4379(02)00102-3</doi><tpages>17</tpages></addata></record>
fulltext	fulltext
identifier	ISSN: 0306-4379
ispartof	Information systems (Oxford), 2003-10, Vol.28 (7), p.867-883
issn	0306-4379 1873-6076
language	eng
recordid	cdi_proquest_miscellaneous_57589343
source	Access via ScienceDirect (Elsevier)
subjects	Categorization Computerized information storage and retrieval Indexing Searching Sequence database Similarity measures Similarity search Suffix tree Time-warping distance Tree structures
title	Similarity search of time-warped subsequences via a suffix tree
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-01T05%3A17%3A46IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Similarity%20search%20of%20time-warped%20subsequences%20via%20a%20suffix%20tree&rft.jtitle=Information%20systems%20(Oxford)&rft.au=Park,%20Sanghyun&rft.date=2003-10-01&rft.volume=28&rft.issue=7&rft.spage=867&rft.epage=883&rft.pages=867-883&rft.issn=0306-4379&rft.eissn=1873-6076&rft_id=info:doi/10.1016/S0306-4379(02)00102-3&rft_dat=%3Cproquest_cross%3E57589343%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=57589343&rft_id=info:pmid/&rft_els_id=S0306437902001023&rfr_iscdi=true