Multivariate Time Series Cleaning under Speed Constraints

Errors are common in time series due to unreliable sensor measurements. Existing methods focus on univariate data but do not utilize the correlation between dimensions. Cleaning each dimension separately may lead to a less accurate result, as some errors can only be identified in the multivariate ca...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Zhang, Aoqian, Wu, Zexue, Gong, Yifeng, Yuan, Ye, Wang, Guoren
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page
container_issue
container_start_page
container_title
container_volume
creator Zhang, Aoqian
Wu, Zexue
Gong, Yifeng
Yuan, Ye
Wang, Guoren
description Errors are common in time series due to unreliable sensor measurements. Existing methods focus on univariate data but do not utilize the correlation between dimensions. Cleaning each dimension separately may lead to a less accurate result, as some errors can only be identified in the multivariate case. We also point out that the widely used minimum change principle is not always the best choice. Instead, we try to change the smallest number of data to avoid a significant change in the data distribution. In this paper, we propose MTCSC, the constraint-based method for cleaning multivariate time series. We formalize the repair problem, propose a linear-time method to employ online computing, and improve it by exploiting data trends. We also support adaptive speed constraint capturing. We analyze the properties of our proposals and compare them with SOTA methods in terms of effectiveness, efficiency versus error rates, data sizes, and applications such as classification. Experiments on real datasets show that MTCSC can have higher repair accuracy with less time consumption. Interestingly, it can be effective even when there are only weak or no correlations between the dimensions.
doi_str_mv 10.48550/arxiv.2411.01214
format Article
fullrecord <record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_2411_01214</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2411_01214</sourcerecordid><originalsourceid>FETCH-arxiv_primary_2411_012143</originalsourceid><addsrcrecordid>eNpjYJA0NNAzsTA1NdBPLKrILNMzMjE01DMwNDI04WSw9C3NKcksSyzKTCxJVQjJzE1VCE4tykwtVnDOSU3My8xLVyjNS0ktUgguSE1NUXDOzysuKUrMzCsp5mFgTUvMKU7lhdLcDPJuriHOHrpgS-ILijJzE4sq40GWxYMtMyasAgBSWzQW</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Multivariate Time Series Cleaning under Speed Constraints</title><source>arXiv.org</source><creator>Zhang, Aoqian ; Wu, Zexue ; Gong, Yifeng ; Yuan, Ye ; Wang, Guoren</creator><creatorcontrib>Zhang, Aoqian ; Wu, Zexue ; Gong, Yifeng ; Yuan, Ye ; Wang, Guoren</creatorcontrib><description>Errors are common in time series due to unreliable sensor measurements. Existing methods focus on univariate data but do not utilize the correlation between dimensions. Cleaning each dimension separately may lead to a less accurate result, as some errors can only be identified in the multivariate case. We also point out that the widely used minimum change principle is not always the best choice. Instead, we try to change the smallest number of data to avoid a significant change in the data distribution. In this paper, we propose MTCSC, the constraint-based method for cleaning multivariate time series. We formalize the repair problem, propose a linear-time method to employ online computing, and improve it by exploiting data trends. We also support adaptive speed constraint capturing. We analyze the properties of our proposals and compare them with SOTA methods in terms of effectiveness, efficiency versus error rates, data sizes, and applications such as classification. Experiments on real datasets show that MTCSC can have higher repair accuracy with less time consumption. Interestingly, it can be effective even when there are only weak or no correlations between the dimensions.</description><identifier>DOI: 10.48550/arxiv.2411.01214</identifier><language>eng</language><subject>Computer Science - Databases</subject><creationdate>2024-11</creationdate><rights>http://arxiv.org/licenses/nonexclusive-distrib/1.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,780,885</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/2411.01214$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.2411.01214$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Zhang, Aoqian</creatorcontrib><creatorcontrib>Wu, Zexue</creatorcontrib><creatorcontrib>Gong, Yifeng</creatorcontrib><creatorcontrib>Yuan, Ye</creatorcontrib><creatorcontrib>Wang, Guoren</creatorcontrib><title>Multivariate Time Series Cleaning under Speed Constraints</title><description>Errors are common in time series due to unreliable sensor measurements. Existing methods focus on univariate data but do not utilize the correlation between dimensions. Cleaning each dimension separately may lead to a less accurate result, as some errors can only be identified in the multivariate case. We also point out that the widely used minimum change principle is not always the best choice. Instead, we try to change the smallest number of data to avoid a significant change in the data distribution. In this paper, we propose MTCSC, the constraint-based method for cleaning multivariate time series. We formalize the repair problem, propose a linear-time method to employ online computing, and improve it by exploiting data trends. We also support adaptive speed constraint capturing. We analyze the properties of our proposals and compare them with SOTA methods in terms of effectiveness, efficiency versus error rates, data sizes, and applications such as classification. Experiments on real datasets show that MTCSC can have higher repair accuracy with less time consumption. Interestingly, it can be effective even when there are only weak or no correlations between the dimensions.</description><subject>Computer Science - Databases</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNpjYJA0NNAzsTA1NdBPLKrILNMzMjE01DMwNDI04WSw9C3NKcksSyzKTCxJVQjJzE1VCE4tykwtVnDOSU3My8xLVyjNS0ktUgguSE1NUXDOzysuKUrMzCsp5mFgTUvMKU7lhdLcDPJuriHOHrpgS-ILijJzE4sq40GWxYMtMyasAgBSWzQW</recordid><startdate>20241102</startdate><enddate>20241102</enddate><creator>Zhang, Aoqian</creator><creator>Wu, Zexue</creator><creator>Gong, Yifeng</creator><creator>Yuan, Ye</creator><creator>Wang, Guoren</creator><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20241102</creationdate><title>Multivariate Time Series Cleaning under Speed Constraints</title><author>Zhang, Aoqian ; Wu, Zexue ; Gong, Yifeng ; Yuan, Ye ; Wang, Guoren</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-arxiv_primary_2411_012143</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Computer Science - Databases</topic><toplevel>online_resources</toplevel><creatorcontrib>Zhang, Aoqian</creatorcontrib><creatorcontrib>Wu, Zexue</creatorcontrib><creatorcontrib>Gong, Yifeng</creatorcontrib><creatorcontrib>Yuan, Ye</creatorcontrib><creatorcontrib>Wang, Guoren</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Zhang, Aoqian</au><au>Wu, Zexue</au><au>Gong, Yifeng</au><au>Yuan, Ye</au><au>Wang, Guoren</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Multivariate Time Series Cleaning under Speed Constraints</atitle><date>2024-11-02</date><risdate>2024</risdate><abstract>Errors are common in time series due to unreliable sensor measurements. Existing methods focus on univariate data but do not utilize the correlation between dimensions. Cleaning each dimension separately may lead to a less accurate result, as some errors can only be identified in the multivariate case. We also point out that the widely used minimum change principle is not always the best choice. Instead, we try to change the smallest number of data to avoid a significant change in the data distribution. In this paper, we propose MTCSC, the constraint-based method for cleaning multivariate time series. We formalize the repair problem, propose a linear-time method to employ online computing, and improve it by exploiting data trends. We also support adaptive speed constraint capturing. We analyze the properties of our proposals and compare them with SOTA methods in terms of effectiveness, efficiency versus error rates, data sizes, and applications such as classification. Experiments on real datasets show that MTCSC can have higher repair accuracy with less time consumption. Interestingly, it can be effective even when there are only weak or no correlations between the dimensions.</abstract><doi>10.48550/arxiv.2411.01214</doi><oa>free_for_read</oa></addata></record>
fulltext fulltext_linktorsrc
identifier DOI: 10.48550/arxiv.2411.01214
ispartof
issn
language eng
recordid cdi_arxiv_primary_2411_01214
source arXiv.org
subjects Computer Science - Databases
title Multivariate Time Series Cleaning under Speed Constraints
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-01T15%3A00%3A36IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Multivariate%20Time%20Series%20Cleaning%20under%20Speed%20Constraints&rft.au=Zhang,%20Aoqian&rft.date=2024-11-02&rft_id=info:doi/10.48550/arxiv.2411.01214&rft_dat=%3Carxiv_GOX%3E2411_01214%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true