MetricSifter: Feature Reduction of Multivariate Time Series Data for Efficient Fault Localization in Cloud Applications
Automated fault localization in large-scale cloud-based applications is challenging because it involves mining multivariate time series data from large volumes of operational monitoring metrics. To improve localization accuracy, automated fault localization methods incorporate feature reduction to r...
Gespeichert in:
Veröffentlicht in: | IEEE access 2024-01, Vol.12, p.1-1 |
---|---|
Hauptverfasser: | , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 1 |
---|---|
container_issue | |
container_start_page | 1 |
container_title | IEEE access |
container_volume | 12 |
creator | Tsubouchi, Yuuki Tsuruta, Hirofumi |
description | Automated fault localization in large-scale cloud-based applications is challenging because it involves mining multivariate time series data from large volumes of operational monitoring metrics. To improve localization accuracy, automated fault localization methods incorporate feature reduction to reduce the number of monitoring metrics unrelated to a failure. However, these methods have problems with inaccuracy, either from removing too many failure-related metrics or from retaining too few failure-unrelated metrics. In this paper, we present MetricSifter, a feature reduction framework designed to accurately identify anomalous metrics caused by faults. Our framework locates a failure time window with the highest density of fault-induced change point times across monitoring metrics with a focus on their temporal proximity. Experimental results indicate that MetricSifter achieves an accuracy of 0.981, which is significantly better than the selected baseline methods. Furthermore, experiments combining various reduction methods with various localization methods demonstrate that MetricSifter improves the recall and time efficiency over the baseline methods. |
doi_str_mv | 10.1109/ACCESS.2024.3374334 |
format | Article |
fullrecord | <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_crossref_primary_10_1109_ACCESS_2024_3374334</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>10462133</ieee_id><doaj_id>oai_doaj_org_article_0fec4f42f1ab4d6da4c91f30aa8f71cf</doaj_id><sourcerecordid>2956886554</sourcerecordid><originalsourceid>FETCH-LOGICAL-c359t-8b0484c221eb6d2c12e155a817fafdc01fe426bf14870439a60a74d74e8a25563</originalsourceid><addsrcrecordid>eNpNkU9rGzEQxZfQQkKaT9AcBD3b1f_V9ma2dhtwKNTpWYy1M0FmY7labUv76bvxhpK5aHi890bwq6r3gi-F4M3HVduud7ul5FIvlaq1UvqiupLCNgtllH3zar-sbobhwKdxk2Tqq-r3PZYcwy5SwfyJbRDKmJF9x24MJaYjS8Tux77EX5AjFGQP8QnZDnPEgX2GAoxSZmuiGCIeC9vAZGbbFKCPf-HcEI-s7dPYsdXp1MdwFod31VuCfsCbl_e6-rFZP7RfF9tvX-7a1XYRlGnKwu25djpIKXBvOxmERGEMOFETUBe4INTS7kloV3OtGrAcat3VGh1IY6y6ru7m3i7BwZ9yfIL8xyeI_iyk_Oghlxh69JwwaNKSBOx1ZzvQoRGkOICjWgSauj7MXaecfo44FH9IYz5O3_eyMdY5a4yeXGp2hZyGISP9vyq4fwbmZ2D-GZh_ATalbudURMRXCW2lUEr9A49dksA</addsrcrecordid><sourcetype>Open Website</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2956886554</pqid></control><display><type>article</type><title>MetricSifter: Feature Reduction of Multivariate Time Series Data for Efficient Fault Localization in Cloud Applications</title><source>IEEE Open Access Journals</source><source>DOAJ Directory of Open Access Journals</source><source>Elektronische Zeitschriftenbibliothek - Frei zugängliche E-Journals</source><creator>Tsubouchi, Yuuki ; Tsuruta, Hirofumi</creator><creatorcontrib>Tsubouchi, Yuuki ; Tsuruta, Hirofumi</creatorcontrib><description>Automated fault localization in large-scale cloud-based applications is challenging because it involves mining multivariate time series data from large volumes of operational monitoring metrics. To improve localization accuracy, automated fault localization methods incorporate feature reduction to reduce the number of monitoring metrics unrelated to a failure. However, these methods have problems with inaccuracy, either from removing too many failure-related metrics or from retaining too few failure-unrelated metrics. In this paper, we present MetricSifter, a feature reduction framework designed to accurately identify anomalous metrics caused by faults. Our framework locates a failure time window with the highest density of fault-induced change point times across monitoring metrics with a focus on their temporal proximity. Experimental results indicate that MetricSifter achieves an accuracy of 0.981, which is significantly better than the selected baseline methods. Furthermore, experiments combining various reduction methods with various localization methods demonstrate that MetricSifter improves the recall and time efficiency over the baseline methods.</description><identifier>ISSN: 2169-3536</identifier><identifier>EISSN: 2169-3536</identifier><identifier>DOI: 10.1109/ACCESS.2024.3374334</identifier><identifier>CODEN: IAECCG</identifier><language>eng</language><publisher>Piscataway: IEEE</publisher><subject>AIOps ; Automation ; Cloud computing ; Failure Management ; Failure times ; Fault detection ; Fault Localization ; Fault location ; Incident Response ; Localization ; Location awareness ; Measurement ; Monitoring ; Multivariate analysis ; Reduction ; Redundancy ; Site Reliability Engineering ; Task analysis ; Time series ; Time series analysis ; Visualization ; Windows (intervals)</subject><ispartof>IEEE access, 2024-01, Vol.12, p.1-1</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2024</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c359t-8b0484c221eb6d2c12e155a817fafdc01fe426bf14870439a60a74d74e8a25563</cites><orcidid>0009-0002-7719-028X ; 0009-0008-5758-7910</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/10462133$$EHTML$$P50$$Gieee$$Hfree_for_read</linktohtml><link.rule.ids>314,776,780,860,2096,27610,27901,27902,54908</link.rule.ids></links><search><creatorcontrib>Tsubouchi, Yuuki</creatorcontrib><creatorcontrib>Tsuruta, Hirofumi</creatorcontrib><title>MetricSifter: Feature Reduction of Multivariate Time Series Data for Efficient Fault Localization in Cloud Applications</title><title>IEEE access</title><addtitle>Access</addtitle><description>Automated fault localization in large-scale cloud-based applications is challenging because it involves mining multivariate time series data from large volumes of operational monitoring metrics. To improve localization accuracy, automated fault localization methods incorporate feature reduction to reduce the number of monitoring metrics unrelated to a failure. However, these methods have problems with inaccuracy, either from removing too many failure-related metrics or from retaining too few failure-unrelated metrics. In this paper, we present MetricSifter, a feature reduction framework designed to accurately identify anomalous metrics caused by faults. Our framework locates a failure time window with the highest density of fault-induced change point times across monitoring metrics with a focus on their temporal proximity. Experimental results indicate that MetricSifter achieves an accuracy of 0.981, which is significantly better than the selected baseline methods. Furthermore, experiments combining various reduction methods with various localization methods demonstrate that MetricSifter improves the recall and time efficiency over the baseline methods.</description><subject>AIOps</subject><subject>Automation</subject><subject>Cloud computing</subject><subject>Failure Management</subject><subject>Failure times</subject><subject>Fault detection</subject><subject>Fault Localization</subject><subject>Fault location</subject><subject>Incident Response</subject><subject>Localization</subject><subject>Location awareness</subject><subject>Measurement</subject><subject>Monitoring</subject><subject>Multivariate analysis</subject><subject>Reduction</subject><subject>Redundancy</subject><subject>Site Reliability Engineering</subject><subject>Task analysis</subject><subject>Time series</subject><subject>Time series analysis</subject><subject>Visualization</subject><subject>Windows (intervals)</subject><issn>2169-3536</issn><issn>2169-3536</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>ESBDL</sourceid><sourceid>RIE</sourceid><sourceid>DOA</sourceid><recordid>eNpNkU9rGzEQxZfQQkKaT9AcBD3b1f_V9ma2dhtwKNTpWYy1M0FmY7labUv76bvxhpK5aHi890bwq6r3gi-F4M3HVduud7ul5FIvlaq1UvqiupLCNgtllH3zar-sbobhwKdxk2Tqq-r3PZYcwy5SwfyJbRDKmJF9x24MJaYjS8Tux77EX5AjFGQP8QnZDnPEgX2GAoxSZmuiGCIeC9vAZGbbFKCPf-HcEI-s7dPYsdXp1MdwFod31VuCfsCbl_e6-rFZP7RfF9tvX-7a1XYRlGnKwu25djpIKXBvOxmERGEMOFETUBe4INTS7kloV3OtGrAcat3VGh1IY6y6ru7m3i7BwZ9yfIL8xyeI_iyk_Oghlxh69JwwaNKSBOx1ZzvQoRGkOICjWgSauj7MXaecfo44FH9IYz5O3_eyMdY5a4yeXGp2hZyGISP9vyq4fwbmZ2D-GZh_ATalbudURMRXCW2lUEr9A49dksA</recordid><startdate>20240101</startdate><enddate>20240101</enddate><creator>Tsubouchi, Yuuki</creator><creator>Tsuruta, Hirofumi</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>ESBDL</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>7SR</scope><scope>8BQ</scope><scope>8FD</scope><scope>JG9</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>DOA</scope><orcidid>https://orcid.org/0009-0002-7719-028X</orcidid><orcidid>https://orcid.org/0009-0008-5758-7910</orcidid></search><sort><creationdate>20240101</creationdate><title>MetricSifter: Feature Reduction of Multivariate Time Series Data for Efficient Fault Localization in Cloud Applications</title><author>Tsubouchi, Yuuki ; Tsuruta, Hirofumi</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c359t-8b0484c221eb6d2c12e155a817fafdc01fe426bf14870439a60a74d74e8a25563</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>AIOps</topic><topic>Automation</topic><topic>Cloud computing</topic><topic>Failure Management</topic><topic>Failure times</topic><topic>Fault detection</topic><topic>Fault Localization</topic><topic>Fault location</topic><topic>Incident Response</topic><topic>Localization</topic><topic>Location awareness</topic><topic>Measurement</topic><topic>Monitoring</topic><topic>Multivariate analysis</topic><topic>Reduction</topic><topic>Redundancy</topic><topic>Site Reliability Engineering</topic><topic>Task analysis</topic><topic>Time series</topic><topic>Time series analysis</topic><topic>Visualization</topic><topic>Windows (intervals)</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Tsubouchi, Yuuki</creatorcontrib><creatorcontrib>Tsuruta, Hirofumi</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE Open Access Journals</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics & Communications Abstracts</collection><collection>Engineered Materials Abstracts</collection><collection>METADEX</collection><collection>Technology Research Database</collection><collection>Materials Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>DOAJ Directory of Open Access Journals</collection><jtitle>IEEE access</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Tsubouchi, Yuuki</au><au>Tsuruta, Hirofumi</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>MetricSifter: Feature Reduction of Multivariate Time Series Data for Efficient Fault Localization in Cloud Applications</atitle><jtitle>IEEE access</jtitle><stitle>Access</stitle><date>2024-01-01</date><risdate>2024</risdate><volume>12</volume><spage>1</spage><epage>1</epage><pages>1-1</pages><issn>2169-3536</issn><eissn>2169-3536</eissn><coden>IAECCG</coden><abstract>Automated fault localization in large-scale cloud-based applications is challenging because it involves mining multivariate time series data from large volumes of operational monitoring metrics. To improve localization accuracy, automated fault localization methods incorporate feature reduction to reduce the number of monitoring metrics unrelated to a failure. However, these methods have problems with inaccuracy, either from removing too many failure-related metrics or from retaining too few failure-unrelated metrics. In this paper, we present MetricSifter, a feature reduction framework designed to accurately identify anomalous metrics caused by faults. Our framework locates a failure time window with the highest density of fault-induced change point times across monitoring metrics with a focus on their temporal proximity. Experimental results indicate that MetricSifter achieves an accuracy of 0.981, which is significantly better than the selected baseline methods. Furthermore, experiments combining various reduction methods with various localization methods demonstrate that MetricSifter improves the recall and time efficiency over the baseline methods.</abstract><cop>Piscataway</cop><pub>IEEE</pub><doi>10.1109/ACCESS.2024.3374334</doi><tpages>1</tpages><orcidid>https://orcid.org/0009-0002-7719-028X</orcidid><orcidid>https://orcid.org/0009-0008-5758-7910</orcidid><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | ISSN: 2169-3536 |
ispartof | IEEE access, 2024-01, Vol.12, p.1-1 |
issn | 2169-3536 2169-3536 |
language | eng |
recordid | cdi_crossref_primary_10_1109_ACCESS_2024_3374334 |
source | IEEE Open Access Journals; DOAJ Directory of Open Access Journals; Elektronische Zeitschriftenbibliothek - Frei zugängliche E-Journals |
subjects | AIOps Automation Cloud computing Failure Management Failure times Fault detection Fault Localization Fault location Incident Response Localization Location awareness Measurement Monitoring Multivariate analysis Reduction Redundancy Site Reliability Engineering Task analysis Time series Time series analysis Visualization Windows (intervals) |
title | MetricSifter: Feature Reduction of Multivariate Time Series Data for Efficient Fault Localization in Cloud Applications |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-02T04%3A35%3A18IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=MetricSifter:%20Feature%20Reduction%20of%20Multivariate%20Time%20Series%20Data%20for%20Efficient%20Fault%20Localization%20in%20Cloud%20Applications&rft.jtitle=IEEE%20access&rft.au=Tsubouchi,%20Yuuki&rft.date=2024-01-01&rft.volume=12&rft.spage=1&rft.epage=1&rft.pages=1-1&rft.issn=2169-3536&rft.eissn=2169-3536&rft.coden=IAECCG&rft_id=info:doi/10.1109/ACCESS.2024.3374334&rft_dat=%3Cproquest_cross%3E2956886554%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2956886554&rft_id=info:pmid/&rft_ieee_id=10462133&rft_doaj_id=oai_doaj_org_article_0fec4f42f1ab4d6da4c91f30aa8f71cf&rfr_iscdi=true |