Semi-supervised Anomaly Detection with an Application to Water Analytics

© 2018 IEEE. Nowadays, all aspects of a production process are continuously monitored and visualized in a dashboard. Equipment is monitored using a variety of sensors, natural resource usage is tracked, and interventions are recorded. In this context, a common task is to identify anomalous behavior...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Vercruyssen, Vincent, Meert, Wannes, Verbruggen, Gust, Maes, Koen, Baumer, Ruben, Davis, Jesse
Format: Tagungsbericht
Sprache:eng
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 536
container_issue
container_start_page 527
container_title
container_volume 2018-November
creator Vercruyssen, Vincent
Meert, Wannes
Verbruggen, Gust
Maes, Koen
Baumer, Ruben
Davis, Jesse
description © 2018 IEEE. Nowadays, all aspects of a production process are continuously monitored and visualized in a dashboard. Equipment is monitored using a variety of sensors, natural resource usage is tracked, and interventions are recorded. In this context, a common task is to identify anomalous behavior from the time series data generated by sensors. As manually analyzing such data is laborious and expensive, automated approaches have the potential to be much more efficient as well as cost effective. While anomaly detection could be posed as a supervised learning problem, typically this is not possible as few or no labeled examples of anomalous behavior are available and it is oftentimes infeasible or undesirable to collect them. Therefore, unsupervised approaches are commonly employed which typically identify anomalies as deviations from normal (i.e., common or frequent) behavior. However, in many real-world settings several types of normal behavior exist that occur less frequently than some anomalous behaviors. In this paper, we propose a novel constrained-clustering-based approach for anomaly detection that works in both an unsupervised and semi-supervised setting. Starting from an unlabeled data set, the approach is able to gradually incorporate expert-provided feedback to improve its performance. We evaluated our approach on real-world water monitoring time series data from supermarkets in collaboration with Colruyt Group, one of Belgiums largest retail companies. Empirically, we found that our approach outperforms the current detection system as well as several other baselines. Our system is currently deployed and used by the company to analyze water usage for 20 stores on a daily basis.
format Conference Proceeding
fullrecord <record><control><sourceid>kuleuven_FZOIL</sourceid><recordid>TN_cdi_kuleuven_dspace_123456789_627231</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>123456789_627231</sourcerecordid><originalsourceid>FETCH-kuleuven_dspace_123456789_6272313</originalsourceid><addsrcrecordid>eNqVyk0LgjAcgPFBBWn1HXbrEMLm3NSj9IL3go5j6D9azRfctPr2SfQB6vTAw2-CfMpZIlLKk3CKPMo5CaI4EXPkW3sjhAnBiIfyI1Q6sH0L3aAtlDirm0qZF96Bg8LppsYP7a5Y1ThrW6ML9XmuwWfloBv5iJ0u7BLNLspYWH27QOvD_rTNg3tvoB-glqVtVQGShiziIk5SKcI4ZJT9Ize_Semejr0Bv3tMLQ</addsrcrecordid><sourcetype>Institutional Repository</sourcetype><iscdi>true</iscdi><recordtype>conference_proceeding</recordtype></control><display><type>conference_proceeding</type><title>Semi-supervised Anomaly Detection with an Application to Water Analytics</title><source>Lirias (KU Leuven Association)</source><creator>Vercruyssen, Vincent ; Meert, Wannes ; Verbruggen, Gust ; Maes, Koen ; Baumer, Ruben ; Davis, Jesse</creator><creatorcontrib>Vercruyssen, Vincent ; Meert, Wannes ; Verbruggen, Gust ; Maes, Koen ; Baumer, Ruben ; Davis, Jesse</creatorcontrib><description>© 2018 IEEE. Nowadays, all aspects of a production process are continuously monitored and visualized in a dashboard. Equipment is monitored using a variety of sensors, natural resource usage is tracked, and interventions are recorded. In this context, a common task is to identify anomalous behavior from the time series data generated by sensors. As manually analyzing such data is laborious and expensive, automated approaches have the potential to be much more efficient as well as cost effective. While anomaly detection could be posed as a supervised learning problem, typically this is not possible as few or no labeled examples of anomalous behavior are available and it is oftentimes infeasible or undesirable to collect them. Therefore, unsupervised approaches are commonly employed which typically identify anomalies as deviations from normal (i.e., common or frequent) behavior. However, in many real-world settings several types of normal behavior exist that occur less frequently than some anomalous behaviors. In this paper, we propose a novel constrained-clustering-based approach for anomaly detection that works in both an unsupervised and semi-supervised setting. Starting from an unlabeled data set, the approach is able to gradually incorporate expert-provided feedback to improve its performance. We evaluated our approach on real-world water monitoring time series data from supermarkets in collaboration with Colruyt Group, one of Belgiums largest retail companies. Empirically, we found that our approach outperforms the current detection system as well as several other baselines. Our system is currently deployed and used by the company to analyze water usage for 20 stores on a daily basis.</description><identifier>ISSN: 1550-4786</identifier><identifier>ISBN: 1538691582</identifier><identifier>ISBN: 9781538691588</identifier><language>eng</language><publisher>IEEE</publisher><ispartof>2018 IEEE INTERNATIONAL CONFERENCE ON DATA MINING (ICDM), 2018, Vol.2018-November, p.527-536</ispartof><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>309,310,315,780,25140,27860</link.rule.ids><linktorsrc>$$Uhttps://lirias.kuleuven.be/handle/123456789/627231$$EView_record_in_KU_Leuven_Association$$FView_record_in_$$GKU_Leuven_Association$$Hfree_for_read</linktorsrc></links><search><creatorcontrib>Vercruyssen, Vincent</creatorcontrib><creatorcontrib>Meert, Wannes</creatorcontrib><creatorcontrib>Verbruggen, Gust</creatorcontrib><creatorcontrib>Maes, Koen</creatorcontrib><creatorcontrib>Baumer, Ruben</creatorcontrib><creatorcontrib>Davis, Jesse</creatorcontrib><title>Semi-supervised Anomaly Detection with an Application to Water Analytics</title><title>2018 IEEE INTERNATIONAL CONFERENCE ON DATA MINING (ICDM)</title><description>© 2018 IEEE. Nowadays, all aspects of a production process are continuously monitored and visualized in a dashboard. Equipment is monitored using a variety of sensors, natural resource usage is tracked, and interventions are recorded. In this context, a common task is to identify anomalous behavior from the time series data generated by sensors. As manually analyzing such data is laborious and expensive, automated approaches have the potential to be much more efficient as well as cost effective. While anomaly detection could be posed as a supervised learning problem, typically this is not possible as few or no labeled examples of anomalous behavior are available and it is oftentimes infeasible or undesirable to collect them. Therefore, unsupervised approaches are commonly employed which typically identify anomalies as deviations from normal (i.e., common or frequent) behavior. However, in many real-world settings several types of normal behavior exist that occur less frequently than some anomalous behaviors. In this paper, we propose a novel constrained-clustering-based approach for anomaly detection that works in both an unsupervised and semi-supervised setting. Starting from an unlabeled data set, the approach is able to gradually incorporate expert-provided feedback to improve its performance. We evaluated our approach on real-world water monitoring time series data from supermarkets in collaboration with Colruyt Group, one of Belgiums largest retail companies. Empirically, we found that our approach outperforms the current detection system as well as several other baselines. Our system is currently deployed and used by the company to analyze water usage for 20 stores on a daily basis.</description><issn>1550-4786</issn><isbn>1538691582</isbn><isbn>9781538691588</isbn><fulltext>true</fulltext><rsrctype>conference_proceeding</rsrctype><creationdate>2018</creationdate><recordtype>conference_proceeding</recordtype><sourceid>FZOIL</sourceid><recordid>eNqVyk0LgjAcgPFBBWn1HXbrEMLm3NSj9IL3go5j6D9azRfctPr2SfQB6vTAw2-CfMpZIlLKk3CKPMo5CaI4EXPkW3sjhAnBiIfyI1Q6sH0L3aAtlDirm0qZF96Bg8LppsYP7a5Y1ThrW6ML9XmuwWfloBv5iJ0u7BLNLspYWH27QOvD_rTNg3tvoB-glqVtVQGShiziIk5SKcI4ZJT9Ize_Semejr0Bv3tMLQ</recordid><startdate>20180101</startdate><enddate>20180101</enddate><creator>Vercruyssen, Vincent</creator><creator>Meert, Wannes</creator><creator>Verbruggen, Gust</creator><creator>Maes, Koen</creator><creator>Baumer, Ruben</creator><creator>Davis, Jesse</creator><general>IEEE</general><scope>FZOIL</scope></search><sort><creationdate>20180101</creationdate><title>Semi-supervised Anomaly Detection with an Application to Water Analytics</title><author>Vercruyssen, Vincent ; Meert, Wannes ; Verbruggen, Gust ; Maes, Koen ; Baumer, Ruben ; Davis, Jesse</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-kuleuven_dspace_123456789_6272313</frbrgroupid><rsrctype>conference_proceedings</rsrctype><prefilter>conference_proceedings</prefilter><language>eng</language><creationdate>2018</creationdate><toplevel>online_resources</toplevel><creatorcontrib>Vercruyssen, Vincent</creatorcontrib><creatorcontrib>Meert, Wannes</creatorcontrib><creatorcontrib>Verbruggen, Gust</creatorcontrib><creatorcontrib>Maes, Koen</creatorcontrib><creatorcontrib>Baumer, Ruben</creatorcontrib><creatorcontrib>Davis, Jesse</creatorcontrib><collection>Lirias (KU Leuven Association)</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Vercruyssen, Vincent</au><au>Meert, Wannes</au><au>Verbruggen, Gust</au><au>Maes, Koen</au><au>Baumer, Ruben</au><au>Davis, Jesse</au><format>book</format><genre>proceeding</genre><ristype>CONF</ristype><atitle>Semi-supervised Anomaly Detection with an Application to Water Analytics</atitle><btitle>2018 IEEE INTERNATIONAL CONFERENCE ON DATA MINING (ICDM)</btitle><date>2018-01-01</date><risdate>2018</risdate><volume>2018-November</volume><spage>527</spage><epage>536</epage><pages>527-536</pages><issn>1550-4786</issn><isbn>1538691582</isbn><isbn>9781538691588</isbn><abstract>© 2018 IEEE. Nowadays, all aspects of a production process are continuously monitored and visualized in a dashboard. Equipment is monitored using a variety of sensors, natural resource usage is tracked, and interventions are recorded. In this context, a common task is to identify anomalous behavior from the time series data generated by sensors. As manually analyzing such data is laborious and expensive, automated approaches have the potential to be much more efficient as well as cost effective. While anomaly detection could be posed as a supervised learning problem, typically this is not possible as few or no labeled examples of anomalous behavior are available and it is oftentimes infeasible or undesirable to collect them. Therefore, unsupervised approaches are commonly employed which typically identify anomalies as deviations from normal (i.e., common or frequent) behavior. However, in many real-world settings several types of normal behavior exist that occur less frequently than some anomalous behaviors. In this paper, we propose a novel constrained-clustering-based approach for anomaly detection that works in both an unsupervised and semi-supervised setting. Starting from an unlabeled data set, the approach is able to gradually incorporate expert-provided feedback to improve its performance. We evaluated our approach on real-world water monitoring time series data from supermarkets in collaboration with Colruyt Group, one of Belgiums largest retail companies. Empirically, we found that our approach outperforms the current detection system as well as several other baselines. Our system is currently deployed and used by the company to analyze water usage for 20 stores on a daily basis.</abstract><pub>IEEE</pub><oa>free_for_read</oa></addata></record>
fulltext fulltext_linktorsrc
identifier ISSN: 1550-4786
ispartof 2018 IEEE INTERNATIONAL CONFERENCE ON DATA MINING (ICDM), 2018, Vol.2018-November, p.527-536
issn 1550-4786
language eng
recordid cdi_kuleuven_dspace_123456789_627231
source Lirias (KU Leuven Association)
title Semi-supervised Anomaly Detection with an Application to Water Analytics
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-03T23%3A21%3A05IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-kuleuven_FZOIL&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=proceeding&rft.atitle=Semi-supervised%20Anomaly%20Detection%20with%20an%20Application%20to%20Water%20Analytics&rft.btitle=2018%20IEEE%20INTERNATIONAL%20CONFERENCE%20ON%20DATA%20MINING%20(ICDM)&rft.au=Vercruyssen,%20Vincent&rft.date=2018-01-01&rft.volume=2018-November&rft.spage=527&rft.epage=536&rft.pages=527-536&rft.issn=1550-4786&rft.isbn=1538691582&rft.isbn_list=9781538691588&rft_id=info:doi/&rft_dat=%3Ckuleuven_FZOIL%3E123456789_627231%3C/kuleuven_FZOIL%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true