Functional Isolation Forest

For the purpose of monitoring the behavior of complex infrastructures (e.g. aircrafts, transport or energy networks), high-rate sensors are deployed to capture multivariate data, generally unlabeled, in quasi continuous-time to detect quickly the occurrence of anomalies that may jeopardize the smoot...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:arXiv.org 2019-10
Hauptverfasser: Staerman, Guillaume, Mozharovskyi, Pavlo, Clémençon, Stephan, d'Alché-Buc, Florence
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page
container_issue
container_start_page
container_title arXiv.org
container_volume
creator Staerman, Guillaume
Mozharovskyi, Pavlo
Clémençon, Stephan
d'Alché-Buc, Florence
description For the purpose of monitoring the behavior of complex infrastructures (e.g. aircrafts, transport or energy networks), high-rate sensors are deployed to capture multivariate data, generally unlabeled, in quasi continuous-time to detect quickly the occurrence of anomalies that may jeopardize the smooth operation of the system of interest. The statistical analysis of such massive data of functional nature raises many challenging methodological questions. The primary goal of this paper is to extend the popular Isolation Forest (IF) approach to Anomaly Detection, originally dedicated to finite dimensional observations, to functional data. The major difficulty lies in the wide variety of topological structures that may equip a space of functions and the great variety of patterns that may characterize abnormal curves. We address the issue of (randomly) splitting the functional space in a flexible manner in order to isolate progressively any trajectory from the others, a key ingredient to the efficiency of the algorithm. Beyond a detailed description of the algorithm, computational complexity and stability issues are investigated at length. From the scoring function measuring the degree of abnormality of an observation provided by the proposed variant of the IF algorithm, a Functional Statistical Depth function is defined and discussed as well as a multivariate functional extension. Numerical experiments provide strong empirical evidence of the accuracy of the extension proposed.
format Article
fullrecord <record><control><sourceid>proquest</sourceid><recordid>TN_cdi_proquest_journals_2206820340</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2206820340</sourcerecordid><originalsourceid>FETCH-proquest_journals_22068203403</originalsourceid><addsrcrecordid>eNpjYuA0MjY21LUwMTLiYOAtLs4yMDAwMjM3MjU15mSQdivNSy7JzM9LzFHwLM7PSQSxFdzyi1KLS3gYWNMSc4pTeaE0N4Oym2uIs4duQVF-YSlQQXxWfmkRUGdxvJGRgZmFkYGxiYExcaoACWorTg</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2206820340</pqid></control><display><type>article</type><title>Functional Isolation Forest</title><source>Freely Accessible Journals</source><creator>Staerman, Guillaume ; Mozharovskyi, Pavlo ; Clémençon, Stephan ; d'Alché-Buc, Florence</creator><creatorcontrib>Staerman, Guillaume ; Mozharovskyi, Pavlo ; Clémençon, Stephan ; d'Alché-Buc, Florence</creatorcontrib><description>For the purpose of monitoring the behavior of complex infrastructures (e.g. aircrafts, transport or energy networks), high-rate sensors are deployed to capture multivariate data, generally unlabeled, in quasi continuous-time to detect quickly the occurrence of anomalies that may jeopardize the smooth operation of the system of interest. The statistical analysis of such massive data of functional nature raises many challenging methodological questions. The primary goal of this paper is to extend the popular Isolation Forest (IF) approach to Anomaly Detection, originally dedicated to finite dimensional observations, to functional data. The major difficulty lies in the wide variety of topological structures that may equip a space of functions and the great variety of patterns that may characterize abnormal curves. We address the issue of (randomly) splitting the functional space in a flexible manner in order to isolate progressively any trajectory from the others, a key ingredient to the efficiency of the algorithm. Beyond a detailed description of the algorithm, computational complexity and stability issues are investigated at length. From the scoring function measuring the degree of abnormality of an observation provided by the proposed variant of the IF algorithm, a Functional Statistical Depth function is defined and discussed as well as a multivariate functional extension. Numerical experiments provide strong empirical evidence of the accuracy of the extension proposed.</description><identifier>EISSN: 2331-8422</identifier><language>eng</language><publisher>Ithaca: Cornell University Library, arXiv.org</publisher><subject>Algorithms ; Anomalies ; Complexity ; Empirical analysis ; Forests ; Multivariate analysis ; Statistical analysis</subject><ispartof>arXiv.org, 2019-10</ispartof><rights>2019. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>781,785</link.rule.ids></links><search><creatorcontrib>Staerman, Guillaume</creatorcontrib><creatorcontrib>Mozharovskyi, Pavlo</creatorcontrib><creatorcontrib>Clémençon, Stephan</creatorcontrib><creatorcontrib>d'Alché-Buc, Florence</creatorcontrib><title>Functional Isolation Forest</title><title>arXiv.org</title><description>For the purpose of monitoring the behavior of complex infrastructures (e.g. aircrafts, transport or energy networks), high-rate sensors are deployed to capture multivariate data, generally unlabeled, in quasi continuous-time to detect quickly the occurrence of anomalies that may jeopardize the smooth operation of the system of interest. The statistical analysis of such massive data of functional nature raises many challenging methodological questions. The primary goal of this paper is to extend the popular Isolation Forest (IF) approach to Anomaly Detection, originally dedicated to finite dimensional observations, to functional data. The major difficulty lies in the wide variety of topological structures that may equip a space of functions and the great variety of patterns that may characterize abnormal curves. We address the issue of (randomly) splitting the functional space in a flexible manner in order to isolate progressively any trajectory from the others, a key ingredient to the efficiency of the algorithm. Beyond a detailed description of the algorithm, computational complexity and stability issues are investigated at length. From the scoring function measuring the degree of abnormality of an observation provided by the proposed variant of the IF algorithm, a Functional Statistical Depth function is defined and discussed as well as a multivariate functional extension. Numerical experiments provide strong empirical evidence of the accuracy of the extension proposed.</description><subject>Algorithms</subject><subject>Anomalies</subject><subject>Complexity</subject><subject>Empirical analysis</subject><subject>Forests</subject><subject>Multivariate analysis</subject><subject>Statistical analysis</subject><issn>2331-8422</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2019</creationdate><recordtype>article</recordtype><sourceid>ABUWG</sourceid><sourceid>AFKRA</sourceid><sourceid>AZQEC</sourceid><sourceid>BENPR</sourceid><sourceid>CCPQU</sourceid><sourceid>DWQXO</sourceid><recordid>eNpjYuA0MjY21LUwMTLiYOAtLs4yMDAwMjM3MjU15mSQdivNSy7JzM9LzFHwLM7PSQSxFdzyi1KLS3gYWNMSc4pTeaE0N4Oym2uIs4duQVF-YSlQQXxWfmkRUGdxvJGRgZmFkYGxiYExcaoACWorTg</recordid><startdate>20191009</startdate><enddate>20191009</enddate><creator>Staerman, Guillaume</creator><creator>Mozharovskyi, Pavlo</creator><creator>Clémençon, Stephan</creator><creator>d'Alché-Buc, Florence</creator><general>Cornell University Library, arXiv.org</general><scope>8FE</scope><scope>8FG</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>HCIFZ</scope><scope>L6V</scope><scope>M7S</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope></search><sort><creationdate>20191009</creationdate><title>Functional Isolation Forest</title><author>Staerman, Guillaume ; Mozharovskyi, Pavlo ; Clémençon, Stephan ; d'Alché-Buc, Florence</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-proquest_journals_22068203403</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2019</creationdate><topic>Algorithms</topic><topic>Anomalies</topic><topic>Complexity</topic><topic>Empirical analysis</topic><topic>Forests</topic><topic>Multivariate analysis</topic><topic>Statistical analysis</topic><toplevel>online_resources</toplevel><creatorcontrib>Staerman, Guillaume</creatorcontrib><creatorcontrib>Mozharovskyi, Pavlo</creatorcontrib><creatorcontrib>Clémençon, Stephan</creatorcontrib><creatorcontrib>d'Alché-Buc, Florence</creatorcontrib><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Materials Science &amp; Engineering Collection</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest Central UK/Ireland</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Engineering Collection</collection><collection>Engineering Database</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering Collection</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Staerman, Guillaume</au><au>Mozharovskyi, Pavlo</au><au>Clémençon, Stephan</au><au>d'Alché-Buc, Florence</au><format>book</format><genre>document</genre><ristype>GEN</ristype><atitle>Functional Isolation Forest</atitle><jtitle>arXiv.org</jtitle><date>2019-10-09</date><risdate>2019</risdate><eissn>2331-8422</eissn><abstract>For the purpose of monitoring the behavior of complex infrastructures (e.g. aircrafts, transport or energy networks), high-rate sensors are deployed to capture multivariate data, generally unlabeled, in quasi continuous-time to detect quickly the occurrence of anomalies that may jeopardize the smooth operation of the system of interest. The statistical analysis of such massive data of functional nature raises many challenging methodological questions. The primary goal of this paper is to extend the popular Isolation Forest (IF) approach to Anomaly Detection, originally dedicated to finite dimensional observations, to functional data. The major difficulty lies in the wide variety of topological structures that may equip a space of functions and the great variety of patterns that may characterize abnormal curves. We address the issue of (randomly) splitting the functional space in a flexible manner in order to isolate progressively any trajectory from the others, a key ingredient to the efficiency of the algorithm. Beyond a detailed description of the algorithm, computational complexity and stability issues are investigated at length. From the scoring function measuring the degree of abnormality of an observation provided by the proposed variant of the IF algorithm, a Functional Statistical Depth function is defined and discussed as well as a multivariate functional extension. Numerical experiments provide strong empirical evidence of the accuracy of the extension proposed.</abstract><cop>Ithaca</cop><pub>Cornell University Library, arXiv.org</pub><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier EISSN: 2331-8422
ispartof arXiv.org, 2019-10
issn 2331-8422
language eng
recordid cdi_proquest_journals_2206820340
source Freely Accessible Journals
subjects Algorithms
Anomalies
Complexity
Empirical analysis
Forests
Multivariate analysis
Statistical analysis
title Functional Isolation Forest
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-15T12%3A14%3A20IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=document&rft.atitle=Functional%20Isolation%20Forest&rft.jtitle=arXiv.org&rft.au=Staerman,%20Guillaume&rft.date=2019-10-09&rft.eissn=2331-8422&rft_id=info:doi/&rft_dat=%3Cproquest%3E2206820340%3C/proquest%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2206820340&rft_id=info:pmid/&rfr_iscdi=true