Functional Isolation Forest
For the purpose of monitoring the behavior of complex infrastructures (e.g. aircrafts, transport or energy networks), high-rate sensors are deployed to capture multivariate data, generally unlabeled, in quasi continuous-time to detect quickly the occurrence of anomalies that may jeopardize the smoot...
Gespeichert in:
Veröffentlicht in: | arXiv.org 2019-10 |
---|---|
Hauptverfasser: | , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | |
---|---|
container_issue | |
container_start_page | |
container_title | arXiv.org |
container_volume | |
creator | Staerman, Guillaume Mozharovskyi, Pavlo Clémençon, Stephan d'Alché-Buc, Florence |
description | For the purpose of monitoring the behavior of complex infrastructures (e.g. aircrafts, transport or energy networks), high-rate sensors are deployed to capture multivariate data, generally unlabeled, in quasi continuous-time to detect quickly the occurrence of anomalies that may jeopardize the smooth operation of the system of interest. The statistical analysis of such massive data of functional nature raises many challenging methodological questions. The primary goal of this paper is to extend the popular Isolation Forest (IF) approach to Anomaly Detection, originally dedicated to finite dimensional observations, to functional data. The major difficulty lies in the wide variety of topological structures that may equip a space of functions and the great variety of patterns that may characterize abnormal curves. We address the issue of (randomly) splitting the functional space in a flexible manner in order to isolate progressively any trajectory from the others, a key ingredient to the efficiency of the algorithm. Beyond a detailed description of the algorithm, computational complexity and stability issues are investigated at length. From the scoring function measuring the degree of abnormality of an observation provided by the proposed variant of the IF algorithm, a Functional Statistical Depth function is defined and discussed as well as a multivariate functional extension. Numerical experiments provide strong empirical evidence of the accuracy of the extension proposed. |
format | Article |
fullrecord | <record><control><sourceid>proquest</sourceid><recordid>TN_cdi_proquest_journals_2206820340</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2206820340</sourcerecordid><originalsourceid>FETCH-proquest_journals_22068203403</originalsourceid><addsrcrecordid>eNpjYuA0MjY21LUwMTLiYOAtLs4yMDAwMjM3MjU15mSQdivNSy7JzM9LzFHwLM7PSQSxFdzyi1KLS3gYWNMSc4pTeaE0N4Oym2uIs4duQVF-YSlQQXxWfmkRUGdxvJGRgZmFkYGxiYExcaoACWorTg</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2206820340</pqid></control><display><type>article</type><title>Functional Isolation Forest</title><source>Freely Accessible Journals</source><creator>Staerman, Guillaume ; Mozharovskyi, Pavlo ; Clémençon, Stephan ; d'Alché-Buc, Florence</creator><creatorcontrib>Staerman, Guillaume ; Mozharovskyi, Pavlo ; Clémençon, Stephan ; d'Alché-Buc, Florence</creatorcontrib><description>For the purpose of monitoring the behavior of complex infrastructures (e.g. aircrafts, transport or energy networks), high-rate sensors are deployed to capture multivariate data, generally unlabeled, in quasi continuous-time to detect quickly the occurrence of anomalies that may jeopardize the smooth operation of the system of interest. The statistical analysis of such massive data of functional nature raises many challenging methodological questions. The primary goal of this paper is to extend the popular Isolation Forest (IF) approach to Anomaly Detection, originally dedicated to finite dimensional observations, to functional data. The major difficulty lies in the wide variety of topological structures that may equip a space of functions and the great variety of patterns that may characterize abnormal curves. We address the issue of (randomly) splitting the functional space in a flexible manner in order to isolate progressively any trajectory from the others, a key ingredient to the efficiency of the algorithm. Beyond a detailed description of the algorithm, computational complexity and stability issues are investigated at length. From the scoring function measuring the degree of abnormality of an observation provided by the proposed variant of the IF algorithm, a Functional Statistical Depth function is defined and discussed as well as a multivariate functional extension. Numerical experiments provide strong empirical evidence of the accuracy of the extension proposed.</description><identifier>EISSN: 2331-8422</identifier><language>eng</language><publisher>Ithaca: Cornell University Library, arXiv.org</publisher><subject>Algorithms ; Anomalies ; Complexity ; Empirical analysis ; Forests ; Multivariate analysis ; Statistical analysis</subject><ispartof>arXiv.org, 2019-10</ispartof><rights>2019. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>781,785</link.rule.ids></links><search><creatorcontrib>Staerman, Guillaume</creatorcontrib><creatorcontrib>Mozharovskyi, Pavlo</creatorcontrib><creatorcontrib>Clémençon, Stephan</creatorcontrib><creatorcontrib>d'Alché-Buc, Florence</creatorcontrib><title>Functional Isolation Forest</title><title>arXiv.org</title><description>For the purpose of monitoring the behavior of complex infrastructures (e.g. aircrafts, transport or energy networks), high-rate sensors are deployed to capture multivariate data, generally unlabeled, in quasi continuous-time to detect quickly the occurrence of anomalies that may jeopardize the smooth operation of the system of interest. The statistical analysis of such massive data of functional nature raises many challenging methodological questions. The primary goal of this paper is to extend the popular Isolation Forest (IF) approach to Anomaly Detection, originally dedicated to finite dimensional observations, to functional data. The major difficulty lies in the wide variety of topological structures that may equip a space of functions and the great variety of patterns that may characterize abnormal curves. We address the issue of (randomly) splitting the functional space in a flexible manner in order to isolate progressively any trajectory from the others, a key ingredient to the efficiency of the algorithm. Beyond a detailed description of the algorithm, computational complexity and stability issues are investigated at length. From the scoring function measuring the degree of abnormality of an observation provided by the proposed variant of the IF algorithm, a Functional Statistical Depth function is defined and discussed as well as a multivariate functional extension. Numerical experiments provide strong empirical evidence of the accuracy of the extension proposed.</description><subject>Algorithms</subject><subject>Anomalies</subject><subject>Complexity</subject><subject>Empirical analysis</subject><subject>Forests</subject><subject>Multivariate analysis</subject><subject>Statistical analysis</subject><issn>2331-8422</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2019</creationdate><recordtype>article</recordtype><sourceid>ABUWG</sourceid><sourceid>AFKRA</sourceid><sourceid>AZQEC</sourceid><sourceid>BENPR</sourceid><sourceid>CCPQU</sourceid><sourceid>DWQXO</sourceid><recordid>eNpjYuA0MjY21LUwMTLiYOAtLs4yMDAwMjM3MjU15mSQdivNSy7JzM9LzFHwLM7PSQSxFdzyi1KLS3gYWNMSc4pTeaE0N4Oym2uIs4duQVF-YSlQQXxWfmkRUGdxvJGRgZmFkYGxiYExcaoACWorTg</recordid><startdate>20191009</startdate><enddate>20191009</enddate><creator>Staerman, Guillaume</creator><creator>Mozharovskyi, Pavlo</creator><creator>Clémençon, Stephan</creator><creator>d'Alché-Buc, Florence</creator><general>Cornell University Library, arXiv.org</general><scope>8FE</scope><scope>8FG</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>HCIFZ</scope><scope>L6V</scope><scope>M7S</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope></search><sort><creationdate>20191009</creationdate><title>Functional Isolation Forest</title><author>Staerman, Guillaume ; Mozharovskyi, Pavlo ; Clémençon, Stephan ; d'Alché-Buc, Florence</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-proquest_journals_22068203403</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2019</creationdate><topic>Algorithms</topic><topic>Anomalies</topic><topic>Complexity</topic><topic>Empirical analysis</topic><topic>Forests</topic><topic>Multivariate analysis</topic><topic>Statistical analysis</topic><toplevel>online_resources</toplevel><creatorcontrib>Staerman, Guillaume</creatorcontrib><creatorcontrib>Mozharovskyi, Pavlo</creatorcontrib><creatorcontrib>Clémençon, Stephan</creatorcontrib><creatorcontrib>d'Alché-Buc, Florence</creatorcontrib><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Materials Science & Engineering Collection</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest Central UK/Ireland</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Engineering Collection</collection><collection>Engineering Database</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering Collection</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Staerman, Guillaume</au><au>Mozharovskyi, Pavlo</au><au>Clémençon, Stephan</au><au>d'Alché-Buc, Florence</au><format>book</format><genre>document</genre><ristype>GEN</ristype><atitle>Functional Isolation Forest</atitle><jtitle>arXiv.org</jtitle><date>2019-10-09</date><risdate>2019</risdate><eissn>2331-8422</eissn><abstract>For the purpose of monitoring the behavior of complex infrastructures (e.g. aircrafts, transport or energy networks), high-rate sensors are deployed to capture multivariate data, generally unlabeled, in quasi continuous-time to detect quickly the occurrence of anomalies that may jeopardize the smooth operation of the system of interest. The statistical analysis of such massive data of functional nature raises many challenging methodological questions. The primary goal of this paper is to extend the popular Isolation Forest (IF) approach to Anomaly Detection, originally dedicated to finite dimensional observations, to functional data. The major difficulty lies in the wide variety of topological structures that may equip a space of functions and the great variety of patterns that may characterize abnormal curves. We address the issue of (randomly) splitting the functional space in a flexible manner in order to isolate progressively any trajectory from the others, a key ingredient to the efficiency of the algorithm. Beyond a detailed description of the algorithm, computational complexity and stability issues are investigated at length. From the scoring function measuring the degree of abnormality of an observation provided by the proposed variant of the IF algorithm, a Functional Statistical Depth function is defined and discussed as well as a multivariate functional extension. Numerical experiments provide strong empirical evidence of the accuracy of the extension proposed.</abstract><cop>Ithaca</cop><pub>Cornell University Library, arXiv.org</pub><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | EISSN: 2331-8422 |
ispartof | arXiv.org, 2019-10 |
issn | 2331-8422 |
language | eng |
recordid | cdi_proquest_journals_2206820340 |
source | Freely Accessible Journals |
subjects | Algorithms Anomalies Complexity Empirical analysis Forests Multivariate analysis Statistical analysis |
title | Functional Isolation Forest |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-15T12%3A14%3A20IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=document&rft.atitle=Functional%20Isolation%20Forest&rft.jtitle=arXiv.org&rft.au=Staerman,%20Guillaume&rft.date=2019-10-09&rft.eissn=2331-8422&rft_id=info:doi/&rft_dat=%3Cproquest%3E2206820340%3C/proquest%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2206820340&rft_id=info:pmid/&rfr_iscdi=true |