Novelty Detection and Online Learning for Chunk Data Streams

Datastream analysis aims at extracting discriminative information for classification from continuously incoming samples. It is extremely challenging to detect novel data while incrementally updating the model efficiently and stably, especially for high-dimensional and/or large-scale data streams. Th...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on pattern analysis and machine intelligence 2021-07, Vol.43 (7), p.2400-2412
Hauptverfasser: Wang, Yi, Ding, Yi, He, Xiangjian, Fan, Xin, Lin, Chi, Li, Fengqi, Wang, Tianzhu, Luo, Zhongxuan, Luo, Jiebo
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 2412
container_issue 7
container_start_page 2400
container_title IEEE transactions on pattern analysis and machine intelligence
container_volume 43
creator Wang, Yi
Ding, Yi
He, Xiangjian
Fan, Xin
Lin, Chi
Li, Fengqi
Wang, Tianzhu
Luo, Zhongxuan
Luo, Jiebo
description Datastream analysis aims at extracting discriminative information for classification from continuously incoming samples. It is extremely challenging to detect novel data while incrementally updating the model efficiently and stably, especially for high-dimensional and/or large-scale data streams. This paper proposes an efficient framework for novelty detection and incremental learning for unlabeled chunk data streams. First, an accurate factorization-free kernel discriminative analysis (FKDA-X) is put forward through solving a linear system in the kernel space. FKDA-X produces a Reproducing Kernel Hilbert Space (RKHS), in which unlabeled chunk data can be detected and classified by multiple known-classes in a single decision model with a deterministic classification boundary. Moreover, based on FKDA-X, two optimal methods FKDA-CX and FKDA-C are proposed. FKDA-CX uses the micro-cluster centers of original data as the input to achieve excellent performance in novelty detection. FKDA-C and incremental FKDA-C (IFKDA-C) using the class centers of original data as their input have extremely fast speed in online learning. Theoretical analysis and experimental validation on under-sampled and large-scale real-world datasets demonstrate that the proposed algorithms make it possible to learn unlabeled chunk data streams with significantly lower computational costs and comparable accuracies than the state-of-the-art approaches.
doi_str_mv 10.1109/TPAMI.2020.2965531
format Article
fullrecord <record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_ieee_primary_8955936</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>8955936</ieee_id><sourcerecordid>2539352006</sourcerecordid><originalsourceid>FETCH-LOGICAL-c351t-8528fe90632d94334a5a1c772b18c83a1ea3e2a5793e32c4bee1afaa77abd7683</originalsourceid><addsrcrecordid>eNqNkE1v1DAQhi0EosvCHwAJReoFCWWxPXFiS1yqlI9KC0WinK1JdgIpWbvYTlH_PV52aaWeOI0Pz_vO-GHsueArIbh5c_Hl5NPZSnLJV9LUSoF4wBbCgClBgXnIFlzUstRa6iP2JMZLzkWlODxmRyBMxZXkC_b2s7-mKd0Up5SoT6N3BbpNce6m0VGxJgxudN-LwYei_TG7n8UpJiy-pkC4jU_ZowGnSM8Oc8m-vX930X4s1-cfztqTddmDEqnUSuqBDK9BbkwFUKFC0TeN7ITuNaAgBJKoGgMEsq86IoEDYtNgt2lqDUv2at97FfyvmWKy2zH2NE3oyM_RSgDTaG1ywZId30Mv_Rxcvs7KLAXyp_MdSyb3VB98jIEGexXGLYYbK7jdubV_3dqdW3twm0MvD9Vzt6XNbeSfzAzoPfCbOj_EfiTX0y3G82IjVcVNfommHRPudLd-dilHX_9_NNMv9vRIdEdpo5SBGv4A03mcGw</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2539352006</pqid></control><display><type>article</type><title>Novelty Detection and Online Learning for Chunk Data Streams</title><source>IEEE Electronic Library (IEL)</source><creator>Wang, Yi ; Ding, Yi ; He, Xiangjian ; Fan, Xin ; Lin, Chi ; Li, Fengqi ; Wang, Tianzhu ; Luo, Zhongxuan ; Luo, Jiebo</creator><creatorcontrib>Wang, Yi ; Ding, Yi ; He, Xiangjian ; Fan, Xin ; Lin, Chi ; Li, Fengqi ; Wang, Tianzhu ; Luo, Zhongxuan ; Luo, Jiebo</creatorcontrib><description>Datastream analysis aims at extracting discriminative information for classification from continuously incoming samples. It is extremely challenging to detect novel data while incrementally updating the model efficiently and stably, especially for high-dimensional and/or large-scale data streams. This paper proposes an efficient framework for novelty detection and incremental learning for unlabeled chunk data streams. First, an accurate factorization-free kernel discriminative analysis (FKDA-X) is put forward through solving a linear system in the kernel space. FKDA-X produces a Reproducing Kernel Hilbert Space (RKHS), in which unlabeled chunk data can be detected and classified by multiple known-classes in a single decision model with a deterministic classification boundary. Moreover, based on FKDA-X, two optimal methods FKDA-CX and FKDA-C are proposed. FKDA-CX uses the micro-cluster centers of original data as the input to achieve excellent performance in novelty detection. FKDA-C and incremental FKDA-C (IFKDA-C) using the class centers of original data as their input have extremely fast speed in online learning. Theoretical analysis and experimental validation on under-sampled and large-scale real-world datasets demonstrate that the proposed algorithms make it possible to learn unlabeled chunk data streams with significantly lower computational costs and comparable accuracies than the state-of-the-art approaches.</description><identifier>ISSN: 0162-8828</identifier><identifier>EISSN: 1939-3539</identifier><identifier>EISSN: 2160-9292</identifier><identifier>DOI: 10.1109/TPAMI.2020.2965531</identifier><identifier>PMID: 31940520</identifier><identifier>CODEN: ITPIDJ</identifier><language>eng</language><publisher>LOS ALAMITOS: IEEE</publisher><subject>Algorithms ; Classification ; Computer Science ; Computer Science, Artificial Intelligence ; Data models ; Data stream ; Data transmission ; Distance learning ; Engineering ; Engineering, Electrical &amp; Electronic ; Fans ; Feature extraction ; feature selection ; Hilbert space ; Kernel ; Kernels ; Linear systems ; Machine learning ; novelty detection ; online learning ; Science &amp; Technology ; Streaming media ; Technology</subject><ispartof>IEEE transactions on pattern analysis and machine intelligence, 2021-07, Vol.43 (7), p.2400-2412</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2021</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>true</woscitedreferencessubscribed><woscitedreferencescount>14</woscitedreferencescount><woscitedreferencesoriginalsourcerecordid>wos000692540900017</woscitedreferencesoriginalsourcerecordid><citedby>FETCH-LOGICAL-c351t-8528fe90632d94334a5a1c772b18c83a1ea3e2a5793e32c4bee1afaa77abd7683</citedby><cites>FETCH-LOGICAL-c351t-8528fe90632d94334a5a1c772b18c83a1ea3e2a5793e32c4bee1afaa77abd7683</cites><orcidid>0000-0002-8991-4188 ; 0000-0002-0302-5102 ; 0000-0003-4056-548X ; 0000-0001-8962-540X ; 0000-0002-4516-9729</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/8955936$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>315,781,785,797,27929,27930,39263,54763</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/8955936$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/31940520$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Wang, Yi</creatorcontrib><creatorcontrib>Ding, Yi</creatorcontrib><creatorcontrib>He, Xiangjian</creatorcontrib><creatorcontrib>Fan, Xin</creatorcontrib><creatorcontrib>Lin, Chi</creatorcontrib><creatorcontrib>Li, Fengqi</creatorcontrib><creatorcontrib>Wang, Tianzhu</creatorcontrib><creatorcontrib>Luo, Zhongxuan</creatorcontrib><creatorcontrib>Luo, Jiebo</creatorcontrib><title>Novelty Detection and Online Learning for Chunk Data Streams</title><title>IEEE transactions on pattern analysis and machine intelligence</title><addtitle>TPAMI</addtitle><addtitle>IEEE T PATTERN ANAL</addtitle><addtitle>IEEE Trans Pattern Anal Mach Intell</addtitle><description>Datastream analysis aims at extracting discriminative information for classification from continuously incoming samples. It is extremely challenging to detect novel data while incrementally updating the model efficiently and stably, especially for high-dimensional and/or large-scale data streams. This paper proposes an efficient framework for novelty detection and incremental learning for unlabeled chunk data streams. First, an accurate factorization-free kernel discriminative analysis (FKDA-X) is put forward through solving a linear system in the kernel space. FKDA-X produces a Reproducing Kernel Hilbert Space (RKHS), in which unlabeled chunk data can be detected and classified by multiple known-classes in a single decision model with a deterministic classification boundary. Moreover, based on FKDA-X, two optimal methods FKDA-CX and FKDA-C are proposed. FKDA-CX uses the micro-cluster centers of original data as the input to achieve excellent performance in novelty detection. FKDA-C and incremental FKDA-C (IFKDA-C) using the class centers of original data as their input have extremely fast speed in online learning. Theoretical analysis and experimental validation on under-sampled and large-scale real-world datasets demonstrate that the proposed algorithms make it possible to learn unlabeled chunk data streams with significantly lower computational costs and comparable accuracies than the state-of-the-art approaches.</description><subject>Algorithms</subject><subject>Classification</subject><subject>Computer Science</subject><subject>Computer Science, Artificial Intelligence</subject><subject>Data models</subject><subject>Data stream</subject><subject>Data transmission</subject><subject>Distance learning</subject><subject>Engineering</subject><subject>Engineering, Electrical &amp; Electronic</subject><subject>Fans</subject><subject>Feature extraction</subject><subject>feature selection</subject><subject>Hilbert space</subject><subject>Kernel</subject><subject>Kernels</subject><subject>Linear systems</subject><subject>Machine learning</subject><subject>novelty detection</subject><subject>online learning</subject><subject>Science &amp; Technology</subject><subject>Streaming media</subject><subject>Technology</subject><issn>0162-8828</issn><issn>1939-3539</issn><issn>2160-9292</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2021</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><sourceid>HGBXW</sourceid><recordid>eNqNkE1v1DAQhi0EosvCHwAJReoFCWWxPXFiS1yqlI9KC0WinK1JdgIpWbvYTlH_PV52aaWeOI0Pz_vO-GHsueArIbh5c_Hl5NPZSnLJV9LUSoF4wBbCgClBgXnIFlzUstRa6iP2JMZLzkWlODxmRyBMxZXkC_b2s7-mKd0Up5SoT6N3BbpNce6m0VGxJgxudN-LwYei_TG7n8UpJiy-pkC4jU_ZowGnSM8Oc8m-vX930X4s1-cfztqTddmDEqnUSuqBDK9BbkwFUKFC0TeN7ITuNaAgBJKoGgMEsq86IoEDYtNgt2lqDUv2at97FfyvmWKy2zH2NE3oyM_RSgDTaG1ywZId30Mv_Rxcvs7KLAXyp_MdSyb3VB98jIEGexXGLYYbK7jdubV_3dqdW3twm0MvD9Vzt6XNbeSfzAzoPfCbOj_EfiTX0y3G82IjVcVNfommHRPudLd-dilHX_9_NNMv9vRIdEdpo5SBGv4A03mcGw</recordid><startdate>20210701</startdate><enddate>20210701</enddate><creator>Wang, Yi</creator><creator>Ding, Yi</creator><creator>He, Xiangjian</creator><creator>Fan, Xin</creator><creator>Lin, Chi</creator><creator>Li, Fengqi</creator><creator>Wang, Tianzhu</creator><creator>Luo, Zhongxuan</creator><creator>Luo, Jiebo</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>BLEPL</scope><scope>DTL</scope><scope>HGBXW</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>7X8</scope><orcidid>https://orcid.org/0000-0002-8991-4188</orcidid><orcidid>https://orcid.org/0000-0002-0302-5102</orcidid><orcidid>https://orcid.org/0000-0003-4056-548X</orcidid><orcidid>https://orcid.org/0000-0001-8962-540X</orcidid><orcidid>https://orcid.org/0000-0002-4516-9729</orcidid></search><sort><creationdate>20210701</creationdate><title>Novelty Detection and Online Learning for Chunk Data Streams</title><author>Wang, Yi ; Ding, Yi ; He, Xiangjian ; Fan, Xin ; Lin, Chi ; Li, Fengqi ; Wang, Tianzhu ; Luo, Zhongxuan ; Luo, Jiebo</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c351t-8528fe90632d94334a5a1c772b18c83a1ea3e2a5793e32c4bee1afaa77abd7683</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2021</creationdate><topic>Algorithms</topic><topic>Classification</topic><topic>Computer Science</topic><topic>Computer Science, Artificial Intelligence</topic><topic>Data models</topic><topic>Data stream</topic><topic>Data transmission</topic><topic>Distance learning</topic><topic>Engineering</topic><topic>Engineering, Electrical &amp; Electronic</topic><topic>Fans</topic><topic>Feature extraction</topic><topic>feature selection</topic><topic>Hilbert space</topic><topic>Kernel</topic><topic>Kernels</topic><topic>Linear systems</topic><topic>Machine learning</topic><topic>novelty detection</topic><topic>online learning</topic><topic>Science &amp; Technology</topic><topic>Streaming media</topic><topic>Technology</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Wang, Yi</creatorcontrib><creatorcontrib>Ding, Yi</creatorcontrib><creatorcontrib>He, Xiangjian</creatorcontrib><creatorcontrib>Fan, Xin</creatorcontrib><creatorcontrib>Lin, Chi</creatorcontrib><creatorcontrib>Li, Fengqi</creatorcontrib><creatorcontrib>Wang, Tianzhu</creatorcontrib><creatorcontrib>Luo, Zhongxuan</creatorcontrib><creatorcontrib>Luo, Jiebo</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>Web of Science Core Collection</collection><collection>Science Citation Index Expanded</collection><collection>Web of Science - Science Citation Index Expanded - 2021</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics &amp; Communications Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>MEDLINE - Academic</collection><jtitle>IEEE transactions on pattern analysis and machine intelligence</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Wang, Yi</au><au>Ding, Yi</au><au>He, Xiangjian</au><au>Fan, Xin</au><au>Lin, Chi</au><au>Li, Fengqi</au><au>Wang, Tianzhu</au><au>Luo, Zhongxuan</au><au>Luo, Jiebo</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Novelty Detection and Online Learning for Chunk Data Streams</atitle><jtitle>IEEE transactions on pattern analysis and machine intelligence</jtitle><stitle>TPAMI</stitle><stitle>IEEE T PATTERN ANAL</stitle><addtitle>IEEE Trans Pattern Anal Mach Intell</addtitle><date>2021-07-01</date><risdate>2021</risdate><volume>43</volume><issue>7</issue><spage>2400</spage><epage>2412</epage><pages>2400-2412</pages><issn>0162-8828</issn><eissn>1939-3539</eissn><eissn>2160-9292</eissn><coden>ITPIDJ</coden><abstract>Datastream analysis aims at extracting discriminative information for classification from continuously incoming samples. It is extremely challenging to detect novel data while incrementally updating the model efficiently and stably, especially for high-dimensional and/or large-scale data streams. This paper proposes an efficient framework for novelty detection and incremental learning for unlabeled chunk data streams. First, an accurate factorization-free kernel discriminative analysis (FKDA-X) is put forward through solving a linear system in the kernel space. FKDA-X produces a Reproducing Kernel Hilbert Space (RKHS), in which unlabeled chunk data can be detected and classified by multiple known-classes in a single decision model with a deterministic classification boundary. Moreover, based on FKDA-X, two optimal methods FKDA-CX and FKDA-C are proposed. FKDA-CX uses the micro-cluster centers of original data as the input to achieve excellent performance in novelty detection. FKDA-C and incremental FKDA-C (IFKDA-C) using the class centers of original data as their input have extremely fast speed in online learning. Theoretical analysis and experimental validation on under-sampled and large-scale real-world datasets demonstrate that the proposed algorithms make it possible to learn unlabeled chunk data streams with significantly lower computational costs and comparable accuracies than the state-of-the-art approaches.</abstract><cop>LOS ALAMITOS</cop><pub>IEEE</pub><pmid>31940520</pmid><doi>10.1109/TPAMI.2020.2965531</doi><tpages>13</tpages><orcidid>https://orcid.org/0000-0002-8991-4188</orcidid><orcidid>https://orcid.org/0000-0002-0302-5102</orcidid><orcidid>https://orcid.org/0000-0003-4056-548X</orcidid><orcidid>https://orcid.org/0000-0001-8962-540X</orcidid><orcidid>https://orcid.org/0000-0002-4516-9729</orcidid></addata></record>
fulltext fulltext_linktorsrc
identifier ISSN: 0162-8828
ispartof IEEE transactions on pattern analysis and machine intelligence, 2021-07, Vol.43 (7), p.2400-2412
issn 0162-8828
1939-3539
2160-9292
language eng
recordid cdi_ieee_primary_8955936
source IEEE Electronic Library (IEL)
subjects Algorithms
Classification
Computer Science
Computer Science, Artificial Intelligence
Data models
Data stream
Data transmission
Distance learning
Engineering
Engineering, Electrical & Electronic
Fans
Feature extraction
feature selection
Hilbert space
Kernel
Kernels
Linear systems
Machine learning
novelty detection
online learning
Science & Technology
Streaming media
Technology
title Novelty Detection and Online Learning for Chunk Data Streams
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-15T19%3A38%3A16IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Novelty%20Detection%20and%20Online%20Learning%20for%20Chunk%20Data%20Streams&rft.jtitle=IEEE%20transactions%20on%20pattern%20analysis%20and%20machine%20intelligence&rft.au=Wang,%20Yi&rft.date=2021-07-01&rft.volume=43&rft.issue=7&rft.spage=2400&rft.epage=2412&rft.pages=2400-2412&rft.issn=0162-8828&rft.eissn=1939-3539&rft.coden=ITPIDJ&rft_id=info:doi/10.1109/TPAMI.2020.2965531&rft_dat=%3Cproquest_RIE%3E2539352006%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2539352006&rft_id=info:pmid/31940520&rft_ieee_id=8955936&rfr_iscdi=true