Novelty Detection and Online Learning for Chunk Data Streams

Datastream analysis aims at extracting discriminative information for classification from continuously incoming samples. It is extremely challenging to detect novel data while incrementally updating the model efficiently and stably, especially for high-dimensional and/or large-scale data streams. Th...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE transactions on pattern analysis and machine intelligence 2021-07, Vol.43 (7), p.2400-2412
Hauptverfasser:	Wang, Yi, Ding, Yi, He, Xiangjian, Fan, Xin, Lin, Chi, Li, Fengqi, Wang, Tianzhu, Luo, Zhongxuan, Luo, Jiebo
Format:	Artikel
Sprache:	eng
Schlagworte:	Algorithms Classification Computer Science Computer Science, Artificial Intelligence Data models Data stream Data transmission Distance learning Engineering Engineering, Electrical & Electronic Fans Feature extraction feature selection Hilbert space Kernel Kernels Linear systems Machine learning novelty detection online learning Science & Technology Streaming media Technology
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	2412
container_issue	7
container_start_page	2400
container_title	IEEE transactions on pattern analysis and machine intelligence
container_volume	43
creator	Wang, Yi Ding, Yi He, Xiangjian Fan, Xin Lin, Chi Li, Fengqi Wang, Tianzhu Luo, Zhongxuan Luo, Jiebo
description	Datastream analysis aims at extracting discriminative information for classification from continuously incoming samples. It is extremely challenging to detect novel data while incrementally updating the model efficiently and stably, especially for high-dimensional and/or large-scale data streams. This paper proposes an efficient framework for novelty detection and incremental learning for unlabeled chunk data streams. First, an accurate factorization-free kernel discriminative analysis (FKDA-X) is put forward through solving a linear system in the kernel space. FKDA-X produces a Reproducing Kernel Hilbert Space (RKHS), in which unlabeled chunk data can be detected and classified by multiple known-classes in a single decision model with a deterministic classification boundary. Moreover, based on FKDA-X, two optimal methods FKDA-CX and FKDA-C are proposed. FKDA-CX uses the micro-cluster centers of original data as the input to achieve excellent performance in novelty detection. FKDA-C and incremental FKDA-C (IFKDA-C) using the class centers of original data as their input have extremely fast speed in online learning. Theoretical analysis and experimental validation on under-sampled and large-scale real-world datasets demonstrate that the proposed algorithms make it possible to learn unlabeled chunk data streams with significantly lower computational costs and comparable accuracies than the state-of-the-art approaches.
doi_str_mv	10.1109/TPAMI.2020.2965531
format	Article
fullrecord	<record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_ieee_primary_8955936</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>8955936</ieee_id><sourcerecordid>2539352006</sourcerecordid><originalsourceid>FETCH-LOGICAL-c351t-8528fe90632d94334a5a1c772b18c83a1ea3e2a5793e32c4bee1afaa77abd7683</originalsourceid><addsrcrecordid>eNqNkE1v1DAQhi0EosvCHwAJReoFCWWxPXFiS1yqlI9KC0WinK1JdgIpWbvYTlH_PV52aaWeOI0Pz_vO-GHsueArIbh5c_Hl5NPZSnLJV9LUSoF4wBbCgClBgXnIFlzUstRa6iP2JMZLzkWlODxmRyBMxZXkC_b2s7-mKd0Up5SoT6N3BbpNce6m0VGxJgxudN-LwYei_TG7n8UpJiy-pkC4jU_ZowGnSM8Oc8m-vX930X4s1-cfztqTddmDEqnUSuqBDK9BbkwFUKFC0TeN7ITuNaAgBJKoGgMEsq86IoEDYtNgt2lqDUv2at97FfyvmWKy2zH2NE3oyM_RSgDTaG1ywZId30Mv_Rxcvs7KLAXyp_MdSyb3VB98jIEGexXGLYYbK7jdubV_3dqdW3twm0MvD9Vzt6XNbeSfzAzoPfCbOj_EfiTX0y3G82IjVcVNfommHRPudLd-dilHX_9_NNMv9vRIdEdpo5SBGv4A03mcGw</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2539352006</pqid></control><display><type>article</type><title>Novelty Detection and Online Learning for Chunk Data Streams</title><source>IEEE Electronic Library (IEL)</source><creator>Wang, Yi ; Ding, Yi ; He, Xiangjian ; Fan, Xin ; Lin, Chi ; Li, Fengqi ; Wang, Tianzhu ; Luo, Zhongxuan ; Luo, Jiebo</creator><creatorcontrib>Wang, Yi ; Ding, Yi ; He, Xiangjian ; Fan, Xin ; Lin, Chi ; Li, Fengqi ; Wang, Tianzhu ; Luo, Zhongxuan ; Luo, Jiebo</creatorcontrib><description>Datastream analysis aims at extracting discriminative information for classification from continuously incoming samples. It is extremely challenging to detect novel data while incrementally updating the model efficiently and stably, especially for high-dimensional and/or large-scale data streams. This paper proposes an efficient framework for novelty detection and incremental learning for unlabeled chunk data streams. First, an accurate factorization-free kernel discriminative analysis (FKDA-X) is put forward through solving a linear system in the kernel space. FKDA-X produces a Reproducing Kernel Hilbert Space (RKHS), in which unlabeled chunk data can be detected and classified by multiple known-classes in a single decision model with a deterministic classification boundary. Moreover, based on FKDA-X, two optimal methods FKDA-CX and FKDA-C are proposed. FKDA-CX uses the micro-cluster centers of original data as the input to achieve excellent performance in novelty detection. FKDA-C and incremental FKDA-C (IFKDA-C) using the class centers of original data as their input have extremely fast speed in online learning. Theoretical analysis and experimental validation on under-sampled and large-scale real-world datasets demonstrate that the proposed algorithms make it possible to learn unlabeled chunk data streams with significantly lower computational costs and comparable accuracies than the state-of-the-art approaches.</description><identifier>ISSN: 0162-8828</identifier><identifier>EISSN: 1939-3539</identifier><identifier>EISSN: 2160-9292</identifier><identifier>DOI: 10.1109/TPAMI.2020.2965531</identifier><identifier>PMID: 31940520</identifier><identifier>CODEN: ITPIDJ</identifier><language>eng</language><publisher>LOS ALAMITOS: IEEE</publisher><subject>Algorithms ; Classification ; Computer Science ; Computer Science, Artificial Intelligence ; Data models ; Data stream ; Data transmission ; Distance learning ; Engineering ; Engineering, Electrical & Electronic ; Fans ; Feature extraction ; feature selection ; Hilbert space ; Kernel ; Kernels ; Linear systems ; Machine learning ; novelty detection ; online learning ; Science & Technology ; Streaming media ; Technology</subject><ispartof>IEEE transactions on pattern analysis and machine intelligence, 2021-07, Vol.43 (7), p.2400-2412</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2021</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>true</woscitedreferencessubscribed><woscitedreferencescount>14</woscitedreferencescount><woscitedreferencesoriginalsourcerecordid>wos000692540900017</woscitedreferencesoriginalsourcerecordid><citedby>FETCH-LOGICAL-c351t-8528fe90632d94334a5a1c772b18c83a1ea3e2a5793e32c4bee1afaa77abd7683</citedby><cites>FETCH-LOGICAL-c351t-8528fe90632d94334a5a1c772b18c83a1ea3e2a5793e32c4bee1afaa77abd7683</cites><orcidid>0000-0002-8991-4188 ; 0000-0002-0302-5102 ; 0000-0003-4056-548X ; 0000-0001-8962-540X ; 0000-0002-4516-9729</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/8955936$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>315,781,785,797,27929,27930,39263,54763</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/8955936$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/31940520$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Wang, Yi</creatorcontrib><creatorcontrib>Ding, Yi</creatorcontrib><creatorcontrib>He, Xiangjian</creatorcontrib><creatorcontrib>Fan, Xin</creatorcontrib><creatorcontrib>Lin, Chi</creatorcontrib><creatorcontrib>Li, Fengqi</creatorcontrib><creatorcontrib>Wang, Tianzhu</creatorcontrib><creatorcontrib>Luo, Zhongxuan</creatorcontrib><creatorcontrib>Luo, Jiebo</creatorcontrib><title>Novelty Detection and Online Learning for Chunk Data Streams</title><title>IEEE transactions on pattern analysis and machine intelligence</title><addtitle>TPAMI</addtitle><addtitle>IEEE T PATTERN ANAL</addtitle><addtitle>IEEE Trans Pattern Anal Mach Intell</addtitle><description>Datastream analysis aims at extracting discriminative information for classification from continuously incoming samples. It is extremely challenging to detect novel data while incrementally updating the model efficiently and stably, especially for high-dimensional and/or large-scale data streams. This paper proposes an efficient framework for novelty detection and incremental learning for unlabeled chunk data streams. First, an accurate factorization-free kernel discriminative analysis (FKDA-X) is put forward through solving a linear system in the kernel space. FKDA-X produces a Reproducing Kernel Hilbert Space (RKHS), in which unlabeled chunk data can be detected and classified by multiple known-classes in a single decision model with a deterministic classification boundary. Moreover, based on FKDA-X, two optimal methods FKDA-CX and FKDA-C are proposed. FKDA-CX uses the micro-cluster centers of original data as the input to achieve excellent performance in novelty detection. FKDA-C and incremental FKDA-C (IFKDA-C) using the class centers of original data as their input have extremely fast speed in online learning. Theoretical analysis and experimental validation on under-sampled and large-scale real-world datasets demonstrate that the proposed algorithms make it possible to learn unlabeled chunk data streams with significantly lower computational costs and comparable accuracies than the state-of-the-art approaches.</description><subject>Algorithms</subject><subject>Classification</subject><subject>Computer Science</subject><subject>Computer Science, Artificial Intelligence</subject><subject>Data models</subject><subject>Data stream</subject><subject>Data transmission</subject><subject>Distance learning</subject><subject>Engineering</subject><subject>Engineering, Electrical & Electronic</subject><subject>Fans</subject><subject>Feature extraction</subject><subject>feature selection</subject><subject>Hilbert space</subject><subject>Kernel</subject><subject>Kernels</subject><subject>Linear systems</subject><subject>Machine learning</subject><subject>novelty detection</subject><subject>online learning</subject><subject>Science & Technology</subject><subject>Streaming media</subject><subject>Technology</subject><issn>0162-8828</issn><issn>1939-3539</issn><issn>2160-9292</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2021</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><sourceid>HGBXW</sourceid><recordid>eNqNkE1v1DAQhi0EosvCHwAJReoFCWWxPXFiS1yqlI9KC0WinK1JdgIpWbvYTlH_PV52aaWeOI0Pz_vO-GHsueArIbh5c_Hl5NPZSnLJV9LUSoF4wBbCgClBgXnIFlzUstRa6iP2JMZLzkWlODxmRyBMxZXkC_b2s7-mKd0Up5SoT6N3BbpNce6m0VGxJgxudN-LwYei_TG7n8UpJiy-pkC4jU_ZowGnSM8Oc8m-vX930X4s1-cfztqTddmDEqnUSuqBDK9BbkwFUKFC0TeN7ITuNaAgBJKoGgMEsq86IoEDYtNgt2lqDUv2at97FfyvmWKy2zH2NE3oyM_RSgDTaG1ywZId30Mv_Rxcvs7KLAXyp_MdSyb3VB98jIEGexXGLYYbK7jdubV_3dqdW3twm0MvD9Vzt6XNbeSfzAzoPfCbOj_EfiTX0y3G82IjVcVNfommHRPudLd-dilHX_9_NNMv9vRIdEdpo5SBGv4A03mcGw</recordid><startdate>20210701</startdate><enddate>20210701</enddate><creator>Wang, Yi</creator><creator>Ding, Yi</creator><creator>He, Xiangjian</creator><creator>Fan, Xin</creator><creator>Lin, Chi</creator><creator>Li, Fengqi</creator><creator>Wang, Tianzhu</creator><creator>Luo, Zhongxuan</creator><creator>Luo, Jiebo</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>BLEPL</scope><scope>DTL</scope><scope>HGBXW</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>7X8</scope><orcidid>https://orcid.org/0000-0002-8991-4188</orcidid><orcidid>https://orcid.org/0000-0002-0302-5102</orcidid><orcidid>https://orcid.org/0000-0003-4056-548X</orcidid><orcidid>https://orcid.org/0000-0001-8962-540X</orcidid><orcidid>https://orcid.org/0000-0002-4516-9729</orcidid></search><sort><creationdate>20210701</creationdate><title>Novelty Detection and Online Learning for Chunk Data Streams</title><author>Wang, Yi ; Ding, Yi ; He, Xiangjian ; Fan, Xin ; Lin, Chi ; Li, Fengqi ; Wang, Tianzhu ; Luo, Zhongxuan ; Luo, Jiebo</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c351t-8528fe90632d94334a5a1c772b18c83a1ea3e2a5793e32c4bee1afaa77abd7683</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2021</creationdate><topic>Algorithms</topic><topic>Classification</topic><topic>Computer Science</topic><topic>Computer Science, Artificial Intelligence</topic><topic>Data models</topic><topic>Data stream</topic><topic>Data transmission</topic><topic>Distance learning</topic><topic>Engineering</topic><topic>Engineering, Electrical & Electronic</topic><topic>Fans</topic><topic>Feature extraction</topic><topic>feature selection</topic><topic>Hilbert space</topic><topic>Kernel</topic><topic>Kernels</topic><topic>Linear systems</topic><topic>Machine learning</topic><topic>novelty detection</topic><topic>online learning</topic><topic>Science & Technology</topic><topic>Streaming media</topic><topic>Technology</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Wang, Yi</creatorcontrib><creatorcontrib>Ding, Yi</creatorcontrib><creatorcontrib>He, Xiangjian</creatorcontrib><creatorcontrib>Fan, Xin</creatorcontrib><creatorcontrib>Lin, Chi</creatorcontrib><creatorcontrib>Li, Fengqi</creatorcontrib><creatorcontrib>Wang, Tianzhu</creatorcontrib><creatorcontrib>Luo, Zhongxuan</creatorcontrib><creatorcontrib>Luo, Jiebo</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>Web of Science Core Collection</collection><collection>Science Citation Index Expanded</collection><collection>Web of Science - Science Citation Index Expanded - 2021</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics & Communications Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>MEDLINE - Academic</collection><jtitle>IEEE transactions on pattern analysis and machine intelligence</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Wang, Yi</au><au>Ding, Yi</au><au>He, Xiangjian</au><au>Fan, Xin</au><au>Lin, Chi</au><au>Li, Fengqi</au><au>Wang, Tianzhu</au><au>Luo, Zhongxuan</au><au>Luo, Jiebo</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Novelty Detection and Online Learning for Chunk Data Streams</atitle><jtitle>IEEE transactions on pattern analysis and machine intelligence</jtitle><stitle>TPAMI</stitle><stitle>IEEE T PATTERN ANAL</stitle><addtitle>IEEE Trans Pattern Anal Mach Intell</addtitle><date>2021-07-01</date><risdate>2021</risdate><volume>43</volume><issue>7</issue><spage>2400</spage><epage>2412</epage><pages>2400-2412</pages><issn>0162-8828</issn><eissn>1939-3539</eissn><eissn>2160-9292</eissn><coden>ITPIDJ</coden><abstract>Datastream analysis aims at extracting discriminative information for classification from continuously incoming samples. It is extremely challenging to detect novel data while incrementally updating the model efficiently and stably, especially for high-dimensional and/or large-scale data streams. This paper proposes an efficient framework for novelty detection and incremental learning for unlabeled chunk data streams. First, an accurate factorization-free kernel discriminative analysis (FKDA-X) is put forward through solving a linear system in the kernel space. FKDA-X produces a Reproducing Kernel Hilbert Space (RKHS), in which unlabeled chunk data can be detected and classified by multiple known-classes in a single decision model with a deterministic classification boundary. Moreover, based on FKDA-X, two optimal methods FKDA-CX and FKDA-C are proposed. FKDA-CX uses the micro-cluster centers of original data as the input to achieve excellent performance in novelty detection. FKDA-C and incremental FKDA-C (IFKDA-C) using the class centers of original data as their input have extremely fast speed in online learning. Theoretical analysis and experimental validation on under-sampled and large-scale real-world datasets demonstrate that the proposed algorithms make it possible to learn unlabeled chunk data streams with significantly lower computational costs and comparable accuracies than the state-of-the-art approaches.</abstract><cop>LOS ALAMITOS</cop><pub>IEEE</pub><pmid>31940520</pmid><doi>10.1109/TPAMI.2020.2965531</doi><tpages>13</tpages><orcidid>https://orcid.org/0000-0002-8991-4188</orcidid><orcidid>https://orcid.org/0000-0002-0302-5102</orcidid><orcidid>https://orcid.org/0000-0003-4056-548X</orcidid><orcidid>https://orcid.org/0000-0001-8962-540X</orcidid><orcidid>https://orcid.org/0000-0002-4516-9729</orcidid></addata></record>
fulltext	fulltext_linktorsrc
identifier	ISSN: 0162-8828
ispartof	IEEE transactions on pattern analysis and machine intelligence, 2021-07, Vol.43 (7), p.2400-2412
issn	0162-8828 1939-3539 2160-9292
language	eng
recordid	cdi_ieee_primary_8955936
source	IEEE Electronic Library (IEL)
subjects	Algorithms Classification Computer Science Computer Science, Artificial Intelligence Data models Data stream Data transmission Distance learning Engineering Engineering, Electrical & Electronic Fans Feature extraction feature selection Hilbert space Kernel Kernels Linear systems Machine learning novelty detection online learning Science & Technology Streaming media Technology
title	Novelty Detection and Online Learning for Chunk Data Streams
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-15T19%3A38%3A16IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Novelty%20Detection%20and%20Online%20Learning%20for%20Chunk%20Data%20Streams&rft.jtitle=IEEE%20transactions%20on%20pattern%20analysis%20and%20machine%20intelligence&rft.au=Wang,%20Yi&rft.date=2021-07-01&rft.volume=43&rft.issue=7&rft.spage=2400&rft.epage=2412&rft.pages=2400-2412&rft.issn=0162-8828&rft.eissn=1939-3539&rft.coden=ITPIDJ&rft_id=info:doi/10.1109/TPAMI.2020.2965531&rft_dat=%3Cproquest_RIE%3E2539352006%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2539352006&rft_id=info:pmid/31940520&rft_ieee_id=8955936&rfr_iscdi=true