Data Stream Clustering With Affinity Propagation

Data stream clustering provides insights into the underlying patterns of data flows. This paper focuses on selecting the best representatives from clusters of streaming data. There are two main challenges: how to cluster with the best representatives and how to handle the evolving patterns that are...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on knowledge and data engineering 2014-07, Vol.26 (7), p.1644-1656
Hauptverfasser: Xiangliang Zhang, Furtlehner, Cyril, Germain-Renaud, Cecile, Sebag, Michele
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 1656
container_issue 7
container_start_page 1644
container_title IEEE transactions on knowledge and data engineering
container_volume 26
creator Xiangliang Zhang
Furtlehner, Cyril
Germain-Renaud, Cecile
Sebag, Michele
description Data stream clustering provides insights into the underlying patterns of data flows. This paper focuses on selecting the best representatives from clusters of streaming data. There are two main challenges: how to cluster with the best representatives and how to handle the evolving patterns that are important characteristics of streaming data with dynamic distributions. We employ the Affinity Propagation (AP) algorithm presented in 2007 by Frey and Dueck for the first challenge, as it offers good guarantees of clustering optimality for selecting exemplars. The second challenging problem is solved by change detection. The presented StrAP algorithm combines AP with a statistical change point detection test; the clustering model is rebuilt whenever the test detects a change in the underlying data distribution. Besides the validation on two benchmark data sets, the presented algorithm is validated on a real-world application, monitoring the data flow of jobs submitted to the EGEE grid.
doi_str_mv 10.1109/TKDE.2013.146
format Article
fullrecord <record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_ieee_primary_6585253</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>6585253</ieee_id><sourcerecordid>3387985901</sourcerecordid><originalsourceid>FETCH-LOGICAL-c358t-6f233230a3d9405a6c81b8b1885c1ec9c020e6605e1fe96ffbde487accbebbe33</originalsourceid><addsrcrecordid>eNo9kE1LAzEQhoMoWKtHT14WPHnYmtl8bHIsbbViQcGKx5BNkzal3a1JKvTfu8uKp3kZHl5mHoRuAY8AsHxcvk5nowIDGQHlZ2gAjIm8AAnnbcYUckpoeYmuYtxijEUpYIDwVCedfaRg9T6b7I4x2eDrdfbl0yYbO-drn07Ze2gOeq2Tb-prdOH0LtqbvzlEn0-z5WSeL96eXybjRW4IEynnriCkIFiTlaSYaW4EVKICIZgBa6TBBbacY2bBWcmdq1aWilIbU9mqsoQM0UPfu9E7dQh-r8NJNdqr-Xihul37AS8khR9o2fuePYTm-2hjUtvmGOr2PAWMSkahlLil8p4yoYkxWPdfC1h1AlUnUHUCVSuw5e963ltr_1nOBCsYIb8Dv2oC</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>1549541790</pqid></control><display><type>article</type><title>Data Stream Clustering With Affinity Propagation</title><source>IEEE Electronic Library (IEL)</source><creator>Xiangliang Zhang ; Furtlehner, Cyril ; Germain-Renaud, Cecile ; Sebag, Michele</creator><creatorcontrib>Xiangliang Zhang ; Furtlehner, Cyril ; Germain-Renaud, Cecile ; Sebag, Michele</creatorcontrib><description>Data stream clustering provides insights into the underlying patterns of data flows. This paper focuses on selecting the best representatives from clusters of streaming data. There are two main challenges: how to cluster with the best representatives and how to handle the evolving patterns that are important characteristics of streaming data with dynamic distributions. We employ the Affinity Propagation (AP) algorithm presented in 2007 by Frey and Dueck for the first challenge, as it offers good guarantees of clustering optimality for selecting exemplars. The second challenging problem is solved by change detection. The presented StrAP algorithm combines AP with a statistical change point detection test; the clustering model is rebuilt whenever the test detects a change in the underlying data distribution. Besides the validation on two benchmark data sets, the presented algorithm is validated on a real-world application, monitoring the data flow of jobs submitted to the EGEE grid.</description><identifier>ISSN: 1041-4347</identifier><identifier>EISSN: 1558-2191</identifier><identifier>DOI: 10.1109/TKDE.2013.146</identifier><identifier>CODEN: ITKEEH</identifier><language>eng</language><publisher>New York: IEEE</publisher><subject>Affinity Propagation ; Algorithms ; Artificial Intelligence ; Change detection algorithms ; Clustering algorithms ; Computational modeling ; Computer Science ; Data models ; Data Stream Clustering ; Machine Learning ; Monitoring ; Optimization ; Reservoirs</subject><ispartof>IEEE transactions on knowledge and data engineering, 2014-07, Vol.26 (7), p.1644-1656</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) Jul 2014</rights><rights>Distributed under a Creative Commons Attribution 4.0 International License</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c358t-6f233230a3d9405a6c81b8b1885c1ec9c020e6605e1fe96ffbde487accbebbe33</citedby><cites>FETCH-LOGICAL-c358t-6f233230a3d9405a6c81b8b1885c1ec9c020e6605e1fe96ffbde487accbebbe33</cites><orcidid>0000-0002-3574-5665</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/6585253$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>230,314,780,784,796,885,27924,27925,54758</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/6585253$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc><backlink>$$Uhttps://inria.hal.science/hal-00862941$$DView record in HAL$$Hfree_for_read</backlink></links><search><creatorcontrib>Xiangliang Zhang</creatorcontrib><creatorcontrib>Furtlehner, Cyril</creatorcontrib><creatorcontrib>Germain-Renaud, Cecile</creatorcontrib><creatorcontrib>Sebag, Michele</creatorcontrib><title>Data Stream Clustering With Affinity Propagation</title><title>IEEE transactions on knowledge and data engineering</title><addtitle>TKDE</addtitle><description>Data stream clustering provides insights into the underlying patterns of data flows. This paper focuses on selecting the best representatives from clusters of streaming data. There are two main challenges: how to cluster with the best representatives and how to handle the evolving patterns that are important characteristics of streaming data with dynamic distributions. We employ the Affinity Propagation (AP) algorithm presented in 2007 by Frey and Dueck for the first challenge, as it offers good guarantees of clustering optimality for selecting exemplars. The second challenging problem is solved by change detection. The presented StrAP algorithm combines AP with a statistical change point detection test; the clustering model is rebuilt whenever the test detects a change in the underlying data distribution. Besides the validation on two benchmark data sets, the presented algorithm is validated on a real-world application, monitoring the data flow of jobs submitted to the EGEE grid.</description><subject>Affinity Propagation</subject><subject>Algorithms</subject><subject>Artificial Intelligence</subject><subject>Change detection algorithms</subject><subject>Clustering algorithms</subject><subject>Computational modeling</subject><subject>Computer Science</subject><subject>Data models</subject><subject>Data Stream Clustering</subject><subject>Machine Learning</subject><subject>Monitoring</subject><subject>Optimization</subject><subject>Reservoirs</subject><issn>1041-4347</issn><issn>1558-2191</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2014</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNo9kE1LAzEQhoMoWKtHT14WPHnYmtl8bHIsbbViQcGKx5BNkzal3a1JKvTfu8uKp3kZHl5mHoRuAY8AsHxcvk5nowIDGQHlZ2gAjIm8AAnnbcYUckpoeYmuYtxijEUpYIDwVCedfaRg9T6b7I4x2eDrdfbl0yYbO-drn07Ze2gOeq2Tb-prdOH0LtqbvzlEn0-z5WSeL96eXybjRW4IEynnriCkIFiTlaSYaW4EVKICIZgBa6TBBbacY2bBWcmdq1aWilIbU9mqsoQM0UPfu9E7dQh-r8NJNdqr-Xihul37AS8khR9o2fuePYTm-2hjUtvmGOr2PAWMSkahlLil8p4yoYkxWPdfC1h1AlUnUHUCVSuw5e963ltr_1nOBCsYIb8Dv2oC</recordid><startdate>20140701</startdate><enddate>20140701</enddate><creator>Xiangliang Zhang</creator><creator>Furtlehner, Cyril</creator><creator>Germain-Renaud, Cecile</creator><creator>Sebag, Michele</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><general>Institute of Electrical and Electronics Engineers</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>1XC</scope><scope>VOOES</scope><orcidid>https://orcid.org/0000-0002-3574-5665</orcidid></search><sort><creationdate>20140701</creationdate><title>Data Stream Clustering With Affinity Propagation</title><author>Xiangliang Zhang ; Furtlehner, Cyril ; Germain-Renaud, Cecile ; Sebag, Michele</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c358t-6f233230a3d9405a6c81b8b1885c1ec9c020e6605e1fe96ffbde487accbebbe33</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2014</creationdate><topic>Affinity Propagation</topic><topic>Algorithms</topic><topic>Artificial Intelligence</topic><topic>Change detection algorithms</topic><topic>Clustering algorithms</topic><topic>Computational modeling</topic><topic>Computer Science</topic><topic>Data models</topic><topic>Data Stream Clustering</topic><topic>Machine Learning</topic><topic>Monitoring</topic><topic>Optimization</topic><topic>Reservoirs</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Xiangliang Zhang</creatorcontrib><creatorcontrib>Furtlehner, Cyril</creatorcontrib><creatorcontrib>Germain-Renaud, Cecile</creatorcontrib><creatorcontrib>Sebag, Michele</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics &amp; Communications Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>Hyper Article en Ligne (HAL)</collection><collection>Hyper Article en Ligne (HAL) (Open Access)</collection><jtitle>IEEE transactions on knowledge and data engineering</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Xiangliang Zhang</au><au>Furtlehner, Cyril</au><au>Germain-Renaud, Cecile</au><au>Sebag, Michele</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Data Stream Clustering With Affinity Propagation</atitle><jtitle>IEEE transactions on knowledge and data engineering</jtitle><stitle>TKDE</stitle><date>2014-07-01</date><risdate>2014</risdate><volume>26</volume><issue>7</issue><spage>1644</spage><epage>1656</epage><pages>1644-1656</pages><issn>1041-4347</issn><eissn>1558-2191</eissn><coden>ITKEEH</coden><abstract>Data stream clustering provides insights into the underlying patterns of data flows. This paper focuses on selecting the best representatives from clusters of streaming data. There are two main challenges: how to cluster with the best representatives and how to handle the evolving patterns that are important characteristics of streaming data with dynamic distributions. We employ the Affinity Propagation (AP) algorithm presented in 2007 by Frey and Dueck for the first challenge, as it offers good guarantees of clustering optimality for selecting exemplars. The second challenging problem is solved by change detection. The presented StrAP algorithm combines AP with a statistical change point detection test; the clustering model is rebuilt whenever the test detects a change in the underlying data distribution. Besides the validation on two benchmark data sets, the presented algorithm is validated on a real-world application, monitoring the data flow of jobs submitted to the EGEE grid.</abstract><cop>New York</cop><pub>IEEE</pub><doi>10.1109/TKDE.2013.146</doi><tpages>13</tpages><orcidid>https://orcid.org/0000-0002-3574-5665</orcidid><oa>free_for_read</oa></addata></record>
fulltext fulltext_linktorsrc
identifier ISSN: 1041-4347
ispartof IEEE transactions on knowledge and data engineering, 2014-07, Vol.26 (7), p.1644-1656
issn 1041-4347
1558-2191
language eng
recordid cdi_ieee_primary_6585253
source IEEE Electronic Library (IEL)
subjects Affinity Propagation
Algorithms
Artificial Intelligence
Change detection algorithms
Clustering algorithms
Computational modeling
Computer Science
Data models
Data Stream Clustering
Machine Learning
Monitoring
Optimization
Reservoirs
title Data Stream Clustering With Affinity Propagation
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-03T01%3A56%3A14IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Data%20Stream%20Clustering%20With%20Affinity%20Propagation&rft.jtitle=IEEE%20transactions%20on%20knowledge%20and%20data%20engineering&rft.au=Xiangliang%20Zhang&rft.date=2014-07-01&rft.volume=26&rft.issue=7&rft.spage=1644&rft.epage=1656&rft.pages=1644-1656&rft.issn=1041-4347&rft.eissn=1558-2191&rft.coden=ITKEEH&rft_id=info:doi/10.1109/TKDE.2013.146&rft_dat=%3Cproquest_RIE%3E3387985901%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=1549541790&rft_id=info:pmid/&rft_ieee_id=6585253&rfr_iscdi=true