Data Stream Clustering With Affinity Propagation
Data stream clustering provides insights into the underlying patterns of data flows. This paper focuses on selecting the best representatives from clusters of streaming data. There are two main challenges: how to cluster with the best representatives and how to handle the evolving patterns that are...
Gespeichert in:
Veröffentlicht in: | IEEE transactions on knowledge and data engineering 2014-07, Vol.26 (7), p.1644-1656 |
---|---|
Hauptverfasser: | , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 1656 |
---|---|
container_issue | 7 |
container_start_page | 1644 |
container_title | IEEE transactions on knowledge and data engineering |
container_volume | 26 |
creator | Xiangliang Zhang Furtlehner, Cyril Germain-Renaud, Cecile Sebag, Michele |
description | Data stream clustering provides insights into the underlying patterns of data flows. This paper focuses on selecting the best representatives from clusters of streaming data. There are two main challenges: how to cluster with the best representatives and how to handle the evolving patterns that are important characteristics of streaming data with dynamic distributions. We employ the Affinity Propagation (AP) algorithm presented in 2007 by Frey and Dueck for the first challenge, as it offers good guarantees of clustering optimality for selecting exemplars. The second challenging problem is solved by change detection. The presented StrAP algorithm combines AP with a statistical change point detection test; the clustering model is rebuilt whenever the test detects a change in the underlying data distribution. Besides the validation on two benchmark data sets, the presented algorithm is validated on a real-world application, monitoring the data flow of jobs submitted to the EGEE grid. |
doi_str_mv | 10.1109/TKDE.2013.146 |
format | Article |
fullrecord | <record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_ieee_primary_6585253</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>6585253</ieee_id><sourcerecordid>3387985901</sourcerecordid><originalsourceid>FETCH-LOGICAL-c358t-6f233230a3d9405a6c81b8b1885c1ec9c020e6605e1fe96ffbde487accbebbe33</originalsourceid><addsrcrecordid>eNo9kE1LAzEQhoMoWKtHT14WPHnYmtl8bHIsbbViQcGKx5BNkzal3a1JKvTfu8uKp3kZHl5mHoRuAY8AsHxcvk5nowIDGQHlZ2gAjIm8AAnnbcYUckpoeYmuYtxijEUpYIDwVCedfaRg9T6b7I4x2eDrdfbl0yYbO-drn07Ze2gOeq2Tb-prdOH0LtqbvzlEn0-z5WSeL96eXybjRW4IEynnriCkIFiTlaSYaW4EVKICIZgBa6TBBbacY2bBWcmdq1aWilIbU9mqsoQM0UPfu9E7dQh-r8NJNdqr-Xihul37AS8khR9o2fuePYTm-2hjUtvmGOr2PAWMSkahlLil8p4yoYkxWPdfC1h1AlUnUHUCVSuw5e963ltr_1nOBCsYIb8Dv2oC</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>1549541790</pqid></control><display><type>article</type><title>Data Stream Clustering With Affinity Propagation</title><source>IEEE Electronic Library (IEL)</source><creator>Xiangliang Zhang ; Furtlehner, Cyril ; Germain-Renaud, Cecile ; Sebag, Michele</creator><creatorcontrib>Xiangliang Zhang ; Furtlehner, Cyril ; Germain-Renaud, Cecile ; Sebag, Michele</creatorcontrib><description>Data stream clustering provides insights into the underlying patterns of data flows. This paper focuses on selecting the best representatives from clusters of streaming data. There are two main challenges: how to cluster with the best representatives and how to handle the evolving patterns that are important characteristics of streaming data with dynamic distributions. We employ the Affinity Propagation (AP) algorithm presented in 2007 by Frey and Dueck for the first challenge, as it offers good guarantees of clustering optimality for selecting exemplars. The second challenging problem is solved by change detection. The presented StrAP algorithm combines AP with a statistical change point detection test; the clustering model is rebuilt whenever the test detects a change in the underlying data distribution. Besides the validation on two benchmark data sets, the presented algorithm is validated on a real-world application, monitoring the data flow of jobs submitted to the EGEE grid.</description><identifier>ISSN: 1041-4347</identifier><identifier>EISSN: 1558-2191</identifier><identifier>DOI: 10.1109/TKDE.2013.146</identifier><identifier>CODEN: ITKEEH</identifier><language>eng</language><publisher>New York: IEEE</publisher><subject>Affinity Propagation ; Algorithms ; Artificial Intelligence ; Change detection algorithms ; Clustering algorithms ; Computational modeling ; Computer Science ; Data models ; Data Stream Clustering ; Machine Learning ; Monitoring ; Optimization ; Reservoirs</subject><ispartof>IEEE transactions on knowledge and data engineering, 2014-07, Vol.26 (7), p.1644-1656</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) Jul 2014</rights><rights>Distributed under a Creative Commons Attribution 4.0 International License</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c358t-6f233230a3d9405a6c81b8b1885c1ec9c020e6605e1fe96ffbde487accbebbe33</citedby><cites>FETCH-LOGICAL-c358t-6f233230a3d9405a6c81b8b1885c1ec9c020e6605e1fe96ffbde487accbebbe33</cites><orcidid>0000-0002-3574-5665</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/6585253$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>230,314,780,784,796,885,27924,27925,54758</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/6585253$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc><backlink>$$Uhttps://inria.hal.science/hal-00862941$$DView record in HAL$$Hfree_for_read</backlink></links><search><creatorcontrib>Xiangliang Zhang</creatorcontrib><creatorcontrib>Furtlehner, Cyril</creatorcontrib><creatorcontrib>Germain-Renaud, Cecile</creatorcontrib><creatorcontrib>Sebag, Michele</creatorcontrib><title>Data Stream Clustering With Affinity Propagation</title><title>IEEE transactions on knowledge and data engineering</title><addtitle>TKDE</addtitle><description>Data stream clustering provides insights into the underlying patterns of data flows. This paper focuses on selecting the best representatives from clusters of streaming data. There are two main challenges: how to cluster with the best representatives and how to handle the evolving patterns that are important characteristics of streaming data with dynamic distributions. We employ the Affinity Propagation (AP) algorithm presented in 2007 by Frey and Dueck for the first challenge, as it offers good guarantees of clustering optimality for selecting exemplars. The second challenging problem is solved by change detection. The presented StrAP algorithm combines AP with a statistical change point detection test; the clustering model is rebuilt whenever the test detects a change in the underlying data distribution. Besides the validation on two benchmark data sets, the presented algorithm is validated on a real-world application, monitoring the data flow of jobs submitted to the EGEE grid.</description><subject>Affinity Propagation</subject><subject>Algorithms</subject><subject>Artificial Intelligence</subject><subject>Change detection algorithms</subject><subject>Clustering algorithms</subject><subject>Computational modeling</subject><subject>Computer Science</subject><subject>Data models</subject><subject>Data Stream Clustering</subject><subject>Machine Learning</subject><subject>Monitoring</subject><subject>Optimization</subject><subject>Reservoirs</subject><issn>1041-4347</issn><issn>1558-2191</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2014</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNo9kE1LAzEQhoMoWKtHT14WPHnYmtl8bHIsbbViQcGKx5BNkzal3a1JKvTfu8uKp3kZHl5mHoRuAY8AsHxcvk5nowIDGQHlZ2gAjIm8AAnnbcYUckpoeYmuYtxijEUpYIDwVCedfaRg9T6b7I4x2eDrdfbl0yYbO-drn07Ze2gOeq2Tb-prdOH0LtqbvzlEn0-z5WSeL96eXybjRW4IEynnriCkIFiTlaSYaW4EVKICIZgBa6TBBbacY2bBWcmdq1aWilIbU9mqsoQM0UPfu9E7dQh-r8NJNdqr-Xihul37AS8khR9o2fuePYTm-2hjUtvmGOr2PAWMSkahlLil8p4yoYkxWPdfC1h1AlUnUHUCVSuw5e963ltr_1nOBCsYIb8Dv2oC</recordid><startdate>20140701</startdate><enddate>20140701</enddate><creator>Xiangliang Zhang</creator><creator>Furtlehner, Cyril</creator><creator>Germain-Renaud, Cecile</creator><creator>Sebag, Michele</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><general>Institute of Electrical and Electronics Engineers</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>1XC</scope><scope>VOOES</scope><orcidid>https://orcid.org/0000-0002-3574-5665</orcidid></search><sort><creationdate>20140701</creationdate><title>Data Stream Clustering With Affinity Propagation</title><author>Xiangliang Zhang ; Furtlehner, Cyril ; Germain-Renaud, Cecile ; Sebag, Michele</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c358t-6f233230a3d9405a6c81b8b1885c1ec9c020e6605e1fe96ffbde487accbebbe33</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2014</creationdate><topic>Affinity Propagation</topic><topic>Algorithms</topic><topic>Artificial Intelligence</topic><topic>Change detection algorithms</topic><topic>Clustering algorithms</topic><topic>Computational modeling</topic><topic>Computer Science</topic><topic>Data models</topic><topic>Data Stream Clustering</topic><topic>Machine Learning</topic><topic>Monitoring</topic><topic>Optimization</topic><topic>Reservoirs</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Xiangliang Zhang</creatorcontrib><creatorcontrib>Furtlehner, Cyril</creatorcontrib><creatorcontrib>Germain-Renaud, Cecile</creatorcontrib><creatorcontrib>Sebag, Michele</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics & Communications Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>Hyper Article en Ligne (HAL)</collection><collection>Hyper Article en Ligne (HAL) (Open Access)</collection><jtitle>IEEE transactions on knowledge and data engineering</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Xiangliang Zhang</au><au>Furtlehner, Cyril</au><au>Germain-Renaud, Cecile</au><au>Sebag, Michele</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Data Stream Clustering With Affinity Propagation</atitle><jtitle>IEEE transactions on knowledge and data engineering</jtitle><stitle>TKDE</stitle><date>2014-07-01</date><risdate>2014</risdate><volume>26</volume><issue>7</issue><spage>1644</spage><epage>1656</epage><pages>1644-1656</pages><issn>1041-4347</issn><eissn>1558-2191</eissn><coden>ITKEEH</coden><abstract>Data stream clustering provides insights into the underlying patterns of data flows. This paper focuses on selecting the best representatives from clusters of streaming data. There are two main challenges: how to cluster with the best representatives and how to handle the evolving patterns that are important characteristics of streaming data with dynamic distributions. We employ the Affinity Propagation (AP) algorithm presented in 2007 by Frey and Dueck for the first challenge, as it offers good guarantees of clustering optimality for selecting exemplars. The second challenging problem is solved by change detection. The presented StrAP algorithm combines AP with a statistical change point detection test; the clustering model is rebuilt whenever the test detects a change in the underlying data distribution. Besides the validation on two benchmark data sets, the presented algorithm is validated on a real-world application, monitoring the data flow of jobs submitted to the EGEE grid.</abstract><cop>New York</cop><pub>IEEE</pub><doi>10.1109/TKDE.2013.146</doi><tpages>13</tpages><orcidid>https://orcid.org/0000-0002-3574-5665</orcidid><oa>free_for_read</oa></addata></record> |
fulltext | fulltext_linktorsrc |
identifier | ISSN: 1041-4347 |
ispartof | IEEE transactions on knowledge and data engineering, 2014-07, Vol.26 (7), p.1644-1656 |
issn | 1041-4347 1558-2191 |
language | eng |
recordid | cdi_ieee_primary_6585253 |
source | IEEE Electronic Library (IEL) |
subjects | Affinity Propagation Algorithms Artificial Intelligence Change detection algorithms Clustering algorithms Computational modeling Computer Science Data models Data Stream Clustering Machine Learning Monitoring Optimization Reservoirs |
title | Data Stream Clustering With Affinity Propagation |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-03T01%3A56%3A14IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Data%20Stream%20Clustering%20With%20Affinity%20Propagation&rft.jtitle=IEEE%20transactions%20on%20knowledge%20and%20data%20engineering&rft.au=Xiangliang%20Zhang&rft.date=2014-07-01&rft.volume=26&rft.issue=7&rft.spage=1644&rft.epage=1656&rft.pages=1644-1656&rft.issn=1041-4347&rft.eissn=1558-2191&rft.coden=ITKEEH&rft_id=info:doi/10.1109/TKDE.2013.146&rft_dat=%3Cproquest_RIE%3E3387985901%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=1549541790&rft_id=info:pmid/&rft_ieee_id=6585253&rfr_iscdi=true |