Streaming-data algorithms for high-quality clustering
Streaming data analysis has recently attracted attention in numerous applications including telephone records, Web documents and click streams. For such analysis, single-pass algorithms that consume a small amount of memory are critical. We describe such a streaming algorithm that effectively cluste...
Gespeichert in:
Hauptverfasser: | , , , , |
---|---|
Format: | Tagungsbericht |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 694 |
---|---|
container_issue | |
container_start_page | 685 |
container_title | |
container_volume | |
creator | O'Callaghan, L. Mishra, N. Meyerson, A. Guha, S. Motwani, R. |
description | Streaming data analysis has recently attracted attention in numerous applications including telephone records, Web documents and click streams. For such analysis, single-pass algorithms that consume a small amount of memory are critical. We describe such a streaming algorithm that effectively clusters large data streams. We also provide empirical evidence of the algorithm's performance on synthetic and real data streams. |
doi_str_mv | 10.1109/ICDE.2002.994785 |
format | Conference Proceeding |
fullrecord | <record><control><sourceid>pascalfrancis_6IE</sourceid><recordid>TN_cdi_ieee_primary_994785</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>994785</ieee_id><sourcerecordid>15812160</sourcerecordid><originalsourceid>FETCH-LOGICAL-c223t-17d5fff8285ec6ec9d2d2479e7ea5a3546c581cd05f23f26665600001ddb30ad3</originalsourceid><addsrcrecordid>eNo9kDtPAzEQhC0eElFIj6iuoXTw2rf2uUQhgUiRKACJLlr8SIwuD-xLkX_PoSCmmWK-Wa2GsRsQYwBh7-eTx-lYCiHH1tamwTM2kMogF1J_nLORNY0w2iKgArhgAxBaca0aecVGpXyJXrYGQDFg-NrlQJu0XXFPHVXUrnY5detNqeIuV-u0WvPvA7WpO1auPZQu5J69ZpeR2hJGfz5k77Pp2-SZL16e5pOHBXdSqo6D8RhjbGSDwengrJde1sYGEwhJYa0dNuC8wChVlFpr1L_PgfefSpBXQ3Z3urun4qiNmbYuleU-pw3l4xL6tgQteu72xKUQwn982kb9ABI-VLg</addsrcrecordid><sourcetype>Index Database</sourcetype><iscdi>true</iscdi><recordtype>conference_proceeding</recordtype></control><display><type>conference_proceeding</type><title>Streaming-data algorithms for high-quality clustering</title><source>IEEE Electronic Library (IEL) Conference Proceedings</source><creator>O'Callaghan, L. ; Mishra, N. ; Meyerson, A. ; Guha, S. ; Motwani, R.</creator><creatorcontrib>O'Callaghan, L. ; Mishra, N. ; Meyerson, A. ; Guha, S. ; Motwani, R.</creatorcontrib><description>Streaming data analysis has recently attracted attention in numerous applications including telephone records, Web documents and click streams. For such analysis, single-pass algorithms that consume a small amount of memory are critical. We describe such a streaming algorithm that effectively clusters large data streams. We also provide empirical evidence of the algorithm's performance on synthetic and real data streams.</description><identifier>ISSN: 1063-6382</identifier><identifier>ISBN: 9780769515311</identifier><identifier>ISBN: 0769515312</identifier><identifier>EISSN: 2375-026X</identifier><identifier>DOI: 10.1109/ICDE.2002.994785</identifier><language>eng</language><publisher>Los Alamitos CA: IEEE</publisher><subject>Algorithm design and analysis ; Applied sciences ; Clustering algorithms ; Computer science ; Computer science; control theory; systems ; Data analysis ; Data engineering ; Exact sciences and technology ; Information systems. Data bases ; Lab-on-a-chip ; Laboratories ; Memory organisation. Data processing ; Partitioning algorithms ; Software ; Telecommunications ; Telecommunications and information theory ; Telephony ; Teleprocessing networks. Isdn</subject><ispartof>Proceedings 18th International Conference on Data Engineering, 2002, p.685-694</ispartof><rights>2004 INIST-CNRS</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c223t-17d5fff8285ec6ec9d2d2479e7ea5a3546c581cd05f23f26665600001ddb30ad3</citedby></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/994785$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>309,310,780,784,789,790,2058,4050,4051,27925,54920</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/994785$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc><backlink>$$Uhttp://pascal-francis.inist.fr/vibad/index.php?action=getRecordDetail&idt=15812160$$DView record in Pascal Francis$$Hfree_for_read</backlink></links><search><creatorcontrib>O'Callaghan, L.</creatorcontrib><creatorcontrib>Mishra, N.</creatorcontrib><creatorcontrib>Meyerson, A.</creatorcontrib><creatorcontrib>Guha, S.</creatorcontrib><creatorcontrib>Motwani, R.</creatorcontrib><title>Streaming-data algorithms for high-quality clustering</title><title>Proceedings 18th International Conference on Data Engineering</title><addtitle>ICDE</addtitle><description>Streaming data analysis has recently attracted attention in numerous applications including telephone records, Web documents and click streams. For such analysis, single-pass algorithms that consume a small amount of memory are critical. We describe such a streaming algorithm that effectively clusters large data streams. We also provide empirical evidence of the algorithm's performance on synthetic and real data streams.</description><subject>Algorithm design and analysis</subject><subject>Applied sciences</subject><subject>Clustering algorithms</subject><subject>Computer science</subject><subject>Computer science; control theory; systems</subject><subject>Data analysis</subject><subject>Data engineering</subject><subject>Exact sciences and technology</subject><subject>Information systems. Data bases</subject><subject>Lab-on-a-chip</subject><subject>Laboratories</subject><subject>Memory organisation. Data processing</subject><subject>Partitioning algorithms</subject><subject>Software</subject><subject>Telecommunications</subject><subject>Telecommunications and information theory</subject><subject>Telephony</subject><subject>Teleprocessing networks. Isdn</subject><issn>1063-6382</issn><issn>2375-026X</issn><isbn>9780769515311</isbn><isbn>0769515312</isbn><fulltext>true</fulltext><rsrctype>conference_proceeding</rsrctype><creationdate>2002</creationdate><recordtype>conference_proceeding</recordtype><sourceid>6IE</sourceid><sourceid>RIE</sourceid><recordid>eNo9kDtPAzEQhC0eElFIj6iuoXTw2rf2uUQhgUiRKACJLlr8SIwuD-xLkX_PoSCmmWK-Wa2GsRsQYwBh7-eTx-lYCiHH1tamwTM2kMogF1J_nLORNY0w2iKgArhgAxBaca0aecVGpXyJXrYGQDFg-NrlQJu0XXFPHVXUrnY5detNqeIuV-u0WvPvA7WpO1auPZQu5J69ZpeR2hJGfz5k77Pp2-SZL16e5pOHBXdSqo6D8RhjbGSDwengrJde1sYGEwhJYa0dNuC8wChVlFpr1L_PgfefSpBXQ3Z3urun4qiNmbYuleU-pw3l4xL6tgQteu72xKUQwn982kb9ABI-VLg</recordid><startdate>2002</startdate><enddate>2002</enddate><creator>O'Callaghan, L.</creator><creator>Mishra, N.</creator><creator>Meyerson, A.</creator><creator>Guha, S.</creator><creator>Motwani, R.</creator><general>IEEE</general><scope>6IE</scope><scope>6IH</scope><scope>CBEJK</scope><scope>RIE</scope><scope>RIO</scope><scope>IQODW</scope></search><sort><creationdate>2002</creationdate><title>Streaming-data algorithms for high-quality clustering</title><author>O'Callaghan, L. ; Mishra, N. ; Meyerson, A. ; Guha, S. ; Motwani, R.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c223t-17d5fff8285ec6ec9d2d2479e7ea5a3546c581cd05f23f26665600001ddb30ad3</frbrgroupid><rsrctype>conference_proceedings</rsrctype><prefilter>conference_proceedings</prefilter><language>eng</language><creationdate>2002</creationdate><topic>Algorithm design and analysis</topic><topic>Applied sciences</topic><topic>Clustering algorithms</topic><topic>Computer science</topic><topic>Computer science; control theory; systems</topic><topic>Data analysis</topic><topic>Data engineering</topic><topic>Exact sciences and technology</topic><topic>Information systems. Data bases</topic><topic>Lab-on-a-chip</topic><topic>Laboratories</topic><topic>Memory organisation. Data processing</topic><topic>Partitioning algorithms</topic><topic>Software</topic><topic>Telecommunications</topic><topic>Telecommunications and information theory</topic><topic>Telephony</topic><topic>Teleprocessing networks. Isdn</topic><toplevel>online_resources</toplevel><creatorcontrib>O'Callaghan, L.</creatorcontrib><creatorcontrib>Mishra, N.</creatorcontrib><creatorcontrib>Meyerson, A.</creatorcontrib><creatorcontrib>Guha, S.</creatorcontrib><creatorcontrib>Motwani, R.</creatorcontrib><collection>IEEE Electronic Library (IEL) Conference Proceedings</collection><collection>IEEE Proceedings Order Plan (POP) 1998-present by volume</collection><collection>IEEE Xplore All Conference Proceedings</collection><collection>IEEE Electronic Library (IEL)</collection><collection>IEEE Proceedings Order Plans (POP) 1998-present</collection><collection>Pascal-Francis</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>O'Callaghan, L.</au><au>Mishra, N.</au><au>Meyerson, A.</au><au>Guha, S.</au><au>Motwani, R.</au><format>book</format><genre>proceeding</genre><ristype>CONF</ristype><atitle>Streaming-data algorithms for high-quality clustering</atitle><btitle>Proceedings 18th International Conference on Data Engineering</btitle><stitle>ICDE</stitle><date>2002</date><risdate>2002</risdate><spage>685</spage><epage>694</epage><pages>685-694</pages><issn>1063-6382</issn><eissn>2375-026X</eissn><isbn>9780769515311</isbn><isbn>0769515312</isbn><abstract>Streaming data analysis has recently attracted attention in numerous applications including telephone records, Web documents and click streams. For such analysis, single-pass algorithms that consume a small amount of memory are critical. We describe such a streaming algorithm that effectively clusters large data streams. We also provide empirical evidence of the algorithm's performance on synthetic and real data streams.</abstract><cop>Los Alamitos CA</cop><pub>IEEE</pub><doi>10.1109/ICDE.2002.994785</doi><tpages>10</tpages><oa>free_for_read</oa></addata></record> |
fulltext | fulltext_linktorsrc |
identifier | ISSN: 1063-6382 |
ispartof | Proceedings 18th International Conference on Data Engineering, 2002, p.685-694 |
issn | 1063-6382 2375-026X |
language | eng |
recordid | cdi_ieee_primary_994785 |
source | IEEE Electronic Library (IEL) Conference Proceedings |
subjects | Algorithm design and analysis Applied sciences Clustering algorithms Computer science Computer science control theory systems Data analysis Data engineering Exact sciences and technology Information systems. Data bases Lab-on-a-chip Laboratories Memory organisation. Data processing Partitioning algorithms Software Telecommunications Telecommunications and information theory Telephony Teleprocessing networks. Isdn |
title | Streaming-data algorithms for high-quality clustering |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-03T21%3A52%3A25IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-pascalfrancis_6IE&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=proceeding&rft.atitle=Streaming-data%20algorithms%20for%20high-quality%20clustering&rft.btitle=Proceedings%2018th%20International%20Conference%20on%20Data%20Engineering&rft.au=O'Callaghan,%20L.&rft.date=2002&rft.spage=685&rft.epage=694&rft.pages=685-694&rft.issn=1063-6382&rft.eissn=2375-026X&rft.isbn=9780769515311&rft.isbn_list=0769515312&rft_id=info:doi/10.1109/ICDE.2002.994785&rft_dat=%3Cpascalfrancis_6IE%3E15812160%3C/pascalfrancis_6IE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rft_ieee_id=994785&rfr_iscdi=true |