Streaming-data algorithms for high-quality clustering

Streaming data analysis has recently attracted attention in numerous applications including telephone records, Web documents and click streams. For such analysis, single-pass algorithms that consume a small amount of memory are critical. We describe such a streaming algorithm that effectively cluste...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: O'Callaghan, L., Mishra, N., Meyerson, A., Guha, S., Motwani, R.
Format: Tagungsbericht
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 694
container_issue
container_start_page 685
container_title
container_volume
creator O'Callaghan, L.
Mishra, N.
Meyerson, A.
Guha, S.
Motwani, R.
description Streaming data analysis has recently attracted attention in numerous applications including telephone records, Web documents and click streams. For such analysis, single-pass algorithms that consume a small amount of memory are critical. We describe such a streaming algorithm that effectively clusters large data streams. We also provide empirical evidence of the algorithm's performance on synthetic and real data streams.
doi_str_mv 10.1109/ICDE.2002.994785
format Conference Proceeding
fullrecord <record><control><sourceid>pascalfrancis_6IE</sourceid><recordid>TN_cdi_ieee_primary_994785</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>994785</ieee_id><sourcerecordid>15812160</sourcerecordid><originalsourceid>FETCH-LOGICAL-c223t-17d5fff8285ec6ec9d2d2479e7ea5a3546c581cd05f23f26665600001ddb30ad3</originalsourceid><addsrcrecordid>eNo9kDtPAzEQhC0eElFIj6iuoXTw2rf2uUQhgUiRKACJLlr8SIwuD-xLkX_PoSCmmWK-Wa2GsRsQYwBh7-eTx-lYCiHH1tamwTM2kMogF1J_nLORNY0w2iKgArhgAxBaca0aecVGpXyJXrYGQDFg-NrlQJu0XXFPHVXUrnY5detNqeIuV-u0WvPvA7WpO1auPZQu5J69ZpeR2hJGfz5k77Pp2-SZL16e5pOHBXdSqo6D8RhjbGSDwengrJde1sYGEwhJYa0dNuC8wChVlFpr1L_PgfefSpBXQ3Z3urun4qiNmbYuleU-pw3l4xL6tgQteu72xKUQwn982kb9ABI-VLg</addsrcrecordid><sourcetype>Index Database</sourcetype><iscdi>true</iscdi><recordtype>conference_proceeding</recordtype></control><display><type>conference_proceeding</type><title>Streaming-data algorithms for high-quality clustering</title><source>IEEE Electronic Library (IEL) Conference Proceedings</source><creator>O'Callaghan, L. ; Mishra, N. ; Meyerson, A. ; Guha, S. ; Motwani, R.</creator><creatorcontrib>O'Callaghan, L. ; Mishra, N. ; Meyerson, A. ; Guha, S. ; Motwani, R.</creatorcontrib><description>Streaming data analysis has recently attracted attention in numerous applications including telephone records, Web documents and click streams. For such analysis, single-pass algorithms that consume a small amount of memory are critical. We describe such a streaming algorithm that effectively clusters large data streams. We also provide empirical evidence of the algorithm's performance on synthetic and real data streams.</description><identifier>ISSN: 1063-6382</identifier><identifier>ISBN: 9780769515311</identifier><identifier>ISBN: 0769515312</identifier><identifier>EISSN: 2375-026X</identifier><identifier>DOI: 10.1109/ICDE.2002.994785</identifier><language>eng</language><publisher>Los Alamitos CA: IEEE</publisher><subject>Algorithm design and analysis ; Applied sciences ; Clustering algorithms ; Computer science ; Computer science; control theory; systems ; Data analysis ; Data engineering ; Exact sciences and technology ; Information systems. Data bases ; Lab-on-a-chip ; Laboratories ; Memory organisation. Data processing ; Partitioning algorithms ; Software ; Telecommunications ; Telecommunications and information theory ; Telephony ; Teleprocessing networks. Isdn</subject><ispartof>Proceedings 18th International Conference on Data Engineering, 2002, p.685-694</ispartof><rights>2004 INIST-CNRS</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c223t-17d5fff8285ec6ec9d2d2479e7ea5a3546c581cd05f23f26665600001ddb30ad3</citedby></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/994785$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>309,310,780,784,789,790,2058,4050,4051,27925,54920</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/994785$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc><backlink>$$Uhttp://pascal-francis.inist.fr/vibad/index.php?action=getRecordDetail&amp;idt=15812160$$DView record in Pascal Francis$$Hfree_for_read</backlink></links><search><creatorcontrib>O'Callaghan, L.</creatorcontrib><creatorcontrib>Mishra, N.</creatorcontrib><creatorcontrib>Meyerson, A.</creatorcontrib><creatorcontrib>Guha, S.</creatorcontrib><creatorcontrib>Motwani, R.</creatorcontrib><title>Streaming-data algorithms for high-quality clustering</title><title>Proceedings 18th International Conference on Data Engineering</title><addtitle>ICDE</addtitle><description>Streaming data analysis has recently attracted attention in numerous applications including telephone records, Web documents and click streams. For such analysis, single-pass algorithms that consume a small amount of memory are critical. We describe such a streaming algorithm that effectively clusters large data streams. We also provide empirical evidence of the algorithm's performance on synthetic and real data streams.</description><subject>Algorithm design and analysis</subject><subject>Applied sciences</subject><subject>Clustering algorithms</subject><subject>Computer science</subject><subject>Computer science; control theory; systems</subject><subject>Data analysis</subject><subject>Data engineering</subject><subject>Exact sciences and technology</subject><subject>Information systems. Data bases</subject><subject>Lab-on-a-chip</subject><subject>Laboratories</subject><subject>Memory organisation. Data processing</subject><subject>Partitioning algorithms</subject><subject>Software</subject><subject>Telecommunications</subject><subject>Telecommunications and information theory</subject><subject>Telephony</subject><subject>Teleprocessing networks. Isdn</subject><issn>1063-6382</issn><issn>2375-026X</issn><isbn>9780769515311</isbn><isbn>0769515312</isbn><fulltext>true</fulltext><rsrctype>conference_proceeding</rsrctype><creationdate>2002</creationdate><recordtype>conference_proceeding</recordtype><sourceid>6IE</sourceid><sourceid>RIE</sourceid><recordid>eNo9kDtPAzEQhC0eElFIj6iuoXTw2rf2uUQhgUiRKACJLlr8SIwuD-xLkX_PoSCmmWK-Wa2GsRsQYwBh7-eTx-lYCiHH1tamwTM2kMogF1J_nLORNY0w2iKgArhgAxBaca0aecVGpXyJXrYGQDFg-NrlQJu0XXFPHVXUrnY5detNqeIuV-u0WvPvA7WpO1auPZQu5J69ZpeR2hJGfz5k77Pp2-SZL16e5pOHBXdSqo6D8RhjbGSDwengrJde1sYGEwhJYa0dNuC8wChVlFpr1L_PgfefSpBXQ3Z3urun4qiNmbYuleU-pw3l4xL6tgQteu72xKUQwn982kb9ABI-VLg</recordid><startdate>2002</startdate><enddate>2002</enddate><creator>O'Callaghan, L.</creator><creator>Mishra, N.</creator><creator>Meyerson, A.</creator><creator>Guha, S.</creator><creator>Motwani, R.</creator><general>IEEE</general><scope>6IE</scope><scope>6IH</scope><scope>CBEJK</scope><scope>RIE</scope><scope>RIO</scope><scope>IQODW</scope></search><sort><creationdate>2002</creationdate><title>Streaming-data algorithms for high-quality clustering</title><author>O'Callaghan, L. ; Mishra, N. ; Meyerson, A. ; Guha, S. ; Motwani, R.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c223t-17d5fff8285ec6ec9d2d2479e7ea5a3546c581cd05f23f26665600001ddb30ad3</frbrgroupid><rsrctype>conference_proceedings</rsrctype><prefilter>conference_proceedings</prefilter><language>eng</language><creationdate>2002</creationdate><topic>Algorithm design and analysis</topic><topic>Applied sciences</topic><topic>Clustering algorithms</topic><topic>Computer science</topic><topic>Computer science; control theory; systems</topic><topic>Data analysis</topic><topic>Data engineering</topic><topic>Exact sciences and technology</topic><topic>Information systems. Data bases</topic><topic>Lab-on-a-chip</topic><topic>Laboratories</topic><topic>Memory organisation. Data processing</topic><topic>Partitioning algorithms</topic><topic>Software</topic><topic>Telecommunications</topic><topic>Telecommunications and information theory</topic><topic>Telephony</topic><topic>Teleprocessing networks. Isdn</topic><toplevel>online_resources</toplevel><creatorcontrib>O'Callaghan, L.</creatorcontrib><creatorcontrib>Mishra, N.</creatorcontrib><creatorcontrib>Meyerson, A.</creatorcontrib><creatorcontrib>Guha, S.</creatorcontrib><creatorcontrib>Motwani, R.</creatorcontrib><collection>IEEE Electronic Library (IEL) Conference Proceedings</collection><collection>IEEE Proceedings Order Plan (POP) 1998-present by volume</collection><collection>IEEE Xplore All Conference Proceedings</collection><collection>IEEE Electronic Library (IEL)</collection><collection>IEEE Proceedings Order Plans (POP) 1998-present</collection><collection>Pascal-Francis</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>O'Callaghan, L.</au><au>Mishra, N.</au><au>Meyerson, A.</au><au>Guha, S.</au><au>Motwani, R.</au><format>book</format><genre>proceeding</genre><ristype>CONF</ristype><atitle>Streaming-data algorithms for high-quality clustering</atitle><btitle>Proceedings 18th International Conference on Data Engineering</btitle><stitle>ICDE</stitle><date>2002</date><risdate>2002</risdate><spage>685</spage><epage>694</epage><pages>685-694</pages><issn>1063-6382</issn><eissn>2375-026X</eissn><isbn>9780769515311</isbn><isbn>0769515312</isbn><abstract>Streaming data analysis has recently attracted attention in numerous applications including telephone records, Web documents and click streams. For such analysis, single-pass algorithms that consume a small amount of memory are critical. We describe such a streaming algorithm that effectively clusters large data streams. We also provide empirical evidence of the algorithm's performance on synthetic and real data streams.</abstract><cop>Los Alamitos CA</cop><pub>IEEE</pub><doi>10.1109/ICDE.2002.994785</doi><tpages>10</tpages><oa>free_for_read</oa></addata></record>
fulltext fulltext_linktorsrc
identifier ISSN: 1063-6382
ispartof Proceedings 18th International Conference on Data Engineering, 2002, p.685-694
issn 1063-6382
2375-026X
language eng
recordid cdi_ieee_primary_994785
source IEEE Electronic Library (IEL) Conference Proceedings
subjects Algorithm design and analysis
Applied sciences
Clustering algorithms
Computer science
Computer science
control theory
systems
Data analysis
Data engineering
Exact sciences and technology
Information systems. Data bases
Lab-on-a-chip
Laboratories
Memory organisation. Data processing
Partitioning algorithms
Software
Telecommunications
Telecommunications and information theory
Telephony
Teleprocessing networks. Isdn
title Streaming-data algorithms for high-quality clustering
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-03T21%3A52%3A25IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-pascalfrancis_6IE&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=proceeding&rft.atitle=Streaming-data%20algorithms%20for%20high-quality%20clustering&rft.btitle=Proceedings%2018th%20International%20Conference%20on%20Data%20Engineering&rft.au=O'Callaghan,%20L.&rft.date=2002&rft.spage=685&rft.epage=694&rft.pages=685-694&rft.issn=1063-6382&rft.eissn=2375-026X&rft.isbn=9780769515311&rft.isbn_list=0769515312&rft_id=info:doi/10.1109/ICDE.2002.994785&rft_dat=%3Cpascalfrancis_6IE%3E15812160%3C/pascalfrancis_6IE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rft_ieee_id=994785&rfr_iscdi=true