Towards a Simple Clustering Criterion Based on Minimum Length Encoding
We propose a simple and intuitive clustering evaluation criterion based on the minimum description length principle which yields a particularly simple way of describing and encoding a set of examples. The basic idea is to view a clustering as a restriction of the attribute domains, given an example’...
Gespeichert in:
Hauptverfasser: | , |
---|---|
Format: | Buchkapitel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 270 |
---|---|
container_issue | |
container_start_page | 258 |
container_title | |
container_volume | 2430 |
creator | Ludl, Marcus-Christopher Widmer, Gerhard |
description | We propose a simple and intuitive clustering evaluation criterion based on the minimum description length principle which yields a particularly simple way of describing and encoding a set of examples. The basic idea is to view a clustering as a restriction of the attribute domains, given an example’s cluster membership. As a special operational case we develop the so-called rectangular uniform message length measure that can be used to evaluate clusterings described as sets of hyper-rectangles. We theoretically prove that this measure punishes cluster boundaries in regions of uniform instance distribution (i.e., unintuitive clusterings), and we experimentally compare a simple clustering algorithm using this measure with the well-known algorithms KMeans and AutoClass. |
doi_str_mv | 10.1007/3-540-36755-1_22 |
format | Book Chapter |
fullrecord | <record><control><sourceid>proquest_pasca</sourceid><recordid>TN_cdi_pascalfrancis_primary_14655655</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>EBC3073102_28_272</sourcerecordid><originalsourceid>FETCH-LOGICAL-p310t-f42286c3844ac4cbd75fbeab2c4a016567bb0d8cd82a883a522ad9871d2b2a563</originalsourceid><addsrcrecordid>eNo9kE1PwzAMhsOnmGB3jr1w7EjspEmPMPElDXEAzpGbZlDo2pJ0Qvx7UkBYlmzZ72vJD2Ongi8E5_occyV5joVWKhcWYIfNS20wDX9mfJfNRCFEjijLvf-dnNblPptx5JCXWuIhm5WooBRaiSM2j_GNp0CQyogZu37qPynUMaPssdkMrc-W7TaOPjTdS7YMzdT1XXZJ0ddZau6brtlsN9nKdy_ja3bVub5O0hN2sKY2-vlfPWbP11dPy9t89XBzt7xY5QMKPuZrCWAKh0ZKctJVtVbrylMFThIXhSp0VfHauNoAGYOkAKgujRY1VECqwGN29nt3oOioXQfqXBPtEJoNhS8rZKFUyqRb_OriMH3ig636_j1awe0E16JNrOwPSDvBTQb8Oxz6j62Po_WTw_luDNS6VxoSiWiR6_QIWDAWNOA34351_Q</addsrcrecordid><sourcetype>Index Database</sourcetype><iscdi>true</iscdi><recordtype>book_chapter</recordtype><pqid>EBC3073102_28_272</pqid></control><display><type>book_chapter</type><title>Towards a Simple Clustering Criterion Based on Minimum Length Encoding</title><source>Springer Books</source><creator>Ludl, Marcus-Christopher ; Widmer, Gerhard</creator><contributor>Toivonen, Hannu ; Elomaa, Tapio ; Mannila, Heikki ; Mannila, Heikki ; Toivonen, Hannu ; Elomaa, Tapio</contributor><creatorcontrib>Ludl, Marcus-Christopher ; Widmer, Gerhard ; Toivonen, Hannu ; Elomaa, Tapio ; Mannila, Heikki ; Mannila, Heikki ; Toivonen, Hannu ; Elomaa, Tapio</creatorcontrib><description>We propose a simple and intuitive clustering evaluation criterion based on the minimum description length principle which yields a particularly simple way of describing and encoding a set of examples. The basic idea is to view a clustering as a restriction of the attribute domains, given an example’s cluster membership. As a special operational case we develop the so-called rectangular uniform message length measure that can be used to evaluate clusterings described as sets of hyper-rectangles. We theoretically prove that this measure punishes cluster boundaries in regions of uniform instance distribution (i.e., unintuitive clusterings), and we experimentally compare a simple clustering algorithm using this measure with the well-known algorithms KMeans and AutoClass.</description><identifier>ISSN: 0302-9743</identifier><identifier>ISBN: 9783540440369</identifier><identifier>ISBN: 3540440364</identifier><identifier>EISSN: 1611-3349</identifier><identifier>EISBN: 9783540367550</identifier><identifier>EISBN: 3540367551</identifier><identifier>DOI: 10.1007/3-540-36755-1_22</identifier><identifier>OCLC: 935291751</identifier><identifier>LCCallNum: Q334-342</identifier><language>eng</language><publisher>Germany: Springer Berlin / Heidelberg</publisher><subject>Applied sciences ; Artificial intelligence ; Candidate Cluster ; Computer science; control theory; systems ; Exact sciences and technology ; Instance Space ; Learning and adaptive systems ; Message Length ; Minimum Description Length ; Synthetic Dataset</subject><ispartof>Lecture notes in computer science, 2002, Vol.2430, p.258-270</ispartof><rights>Springer-Verlag Berlin Heidelberg 2002</rights><rights>2003 INIST-CNRS</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><relation>Lecture Notes in Computer Science</relation></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Uhttps://ebookcentral.proquest.com/covers/3073102-l.jpg</thumbnail><linktopdf>$$Uhttps://link.springer.com/content/pdf/10.1007/3-540-36755-1_22$$EPDF$$P50$$Gspringer$$H</linktopdf><linktohtml>$$Uhttps://link.springer.com/10.1007/3-540-36755-1_22$$EHTML$$P50$$Gspringer$$H</linktohtml><link.rule.ids>309,310,775,776,780,785,786,789,4036,4037,27902,38232,41418,42487</link.rule.ids><backlink>$$Uhttp://pascal-francis.inist.fr/vibad/index.php?action=getRecordDetail&idt=14655655$$DView record in Pascal Francis$$Hfree_for_read</backlink></links><search><contributor>Toivonen, Hannu</contributor><contributor>Elomaa, Tapio</contributor><contributor>Mannila, Heikki</contributor><contributor>Mannila, Heikki</contributor><contributor>Toivonen, Hannu</contributor><contributor>Elomaa, Tapio</contributor><creatorcontrib>Ludl, Marcus-Christopher</creatorcontrib><creatorcontrib>Widmer, Gerhard</creatorcontrib><title>Towards a Simple Clustering Criterion Based on Minimum Length Encoding</title><title>Lecture notes in computer science</title><description>We propose a simple and intuitive clustering evaluation criterion based on the minimum description length principle which yields a particularly simple way of describing and encoding a set of examples. The basic idea is to view a clustering as a restriction of the attribute domains, given an example’s cluster membership. As a special operational case we develop the so-called rectangular uniform message length measure that can be used to evaluate clusterings described as sets of hyper-rectangles. We theoretically prove that this measure punishes cluster boundaries in regions of uniform instance distribution (i.e., unintuitive clusterings), and we experimentally compare a simple clustering algorithm using this measure with the well-known algorithms KMeans and AutoClass.</description><subject>Applied sciences</subject><subject>Artificial intelligence</subject><subject>Candidate Cluster</subject><subject>Computer science; control theory; systems</subject><subject>Exact sciences and technology</subject><subject>Instance Space</subject><subject>Learning and adaptive systems</subject><subject>Message Length</subject><subject>Minimum Description Length</subject><subject>Synthetic Dataset</subject><issn>0302-9743</issn><issn>1611-3349</issn><isbn>9783540440369</isbn><isbn>3540440364</isbn><isbn>9783540367550</isbn><isbn>3540367551</isbn><fulltext>true</fulltext><rsrctype>book_chapter</rsrctype><creationdate>2002</creationdate><recordtype>book_chapter</recordtype><recordid>eNo9kE1PwzAMhsOnmGB3jr1w7EjspEmPMPElDXEAzpGbZlDo2pJ0Qvx7UkBYlmzZ72vJD2Ongi8E5_occyV5joVWKhcWYIfNS20wDX9mfJfNRCFEjijLvf-dnNblPptx5JCXWuIhm5WooBRaiSM2j_GNp0CQyogZu37qPynUMaPssdkMrc-W7TaOPjTdS7YMzdT1XXZJ0ddZau6brtlsN9nKdy_ja3bVub5O0hN2sKY2-vlfPWbP11dPy9t89XBzt7xY5QMKPuZrCWAKh0ZKctJVtVbrylMFThIXhSp0VfHauNoAGYOkAKgujRY1VECqwGN29nt3oOioXQfqXBPtEJoNhS8rZKFUyqRb_OriMH3ig636_j1awe0E16JNrOwPSDvBTQb8Oxz6j62Po_WTw_luDNS6VxoSiWiR6_QIWDAWNOA34351_Q</recordid><startdate>2002</startdate><enddate>2002</enddate><creator>Ludl, Marcus-Christopher</creator><creator>Widmer, Gerhard</creator><general>Springer Berlin / Heidelberg</general><general>Springer Berlin Heidelberg</general><general>Springer</general><scope>FFUUA</scope><scope>IQODW</scope></search><sort><creationdate>2002</creationdate><title>Towards a Simple Clustering Criterion Based on Minimum Length Encoding</title><author>Ludl, Marcus-Christopher ; Widmer, Gerhard</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-p310t-f42286c3844ac4cbd75fbeab2c4a016567bb0d8cd82a883a522ad9871d2b2a563</frbrgroupid><rsrctype>book_chapters</rsrctype><prefilter>book_chapters</prefilter><language>eng</language><creationdate>2002</creationdate><topic>Applied sciences</topic><topic>Artificial intelligence</topic><topic>Candidate Cluster</topic><topic>Computer science; control theory; systems</topic><topic>Exact sciences and technology</topic><topic>Instance Space</topic><topic>Learning and adaptive systems</topic><topic>Message Length</topic><topic>Minimum Description Length</topic><topic>Synthetic Dataset</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Ludl, Marcus-Christopher</creatorcontrib><creatorcontrib>Widmer, Gerhard</creatorcontrib><collection>ProQuest Ebook Central - Book Chapters - Demo use only</collection><collection>Pascal-Francis</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Ludl, Marcus-Christopher</au><au>Widmer, Gerhard</au><au>Toivonen, Hannu</au><au>Elomaa, Tapio</au><au>Mannila, Heikki</au><au>Mannila, Heikki</au><au>Toivonen, Hannu</au><au>Elomaa, Tapio</au><format>book</format><genre>bookitem</genre><ristype>CHAP</ristype><atitle>Towards a Simple Clustering Criterion Based on Minimum Length Encoding</atitle><btitle>Lecture notes in computer science</btitle><seriestitle>Lecture Notes in Computer Science</seriestitle><date>2002</date><risdate>2002</risdate><volume>2430</volume><spage>258</spage><epage>270</epage><pages>258-270</pages><issn>0302-9743</issn><eissn>1611-3349</eissn><isbn>9783540440369</isbn><isbn>3540440364</isbn><eisbn>9783540367550</eisbn><eisbn>3540367551</eisbn><abstract>We propose a simple and intuitive clustering evaluation criterion based on the minimum description length principle which yields a particularly simple way of describing and encoding a set of examples. The basic idea is to view a clustering as a restriction of the attribute domains, given an example’s cluster membership. As a special operational case we develop the so-called rectangular uniform message length measure that can be used to evaluate clusterings described as sets of hyper-rectangles. We theoretically prove that this measure punishes cluster boundaries in regions of uniform instance distribution (i.e., unintuitive clusterings), and we experimentally compare a simple clustering algorithm using this measure with the well-known algorithms KMeans and AutoClass.</abstract><cop>Germany</cop><pub>Springer Berlin / Heidelberg</pub><doi>10.1007/3-540-36755-1_22</doi><oclcid>935291751</oclcid><tpages>13</tpages><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | ISSN: 0302-9743 |
ispartof | Lecture notes in computer science, 2002, Vol.2430, p.258-270 |
issn | 0302-9743 1611-3349 |
language | eng |
recordid | cdi_pascalfrancis_primary_14655655 |
source | Springer Books |
subjects | Applied sciences Artificial intelligence Candidate Cluster Computer science control theory systems Exact sciences and technology Instance Space Learning and adaptive systems Message Length Minimum Description Length Synthetic Dataset |
title | Towards a Simple Clustering Criterion Based on Minimum Length Encoding |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-02T09%3A21%3A35IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_pasca&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=bookitem&rft.atitle=Towards%20a%20Simple%20Clustering%20Criterion%20Based%20on%20Minimum%20Length%20Encoding&rft.btitle=Lecture%20notes%20in%20computer%20science&rft.au=Ludl,%20Marcus-Christopher&rft.date=2002&rft.volume=2430&rft.spage=258&rft.epage=270&rft.pages=258-270&rft.issn=0302-9743&rft.eissn=1611-3349&rft.isbn=9783540440369&rft.isbn_list=3540440364&rft_id=info:doi/10.1007/3-540-36755-1_22&rft_dat=%3Cproquest_pasca%3EEBC3073102_28_272%3C/proquest_pasca%3E%3Curl%3E%3C/url%3E&rft.eisbn=9783540367550&rft.eisbn_list=3540367551&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=EBC3073102_28_272&rft_id=info:pmid/&rfr_iscdi=true |