A Simple Algorithm for Topic Identification in 0–1 Data
Topics in 0–1 datasets are sets of variables whose occurrences are positively connected together. Earlier, we described a simple generative topic model. In this paper we show that, given data produced by this model, the lift statistics of attributes can be described in matrix form. We use this resul...
Gespeichert in:
Hauptverfasser: | , , |
---|---|
Format: | Buchkapitel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 434 |
---|---|
container_issue | |
container_start_page | 423 |
container_title | |
container_volume | |
creator | Seppänen, Jouni K. Bingham, Ella Mannila, Heikki |
description | Topics in 0–1 datasets are sets of variables whose occurrences are positively connected together. Earlier, we described a simple generative topic model. In this paper we show that, given data produced by this model, the lift statistics of attributes can be described in matrix form. We use this result to obtain a simple algorithm for finding topics in 0–1 data. We also show that a problem related to the identification of topics is NP-hard. We give experimental results on the topic identification problem, both on generated and real data. |
doi_str_mv | 10.1007/978-3-540-39804-2_38 |
format | Book Chapter |
fullrecord | <record><control><sourceid>pascalfrancis_sprin</sourceid><recordid>TN_cdi_pascalfrancis_primary_15618329</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>15618329</sourcerecordid><originalsourceid>FETCH-LOGICAL-p272t-f2e04f91bc89a97651d8996d81bbba3e81edc15eb819899ed13702690e3636243</originalsourceid><addsrcrecordid>eNotkL1OwzAUhc2fRCl9AwYvjIZ7fRPHHqvyV6kSA0Vis5zEKYY0iZIsbLwDb8iT4Lbc5UjnO7rDx9gVwg0CZLcm04JEmoAgoyER0pI-YhcUm33xdswmqBAFUWJO2Czud0wC6DQ7ZRMgkMJkCZ2z2TB8QDySiVQ4YWbOX8K2qz2f15u2D-P7lldtz9dtFwq-LH0zhioUbgxtw0PD4ff7B_mdG90lO6tcPfjZf07Z68P9evEkVs-Py8V8JTqZyVFU0kNSGcwLbZzJVIqlNkaVGvM8d-Q1-rLA1OcaTQS-RMpAKgOeFCmZ0JRdH_52bihcXfWuKcJguz5sXf9lMVWoSZq4k4fdEFGz8b3N2_ZzsAh2J9FGKZZs1GL3zuxOIv0Bv3deGA</addsrcrecordid><sourcetype>Index Database</sourcetype><iscdi>true</iscdi><recordtype>book_chapter</recordtype></control><display><type>book_chapter</type><title>A Simple Algorithm for Topic Identification in 0–1 Data</title><source>Springer Books</source><creator>Seppänen, Jouni K. ; Bingham, Ella ; Mannila, Heikki</creator><contributor>Lavrač, Nada ; Blockeel, Hendrik ; Todorovski, Ljupčo ; Gamberger, Dragan</contributor><creatorcontrib>Seppänen, Jouni K. ; Bingham, Ella ; Mannila, Heikki ; Lavrač, Nada ; Blockeel, Hendrik ; Todorovski, Ljupčo ; Gamberger, Dragan</creatorcontrib><description>Topics in 0–1 datasets are sets of variables whose occurrences are positively connected together. Earlier, we described a simple generative topic model. In this paper we show that, given data produced by this model, the lift statistics of attributes can be described in matrix form. We use this result to obtain a simple algorithm for finding topics in 0–1 data. We also show that a problem related to the identification of topics is NP-hard. We give experimental results on the topic identification problem, both on generated and real data.</description><identifier>ISSN: 0302-9743</identifier><identifier>ISBN: 9783540200857</identifier><identifier>ISBN: 3540200851</identifier><identifier>EISSN: 1611-3349</identifier><identifier>EISBN: 354039804X</identifier><identifier>EISBN: 9783540398042</identifier><identifier>DOI: 10.1007/978-3-540-39804-2_38</identifier><language>eng</language><publisher>Berlin, Heidelberg: Springer Berlin Heidelberg</publisher><subject>Applied sciences ; Artificial intelligence ; Computer science; control theory; systems ; Exact sciences and technology ; Independent Component Analysis ; Latent Semantic Analysis ; Learning and adaptive systems ; Nonnegative Matrix Factorization ; Topic Model ; Truth Assignment</subject><ispartof>Knowledge Discovery in Databases: PKDD 2003, 2003, p.423-434</ispartof><rights>Springer-Verlag Berlin Heidelberg 2003</rights><rights>2004 INIST-CNRS</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><relation>Lecture Notes in Computer Science</relation></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://link.springer.com/content/pdf/10.1007/978-3-540-39804-2_38$$EPDF$$P50$$Gspringer$$H</linktopdf><linktohtml>$$Uhttps://link.springer.com/10.1007/978-3-540-39804-2_38$$EHTML$$P50$$Gspringer$$H</linktohtml><link.rule.ids>310,311,780,781,785,790,791,794,4051,4052,27930,38260,41447,42516</link.rule.ids><backlink>$$Uhttp://pascal-francis.inist.fr/vibad/index.php?action=getRecordDetail&idt=15618329$$DView record in Pascal Francis$$Hfree_for_read</backlink></links><search><contributor>Lavrač, Nada</contributor><contributor>Blockeel, Hendrik</contributor><contributor>Todorovski, Ljupčo</contributor><contributor>Gamberger, Dragan</contributor><creatorcontrib>Seppänen, Jouni K.</creatorcontrib><creatorcontrib>Bingham, Ella</creatorcontrib><creatorcontrib>Mannila, Heikki</creatorcontrib><title>A Simple Algorithm for Topic Identification in 0–1 Data</title><title>Knowledge Discovery in Databases: PKDD 2003</title><description>Topics in 0–1 datasets are sets of variables whose occurrences are positively connected together. Earlier, we described a simple generative topic model. In this paper we show that, given data produced by this model, the lift statistics of attributes can be described in matrix form. We use this result to obtain a simple algorithm for finding topics in 0–1 data. We also show that a problem related to the identification of topics is NP-hard. We give experimental results on the topic identification problem, both on generated and real data.</description><subject>Applied sciences</subject><subject>Artificial intelligence</subject><subject>Computer science; control theory; systems</subject><subject>Exact sciences and technology</subject><subject>Independent Component Analysis</subject><subject>Latent Semantic Analysis</subject><subject>Learning and adaptive systems</subject><subject>Nonnegative Matrix Factorization</subject><subject>Topic Model</subject><subject>Truth Assignment</subject><issn>0302-9743</issn><issn>1611-3349</issn><isbn>9783540200857</isbn><isbn>3540200851</isbn><isbn>354039804X</isbn><isbn>9783540398042</isbn><fulltext>true</fulltext><rsrctype>book_chapter</rsrctype><creationdate>2003</creationdate><recordtype>book_chapter</recordtype><recordid>eNotkL1OwzAUhc2fRCl9AwYvjIZ7fRPHHqvyV6kSA0Vis5zEKYY0iZIsbLwDb8iT4Lbc5UjnO7rDx9gVwg0CZLcm04JEmoAgoyER0pI-YhcUm33xdswmqBAFUWJO2Czud0wC6DQ7ZRMgkMJkCZ2z2TB8QDySiVQ4YWbOX8K2qz2f15u2D-P7lldtz9dtFwq-LH0zhioUbgxtw0PD4ff7B_mdG90lO6tcPfjZf07Z68P9evEkVs-Py8V8JTqZyVFU0kNSGcwLbZzJVIqlNkaVGvM8d-Q1-rLA1OcaTQS-RMpAKgOeFCmZ0JRdH_52bihcXfWuKcJguz5sXf9lMVWoSZq4k4fdEFGz8b3N2_ZzsAh2J9FGKZZs1GL3zuxOIv0Bv3deGA</recordid><startdate>2003</startdate><enddate>2003</enddate><creator>Seppänen, Jouni K.</creator><creator>Bingham, Ella</creator><creator>Mannila, Heikki</creator><general>Springer Berlin Heidelberg</general><general>Springer</general><scope>IQODW</scope></search><sort><creationdate>2003</creationdate><title>A Simple Algorithm for Topic Identification in 0–1 Data</title><author>Seppänen, Jouni K. ; Bingham, Ella ; Mannila, Heikki</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-p272t-f2e04f91bc89a97651d8996d81bbba3e81edc15eb819899ed13702690e3636243</frbrgroupid><rsrctype>book_chapters</rsrctype><prefilter>book_chapters</prefilter><language>eng</language><creationdate>2003</creationdate><topic>Applied sciences</topic><topic>Artificial intelligence</topic><topic>Computer science; control theory; systems</topic><topic>Exact sciences and technology</topic><topic>Independent Component Analysis</topic><topic>Latent Semantic Analysis</topic><topic>Learning and adaptive systems</topic><topic>Nonnegative Matrix Factorization</topic><topic>Topic Model</topic><topic>Truth Assignment</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Seppänen, Jouni K.</creatorcontrib><creatorcontrib>Bingham, Ella</creatorcontrib><creatorcontrib>Mannila, Heikki</creatorcontrib><collection>Pascal-Francis</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Seppänen, Jouni K.</au><au>Bingham, Ella</au><au>Mannila, Heikki</au><au>Lavrač, Nada</au><au>Blockeel, Hendrik</au><au>Todorovski, Ljupčo</au><au>Gamberger, Dragan</au><format>book</format><genre>bookitem</genre><ristype>CHAP</ristype><atitle>A Simple Algorithm for Topic Identification in 0–1 Data</atitle><btitle>Knowledge Discovery in Databases: PKDD 2003</btitle><seriestitle>Lecture Notes in Computer Science</seriestitle><date>2003</date><risdate>2003</risdate><spage>423</spage><epage>434</epage><pages>423-434</pages><issn>0302-9743</issn><eissn>1611-3349</eissn><isbn>9783540200857</isbn><isbn>3540200851</isbn><eisbn>354039804X</eisbn><eisbn>9783540398042</eisbn><abstract>Topics in 0–1 datasets are sets of variables whose occurrences are positively connected together. Earlier, we described a simple generative topic model. In this paper we show that, given data produced by this model, the lift statistics of attributes can be described in matrix form. We use this result to obtain a simple algorithm for finding topics in 0–1 data. We also show that a problem related to the identification of topics is NP-hard. We give experimental results on the topic identification problem, both on generated and real data.</abstract><cop>Berlin, Heidelberg</cop><pub>Springer Berlin Heidelberg</pub><doi>10.1007/978-3-540-39804-2_38</doi><tpages>12</tpages><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | ISSN: 0302-9743 |
ispartof | Knowledge Discovery in Databases: PKDD 2003, 2003, p.423-434 |
issn | 0302-9743 1611-3349 |
language | eng |
recordid | cdi_pascalfrancis_primary_15618329 |
source | Springer Books |
subjects | Applied sciences Artificial intelligence Computer science control theory systems Exact sciences and technology Independent Component Analysis Latent Semantic Analysis Learning and adaptive systems Nonnegative Matrix Factorization Topic Model Truth Assignment |
title | A Simple Algorithm for Topic Identification in 0–1 Data |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-14T16%3A03%3A58IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-pascalfrancis_sprin&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=bookitem&rft.atitle=A%20Simple%20Algorithm%20for%20Topic%20Identification%20in%200%E2%80%931%20Data&rft.btitle=Knowledge%20Discovery%20in%20Databases:%20PKDD%202003&rft.au=Sepp%C3%A4nen,%20Jouni%20K.&rft.date=2003&rft.spage=423&rft.epage=434&rft.pages=423-434&rft.issn=0302-9743&rft.eissn=1611-3349&rft.isbn=9783540200857&rft.isbn_list=3540200851&rft_id=info:doi/10.1007/978-3-540-39804-2_38&rft_dat=%3Cpascalfrancis_sprin%3E15618329%3C/pascalfrancis_sprin%3E%3Curl%3E%3C/url%3E&rft.eisbn=354039804X&rft.eisbn_list=9783540398042&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true |