A Simple Algorithm for Topic Identification in 0–1 Data

Topics in 0–1 datasets are sets of variables whose occurrences are positively connected together. Earlier, we described a simple generative topic model. In this paper we show that, given data produced by this model, the lift statistics of attributes can be described in matrix form. We use this resul...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Seppänen, Jouni K., Bingham, Ella, Mannila, Heikki
Format: Buchkapitel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 434
container_issue
container_start_page 423
container_title
container_volume
creator Seppänen, Jouni K.
Bingham, Ella
Mannila, Heikki
description Topics in 0–1 datasets are sets of variables whose occurrences are positively connected together. Earlier, we described a simple generative topic model. In this paper we show that, given data produced by this model, the lift statistics of attributes can be described in matrix form. We use this result to obtain a simple algorithm for finding topics in 0–1 data. We also show that a problem related to the identification of topics is NP-hard. We give experimental results on the topic identification problem, both on generated and real data.
doi_str_mv 10.1007/978-3-540-39804-2_38
format Book Chapter
fullrecord <record><control><sourceid>pascalfrancis_sprin</sourceid><recordid>TN_cdi_pascalfrancis_primary_15618329</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>15618329</sourcerecordid><originalsourceid>FETCH-LOGICAL-p272t-f2e04f91bc89a97651d8996d81bbba3e81edc15eb819899ed13702690e3636243</originalsourceid><addsrcrecordid>eNotkL1OwzAUhc2fRCl9AwYvjIZ7fRPHHqvyV6kSA0Vis5zEKYY0iZIsbLwDb8iT4Lbc5UjnO7rDx9gVwg0CZLcm04JEmoAgoyER0pI-YhcUm33xdswmqBAFUWJO2Czud0wC6DQ7ZRMgkMJkCZ2z2TB8QDySiVQ4YWbOX8K2qz2f15u2D-P7lldtz9dtFwq-LH0zhioUbgxtw0PD4ff7B_mdG90lO6tcPfjZf07Z68P9evEkVs-Py8V8JTqZyVFU0kNSGcwLbZzJVIqlNkaVGvM8d-Q1-rLA1OcaTQS-RMpAKgOeFCmZ0JRdH_52bihcXfWuKcJguz5sXf9lMVWoSZq4k4fdEFGz8b3N2_ZzsAh2J9FGKZZs1GL3zuxOIv0Bv3deGA</addsrcrecordid><sourcetype>Index Database</sourcetype><iscdi>true</iscdi><recordtype>book_chapter</recordtype></control><display><type>book_chapter</type><title>A Simple Algorithm for Topic Identification in 0–1 Data</title><source>Springer Books</source><creator>Seppänen, Jouni K. ; Bingham, Ella ; Mannila, Heikki</creator><contributor>Lavrač, Nada ; Blockeel, Hendrik ; Todorovski, Ljupčo ; Gamberger, Dragan</contributor><creatorcontrib>Seppänen, Jouni K. ; Bingham, Ella ; Mannila, Heikki ; Lavrač, Nada ; Blockeel, Hendrik ; Todorovski, Ljupčo ; Gamberger, Dragan</creatorcontrib><description>Topics in 0–1 datasets are sets of variables whose occurrences are positively connected together. Earlier, we described a simple generative topic model. In this paper we show that, given data produced by this model, the lift statistics of attributes can be described in matrix form. We use this result to obtain a simple algorithm for finding topics in 0–1 data. We also show that a problem related to the identification of topics is NP-hard. We give experimental results on the topic identification problem, both on generated and real data.</description><identifier>ISSN: 0302-9743</identifier><identifier>ISBN: 9783540200857</identifier><identifier>ISBN: 3540200851</identifier><identifier>EISSN: 1611-3349</identifier><identifier>EISBN: 354039804X</identifier><identifier>EISBN: 9783540398042</identifier><identifier>DOI: 10.1007/978-3-540-39804-2_38</identifier><language>eng</language><publisher>Berlin, Heidelberg: Springer Berlin Heidelberg</publisher><subject>Applied sciences ; Artificial intelligence ; Computer science; control theory; systems ; Exact sciences and technology ; Independent Component Analysis ; Latent Semantic Analysis ; Learning and adaptive systems ; Nonnegative Matrix Factorization ; Topic Model ; Truth Assignment</subject><ispartof>Knowledge Discovery in Databases: PKDD 2003, 2003, p.423-434</ispartof><rights>Springer-Verlag Berlin Heidelberg 2003</rights><rights>2004 INIST-CNRS</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><relation>Lecture Notes in Computer Science</relation></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://link.springer.com/content/pdf/10.1007/978-3-540-39804-2_38$$EPDF$$P50$$Gspringer$$H</linktopdf><linktohtml>$$Uhttps://link.springer.com/10.1007/978-3-540-39804-2_38$$EHTML$$P50$$Gspringer$$H</linktohtml><link.rule.ids>310,311,780,781,785,790,791,794,4051,4052,27930,38260,41447,42516</link.rule.ids><backlink>$$Uhttp://pascal-francis.inist.fr/vibad/index.php?action=getRecordDetail&amp;idt=15618329$$DView record in Pascal Francis$$Hfree_for_read</backlink></links><search><contributor>Lavrač, Nada</contributor><contributor>Blockeel, Hendrik</contributor><contributor>Todorovski, Ljupčo</contributor><contributor>Gamberger, Dragan</contributor><creatorcontrib>Seppänen, Jouni K.</creatorcontrib><creatorcontrib>Bingham, Ella</creatorcontrib><creatorcontrib>Mannila, Heikki</creatorcontrib><title>A Simple Algorithm for Topic Identification in 0–1 Data</title><title>Knowledge Discovery in Databases: PKDD 2003</title><description>Topics in 0–1 datasets are sets of variables whose occurrences are positively connected together. Earlier, we described a simple generative topic model. In this paper we show that, given data produced by this model, the lift statistics of attributes can be described in matrix form. We use this result to obtain a simple algorithm for finding topics in 0–1 data. We also show that a problem related to the identification of topics is NP-hard. We give experimental results on the topic identification problem, both on generated and real data.</description><subject>Applied sciences</subject><subject>Artificial intelligence</subject><subject>Computer science; control theory; systems</subject><subject>Exact sciences and technology</subject><subject>Independent Component Analysis</subject><subject>Latent Semantic Analysis</subject><subject>Learning and adaptive systems</subject><subject>Nonnegative Matrix Factorization</subject><subject>Topic Model</subject><subject>Truth Assignment</subject><issn>0302-9743</issn><issn>1611-3349</issn><isbn>9783540200857</isbn><isbn>3540200851</isbn><isbn>354039804X</isbn><isbn>9783540398042</isbn><fulltext>true</fulltext><rsrctype>book_chapter</rsrctype><creationdate>2003</creationdate><recordtype>book_chapter</recordtype><recordid>eNotkL1OwzAUhc2fRCl9AwYvjIZ7fRPHHqvyV6kSA0Vis5zEKYY0iZIsbLwDb8iT4Lbc5UjnO7rDx9gVwg0CZLcm04JEmoAgoyER0pI-YhcUm33xdswmqBAFUWJO2Czud0wC6DQ7ZRMgkMJkCZ2z2TB8QDySiVQ4YWbOX8K2qz2f15u2D-P7lldtz9dtFwq-LH0zhioUbgxtw0PD4ff7B_mdG90lO6tcPfjZf07Z68P9evEkVs-Py8V8JTqZyVFU0kNSGcwLbZzJVIqlNkaVGvM8d-Q1-rLA1OcaTQS-RMpAKgOeFCmZ0JRdH_52bihcXfWuKcJguz5sXf9lMVWoSZq4k4fdEFGz8b3N2_ZzsAh2J9FGKZZs1GL3zuxOIv0Bv3deGA</recordid><startdate>2003</startdate><enddate>2003</enddate><creator>Seppänen, Jouni K.</creator><creator>Bingham, Ella</creator><creator>Mannila, Heikki</creator><general>Springer Berlin Heidelberg</general><general>Springer</general><scope>IQODW</scope></search><sort><creationdate>2003</creationdate><title>A Simple Algorithm for Topic Identification in 0–1 Data</title><author>Seppänen, Jouni K. ; Bingham, Ella ; Mannila, Heikki</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-p272t-f2e04f91bc89a97651d8996d81bbba3e81edc15eb819899ed13702690e3636243</frbrgroupid><rsrctype>book_chapters</rsrctype><prefilter>book_chapters</prefilter><language>eng</language><creationdate>2003</creationdate><topic>Applied sciences</topic><topic>Artificial intelligence</topic><topic>Computer science; control theory; systems</topic><topic>Exact sciences and technology</topic><topic>Independent Component Analysis</topic><topic>Latent Semantic Analysis</topic><topic>Learning and adaptive systems</topic><topic>Nonnegative Matrix Factorization</topic><topic>Topic Model</topic><topic>Truth Assignment</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Seppänen, Jouni K.</creatorcontrib><creatorcontrib>Bingham, Ella</creatorcontrib><creatorcontrib>Mannila, Heikki</creatorcontrib><collection>Pascal-Francis</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Seppänen, Jouni K.</au><au>Bingham, Ella</au><au>Mannila, Heikki</au><au>Lavrač, Nada</au><au>Blockeel, Hendrik</au><au>Todorovski, Ljupčo</au><au>Gamberger, Dragan</au><format>book</format><genre>bookitem</genre><ristype>CHAP</ristype><atitle>A Simple Algorithm for Topic Identification in 0–1 Data</atitle><btitle>Knowledge Discovery in Databases: PKDD 2003</btitle><seriestitle>Lecture Notes in Computer Science</seriestitle><date>2003</date><risdate>2003</risdate><spage>423</spage><epage>434</epage><pages>423-434</pages><issn>0302-9743</issn><eissn>1611-3349</eissn><isbn>9783540200857</isbn><isbn>3540200851</isbn><eisbn>354039804X</eisbn><eisbn>9783540398042</eisbn><abstract>Topics in 0–1 datasets are sets of variables whose occurrences are positively connected together. Earlier, we described a simple generative topic model. In this paper we show that, given data produced by this model, the lift statistics of attributes can be described in matrix form. We use this result to obtain a simple algorithm for finding topics in 0–1 data. We also show that a problem related to the identification of topics is NP-hard. We give experimental results on the topic identification problem, both on generated and real data.</abstract><cop>Berlin, Heidelberg</cop><pub>Springer Berlin Heidelberg</pub><doi>10.1007/978-3-540-39804-2_38</doi><tpages>12</tpages><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 0302-9743
ispartof Knowledge Discovery in Databases: PKDD 2003, 2003, p.423-434
issn 0302-9743
1611-3349
language eng
recordid cdi_pascalfrancis_primary_15618329
source Springer Books
subjects Applied sciences
Artificial intelligence
Computer science
control theory
systems
Exact sciences and technology
Independent Component Analysis
Latent Semantic Analysis
Learning and adaptive systems
Nonnegative Matrix Factorization
Topic Model
Truth Assignment
title A Simple Algorithm for Topic Identification in 0–1 Data
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-14T16%3A03%3A58IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-pascalfrancis_sprin&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=bookitem&rft.atitle=A%20Simple%20Algorithm%20for%20Topic%20Identification%20in%200%E2%80%931%20Data&rft.btitle=Knowledge%20Discovery%20in%20Databases:%20PKDD%202003&rft.au=Sepp%C3%A4nen,%20Jouni%20K.&rft.date=2003&rft.spage=423&rft.epage=434&rft.pages=423-434&rft.issn=0302-9743&rft.eissn=1611-3349&rft.isbn=9783540200857&rft.isbn_list=3540200851&rft_id=info:doi/10.1007/978-3-540-39804-2_38&rft_dat=%3Cpascalfrancis_sprin%3E15618329%3C/pascalfrancis_sprin%3E%3Curl%3E%3C/url%3E&rft.eisbn=354039804X&rft.eisbn_list=9783540398042&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true