A Simple Algorithm for Topic Identification in 0–1 Data

Topics in 0–1 datasets are sets of variables whose occurrences are positively connected together. Earlier, we described a simple generative topic model. In this paper we show that, given data produced by this model, the lift statistics of attributes can be described in matrix form. We use this resul...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Seppänen, Jouni K., Bingham, Ella, Mannila, Heikki
Format:	Buchkapitel
Sprache:	eng
Schlagworte:	Applied sciences Artificial intelligence Computer science control theory systems Exact sciences and technology Independent Component Analysis Latent Semantic Analysis Learning and adaptive systems Nonnegative Matrix Factorization Topic Model Truth Assignment
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	434
container_issue
container_start_page	423
container_title
container_volume
creator	Seppänen, Jouni K. Bingham, Ella Mannila, Heikki
description	Topics in 0–1 datasets are sets of variables whose occurrences are positively connected together. Earlier, we described a simple generative topic model. In this paper we show that, given data produced by this model, the lift statistics of attributes can be described in matrix form. We use this result to obtain a simple algorithm for finding topics in 0–1 data. We also show that a problem related to the identification of topics is NP-hard. We give experimental results on the topic identification problem, both on generated and real data.
doi_str_mv	10.1007/978-3-540-39804-2_38
format	Book Chapter
fullrecord	<record><control><sourceid>pascalfrancis_sprin</sourceid><recordid>TN_cdi_pascalfrancis_primary_15618329</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>15618329</sourcerecordid><originalsourceid>FETCH-LOGICAL-p272t-f2e04f91bc89a97651d8996d81bbba3e81edc15eb819899ed13702690e3636243</originalsourceid><addsrcrecordid>eNotkL1OwzAUhc2fRCl9AwYvjIZ7fRPHHqvyV6kSA0Vis5zEKYY0iZIsbLwDb8iT4Lbc5UjnO7rDx9gVwg0CZLcm04JEmoAgoyER0pI-YhcUm33xdswmqBAFUWJO2Czud0wC6DQ7ZRMgkMJkCZ2z2TB8QDySiVQ4YWbOX8K2qz2f15u2D-P7lldtz9dtFwq-LH0zhioUbgxtw0PD4ff7B_mdG90lO6tcPfjZf07Z68P9evEkVs-Py8V8JTqZyVFU0kNSGcwLbZzJVIqlNkaVGvM8d-Q1-rLA1OcaTQS-RMpAKgOeFCmZ0JRdH_52bihcXfWuKcJguz5sXf9lMVWoSZq4k4fdEFGz8b3N2_ZzsAh2J9FGKZZs1GL3zuxOIv0Bv3deGA</addsrcrecordid><sourcetype>Index Database</sourcetype><iscdi>true</iscdi><recordtype>book_chapter</recordtype></control><display><type>book_chapter</type><title>A Simple Algorithm for Topic Identification in 0–1 Data</title><source>Springer Books</source><creator>Seppänen, Jouni K. ; Bingham, Ella ; Mannila, Heikki</creator><contributor>Lavrač, Nada ; Blockeel, Hendrik ; Todorovski, Ljupčo ; Gamberger, Dragan</contributor><creatorcontrib>Seppänen, Jouni K. ; Bingham, Ella ; Mannila, Heikki ; Lavrač, Nada ; Blockeel, Hendrik ; Todorovski, Ljupčo ; Gamberger, Dragan</creatorcontrib><description>Topics in 0–1 datasets are sets of variables whose occurrences are positively connected together. Earlier, we described a simple generative topic model. In this paper we show that, given data produced by this model, the lift statistics of attributes can be described in matrix form. We use this result to obtain a simple algorithm for finding topics in 0–1 data. We also show that a problem related to the identification of topics is NP-hard. We give experimental results on the topic identification problem, both on generated and real data.</description><identifier>ISSN: 0302-9743</identifier><identifier>ISBN: 9783540200857</identifier><identifier>ISBN: 3540200851</identifier><identifier>EISSN: 1611-3349</identifier><identifier>EISBN: 354039804X</identifier><identifier>EISBN: 9783540398042</identifier><identifier>DOI: 10.1007/978-3-540-39804-2_38</identifier><language>eng</language><publisher>Berlin, Heidelberg: Springer Berlin Heidelberg</publisher><subject>Applied sciences ; Artificial intelligence ; Computer science; control theory; systems ; Exact sciences and technology ; Independent Component Analysis ; Latent Semantic Analysis ; Learning and adaptive systems ; Nonnegative Matrix Factorization ; Topic Model ; Truth Assignment</subject><ispartof>Knowledge Discovery in Databases: PKDD 2003, 2003, p.423-434</ispartof><rights>Springer-Verlag Berlin Heidelberg 2003</rights><rights>2004 INIST-CNRS</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><relation>Lecture Notes in Computer Science</relation></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://link.springer.com/content/pdf/10.1007/978-3-540-39804-2_38$$EPDF$$P50$$Gspringer$$H</linktopdf><linktohtml>$$Uhttps://link.springer.com/10.1007/978-3-540-39804-2_38$$EHTML$$P50$$Gspringer$$H</linktohtml><link.rule.ids>310,311,780,781,785,790,791,794,4051,4052,27930,38260,41447,42516</link.rule.ids><backlink>$$Uhttp://pascal-francis.inist.fr/vibad/index.php?action=getRecordDetail&idt=15618329$$DView record in Pascal Francis$$Hfree_for_read</backlink></links><search><contributor>Lavrač, Nada</contributor><contributor>Blockeel, Hendrik</contributor><contributor>Todorovski, Ljupčo</contributor><contributor>Gamberger, Dragan</contributor><creatorcontrib>Seppänen, Jouni K.</creatorcontrib><creatorcontrib>Bingham, Ella</creatorcontrib><creatorcontrib>Mannila, Heikki</creatorcontrib><title>A Simple Algorithm for Topic Identification in 0–1 Data</title><title>Knowledge Discovery in Databases: PKDD 2003</title><description>Topics in 0–1 datasets are sets of variables whose occurrences are positively connected together. Earlier, we described a simple generative topic model. In this paper we show that, given data produced by this model, the lift statistics of attributes can be described in matrix form. We use this result to obtain a simple algorithm for finding topics in 0–1 data. We also show that a problem related to the identification of topics is NP-hard. We give experimental results on the topic identification problem, both on generated and real data.</description><subject>Applied sciences</subject><subject>Artificial intelligence</subject><subject>Computer science; control theory; systems</subject><subject>Exact sciences and technology</subject><subject>Independent Component Analysis</subject><subject>Latent Semantic Analysis</subject><subject>Learning and adaptive systems</subject><subject>Nonnegative Matrix Factorization</subject><subject>Topic Model</subject><subject>Truth Assignment</subject><issn>0302-9743</issn><issn>1611-3349</issn><isbn>9783540200857</isbn><isbn>3540200851</isbn><isbn>354039804X</isbn><isbn>9783540398042</isbn><fulltext>true</fulltext><rsrctype>book_chapter</rsrctype><creationdate>2003</creationdate><recordtype>book_chapter</recordtype><recordid>eNotkL1OwzAUhc2fRCl9AwYvjIZ7fRPHHqvyV6kSA0Vis5zEKYY0iZIsbLwDb8iT4Lbc5UjnO7rDx9gVwg0CZLcm04JEmoAgoyER0pI-YhcUm33xdswmqBAFUWJO2Czud0wC6DQ7ZRMgkMJkCZ2z2TB8QDySiVQ4YWbOX8K2qz2f15u2D-P7lldtz9dtFwq-LH0zhioUbgxtw0PD4ff7B_mdG90lO6tcPfjZf07Z68P9evEkVs-Py8V8JTqZyVFU0kNSGcwLbZzJVIqlNkaVGvM8d-Q1-rLA1OcaTQS-RMpAKgOeFCmZ0JRdH_52bihcXfWuKcJguz5sXf9lMVWoSZq4k4fdEFGz8b3N2_ZzsAh2J9FGKZZs1GL3zuxOIv0Bv3deGA</recordid><startdate>2003</startdate><enddate>2003</enddate><creator>Seppänen, Jouni K.</creator><creator>Bingham, Ella</creator><creator>Mannila, Heikki</creator><general>Springer Berlin Heidelberg</general><general>Springer</general><scope>IQODW</scope></search><sort><creationdate>2003</creationdate><title>A Simple Algorithm for Topic Identification in 0–1 Data</title><author>Seppänen, Jouni K. ; Bingham, Ella ; Mannila, Heikki</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-p272t-f2e04f91bc89a97651d8996d81bbba3e81edc15eb819899ed13702690e3636243</frbrgroupid><rsrctype>book_chapters</rsrctype><prefilter>book_chapters</prefilter><language>eng</language><creationdate>2003</creationdate><topic>Applied sciences</topic><topic>Artificial intelligence</topic><topic>Computer science; control theory; systems</topic><topic>Exact sciences and technology</topic><topic>Independent Component Analysis</topic><topic>Latent Semantic Analysis</topic><topic>Learning and adaptive systems</topic><topic>Nonnegative Matrix Factorization</topic><topic>Topic Model</topic><topic>Truth Assignment</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Seppänen, Jouni K.</creatorcontrib><creatorcontrib>Bingham, Ella</creatorcontrib><creatorcontrib>Mannila, Heikki</creatorcontrib><collection>Pascal-Francis</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Seppänen, Jouni K.</au><au>Bingham, Ella</au><au>Mannila, Heikki</au><au>Lavrač, Nada</au><au>Blockeel, Hendrik</au><au>Todorovski, Ljupčo</au><au>Gamberger, Dragan</au><format>book</format><genre>bookitem</genre><ristype>CHAP</ristype><atitle>A Simple Algorithm for Topic Identification in 0–1 Data</atitle><btitle>Knowledge Discovery in Databases: PKDD 2003</btitle><seriestitle>Lecture Notes in Computer Science</seriestitle><date>2003</date><risdate>2003</risdate><spage>423</spage><epage>434</epage><pages>423-434</pages><issn>0302-9743</issn><eissn>1611-3349</eissn><isbn>9783540200857</isbn><isbn>3540200851</isbn><eisbn>354039804X</eisbn><eisbn>9783540398042</eisbn><abstract>Topics in 0–1 datasets are sets of variables whose occurrences are positively connected together. Earlier, we described a simple generative topic model. In this paper we show that, given data produced by this model, the lift statistics of attributes can be described in matrix form. We use this result to obtain a simple algorithm for finding topics in 0–1 data. We also show that a problem related to the identification of topics is NP-hard. We give experimental results on the topic identification problem, both on generated and real data.</abstract><cop>Berlin, Heidelberg</cop><pub>Springer Berlin Heidelberg</pub><doi>10.1007/978-3-540-39804-2_38</doi><tpages>12</tpages><oa>free_for_read</oa></addata></record>
fulltext	fulltext
identifier	ISSN: 0302-9743
ispartof	Knowledge Discovery in Databases: PKDD 2003, 2003, p.423-434
issn	0302-9743 1611-3349
language	eng
recordid	cdi_pascalfrancis_primary_15618329
source	Springer Books
subjects	Applied sciences Artificial intelligence Computer science control theory systems Exact sciences and technology Independent Component Analysis Latent Semantic Analysis Learning and adaptive systems Nonnegative Matrix Factorization Topic Model Truth Assignment
title	A Simple Algorithm for Topic Identification in 0–1 Data
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-14T16%3A03%3A58IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-pascalfrancis_sprin&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=bookitem&rft.atitle=A%20Simple%20Algorithm%20for%20Topic%20Identification%20in%200%E2%80%931%20Data&rft.btitle=Knowledge%20Discovery%20in%20Databases:%20PKDD%202003&rft.au=Sepp%C3%A4nen,%20Jouni%20K.&rft.date=2003&rft.spage=423&rft.epage=434&rft.pages=423-434&rft.issn=0302-9743&rft.eissn=1611-3349&rft.isbn=9783540200857&rft.isbn_list=3540200851&rft_id=info:doi/10.1007/978-3-540-39804-2_38&rft_dat=%3Cpascalfrancis_sprin%3E15618329%3C/pascalfrancis_sprin%3E%3Curl%3E%3C/url%3E&rft.eisbn=354039804X&rft.eisbn_list=9783540398042&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true