Document Classification Through Interactive Supervision of Document and Term Labels

Effective incorporation of human expertise, while exerting a low cognitive load, is a critical aspect of real-life text classification applications that is not adequately addressed by batch-supervised high-accuracy learners. Standard text classifiers are supervised in only one way: assigning labels...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Godbole, Shantanu, Harpale, Abhay, Sarawagi, Sunita, Chakrabarti, Soumen
Format: Buchkapitel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 196
container_issue
container_start_page 185
container_title
container_volume
creator Godbole, Shantanu
Harpale, Abhay
Sarawagi, Sunita
Chakrabarti, Soumen
description Effective incorporation of human expertise, while exerting a low cognitive load, is a critical aspect of real-life text classification applications that is not adequately addressed by batch-supervised high-accuracy learners. Standard text classifiers are supervised in only one way: assigning labels to whole documents. They are thus deprived of the enormous wisdom that humans carry about the significance of words and phrases in context. We present HIClass, an interactive and exploratory labeling package that actively collects user opinion on feature representations and choices, as well as whole-document labels, while minimizing redundancy in the input sought. Preliminary experience suggests that, starting with essentially an unlabeled corpus, very little cognitive labor suffices to set up a labeled collection on which standard classifiers perform well.
doi_str_mv 10.1007/978-3-540-30116-5_19
format Book Chapter
fullrecord <record><control><sourceid>pascalfrancis_sprin</sourceid><recordid>TN_cdi_pascalfrancis_primary_16177321</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>16177321</sourcerecordid><originalsourceid>FETCH-LOGICAL-p274t-5e4934d13bdc4863be93326459e21effdd1d781742d2c41d165490315b3fb953</originalsourceid><addsrcrecordid>eNo9kM1OwzAQhM2fRFX6Bhxy4Wjwep04PqJCoVIlDs3dcmKnDaRJZKeVeHvcFnUvK83MrjQfIY_AnoEx-aJkTpGmglFkABlNNagrMosyRvGkpddkAhkARRTq5uJxBJazWzKJKU6VFHhPZiF8sziQS8X4hKzf-mq_c92YzFsTQlM3lRmbvkuKre_3m22y7EbnTTU2B5es94PzhyYc_b5OLqems0nh_C5ZmdK14YHc1aYNbva_p6RYvBfzT7r6-ljOX1d04FKMNHVCobCApa1EnmHpFCLPRKocB1fX1oKVOUjBLa8E2NhTKIaQlliXKsUpeTq_HUyoTFt701VN0INvdsb_6ghESuQQc_ycC9HqNs7rsu9_ggamj4B1pKVRR176BFMfAeMfoRxobg</addsrcrecordid><sourcetype>Index Database</sourcetype><iscdi>true</iscdi><recordtype>book_chapter</recordtype></control><display><type>book_chapter</type><title>Document Classification Through Interactive Supervision of Document and Term Labels</title><source>Springer Books</source><creator>Godbole, Shantanu ; Harpale, Abhay ; Sarawagi, Sunita ; Chakrabarti, Soumen</creator><contributor>Boulicaut, Jean-François ; Giannotti, Fosca ; Esposito, Floriana ; Pedreschi, Dino</contributor><creatorcontrib>Godbole, Shantanu ; Harpale, Abhay ; Sarawagi, Sunita ; Chakrabarti, Soumen ; Boulicaut, Jean-François ; Giannotti, Fosca ; Esposito, Floriana ; Pedreschi, Dino</creatorcontrib><description>Effective incorporation of human expertise, while exerting a low cognitive load, is a critical aspect of real-life text classification applications that is not adequately addressed by batch-supervised high-accuracy learners. Standard text classifiers are supervised in only one way: assigning labels to whole documents. They are thus deprived of the enormous wisdom that humans carry about the significance of words and phrases in context. We present HIClass, an interactive and exploratory labeling package that actively collects user opinion on feature representations and choices, as well as whole-document labels, while minimizing redundancy in the input sought. Preliminary experience suggests that, starting with essentially an unlabeled corpus, very little cognitive labor suffices to set up a labeled collection on which standard classifiers perform well.</description><identifier>ISSN: 0302-9743</identifier><identifier>ISBN: 9783540231080</identifier><identifier>ISBN: 3540231080</identifier><identifier>EISSN: 1611-3349</identifier><identifier>EISBN: 9783540301165</identifier><identifier>EISBN: 354030116X</identifier><identifier>DOI: 10.1007/978-3-540-30116-5_19</identifier><language>eng</language><publisher>Berlin, Heidelberg: Springer Berlin Heidelberg</publisher><subject>Active Learning ; Applied sciences ; Cognitive Load ; Computer science; control theory; systems ; Data processing. List processing. Character string processing ; Exact sciences and technology ; Label Document ; Linear Additive Model ; Memory organisation. Data processing ; Software ; Support Vector Machine</subject><ispartof>Knowledge Discovery in Databases: PKDD 2004, 2004, p.185-196</ispartof><rights>Springer-Verlag Berlin Heidelberg 2004</rights><rights>2004 INIST-CNRS</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><relation>Lecture Notes in Computer Science</relation></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://link.springer.com/content/pdf/10.1007/978-3-540-30116-5_19$$EPDF$$P50$$Gspringer$$H</linktopdf><linktohtml>$$Uhttps://link.springer.com/10.1007/978-3-540-30116-5_19$$EHTML$$P50$$Gspringer$$H</linktohtml><link.rule.ids>309,310,779,780,784,789,790,793,4050,4051,27925,38255,41442,42511</link.rule.ids><backlink>$$Uhttp://pascal-francis.inist.fr/vibad/index.php?action=getRecordDetail&amp;idt=16177321$$DView record in Pascal Francis$$Hfree_for_read</backlink></links><search><contributor>Boulicaut, Jean-François</contributor><contributor>Giannotti, Fosca</contributor><contributor>Esposito, Floriana</contributor><contributor>Pedreschi, Dino</contributor><creatorcontrib>Godbole, Shantanu</creatorcontrib><creatorcontrib>Harpale, Abhay</creatorcontrib><creatorcontrib>Sarawagi, Sunita</creatorcontrib><creatorcontrib>Chakrabarti, Soumen</creatorcontrib><title>Document Classification Through Interactive Supervision of Document and Term Labels</title><title>Knowledge Discovery in Databases: PKDD 2004</title><description>Effective incorporation of human expertise, while exerting a low cognitive load, is a critical aspect of real-life text classification applications that is not adequately addressed by batch-supervised high-accuracy learners. Standard text classifiers are supervised in only one way: assigning labels to whole documents. They are thus deprived of the enormous wisdom that humans carry about the significance of words and phrases in context. We present HIClass, an interactive and exploratory labeling package that actively collects user opinion on feature representations and choices, as well as whole-document labels, while minimizing redundancy in the input sought. Preliminary experience suggests that, starting with essentially an unlabeled corpus, very little cognitive labor suffices to set up a labeled collection on which standard classifiers perform well.</description><subject>Active Learning</subject><subject>Applied sciences</subject><subject>Cognitive Load</subject><subject>Computer science; control theory; systems</subject><subject>Data processing. List processing. Character string processing</subject><subject>Exact sciences and technology</subject><subject>Label Document</subject><subject>Linear Additive Model</subject><subject>Memory organisation. Data processing</subject><subject>Software</subject><subject>Support Vector Machine</subject><issn>0302-9743</issn><issn>1611-3349</issn><isbn>9783540231080</isbn><isbn>3540231080</isbn><isbn>9783540301165</isbn><isbn>354030116X</isbn><fulltext>true</fulltext><rsrctype>book_chapter</rsrctype><creationdate>2004</creationdate><recordtype>book_chapter</recordtype><recordid>eNo9kM1OwzAQhM2fRFX6Bhxy4Wjwep04PqJCoVIlDs3dcmKnDaRJZKeVeHvcFnUvK83MrjQfIY_AnoEx-aJkTpGmglFkABlNNagrMosyRvGkpddkAhkARRTq5uJxBJazWzKJKU6VFHhPZiF8sziQS8X4hKzf-mq_c92YzFsTQlM3lRmbvkuKre_3m22y7EbnTTU2B5es94PzhyYc_b5OLqems0nh_C5ZmdK14YHc1aYNbva_p6RYvBfzT7r6-ljOX1d04FKMNHVCobCApa1EnmHpFCLPRKocB1fX1oKVOUjBLa8E2NhTKIaQlliXKsUpeTq_HUyoTFt701VN0INvdsb_6ghESuQQc_ycC9HqNs7rsu9_ggamj4B1pKVRR176BFMfAeMfoRxobg</recordid><startdate>2004</startdate><enddate>2004</enddate><creator>Godbole, Shantanu</creator><creator>Harpale, Abhay</creator><creator>Sarawagi, Sunita</creator><creator>Chakrabarti, Soumen</creator><general>Springer Berlin Heidelberg</general><general>Springer</general><scope>IQODW</scope></search><sort><creationdate>2004</creationdate><title>Document Classification Through Interactive Supervision of Document and Term Labels</title><author>Godbole, Shantanu ; Harpale, Abhay ; Sarawagi, Sunita ; Chakrabarti, Soumen</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-p274t-5e4934d13bdc4863be93326459e21effdd1d781742d2c41d165490315b3fb953</frbrgroupid><rsrctype>book_chapters</rsrctype><prefilter>book_chapters</prefilter><language>eng</language><creationdate>2004</creationdate><topic>Active Learning</topic><topic>Applied sciences</topic><topic>Cognitive Load</topic><topic>Computer science; control theory; systems</topic><topic>Data processing. List processing. Character string processing</topic><topic>Exact sciences and technology</topic><topic>Label Document</topic><topic>Linear Additive Model</topic><topic>Memory organisation. Data processing</topic><topic>Software</topic><topic>Support Vector Machine</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Godbole, Shantanu</creatorcontrib><creatorcontrib>Harpale, Abhay</creatorcontrib><creatorcontrib>Sarawagi, Sunita</creatorcontrib><creatorcontrib>Chakrabarti, Soumen</creatorcontrib><collection>Pascal-Francis</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Godbole, Shantanu</au><au>Harpale, Abhay</au><au>Sarawagi, Sunita</au><au>Chakrabarti, Soumen</au><au>Boulicaut, Jean-François</au><au>Giannotti, Fosca</au><au>Esposito, Floriana</au><au>Pedreschi, Dino</au><format>book</format><genre>bookitem</genre><ristype>CHAP</ristype><atitle>Document Classification Through Interactive Supervision of Document and Term Labels</atitle><btitle>Knowledge Discovery in Databases: PKDD 2004</btitle><seriestitle>Lecture Notes in Computer Science</seriestitle><date>2004</date><risdate>2004</risdate><spage>185</spage><epage>196</epage><pages>185-196</pages><issn>0302-9743</issn><eissn>1611-3349</eissn><isbn>9783540231080</isbn><isbn>3540231080</isbn><eisbn>9783540301165</eisbn><eisbn>354030116X</eisbn><abstract>Effective incorporation of human expertise, while exerting a low cognitive load, is a critical aspect of real-life text classification applications that is not adequately addressed by batch-supervised high-accuracy learners. Standard text classifiers are supervised in only one way: assigning labels to whole documents. They are thus deprived of the enormous wisdom that humans carry about the significance of words and phrases in context. We present HIClass, an interactive and exploratory labeling package that actively collects user opinion on feature representations and choices, as well as whole-document labels, while minimizing redundancy in the input sought. Preliminary experience suggests that, starting with essentially an unlabeled corpus, very little cognitive labor suffices to set up a labeled collection on which standard classifiers perform well.</abstract><cop>Berlin, Heidelberg</cop><pub>Springer Berlin Heidelberg</pub><doi>10.1007/978-3-540-30116-5_19</doi><tpages>12</tpages><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 0302-9743
ispartof Knowledge Discovery in Databases: PKDD 2004, 2004, p.185-196
issn 0302-9743
1611-3349
language eng
recordid cdi_pascalfrancis_primary_16177321
source Springer Books
subjects Active Learning
Applied sciences
Cognitive Load
Computer science
control theory
systems
Data processing. List processing. Character string processing
Exact sciences and technology
Label Document
Linear Additive Model
Memory organisation. Data processing
Software
Support Vector Machine
title Document Classification Through Interactive Supervision of Document and Term Labels
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-25T15%3A55%3A29IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-pascalfrancis_sprin&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=bookitem&rft.atitle=Document%20Classification%20Through%20Interactive%20Supervision%20of%20Document%20and%20Term%20Labels&rft.btitle=Knowledge%20Discovery%20in%20Databases:%20PKDD%202004&rft.au=Godbole,%20Shantanu&rft.date=2004&rft.spage=185&rft.epage=196&rft.pages=185-196&rft.issn=0302-9743&rft.eissn=1611-3349&rft.isbn=9783540231080&rft.isbn_list=3540231080&rft_id=info:doi/10.1007/978-3-540-30116-5_19&rft_dat=%3Cpascalfrancis_sprin%3E16177321%3C/pascalfrancis_sprin%3E%3Curl%3E%3C/url%3E&rft.eisbn=9783540301165&rft.eisbn_list=354030116X&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true