A Probabilistic Active Learning Algorithm Based on Fisher Information Ratio

The task of labeling samples is demanding and expensive. Active learning aims to generate the smallest possible training data set that results in a classifier with high performance in the test phase. It usually consists of two steps of selecting a set of queries and requesting their labels. Among th...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on pattern analysis and machine intelligence 2018-08, Vol.40 (8), p.2023-2029
Hauptverfasser: Sourati, Jamshid, Akcakaya, Murat, Erdogmus, Deniz, Leen, Todd K., Dy, Jennifer G.
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 2029
container_issue 8
container_start_page 2023
container_title IEEE transactions on pattern analysis and machine intelligence
container_volume 40
creator Sourati, Jamshid
Akcakaya, Murat
Erdogmus, Deniz
Leen, Todd K.
Dy, Jennifer G.
description The task of labeling samples is demanding and expensive. Active learning aims to generate the smallest possible training data set that results in a classifier with high performance in the test phase. It usually consists of two steps of selecting a set of queries and requesting their labels. Among the suggested objectives to score the query sets, information theoretic measures have become very popular. Yet among them, those based on Fisher information (FI) have the advantage of considering the diversity among the queries and tractable computations. In this work, we provide a practical algorithm based on Fisher information ratio to obtain query distribution for a general framework where, in contrast to the previous FI-based querying methods, we make no assumptions over the test distribution. The empirical results on synthetic and real-world data sets indicate that this algorithm gives competitive results.
doi_str_mv 10.1109/TPAMI.2017.2743707
format Article
fullrecord <record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_crossref_primary_10_1109_TPAMI_2017_2743707</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>8016395</ieee_id><sourcerecordid>1988258157</sourcerecordid><originalsourceid>FETCH-LOGICAL-c395t-ed9da20dd1084595bb0bf3379771a127b713a7176e56441e44ba77cca3697f913</originalsourceid><addsrcrecordid>eNpdkUtLAzEQx4MoWh9fQEEWvHjZmsmjkxxX8VGsKKLnkN3Naso-NNkKfnu3tvbgaSDzmz8zvxByDHQMQPXFy1P2MB0zCjhmKDhS3CIj0FynXHK9TUYUJixViqk9sh_jnFIQkvJdsseUkgqVGJH7LHkKXW5zX_vY-yLJit5_uWTmbGh9-5Zk9VsXfP_eJJc2ujLp2uTGx3cXkmlbdaGxvR-enpflkOxUto7uaF0PyOvN9cvVXTp7vJ1eZbO04Fr2qSt1aRktS6BKSC3znOYV56gRwQLDHIFbBJw4ORECnBC5RSwKyycaKw38gJyvcj9C97lwsTeNj4Wra9u6bhEN6OFmqUDigJ79Q-fdIrTDdoYBCqE4Y8tAtqKK0MUYXGU-gm9s-DZAzVK1-VVtlqrNWvUwdLqOXuSNKzcjf24H4GQFeOfcpq2GTxk08B-XTICE</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2174483221</pqid></control><display><type>article</type><title>A Probabilistic Active Learning Algorithm Based on Fisher Information Ratio</title><source>IEEE Electronic Library (IEL)</source><creator>Sourati, Jamshid ; Akcakaya, Murat ; Erdogmus, Deniz ; Leen, Todd K. ; Dy, Jennifer G.</creator><creatorcontrib>Sourati, Jamshid ; Akcakaya, Murat ; Erdogmus, Deniz ; Leen, Todd K. ; Dy, Jennifer G.</creatorcontrib><description>The task of labeling samples is demanding and expensive. Active learning aims to generate the smallest possible training data set that results in a classifier with high performance in the test phase. It usually consists of two steps of selecting a set of queries and requesting their labels. Among the suggested objectives to score the query sets, information theoretic measures have become very popular. Yet among them, those based on Fisher information (FI) have the advantage of considering the diversity among the queries and tractable computations. In this work, we provide a practical algorithm based on Fisher information ratio to obtain query distribution for a general framework where, in contrast to the previous FI-based querying methods, we make no assumptions over the test distribution. The empirical results on synthetic and real-world data sets indicate that this algorithm gives competitive results.</description><identifier>ISSN: 0162-8828</identifier><identifier>EISSN: 1939-3539</identifier><identifier>EISSN: 2160-9292</identifier><identifier>DOI: 10.1109/TPAMI.2017.2743707</identifier><identifier>PMID: 28858784</identifier><identifier>CODEN: ITPIDJ</identifier><language>eng</language><publisher>United States: IEEE</publisher><subject>Active learning ; Algorithms ; Approximation algorithms ; Computational complexity ; Computer Simulation ; Databases, Factual - statistics &amp; numerical data ; discriminative classification ; Finite impulse response filters ; Fisher information ; Humans ; Information theory ; Labels ; Machine learning ; Models, Statistical ; Monte Carlo Method ; Optimization ; Probabilistic logic ; probabilistic querying ; Proposals ; Queries ; Training</subject><ispartof>IEEE transactions on pattern analysis and machine intelligence, 2018-08, Vol.40 (8), p.2023-2029</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2018</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c395t-ed9da20dd1084595bb0bf3379771a127b713a7176e56441e44ba77cca3697f913</citedby><cites>FETCH-LOGICAL-c395t-ed9da20dd1084595bb0bf3379771a127b713a7176e56441e44ba77cca3697f913</cites><orcidid>0000-0003-1853-7271</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/8016395$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,780,784,796,27924,27925,54758</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/8016395$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/28858784$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Sourati, Jamshid</creatorcontrib><creatorcontrib>Akcakaya, Murat</creatorcontrib><creatorcontrib>Erdogmus, Deniz</creatorcontrib><creatorcontrib>Leen, Todd K.</creatorcontrib><creatorcontrib>Dy, Jennifer G.</creatorcontrib><title>A Probabilistic Active Learning Algorithm Based on Fisher Information Ratio</title><title>IEEE transactions on pattern analysis and machine intelligence</title><addtitle>TPAMI</addtitle><addtitle>IEEE Trans Pattern Anal Mach Intell</addtitle><description>The task of labeling samples is demanding and expensive. Active learning aims to generate the smallest possible training data set that results in a classifier with high performance in the test phase. It usually consists of two steps of selecting a set of queries and requesting their labels. Among the suggested objectives to score the query sets, information theoretic measures have become very popular. Yet among them, those based on Fisher information (FI) have the advantage of considering the diversity among the queries and tractable computations. In this work, we provide a practical algorithm based on Fisher information ratio to obtain query distribution for a general framework where, in contrast to the previous FI-based querying methods, we make no assumptions over the test distribution. The empirical results on synthetic and real-world data sets indicate that this algorithm gives competitive results.</description><subject>Active learning</subject><subject>Algorithms</subject><subject>Approximation algorithms</subject><subject>Computational complexity</subject><subject>Computer Simulation</subject><subject>Databases, Factual - statistics &amp; numerical data</subject><subject>discriminative classification</subject><subject>Finite impulse response filters</subject><subject>Fisher information</subject><subject>Humans</subject><subject>Information theory</subject><subject>Labels</subject><subject>Machine learning</subject><subject>Models, Statistical</subject><subject>Monte Carlo Method</subject><subject>Optimization</subject><subject>Probabilistic logic</subject><subject>probabilistic querying</subject><subject>Proposals</subject><subject>Queries</subject><subject>Training</subject><issn>0162-8828</issn><issn>1939-3539</issn><issn>2160-9292</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2018</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><sourceid>EIF</sourceid><recordid>eNpdkUtLAzEQx4MoWh9fQEEWvHjZmsmjkxxX8VGsKKLnkN3Naso-NNkKfnu3tvbgaSDzmz8zvxByDHQMQPXFy1P2MB0zCjhmKDhS3CIj0FynXHK9TUYUJixViqk9sh_jnFIQkvJdsseUkgqVGJH7LHkKXW5zX_vY-yLJit5_uWTmbGh9-5Zk9VsXfP_eJJc2ujLp2uTGx3cXkmlbdaGxvR-enpflkOxUto7uaF0PyOvN9cvVXTp7vJ1eZbO04Fr2qSt1aRktS6BKSC3znOYV56gRwQLDHIFbBJw4ORECnBC5RSwKyycaKw38gJyvcj9C97lwsTeNj4Wra9u6bhEN6OFmqUDigJ79Q-fdIrTDdoYBCqE4Y8tAtqKK0MUYXGU-gm9s-DZAzVK1-VVtlqrNWvUwdLqOXuSNKzcjf24H4GQFeOfcpq2GTxk08B-XTICE</recordid><startdate>20180801</startdate><enddate>20180801</enddate><creator>Sourati, Jamshid</creator><creator>Akcakaya, Murat</creator><creator>Erdogmus, Deniz</creator><creator>Leen, Todd K.</creator><creator>Dy, Jennifer G.</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>7X8</scope><orcidid>https://orcid.org/0000-0003-1853-7271</orcidid></search><sort><creationdate>20180801</creationdate><title>A Probabilistic Active Learning Algorithm Based on Fisher Information Ratio</title><author>Sourati, Jamshid ; Akcakaya, Murat ; Erdogmus, Deniz ; Leen, Todd K. ; Dy, Jennifer G.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c395t-ed9da20dd1084595bb0bf3379771a127b713a7176e56441e44ba77cca3697f913</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2018</creationdate><topic>Active learning</topic><topic>Algorithms</topic><topic>Approximation algorithms</topic><topic>Computational complexity</topic><topic>Computer Simulation</topic><topic>Databases, Factual - statistics &amp; numerical data</topic><topic>discriminative classification</topic><topic>Finite impulse response filters</topic><topic>Fisher information</topic><topic>Humans</topic><topic>Information theory</topic><topic>Labels</topic><topic>Machine learning</topic><topic>Models, Statistical</topic><topic>Monte Carlo Method</topic><topic>Optimization</topic><topic>Probabilistic logic</topic><topic>probabilistic querying</topic><topic>Proposals</topic><topic>Queries</topic><topic>Training</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Sourati, Jamshid</creatorcontrib><creatorcontrib>Akcakaya, Murat</creatorcontrib><creatorcontrib>Erdogmus, Deniz</creatorcontrib><creatorcontrib>Leen, Todd K.</creatorcontrib><creatorcontrib>Dy, Jennifer G.</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics &amp; Communications Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>MEDLINE - Academic</collection><jtitle>IEEE transactions on pattern analysis and machine intelligence</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Sourati, Jamshid</au><au>Akcakaya, Murat</au><au>Erdogmus, Deniz</au><au>Leen, Todd K.</au><au>Dy, Jennifer G.</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>A Probabilistic Active Learning Algorithm Based on Fisher Information Ratio</atitle><jtitle>IEEE transactions on pattern analysis and machine intelligence</jtitle><stitle>TPAMI</stitle><addtitle>IEEE Trans Pattern Anal Mach Intell</addtitle><date>2018-08-01</date><risdate>2018</risdate><volume>40</volume><issue>8</issue><spage>2023</spage><epage>2029</epage><pages>2023-2029</pages><issn>0162-8828</issn><eissn>1939-3539</eissn><eissn>2160-9292</eissn><coden>ITPIDJ</coden><abstract>The task of labeling samples is demanding and expensive. Active learning aims to generate the smallest possible training data set that results in a classifier with high performance in the test phase. It usually consists of two steps of selecting a set of queries and requesting their labels. Among the suggested objectives to score the query sets, information theoretic measures have become very popular. Yet among them, those based on Fisher information (FI) have the advantage of considering the diversity among the queries and tractable computations. In this work, we provide a practical algorithm based on Fisher information ratio to obtain query distribution for a general framework where, in contrast to the previous FI-based querying methods, we make no assumptions over the test distribution. The empirical results on synthetic and real-world data sets indicate that this algorithm gives competitive results.</abstract><cop>United States</cop><pub>IEEE</pub><pmid>28858784</pmid><doi>10.1109/TPAMI.2017.2743707</doi><tpages>7</tpages><orcidid>https://orcid.org/0000-0003-1853-7271</orcidid><oa>free_for_read</oa></addata></record>
fulltext fulltext_linktorsrc
identifier ISSN: 0162-8828
ispartof IEEE transactions on pattern analysis and machine intelligence, 2018-08, Vol.40 (8), p.2023-2029
issn 0162-8828
1939-3539
2160-9292
language eng
recordid cdi_crossref_primary_10_1109_TPAMI_2017_2743707
source IEEE Electronic Library (IEL)
subjects Active learning
Algorithms
Approximation algorithms
Computational complexity
Computer Simulation
Databases, Factual - statistics & numerical data
discriminative classification
Finite impulse response filters
Fisher information
Humans
Information theory
Labels
Machine learning
Models, Statistical
Monte Carlo Method
Optimization
Probabilistic logic
probabilistic querying
Proposals
Queries
Training
title A Probabilistic Active Learning Algorithm Based on Fisher Information Ratio
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-22T10%3A05%3A15IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=A%20Probabilistic%20Active%20Learning%20Algorithm%20Based%20on%20Fisher%20Information%20Ratio&rft.jtitle=IEEE%20transactions%20on%20pattern%20analysis%20and%20machine%20intelligence&rft.au=Sourati,%20Jamshid&rft.date=2018-08-01&rft.volume=40&rft.issue=8&rft.spage=2023&rft.epage=2029&rft.pages=2023-2029&rft.issn=0162-8828&rft.eissn=1939-3539&rft.coden=ITPIDJ&rft_id=info:doi/10.1109/TPAMI.2017.2743707&rft_dat=%3Cproquest_RIE%3E1988258157%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2174483221&rft_id=info:pmid/28858784&rft_ieee_id=8016395&rfr_iscdi=true