Guided Attention Inference Network

With only coarse labels, weakly supervised learning typically uses top-down attention maps generated by back-propagating gradients as priors for tasks such as object localization and semantic segmentation. While these attention maps are intuitive and informative explanations of deep neural network,...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on pattern analysis and machine intelligence 2020-12, Vol.42 (12), p.2996-3010
Hauptverfasser: Li, Kunpeng, Wu, Ziyan, Peng, Kuan-Chuan, Ernst, Jan, Fu, Yun
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 3010
container_issue 12
container_start_page 2996
container_title IEEE transactions on pattern analysis and machine intelligence
container_volume 42
creator Li, Kunpeng
Wu, Ziyan
Peng, Kuan-Chuan
Ernst, Jan
Fu, Yun
description With only coarse labels, weakly supervised learning typically uses top-down attention maps generated by back-propagating gradients as priors for tasks such as object localization and semantic segmentation. While these attention maps are intuitive and informative explanations of deep neural network, there is no effective mechanism to manipulate the network attention during learning process. In this paper, we address three shortcomings of previous approaches in modeling such attention maps in one common framework. First, we make attention maps a natural and explicit component in the training pipeline such that they are end-to-end trainable. Moreover, we provide self-guidance directly on these maps by exploring supervision from the network itself to improve them towards specific target tasks. Lastly, we proposed a design to seamlessly bridge the gap between using weak and extra supervision if available. Despite its simplicity, experiments on the semantic segmentation task demonstrate the effectiveness of our methods. Besides, the proposed framework provides a way not only explaining the focus of the learner but also feeding back with direct guidance towards specific tasks. Under mild assumptions our method can also be understood as a plug-in to existing convolutional neural networks to improve their generalization performance.
doi_str_mv 10.1109/TPAMI.2019.2921543
format Article
fullrecord <record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_proquest_journals_2457972755</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>8733010</ieee_id><sourcerecordid>2457972755</sourcerecordid><originalsourceid>FETCH-LOGICAL-c395t-4048eaf1ec2eacbba1ff75bbf1f6baa26e04fa12b7ca11fa348c26ccc30ed84e3</originalsourceid><addsrcrecordid>eNpdkMtOwzAQRS0EoqXwAyChCjZsUjweJ7GXVQWlUnksytpynLGU0ibFSYT4e1JaumA1i3vu1egwdgl8BMD1_eJt_DwbCQ56JLSAWOIR64NGHWGM-pj1OSQiUkqoHjur6yXnIGOOp6yHAIor1H12M22LnPLhuGmobIqqHM5KT4FKR8MXar6q8HHOTrxd1XSxvwP2_viwmDxF89fpbDKeRw513ESSS0XWAzlB1mWZBe_TOMs8-CSzViTEpbcgstRZAG9RKicS5xxyypUkHLC73e4mVJ8t1Y1ZF7Wj1cqWVLW1ESg5l4AaO_T2H7qs2lB23xkh41SnIo3jjhI7yoWqrgN5swnF2oZvA9xsDZpfg2Zr0OwNdqXr_XSbrSk_VP6UdcDVDiiI6BCrFJEDxx_N13P8</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2457972755</pqid></control><display><type>article</type><title>Guided Attention Inference Network</title><source>IEEE Electronic Library (IEL)</source><creator>Li, Kunpeng ; Wu, Ziyan ; Peng, Kuan-Chuan ; Ernst, Jan ; Fu, Yun</creator><creatorcontrib>Li, Kunpeng ; Wu, Ziyan ; Peng, Kuan-Chuan ; Ernst, Jan ; Fu, Yun</creatorcontrib><description>With only coarse labels, weakly supervised learning typically uses top-down attention maps generated by back-propagating gradients as priors for tasks such as object localization and semantic segmentation. While these attention maps are intuitive and informative explanations of deep neural network, there is no effective mechanism to manipulate the network attention during learning process. In this paper, we address three shortcomings of previous approaches in modeling such attention maps in one common framework. First, we make attention maps a natural and explicit component in the training pipeline such that they are end-to-end trainable. Moreover, we provide self-guidance directly on these maps by exploring supervision from the network itself to improve them towards specific target tasks. Lastly, we proposed a design to seamlessly bridge the gap between using weak and extra supervision if available. Despite its simplicity, experiments on the semantic segmentation task demonstrate the effectiveness of our methods. Besides, the proposed framework provides a way not only explaining the focus of the learner but also feeding back with direct guidance towards specific tasks. Under mild assumptions our method can also be understood as a plug-in to existing convolutional neural networks to improve their generalization performance.</description><identifier>ISSN: 0162-8828</identifier><identifier>EISSN: 1939-3539</identifier><identifier>EISSN: 2160-9292</identifier><identifier>DOI: 10.1109/TPAMI.2019.2921543</identifier><identifier>PMID: 31180839</identifier><identifier>CODEN: ITPIDJ</identifier><language>eng</language><publisher>United States: IEEE</publisher><subject>Artificial neural networks ; Back propagation ; biased data ; Convolutional neural network ; Convolutional neural networks ; Image segmentation ; Machine learning ; network attention ; Neural networks ; Semantic segmentation ; Semantics ; Supervised learning ; Training data ; Visualization ; weakly supervised learning</subject><ispartof>IEEE transactions on pattern analysis and machine intelligence, 2020-12, Vol.42 (12), p.2996-3010</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2020</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c395t-4048eaf1ec2eacbba1ff75bbf1f6baa26e04fa12b7ca11fa348c26ccc30ed84e3</citedby><cites>FETCH-LOGICAL-c395t-4048eaf1ec2eacbba1ff75bbf1f6baa26e04fa12b7ca11fa348c26ccc30ed84e3</cites><orcidid>0000-0002-5098-2853 ; 0000-0001-5805-793X ; 0000?0002?2682?9912 ; 0000-0002-6342-9213</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/8733010$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,776,780,792,27901,27902,54733</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/8733010$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/31180839$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Li, Kunpeng</creatorcontrib><creatorcontrib>Wu, Ziyan</creatorcontrib><creatorcontrib>Peng, Kuan-Chuan</creatorcontrib><creatorcontrib>Ernst, Jan</creatorcontrib><creatorcontrib>Fu, Yun</creatorcontrib><title>Guided Attention Inference Network</title><title>IEEE transactions on pattern analysis and machine intelligence</title><addtitle>TPAMI</addtitle><addtitle>IEEE Trans Pattern Anal Mach Intell</addtitle><description>With only coarse labels, weakly supervised learning typically uses top-down attention maps generated by back-propagating gradients as priors for tasks such as object localization and semantic segmentation. While these attention maps are intuitive and informative explanations of deep neural network, there is no effective mechanism to manipulate the network attention during learning process. In this paper, we address three shortcomings of previous approaches in modeling such attention maps in one common framework. First, we make attention maps a natural and explicit component in the training pipeline such that they are end-to-end trainable. Moreover, we provide self-guidance directly on these maps by exploring supervision from the network itself to improve them towards specific target tasks. Lastly, we proposed a design to seamlessly bridge the gap between using weak and extra supervision if available. Despite its simplicity, experiments on the semantic segmentation task demonstrate the effectiveness of our methods. Besides, the proposed framework provides a way not only explaining the focus of the learner but also feeding back with direct guidance towards specific tasks. Under mild assumptions our method can also be understood as a plug-in to existing convolutional neural networks to improve their generalization performance.</description><subject>Artificial neural networks</subject><subject>Back propagation</subject><subject>biased data</subject><subject>Convolutional neural network</subject><subject>Convolutional neural networks</subject><subject>Image segmentation</subject><subject>Machine learning</subject><subject>network attention</subject><subject>Neural networks</subject><subject>Semantic segmentation</subject><subject>Semantics</subject><subject>Supervised learning</subject><subject>Training data</subject><subject>Visualization</subject><subject>weakly supervised learning</subject><issn>0162-8828</issn><issn>1939-3539</issn><issn>2160-9292</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2020</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNpdkMtOwzAQRS0EoqXwAyChCjZsUjweJ7GXVQWlUnksytpynLGU0ibFSYT4e1JaumA1i3vu1egwdgl8BMD1_eJt_DwbCQ56JLSAWOIR64NGHWGM-pj1OSQiUkqoHjur6yXnIGOOp6yHAIor1H12M22LnPLhuGmobIqqHM5KT4FKR8MXar6q8HHOTrxd1XSxvwP2_viwmDxF89fpbDKeRw513ESSS0XWAzlB1mWZBe_TOMs8-CSzViTEpbcgstRZAG9RKicS5xxyypUkHLC73e4mVJ8t1Y1ZF7Wj1cqWVLW1ESg5l4AaO_T2H7qs2lB23xkh41SnIo3jjhI7yoWqrgN5swnF2oZvA9xsDZpfg2Zr0OwNdqXr_XSbrSk_VP6UdcDVDiiI6BCrFJEDxx_N13P8</recordid><startdate>20201201</startdate><enddate>20201201</enddate><creator>Li, Kunpeng</creator><creator>Wu, Ziyan</creator><creator>Peng, Kuan-Chuan</creator><creator>Ernst, Jan</creator><creator>Fu, Yun</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>7X8</scope><orcidid>https://orcid.org/0000-0002-5098-2853</orcidid><orcidid>https://orcid.org/0000-0001-5805-793X</orcidid><orcidid>https://orcid.org/0000?0002?2682?9912</orcidid><orcidid>https://orcid.org/0000-0002-6342-9213</orcidid></search><sort><creationdate>20201201</creationdate><title>Guided Attention Inference Network</title><author>Li, Kunpeng ; Wu, Ziyan ; Peng, Kuan-Chuan ; Ernst, Jan ; Fu, Yun</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c395t-4048eaf1ec2eacbba1ff75bbf1f6baa26e04fa12b7ca11fa348c26ccc30ed84e3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2020</creationdate><topic>Artificial neural networks</topic><topic>Back propagation</topic><topic>biased data</topic><topic>Convolutional neural network</topic><topic>Convolutional neural networks</topic><topic>Image segmentation</topic><topic>Machine learning</topic><topic>network attention</topic><topic>Neural networks</topic><topic>Semantic segmentation</topic><topic>Semantics</topic><topic>Supervised learning</topic><topic>Training data</topic><topic>Visualization</topic><topic>weakly supervised learning</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Li, Kunpeng</creatorcontrib><creatorcontrib>Wu, Ziyan</creatorcontrib><creatorcontrib>Peng, Kuan-Chuan</creatorcontrib><creatorcontrib>Ernst, Jan</creatorcontrib><creatorcontrib>Fu, Yun</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics &amp; Communications Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>MEDLINE - Academic</collection><jtitle>IEEE transactions on pattern analysis and machine intelligence</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Li, Kunpeng</au><au>Wu, Ziyan</au><au>Peng, Kuan-Chuan</au><au>Ernst, Jan</au><au>Fu, Yun</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Guided Attention Inference Network</atitle><jtitle>IEEE transactions on pattern analysis and machine intelligence</jtitle><stitle>TPAMI</stitle><addtitle>IEEE Trans Pattern Anal Mach Intell</addtitle><date>2020-12-01</date><risdate>2020</risdate><volume>42</volume><issue>12</issue><spage>2996</spage><epage>3010</epage><pages>2996-3010</pages><issn>0162-8828</issn><eissn>1939-3539</eissn><eissn>2160-9292</eissn><coden>ITPIDJ</coden><abstract>With only coarse labels, weakly supervised learning typically uses top-down attention maps generated by back-propagating gradients as priors for tasks such as object localization and semantic segmentation. While these attention maps are intuitive and informative explanations of deep neural network, there is no effective mechanism to manipulate the network attention during learning process. In this paper, we address three shortcomings of previous approaches in modeling such attention maps in one common framework. First, we make attention maps a natural and explicit component in the training pipeline such that they are end-to-end trainable. Moreover, we provide self-guidance directly on these maps by exploring supervision from the network itself to improve them towards specific target tasks. Lastly, we proposed a design to seamlessly bridge the gap between using weak and extra supervision if available. Despite its simplicity, experiments on the semantic segmentation task demonstrate the effectiveness of our methods. Besides, the proposed framework provides a way not only explaining the focus of the learner but also feeding back with direct guidance towards specific tasks. Under mild assumptions our method can also be understood as a plug-in to existing convolutional neural networks to improve their generalization performance.</abstract><cop>United States</cop><pub>IEEE</pub><pmid>31180839</pmid><doi>10.1109/TPAMI.2019.2921543</doi><tpages>15</tpages><orcidid>https://orcid.org/0000-0002-5098-2853</orcidid><orcidid>https://orcid.org/0000-0001-5805-793X</orcidid><orcidid>https://orcid.org/0000?0002?2682?9912</orcidid><orcidid>https://orcid.org/0000-0002-6342-9213</orcidid><oa>free_for_read</oa></addata></record>
fulltext fulltext_linktorsrc
identifier ISSN: 0162-8828
ispartof IEEE transactions on pattern analysis and machine intelligence, 2020-12, Vol.42 (12), p.2996-3010
issn 0162-8828
1939-3539
2160-9292
language eng
recordid cdi_proquest_journals_2457972755
source IEEE Electronic Library (IEL)
subjects Artificial neural networks
Back propagation
biased data
Convolutional neural network
Convolutional neural networks
Image segmentation
Machine learning
network attention
Neural networks
Semantic segmentation
Semantics
Supervised learning
Training data
Visualization
weakly supervised learning
title Guided Attention Inference Network
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-03T15%3A47%3A38IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Guided%20Attention%20Inference%20Network&rft.jtitle=IEEE%20transactions%20on%20pattern%20analysis%20and%20machine%20intelligence&rft.au=Li,%20Kunpeng&rft.date=2020-12-01&rft.volume=42&rft.issue=12&rft.spage=2996&rft.epage=3010&rft.pages=2996-3010&rft.issn=0162-8828&rft.eissn=1939-3539&rft.coden=ITPIDJ&rft_id=info:doi/10.1109/TPAMI.2019.2921543&rft_dat=%3Cproquest_RIE%3E2457972755%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2457972755&rft_id=info:pmid/31180839&rft_ieee_id=8733010&rfr_iscdi=true