Guided Attention Inference Network

With only coarse labels, weakly supervised learning typically uses top-down attention maps generated by back-propagating gradients as priors for tasks such as object localization and semantic segmentation. While these attention maps are intuitive and informative explanations of deep neural network,...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE transactions on pattern analysis and machine intelligence 2020-12, Vol.42 (12), p.2996-3010
Hauptverfasser:	Li, Kunpeng, Wu, Ziyan, Peng, Kuan-Chuan, Ernst, Jan, Fu, Yun
Format:	Artikel
Sprache:	eng
Schlagworte:	Artificial neural networks Back propagation biased data Convolutional neural network Convolutional neural networks Image segmentation Machine learning network attention Neural networks Semantic segmentation Semantics Supervised learning Training data Visualization weakly supervised learning
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	3010
container_issue	12
container_start_page	2996
container_title	IEEE transactions on pattern analysis and machine intelligence
container_volume	42
creator	Li, Kunpeng Wu, Ziyan Peng, Kuan-Chuan Ernst, Jan Fu, Yun
description	With only coarse labels, weakly supervised learning typically uses top-down attention maps generated by back-propagating gradients as priors for tasks such as object localization and semantic segmentation. While these attention maps are intuitive and informative explanations of deep neural network, there is no effective mechanism to manipulate the network attention during learning process. In this paper, we address three shortcomings of previous approaches in modeling such attention maps in one common framework. First, we make attention maps a natural and explicit component in the training pipeline such that they are end-to-end trainable. Moreover, we provide self-guidance directly on these maps by exploring supervision from the network itself to improve them towards specific target tasks. Lastly, we proposed a design to seamlessly bridge the gap between using weak and extra supervision if available. Despite its simplicity, experiments on the semantic segmentation task demonstrate the effectiveness of our methods. Besides, the proposed framework provides a way not only explaining the focus of the learner but also feeding back with direct guidance towards specific tasks. Under mild assumptions our method can also be understood as a plug-in to existing convolutional neural networks to improve their generalization performance.
doi_str_mv	10.1109/TPAMI.2019.2921543
format	Article
fullrecord	<record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_proquest_journals_2457972755</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>8733010</ieee_id><sourcerecordid>2457972755</sourcerecordid><originalsourceid>FETCH-LOGICAL-c395t-4048eaf1ec2eacbba1ff75bbf1f6baa26e04fa12b7ca11fa348c26ccc30ed84e3</originalsourceid><addsrcrecordid>eNpdkMtOwzAQRS0EoqXwAyChCjZsUjweJ7GXVQWlUnksytpynLGU0ibFSYT4e1JaumA1i3vu1egwdgl8BMD1_eJt_DwbCQ56JLSAWOIR64NGHWGM-pj1OSQiUkqoHjur6yXnIGOOp6yHAIor1H12M22LnPLhuGmobIqqHM5KT4FKR8MXar6q8HHOTrxd1XSxvwP2_viwmDxF89fpbDKeRw513ESSS0XWAzlB1mWZBe_TOMs8-CSzViTEpbcgstRZAG9RKicS5xxyypUkHLC73e4mVJ8t1Y1ZF7Wj1cqWVLW1ESg5l4AaO_T2H7qs2lB23xkh41SnIo3jjhI7yoWqrgN5swnF2oZvA9xsDZpfg2Zr0OwNdqXr_XSbrSk_VP6UdcDVDiiI6BCrFJEDxx_N13P8</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2457972755</pqid></control><display><type>article</type><title>Guided Attention Inference Network</title><source>IEEE Electronic Library (IEL)</source><creator>Li, Kunpeng ; Wu, Ziyan ; Peng, Kuan-Chuan ; Ernst, Jan ; Fu, Yun</creator><creatorcontrib>Li, Kunpeng ; Wu, Ziyan ; Peng, Kuan-Chuan ; Ernst, Jan ; Fu, Yun</creatorcontrib><description>With only coarse labels, weakly supervised learning typically uses top-down attention maps generated by back-propagating gradients as priors for tasks such as object localization and semantic segmentation. While these attention maps are intuitive and informative explanations of deep neural network, there is no effective mechanism to manipulate the network attention during learning process. In this paper, we address three shortcomings of previous approaches in modeling such attention maps in one common framework. First, we make attention maps a natural and explicit component in the training pipeline such that they are end-to-end trainable. Moreover, we provide self-guidance directly on these maps by exploring supervision from the network itself to improve them towards specific target tasks. Lastly, we proposed a design to seamlessly bridge the gap between using weak and extra supervision if available. Despite its simplicity, experiments on the semantic segmentation task demonstrate the effectiveness of our methods. Besides, the proposed framework provides a way not only explaining the focus of the learner but also feeding back with direct guidance towards specific tasks. Under mild assumptions our method can also be understood as a plug-in to existing convolutional neural networks to improve their generalization performance.</description><identifier>ISSN: 0162-8828</identifier><identifier>EISSN: 1939-3539</identifier><identifier>EISSN: 2160-9292</identifier><identifier>DOI: 10.1109/TPAMI.2019.2921543</identifier><identifier>PMID: 31180839</identifier><identifier>CODEN: ITPIDJ</identifier><language>eng</language><publisher>United States: IEEE</publisher><subject>Artificial neural networks ; Back propagation ; biased data ; Convolutional neural network ; Convolutional neural networks ; Image segmentation ; Machine learning ; network attention ; Neural networks ; Semantic segmentation ; Semantics ; Supervised learning ; Training data ; Visualization ; weakly supervised learning</subject><ispartof>IEEE transactions on pattern analysis and machine intelligence, 2020-12, Vol.42 (12), p.2996-3010</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2020</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c395t-4048eaf1ec2eacbba1ff75bbf1f6baa26e04fa12b7ca11fa348c26ccc30ed84e3</citedby><cites>FETCH-LOGICAL-c395t-4048eaf1ec2eacbba1ff75bbf1f6baa26e04fa12b7ca11fa348c26ccc30ed84e3</cites><orcidid>0000-0002-5098-2853 ; 0000-0001-5805-793X ; 0000?0002?2682?9912 ; 0000-0002-6342-9213</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/8733010$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,776,780,792,27901,27902,54733</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/8733010$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/31180839$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Li, Kunpeng</creatorcontrib><creatorcontrib>Wu, Ziyan</creatorcontrib><creatorcontrib>Peng, Kuan-Chuan</creatorcontrib><creatorcontrib>Ernst, Jan</creatorcontrib><creatorcontrib>Fu, Yun</creatorcontrib><title>Guided Attention Inference Network</title><title>IEEE transactions on pattern analysis and machine intelligence</title><addtitle>TPAMI</addtitle><addtitle>IEEE Trans Pattern Anal Mach Intell</addtitle><description>With only coarse labels, weakly supervised learning typically uses top-down attention maps generated by back-propagating gradients as priors for tasks such as object localization and semantic segmentation. While these attention maps are intuitive and informative explanations of deep neural network, there is no effective mechanism to manipulate the network attention during learning process. In this paper, we address three shortcomings of previous approaches in modeling such attention maps in one common framework. First, we make attention maps a natural and explicit component in the training pipeline such that they are end-to-end trainable. Moreover, we provide self-guidance directly on these maps by exploring supervision from the network itself to improve them towards specific target tasks. Lastly, we proposed a design to seamlessly bridge the gap between using weak and extra supervision if available. Despite its simplicity, experiments on the semantic segmentation task demonstrate the effectiveness of our methods. Besides, the proposed framework provides a way not only explaining the focus of the learner but also feeding back with direct guidance towards specific tasks. Under mild assumptions our method can also be understood as a plug-in to existing convolutional neural networks to improve their generalization performance.</description><subject>Artificial neural networks</subject><subject>Back propagation</subject><subject>biased data</subject><subject>Convolutional neural network</subject><subject>Convolutional neural networks</subject><subject>Image segmentation</subject><subject>Machine learning</subject><subject>network attention</subject><subject>Neural networks</subject><subject>Semantic segmentation</subject><subject>Semantics</subject><subject>Supervised learning</subject><subject>Training data</subject><subject>Visualization</subject><subject>weakly supervised learning</subject><issn>0162-8828</issn><issn>1939-3539</issn><issn>2160-9292</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2020</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNpdkMtOwzAQRS0EoqXwAyChCjZsUjweJ7GXVQWlUnksytpynLGU0ibFSYT4e1JaumA1i3vu1egwdgl8BMD1_eJt_DwbCQ56JLSAWOIR64NGHWGM-pj1OSQiUkqoHjur6yXnIGOOp6yHAIor1H12M22LnPLhuGmobIqqHM5KT4FKR8MXar6q8HHOTrxd1XSxvwP2_viwmDxF89fpbDKeRw513ESSS0XWAzlB1mWZBe_TOMs8-CSzViTEpbcgstRZAG9RKicS5xxyypUkHLC73e4mVJ8t1Y1ZF7Wj1cqWVLW1ESg5l4AaO_T2H7qs2lB23xkh41SnIo3jjhI7yoWqrgN5swnF2oZvA9xsDZpfg2Zr0OwNdqXr_XSbrSk_VP6UdcDVDiiI6BCrFJEDxx_N13P8</recordid><startdate>20201201</startdate><enddate>20201201</enddate><creator>Li, Kunpeng</creator><creator>Wu, Ziyan</creator><creator>Peng, Kuan-Chuan</creator><creator>Ernst, Jan</creator><creator>Fu, Yun</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>7X8</scope><orcidid>https://orcid.org/0000-0002-5098-2853</orcidid><orcidid>https://orcid.org/0000-0001-5805-793X</orcidid><orcidid>https://orcid.org/0000?0002?2682?9912</orcidid><orcidid>https://orcid.org/0000-0002-6342-9213</orcidid></search><sort><creationdate>20201201</creationdate><title>Guided Attention Inference Network</title><author>Li, Kunpeng ; Wu, Ziyan ; Peng, Kuan-Chuan ; Ernst, Jan ; Fu, Yun</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c395t-4048eaf1ec2eacbba1ff75bbf1f6baa26e04fa12b7ca11fa348c26ccc30ed84e3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2020</creationdate><topic>Artificial neural networks</topic><topic>Back propagation</topic><topic>biased data</topic><topic>Convolutional neural network</topic><topic>Convolutional neural networks</topic><topic>Image segmentation</topic><topic>Machine learning</topic><topic>network attention</topic><topic>Neural networks</topic><topic>Semantic segmentation</topic><topic>Semantics</topic><topic>Supervised learning</topic><topic>Training data</topic><topic>Visualization</topic><topic>weakly supervised learning</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Li, Kunpeng</creatorcontrib><creatorcontrib>Wu, Ziyan</creatorcontrib><creatorcontrib>Peng, Kuan-Chuan</creatorcontrib><creatorcontrib>Ernst, Jan</creatorcontrib><creatorcontrib>Fu, Yun</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics & Communications Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>MEDLINE - Academic</collection><jtitle>IEEE transactions on pattern analysis and machine intelligence</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Li, Kunpeng</au><au>Wu, Ziyan</au><au>Peng, Kuan-Chuan</au><au>Ernst, Jan</au><au>Fu, Yun</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Guided Attention Inference Network</atitle><jtitle>IEEE transactions on pattern analysis and machine intelligence</jtitle><stitle>TPAMI</stitle><addtitle>IEEE Trans Pattern Anal Mach Intell</addtitle><date>2020-12-01</date><risdate>2020</risdate><volume>42</volume><issue>12</issue><spage>2996</spage><epage>3010</epage><pages>2996-3010</pages><issn>0162-8828</issn><eissn>1939-3539</eissn><eissn>2160-9292</eissn><coden>ITPIDJ</coden><abstract>With only coarse labels, weakly supervised learning typically uses top-down attention maps generated by back-propagating gradients as priors for tasks such as object localization and semantic segmentation. While these attention maps are intuitive and informative explanations of deep neural network, there is no effective mechanism to manipulate the network attention during learning process. In this paper, we address three shortcomings of previous approaches in modeling such attention maps in one common framework. First, we make attention maps a natural and explicit component in the training pipeline such that they are end-to-end trainable. Moreover, we provide self-guidance directly on these maps by exploring supervision from the network itself to improve them towards specific target tasks. Lastly, we proposed a design to seamlessly bridge the gap between using weak and extra supervision if available. Despite its simplicity, experiments on the semantic segmentation task demonstrate the effectiveness of our methods. Besides, the proposed framework provides a way not only explaining the focus of the learner but also feeding back with direct guidance towards specific tasks. Under mild assumptions our method can also be understood as a plug-in to existing convolutional neural networks to improve their generalization performance.</abstract><cop>United States</cop><pub>IEEE</pub><pmid>31180839</pmid><doi>10.1109/TPAMI.2019.2921543</doi><tpages>15</tpages><orcidid>https://orcid.org/0000-0002-5098-2853</orcidid><orcidid>https://orcid.org/0000-0001-5805-793X</orcidid><orcidid>https://orcid.org/0000?0002?2682?9912</orcidid><orcidid>https://orcid.org/0000-0002-6342-9213</orcidid><oa>free_for_read</oa></addata></record>
fulltext	fulltext_linktorsrc
identifier	ISSN: 0162-8828
ispartof	IEEE transactions on pattern analysis and machine intelligence, 2020-12, Vol.42 (12), p.2996-3010
issn	0162-8828 1939-3539 2160-9292
language	eng
recordid	cdi_proquest_journals_2457972755
source	IEEE Electronic Library (IEL)
subjects	Artificial neural networks Back propagation biased data Convolutional neural network Convolutional neural networks Image segmentation Machine learning network attention Neural networks Semantic segmentation Semantics Supervised learning Training data Visualization weakly supervised learning
title	Guided Attention Inference Network
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-03T15%3A47%3A38IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Guided%20Attention%20Inference%20Network&rft.jtitle=IEEE%20transactions%20on%20pattern%20analysis%20and%20machine%20intelligence&rft.au=Li,%20Kunpeng&rft.date=2020-12-01&rft.volume=42&rft.issue=12&rft.spage=2996&rft.epage=3010&rft.pages=2996-3010&rft.issn=0162-8828&rft.eissn=1939-3539&rft.coden=ITPIDJ&rft_id=info:doi/10.1109/TPAMI.2019.2921543&rft_dat=%3Cproquest_RIE%3E2457972755%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2457972755&rft_id=info:pmid/31180839&rft_ieee_id=8733010&rfr_iscdi=true