Guided Attention Inference Network
With only coarse labels, weakly supervised learning typically uses top-down attention maps generated by back-propagating gradients as priors for tasks such as object localization and semantic segmentation. While these attention maps are intuitive and informative explanations of deep neural network,...
Gespeichert in:
Veröffentlicht in: | IEEE transactions on pattern analysis and machine intelligence 2020-12, Vol.42 (12), p.2996-3010 |
---|---|
Hauptverfasser: | , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 3010 |
---|---|
container_issue | 12 |
container_start_page | 2996 |
container_title | IEEE transactions on pattern analysis and machine intelligence |
container_volume | 42 |
creator | Li, Kunpeng Wu, Ziyan Peng, Kuan-Chuan Ernst, Jan Fu, Yun |
description | With only coarse labels, weakly supervised learning typically uses top-down attention maps generated by back-propagating gradients as priors for tasks such as object localization and semantic segmentation. While these attention maps are intuitive and informative explanations of deep neural network, there is no effective mechanism to manipulate the network attention during learning process. In this paper, we address three shortcomings of previous approaches in modeling such attention maps in one common framework. First, we make attention maps a natural and explicit component in the training pipeline such that they are end-to-end trainable. Moreover, we provide self-guidance directly on these maps by exploring supervision from the network itself to improve them towards specific target tasks. Lastly, we proposed a design to seamlessly bridge the gap between using weak and extra supervision if available. Despite its simplicity, experiments on the semantic segmentation task demonstrate the effectiveness of our methods. Besides, the proposed framework provides a way not only explaining the focus of the learner but also feeding back with direct guidance towards specific tasks. Under mild assumptions our method can also be understood as a plug-in to existing convolutional neural networks to improve their generalization performance. |
doi_str_mv | 10.1109/TPAMI.2019.2921543 |
format | Article |
fullrecord | <record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_proquest_journals_2457972755</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>8733010</ieee_id><sourcerecordid>2457972755</sourcerecordid><originalsourceid>FETCH-LOGICAL-c395t-4048eaf1ec2eacbba1ff75bbf1f6baa26e04fa12b7ca11fa348c26ccc30ed84e3</originalsourceid><addsrcrecordid>eNpdkMtOwzAQRS0EoqXwAyChCjZsUjweJ7GXVQWlUnksytpynLGU0ibFSYT4e1JaumA1i3vu1egwdgl8BMD1_eJt_DwbCQ56JLSAWOIR64NGHWGM-pj1OSQiUkqoHjur6yXnIGOOp6yHAIor1H12M22LnPLhuGmobIqqHM5KT4FKR8MXar6q8HHOTrxd1XSxvwP2_viwmDxF89fpbDKeRw513ESSS0XWAzlB1mWZBe_TOMs8-CSzViTEpbcgstRZAG9RKicS5xxyypUkHLC73e4mVJ8t1Y1ZF7Wj1cqWVLW1ESg5l4AaO_T2H7qs2lB23xkh41SnIo3jjhI7yoWqrgN5swnF2oZvA9xsDZpfg2Zr0OwNdqXr_XSbrSk_VP6UdcDVDiiI6BCrFJEDxx_N13P8</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2457972755</pqid></control><display><type>article</type><title>Guided Attention Inference Network</title><source>IEEE Electronic Library (IEL)</source><creator>Li, Kunpeng ; Wu, Ziyan ; Peng, Kuan-Chuan ; Ernst, Jan ; Fu, Yun</creator><creatorcontrib>Li, Kunpeng ; Wu, Ziyan ; Peng, Kuan-Chuan ; Ernst, Jan ; Fu, Yun</creatorcontrib><description>With only coarse labels, weakly supervised learning typically uses top-down attention maps generated by back-propagating gradients as priors for tasks such as object localization and semantic segmentation. While these attention maps are intuitive and informative explanations of deep neural network, there is no effective mechanism to manipulate the network attention during learning process. In this paper, we address three shortcomings of previous approaches in modeling such attention maps in one common framework. First, we make attention maps a natural and explicit component in the training pipeline such that they are end-to-end trainable. Moreover, we provide self-guidance directly on these maps by exploring supervision from the network itself to improve them towards specific target tasks. Lastly, we proposed a design to seamlessly bridge the gap between using weak and extra supervision if available. Despite its simplicity, experiments on the semantic segmentation task demonstrate the effectiveness of our methods. Besides, the proposed framework provides a way not only explaining the focus of the learner but also feeding back with direct guidance towards specific tasks. Under mild assumptions our method can also be understood as a plug-in to existing convolutional neural networks to improve their generalization performance.</description><identifier>ISSN: 0162-8828</identifier><identifier>EISSN: 1939-3539</identifier><identifier>EISSN: 2160-9292</identifier><identifier>DOI: 10.1109/TPAMI.2019.2921543</identifier><identifier>PMID: 31180839</identifier><identifier>CODEN: ITPIDJ</identifier><language>eng</language><publisher>United States: IEEE</publisher><subject>Artificial neural networks ; Back propagation ; biased data ; Convolutional neural network ; Convolutional neural networks ; Image segmentation ; Machine learning ; network attention ; Neural networks ; Semantic segmentation ; Semantics ; Supervised learning ; Training data ; Visualization ; weakly supervised learning</subject><ispartof>IEEE transactions on pattern analysis and machine intelligence, 2020-12, Vol.42 (12), p.2996-3010</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2020</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c395t-4048eaf1ec2eacbba1ff75bbf1f6baa26e04fa12b7ca11fa348c26ccc30ed84e3</citedby><cites>FETCH-LOGICAL-c395t-4048eaf1ec2eacbba1ff75bbf1f6baa26e04fa12b7ca11fa348c26ccc30ed84e3</cites><orcidid>0000-0002-5098-2853 ; 0000-0001-5805-793X ; 0000?0002?2682?9912 ; 0000-0002-6342-9213</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/8733010$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,776,780,792,27901,27902,54733</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/8733010$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/31180839$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Li, Kunpeng</creatorcontrib><creatorcontrib>Wu, Ziyan</creatorcontrib><creatorcontrib>Peng, Kuan-Chuan</creatorcontrib><creatorcontrib>Ernst, Jan</creatorcontrib><creatorcontrib>Fu, Yun</creatorcontrib><title>Guided Attention Inference Network</title><title>IEEE transactions on pattern analysis and machine intelligence</title><addtitle>TPAMI</addtitle><addtitle>IEEE Trans Pattern Anal Mach Intell</addtitle><description>With only coarse labels, weakly supervised learning typically uses top-down attention maps generated by back-propagating gradients as priors for tasks such as object localization and semantic segmentation. While these attention maps are intuitive and informative explanations of deep neural network, there is no effective mechanism to manipulate the network attention during learning process. In this paper, we address three shortcomings of previous approaches in modeling such attention maps in one common framework. First, we make attention maps a natural and explicit component in the training pipeline such that they are end-to-end trainable. Moreover, we provide self-guidance directly on these maps by exploring supervision from the network itself to improve them towards specific target tasks. Lastly, we proposed a design to seamlessly bridge the gap between using weak and extra supervision if available. Despite its simplicity, experiments on the semantic segmentation task demonstrate the effectiveness of our methods. Besides, the proposed framework provides a way not only explaining the focus of the learner but also feeding back with direct guidance towards specific tasks. Under mild assumptions our method can also be understood as a plug-in to existing convolutional neural networks to improve their generalization performance.</description><subject>Artificial neural networks</subject><subject>Back propagation</subject><subject>biased data</subject><subject>Convolutional neural network</subject><subject>Convolutional neural networks</subject><subject>Image segmentation</subject><subject>Machine learning</subject><subject>network attention</subject><subject>Neural networks</subject><subject>Semantic segmentation</subject><subject>Semantics</subject><subject>Supervised learning</subject><subject>Training data</subject><subject>Visualization</subject><subject>weakly supervised learning</subject><issn>0162-8828</issn><issn>1939-3539</issn><issn>2160-9292</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2020</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNpdkMtOwzAQRS0EoqXwAyChCjZsUjweJ7GXVQWlUnksytpynLGU0ibFSYT4e1JaumA1i3vu1egwdgl8BMD1_eJt_DwbCQ56JLSAWOIR64NGHWGM-pj1OSQiUkqoHjur6yXnIGOOp6yHAIor1H12M22LnPLhuGmobIqqHM5KT4FKR8MXar6q8HHOTrxd1XSxvwP2_viwmDxF89fpbDKeRw513ESSS0XWAzlB1mWZBe_TOMs8-CSzViTEpbcgstRZAG9RKicS5xxyypUkHLC73e4mVJ8t1Y1ZF7Wj1cqWVLW1ESg5l4AaO_T2H7qs2lB23xkh41SnIo3jjhI7yoWqrgN5swnF2oZvA9xsDZpfg2Zr0OwNdqXr_XSbrSk_VP6UdcDVDiiI6BCrFJEDxx_N13P8</recordid><startdate>20201201</startdate><enddate>20201201</enddate><creator>Li, Kunpeng</creator><creator>Wu, Ziyan</creator><creator>Peng, Kuan-Chuan</creator><creator>Ernst, Jan</creator><creator>Fu, Yun</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>7X8</scope><orcidid>https://orcid.org/0000-0002-5098-2853</orcidid><orcidid>https://orcid.org/0000-0001-5805-793X</orcidid><orcidid>https://orcid.org/0000?0002?2682?9912</orcidid><orcidid>https://orcid.org/0000-0002-6342-9213</orcidid></search><sort><creationdate>20201201</creationdate><title>Guided Attention Inference Network</title><author>Li, Kunpeng ; Wu, Ziyan ; Peng, Kuan-Chuan ; Ernst, Jan ; Fu, Yun</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c395t-4048eaf1ec2eacbba1ff75bbf1f6baa26e04fa12b7ca11fa348c26ccc30ed84e3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2020</creationdate><topic>Artificial neural networks</topic><topic>Back propagation</topic><topic>biased data</topic><topic>Convolutional neural network</topic><topic>Convolutional neural networks</topic><topic>Image segmentation</topic><topic>Machine learning</topic><topic>network attention</topic><topic>Neural networks</topic><topic>Semantic segmentation</topic><topic>Semantics</topic><topic>Supervised learning</topic><topic>Training data</topic><topic>Visualization</topic><topic>weakly supervised learning</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Li, Kunpeng</creatorcontrib><creatorcontrib>Wu, Ziyan</creatorcontrib><creatorcontrib>Peng, Kuan-Chuan</creatorcontrib><creatorcontrib>Ernst, Jan</creatorcontrib><creatorcontrib>Fu, Yun</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics & Communications Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>MEDLINE - Academic</collection><jtitle>IEEE transactions on pattern analysis and machine intelligence</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Li, Kunpeng</au><au>Wu, Ziyan</au><au>Peng, Kuan-Chuan</au><au>Ernst, Jan</au><au>Fu, Yun</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Guided Attention Inference Network</atitle><jtitle>IEEE transactions on pattern analysis and machine intelligence</jtitle><stitle>TPAMI</stitle><addtitle>IEEE Trans Pattern Anal Mach Intell</addtitle><date>2020-12-01</date><risdate>2020</risdate><volume>42</volume><issue>12</issue><spage>2996</spage><epage>3010</epage><pages>2996-3010</pages><issn>0162-8828</issn><eissn>1939-3539</eissn><eissn>2160-9292</eissn><coden>ITPIDJ</coden><abstract>With only coarse labels, weakly supervised learning typically uses top-down attention maps generated by back-propagating gradients as priors for tasks such as object localization and semantic segmentation. While these attention maps are intuitive and informative explanations of deep neural network, there is no effective mechanism to manipulate the network attention during learning process. In this paper, we address three shortcomings of previous approaches in modeling such attention maps in one common framework. First, we make attention maps a natural and explicit component in the training pipeline such that they are end-to-end trainable. Moreover, we provide self-guidance directly on these maps by exploring supervision from the network itself to improve them towards specific target tasks. Lastly, we proposed a design to seamlessly bridge the gap between using weak and extra supervision if available. Despite its simplicity, experiments on the semantic segmentation task demonstrate the effectiveness of our methods. Besides, the proposed framework provides a way not only explaining the focus of the learner but also feeding back with direct guidance towards specific tasks. Under mild assumptions our method can also be understood as a plug-in to existing convolutional neural networks to improve their generalization performance.</abstract><cop>United States</cop><pub>IEEE</pub><pmid>31180839</pmid><doi>10.1109/TPAMI.2019.2921543</doi><tpages>15</tpages><orcidid>https://orcid.org/0000-0002-5098-2853</orcidid><orcidid>https://orcid.org/0000-0001-5805-793X</orcidid><orcidid>https://orcid.org/0000?0002?2682?9912</orcidid><orcidid>https://orcid.org/0000-0002-6342-9213</orcidid><oa>free_for_read</oa></addata></record> |
fulltext | fulltext_linktorsrc |
identifier | ISSN: 0162-8828 |
ispartof | IEEE transactions on pattern analysis and machine intelligence, 2020-12, Vol.42 (12), p.2996-3010 |
issn | 0162-8828 1939-3539 2160-9292 |
language | eng |
recordid | cdi_proquest_journals_2457972755 |
source | IEEE Electronic Library (IEL) |
subjects | Artificial neural networks Back propagation biased data Convolutional neural network Convolutional neural networks Image segmentation Machine learning network attention Neural networks Semantic segmentation Semantics Supervised learning Training data Visualization weakly supervised learning |
title | Guided Attention Inference Network |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-03T15%3A47%3A38IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Guided%20Attention%20Inference%20Network&rft.jtitle=IEEE%20transactions%20on%20pattern%20analysis%20and%20machine%20intelligence&rft.au=Li,%20Kunpeng&rft.date=2020-12-01&rft.volume=42&rft.issue=12&rft.spage=2996&rft.epage=3010&rft.pages=2996-3010&rft.issn=0162-8828&rft.eissn=1939-3539&rft.coden=ITPIDJ&rft_id=info:doi/10.1109/TPAMI.2019.2921543&rft_dat=%3Cproquest_RIE%3E2457972755%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2457972755&rft_id=info:pmid/31180839&rft_ieee_id=8733010&rfr_iscdi=true |