CNN Fixations: An Unraveling Approach to Visualize the Discriminative Image Regions

Deep convolutional neural networks (CNNs) have revolutionized the computer vision research and have seen unprecedented adoption for multiple tasks, such as classification, detection, and caption generation. However, they offer little transparency into their inner workings and are often treated as bl...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE transactions on image processing 2019-05, Vol.28 (5), p.2116-2125
Hauptverfasser:	Mopuri, Konda Reddy, Garg, Utsav, Venkatesh Babu, R.
Format:	Artikel
Sprache:	eng
Schlagworte:	Artificial neural networks Black boxes CNN visualization Computer architecture Computer vision Convolution Explainable AI label localization Network architecture Neurons Object recognition Task analysis Training visual explanations Visualization weakly supervised localization
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	2125
container_issue	5
container_start_page	2116
container_title	IEEE transactions on image processing
container_volume	28
creator	Mopuri, Konda Reddy Garg, Utsav Venkatesh Babu, R.
description	Deep convolutional neural networks (CNNs) have revolutionized the computer vision research and have seen unprecedented adoption for multiple tasks, such as classification, detection, and caption generation. However, they offer little transparency into their inner workings and are often treated as black boxes that deliver excellent performance. In this paper, we aim at alleviating this opaqueness of CNNs by providing visual explanations for the network's predictions. Our approach can analyze a variety of CNN-based models trained for computer vision applications, such as object recognition and caption generation. Unlike the existing methods, we achieve this via unraveling the forward pass operation. The proposed method exploits feature dependencies across the layer hierarchy and uncovers the discriminative image locations that guide the network's predictions. We name these locations CNN fixations, loosely analogous to human eye fixations. Our approach is a generic method that requires no architectural changes, additional training, or gradient computation, and computes the important image locations (CNN fixations). We demonstrate through a variety of applications that our approach is able to localize the discriminative image locations across different network architectures, diverse vision tasks, and data modalities.
doi_str_mv	10.1109/TIP.2018.2881920
format	Article
fullrecord	<record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_ieee_primary_8537979</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>8537979</ieee_id><sourcerecordid>2136066102</sourcerecordid><originalsourceid>FETCH-LOGICAL-c347t-4eca04c5659fc50997bb8fcc1298ef3cc023c91666ca2c4953c7aee6a51a7cb63</originalsourceid><addsrcrecordid>eNpdkM1LwzAchoMoOqd3QZCAFy-d-U7jbcyvgUzRzWvJ4q8zo2tn04r615uxuYOnBPK8L28ehE4o6VFKzOV4-NRjhKY9lqbUMLKDOtQImhAi2G68E6kTTYU5QIchzAmhQlK1jw44EZJxpTvoZTAa4Vv_ZRtfleEK90s8KWv7CYUvZ7i_XNaVde-4qfCrD60t_A_g5h3wtQ-u9gtfxuAn4OHCzgA_w2zVcoT2clsEON6cXTS5vRkP7pOHx7vhoP-QOC50kwhwlggnlTS5k8QYPZ2muXOUmRRy7hxh3BmqlHKWOWEkd9oCKCup1W6qeBddrHvjyI8WQpMt4iooCltC1YaMUa6IUjT2dNH5P3RetXUZ10VKGcENEzJSZE25ugqhhjxbxj_a-jujJFsJz6LwbCU82wiPkbNNcTtdwNs28Gc4AqdrwAPA9jmVXBtt-C_wFYMz</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2169439245</pqid></control><display><type>article</type><title>CNN Fixations: An Unraveling Approach to Visualize the Discriminative Image Regions</title><source>IEEE Electronic Library (IEL)</source><creator>Mopuri, Konda Reddy ; Garg, Utsav ; Venkatesh Babu, R.</creator><creatorcontrib>Mopuri, Konda Reddy ; Garg, Utsav ; Venkatesh Babu, R.</creatorcontrib><description>Deep convolutional neural networks (CNNs) have revolutionized the computer vision research and have seen unprecedented adoption for multiple tasks, such as classification, detection, and caption generation. However, they offer little transparency into their inner workings and are often treated as black boxes that deliver excellent performance. In this paper, we aim at alleviating this opaqueness of CNNs by providing visual explanations for the network's predictions. Our approach can analyze a variety of CNN-based models trained for computer vision applications, such as object recognition and caption generation. Unlike the existing methods, we achieve this via unraveling the forward pass operation. The proposed method exploits feature dependencies across the layer hierarchy and uncovers the discriminative image locations that guide the network's predictions. We name these locations CNN fixations, loosely analogous to human eye fixations. Our approach is a generic method that requires no architectural changes, additional training, or gradient computation, and computes the important image locations (CNN fixations). We demonstrate through a variety of applications that our approach is able to localize the discriminative image locations across different network architectures, diverse vision tasks, and data modalities.</description><identifier>ISSN: 1057-7149</identifier><identifier>EISSN: 1941-0042</identifier><identifier>DOI: 10.1109/TIP.2018.2881920</identifier><identifier>PMID: 30452367</identifier><identifier>CODEN: IIPRE4</identifier><language>eng</language><publisher>United States: IEEE</publisher><subject>Artificial neural networks ; Black boxes ; CNN visualization ; Computer architecture ; Computer vision ; Convolution ; Explainable AI ; label localization ; Network architecture ; Neurons ; Object recognition ; Task analysis ; Training ; visual explanations ; Visualization ; weakly supervised localization</subject><ispartof>IEEE transactions on image processing, 2019-05, Vol.28 (5), p.2116-2125</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2019</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c347t-4eca04c5659fc50997bb8fcc1298ef3cc023c91666ca2c4953c7aee6a51a7cb63</citedby><cites>FETCH-LOGICAL-c347t-4eca04c5659fc50997bb8fcc1298ef3cc023c91666ca2c4953c7aee6a51a7cb63</cites><orcidid>0000-0001-8894-7212 ; 0000-0002-1926-1804</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/8537979$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,776,780,792,27901,27902,54733</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/8537979$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/30452367$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Mopuri, Konda Reddy</creatorcontrib><creatorcontrib>Garg, Utsav</creatorcontrib><creatorcontrib>Venkatesh Babu, R.</creatorcontrib><title>CNN Fixations: An Unraveling Approach to Visualize the Discriminative Image Regions</title><title>IEEE transactions on image processing</title><addtitle>TIP</addtitle><addtitle>IEEE Trans Image Process</addtitle><description>Deep convolutional neural networks (CNNs) have revolutionized the computer vision research and have seen unprecedented adoption for multiple tasks, such as classification, detection, and caption generation. However, they offer little transparency into their inner workings and are often treated as black boxes that deliver excellent performance. In this paper, we aim at alleviating this opaqueness of CNNs by providing visual explanations for the network's predictions. Our approach can analyze a variety of CNN-based models trained for computer vision applications, such as object recognition and caption generation. Unlike the existing methods, we achieve this via unraveling the forward pass operation. The proposed method exploits feature dependencies across the layer hierarchy and uncovers the discriminative image locations that guide the network's predictions. We name these locations CNN fixations, loosely analogous to human eye fixations. Our approach is a generic method that requires no architectural changes, additional training, or gradient computation, and computes the important image locations (CNN fixations). We demonstrate through a variety of applications that our approach is able to localize the discriminative image locations across different network architectures, diverse vision tasks, and data modalities.</description><subject>Artificial neural networks</subject><subject>Black boxes</subject><subject>CNN visualization</subject><subject>Computer architecture</subject><subject>Computer vision</subject><subject>Convolution</subject><subject>Explainable AI</subject><subject>label localization</subject><subject>Network architecture</subject><subject>Neurons</subject><subject>Object recognition</subject><subject>Task analysis</subject><subject>Training</subject><subject>visual explanations</subject><subject>Visualization</subject><subject>weakly supervised localization</subject><issn>1057-7149</issn><issn>1941-0042</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2019</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNpdkM1LwzAchoMoOqd3QZCAFy-d-U7jbcyvgUzRzWvJ4q8zo2tn04r615uxuYOnBPK8L28ehE4o6VFKzOV4-NRjhKY9lqbUMLKDOtQImhAi2G68E6kTTYU5QIchzAmhQlK1jw44EZJxpTvoZTAa4Vv_ZRtfleEK90s8KWv7CYUvZ7i_XNaVde-4qfCrD60t_A_g5h3wtQ-u9gtfxuAn4OHCzgA_w2zVcoT2clsEON6cXTS5vRkP7pOHx7vhoP-QOC50kwhwlggnlTS5k8QYPZ2muXOUmRRy7hxh3BmqlHKWOWEkd9oCKCup1W6qeBddrHvjyI8WQpMt4iooCltC1YaMUa6IUjT2dNH5P3RetXUZ10VKGcENEzJSZE25ugqhhjxbxj_a-jujJFsJz6LwbCU82wiPkbNNcTtdwNs28Gc4AqdrwAPA9jmVXBtt-C_wFYMz</recordid><startdate>20190501</startdate><enddate>20190501</enddate><creator>Mopuri, Konda Reddy</creator><creator>Garg, Utsav</creator><creator>Venkatesh Babu, R.</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>7X8</scope><orcidid>https://orcid.org/0000-0001-8894-7212</orcidid><orcidid>https://orcid.org/0000-0002-1926-1804</orcidid></search><sort><creationdate>20190501</creationdate><title>CNN Fixations: An Unraveling Approach to Visualize the Discriminative Image Regions</title><author>Mopuri, Konda Reddy ; Garg, Utsav ; Venkatesh Babu, R.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c347t-4eca04c5659fc50997bb8fcc1298ef3cc023c91666ca2c4953c7aee6a51a7cb63</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2019</creationdate><topic>Artificial neural networks</topic><topic>Black boxes</topic><topic>CNN visualization</topic><topic>Computer architecture</topic><topic>Computer vision</topic><topic>Convolution</topic><topic>Explainable AI</topic><topic>label localization</topic><topic>Network architecture</topic><topic>Neurons</topic><topic>Object recognition</topic><topic>Task analysis</topic><topic>Training</topic><topic>visual explanations</topic><topic>Visualization</topic><topic>weakly supervised localization</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Mopuri, Konda Reddy</creatorcontrib><creatorcontrib>Garg, Utsav</creatorcontrib><creatorcontrib>Venkatesh Babu, R.</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics & Communications Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>MEDLINE - Academic</collection><jtitle>IEEE transactions on image processing</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Mopuri, Konda Reddy</au><au>Garg, Utsav</au><au>Venkatesh Babu, R.</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>CNN Fixations: An Unraveling Approach to Visualize the Discriminative Image Regions</atitle><jtitle>IEEE transactions on image processing</jtitle><stitle>TIP</stitle><addtitle>IEEE Trans Image Process</addtitle><date>2019-05-01</date><risdate>2019</risdate><volume>28</volume><issue>5</issue><spage>2116</spage><epage>2125</epage><pages>2116-2125</pages><issn>1057-7149</issn><eissn>1941-0042</eissn><coden>IIPRE4</coden><abstract>Deep convolutional neural networks (CNNs) have revolutionized the computer vision research and have seen unprecedented adoption for multiple tasks, such as classification, detection, and caption generation. However, they offer little transparency into their inner workings and are often treated as black boxes that deliver excellent performance. In this paper, we aim at alleviating this opaqueness of CNNs by providing visual explanations for the network's predictions. Our approach can analyze a variety of CNN-based models trained for computer vision applications, such as object recognition and caption generation. Unlike the existing methods, we achieve this via unraveling the forward pass operation. The proposed method exploits feature dependencies across the layer hierarchy and uncovers the discriminative image locations that guide the network's predictions. We name these locations CNN fixations, loosely analogous to human eye fixations. Our approach is a generic method that requires no architectural changes, additional training, or gradient computation, and computes the important image locations (CNN fixations). We demonstrate through a variety of applications that our approach is able to localize the discriminative image locations across different network architectures, diverse vision tasks, and data modalities.</abstract><cop>United States</cop><pub>IEEE</pub><pmid>30452367</pmid><doi>10.1109/TIP.2018.2881920</doi><tpages>10</tpages><orcidid>https://orcid.org/0000-0001-8894-7212</orcidid><orcidid>https://orcid.org/0000-0002-1926-1804</orcidid></addata></record>
fulltext	fulltext_linktorsrc
identifier	ISSN: 1057-7149
ispartof	IEEE transactions on image processing, 2019-05, Vol.28 (5), p.2116-2125
issn	1057-7149 1941-0042
language	eng
recordid	cdi_ieee_primary_8537979
source	IEEE Electronic Library (IEL)
subjects	Artificial neural networks Black boxes CNN visualization Computer architecture Computer vision Convolution Explainable AI label localization Network architecture Neurons Object recognition Task analysis Training visual explanations Visualization weakly supervised localization
title	CNN Fixations: An Unraveling Approach to Visualize the Discriminative Image Regions
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-02T03%3A33%3A49IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=CNN%20Fixations:%20An%20Unraveling%20Approach%20to%20Visualize%20the%20Discriminative%20Image%20Regions&rft.jtitle=IEEE%20transactions%20on%20image%20processing&rft.au=Mopuri,%20Konda%20Reddy&rft.date=2019-05-01&rft.volume=28&rft.issue=5&rft.spage=2116&rft.epage=2125&rft.pages=2116-2125&rft.issn=1057-7149&rft.eissn=1941-0042&rft.coden=IIPRE4&rft_id=info:doi/10.1109/TIP.2018.2881920&rft_dat=%3Cproquest_RIE%3E2136066102%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2169439245&rft_id=info:pmid/30452367&rft_ieee_id=8537979&rfr_iscdi=true