CNN Fixations: An Unraveling Approach to Visualize the Discriminative Image Regions
Deep convolutional neural networks (CNNs) have revolutionized the computer vision research and have seen unprecedented adoption for multiple tasks, such as classification, detection, and caption generation. However, they offer little transparency into their inner workings and are often treated as bl...
Gespeichert in:
Veröffentlicht in: | IEEE transactions on image processing 2019-05, Vol.28 (5), p.2116-2125 |
---|---|
Hauptverfasser: | , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 2125 |
---|---|
container_issue | 5 |
container_start_page | 2116 |
container_title | IEEE transactions on image processing |
container_volume | 28 |
creator | Mopuri, Konda Reddy Garg, Utsav Venkatesh Babu, R. |
description | Deep convolutional neural networks (CNNs) have revolutionized the computer vision research and have seen unprecedented adoption for multiple tasks, such as classification, detection, and caption generation. However, they offer little transparency into their inner workings and are often treated as black boxes that deliver excellent performance. In this paper, we aim at alleviating this opaqueness of CNNs by providing visual explanations for the network's predictions. Our approach can analyze a variety of CNN-based models trained for computer vision applications, such as object recognition and caption generation. Unlike the existing methods, we achieve this via unraveling the forward pass operation. The proposed method exploits feature dependencies across the layer hierarchy and uncovers the discriminative image locations that guide the network's predictions. We name these locations CNN fixations, loosely analogous to human eye fixations. Our approach is a generic method that requires no architectural changes, additional training, or gradient computation, and computes the important image locations (CNN fixations). We demonstrate through a variety of applications that our approach is able to localize the discriminative image locations across different network architectures, diverse vision tasks, and data modalities. |
doi_str_mv | 10.1109/TIP.2018.2881920 |
format | Article |
fullrecord | <record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_ieee_primary_8537979</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>8537979</ieee_id><sourcerecordid>2136066102</sourcerecordid><originalsourceid>FETCH-LOGICAL-c347t-4eca04c5659fc50997bb8fcc1298ef3cc023c91666ca2c4953c7aee6a51a7cb63</originalsourceid><addsrcrecordid>eNpdkM1LwzAchoMoOqd3QZCAFy-d-U7jbcyvgUzRzWvJ4q8zo2tn04r615uxuYOnBPK8L28ehE4o6VFKzOV4-NRjhKY9lqbUMLKDOtQImhAi2G68E6kTTYU5QIchzAmhQlK1jw44EZJxpTvoZTAa4Vv_ZRtfleEK90s8KWv7CYUvZ7i_XNaVde-4qfCrD60t_A_g5h3wtQ-u9gtfxuAn4OHCzgA_w2zVcoT2clsEON6cXTS5vRkP7pOHx7vhoP-QOC50kwhwlggnlTS5k8QYPZ2muXOUmRRy7hxh3BmqlHKWOWEkd9oCKCup1W6qeBddrHvjyI8WQpMt4iooCltC1YaMUa6IUjT2dNH5P3RetXUZ10VKGcENEzJSZE25ugqhhjxbxj_a-jujJFsJz6LwbCU82wiPkbNNcTtdwNs28Gc4AqdrwAPA9jmVXBtt-C_wFYMz</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2169439245</pqid></control><display><type>article</type><title>CNN Fixations: An Unraveling Approach to Visualize the Discriminative Image Regions</title><source>IEEE Electronic Library (IEL)</source><creator>Mopuri, Konda Reddy ; Garg, Utsav ; Venkatesh Babu, R.</creator><creatorcontrib>Mopuri, Konda Reddy ; Garg, Utsav ; Venkatesh Babu, R.</creatorcontrib><description>Deep convolutional neural networks (CNNs) have revolutionized the computer vision research and have seen unprecedented adoption for multiple tasks, such as classification, detection, and caption generation. However, they offer little transparency into their inner workings and are often treated as black boxes that deliver excellent performance. In this paper, we aim at alleviating this opaqueness of CNNs by providing visual explanations for the network's predictions. Our approach can analyze a variety of CNN-based models trained for computer vision applications, such as object recognition and caption generation. Unlike the existing methods, we achieve this via unraveling the forward pass operation. The proposed method exploits feature dependencies across the layer hierarchy and uncovers the discriminative image locations that guide the network's predictions. We name these locations CNN fixations, loosely analogous to human eye fixations. Our approach is a generic method that requires no architectural changes, additional training, or gradient computation, and computes the important image locations (CNN fixations). We demonstrate through a variety of applications that our approach is able to localize the discriminative image locations across different network architectures, diverse vision tasks, and data modalities.</description><identifier>ISSN: 1057-7149</identifier><identifier>EISSN: 1941-0042</identifier><identifier>DOI: 10.1109/TIP.2018.2881920</identifier><identifier>PMID: 30452367</identifier><identifier>CODEN: IIPRE4</identifier><language>eng</language><publisher>United States: IEEE</publisher><subject>Artificial neural networks ; Black boxes ; CNN visualization ; Computer architecture ; Computer vision ; Convolution ; Explainable AI ; label localization ; Network architecture ; Neurons ; Object recognition ; Task analysis ; Training ; visual explanations ; Visualization ; weakly supervised localization</subject><ispartof>IEEE transactions on image processing, 2019-05, Vol.28 (5), p.2116-2125</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2019</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c347t-4eca04c5659fc50997bb8fcc1298ef3cc023c91666ca2c4953c7aee6a51a7cb63</citedby><cites>FETCH-LOGICAL-c347t-4eca04c5659fc50997bb8fcc1298ef3cc023c91666ca2c4953c7aee6a51a7cb63</cites><orcidid>0000-0001-8894-7212 ; 0000-0002-1926-1804</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/8537979$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,776,780,792,27901,27902,54733</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/8537979$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/30452367$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Mopuri, Konda Reddy</creatorcontrib><creatorcontrib>Garg, Utsav</creatorcontrib><creatorcontrib>Venkatesh Babu, R.</creatorcontrib><title>CNN Fixations: An Unraveling Approach to Visualize the Discriminative Image Regions</title><title>IEEE transactions on image processing</title><addtitle>TIP</addtitle><addtitle>IEEE Trans Image Process</addtitle><description>Deep convolutional neural networks (CNNs) have revolutionized the computer vision research and have seen unprecedented adoption for multiple tasks, such as classification, detection, and caption generation. However, they offer little transparency into their inner workings and are often treated as black boxes that deliver excellent performance. In this paper, we aim at alleviating this opaqueness of CNNs by providing visual explanations for the network's predictions. Our approach can analyze a variety of CNN-based models trained for computer vision applications, such as object recognition and caption generation. Unlike the existing methods, we achieve this via unraveling the forward pass operation. The proposed method exploits feature dependencies across the layer hierarchy and uncovers the discriminative image locations that guide the network's predictions. We name these locations CNN fixations, loosely analogous to human eye fixations. Our approach is a generic method that requires no architectural changes, additional training, or gradient computation, and computes the important image locations (CNN fixations). We demonstrate through a variety of applications that our approach is able to localize the discriminative image locations across different network architectures, diverse vision tasks, and data modalities.</description><subject>Artificial neural networks</subject><subject>Black boxes</subject><subject>CNN visualization</subject><subject>Computer architecture</subject><subject>Computer vision</subject><subject>Convolution</subject><subject>Explainable AI</subject><subject>label localization</subject><subject>Network architecture</subject><subject>Neurons</subject><subject>Object recognition</subject><subject>Task analysis</subject><subject>Training</subject><subject>visual explanations</subject><subject>Visualization</subject><subject>weakly supervised localization</subject><issn>1057-7149</issn><issn>1941-0042</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2019</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNpdkM1LwzAchoMoOqd3QZCAFy-d-U7jbcyvgUzRzWvJ4q8zo2tn04r615uxuYOnBPK8L28ehE4o6VFKzOV4-NRjhKY9lqbUMLKDOtQImhAi2G68E6kTTYU5QIchzAmhQlK1jw44EZJxpTvoZTAa4Vv_ZRtfleEK90s8KWv7CYUvZ7i_XNaVde-4qfCrD60t_A_g5h3wtQ-u9gtfxuAn4OHCzgA_w2zVcoT2clsEON6cXTS5vRkP7pOHx7vhoP-QOC50kwhwlggnlTS5k8QYPZ2muXOUmRRy7hxh3BmqlHKWOWEkd9oCKCup1W6qeBddrHvjyI8WQpMt4iooCltC1YaMUa6IUjT2dNH5P3RetXUZ10VKGcENEzJSZE25ugqhhjxbxj_a-jujJFsJz6LwbCU82wiPkbNNcTtdwNs28Gc4AqdrwAPA9jmVXBtt-C_wFYMz</recordid><startdate>20190501</startdate><enddate>20190501</enddate><creator>Mopuri, Konda Reddy</creator><creator>Garg, Utsav</creator><creator>Venkatesh Babu, R.</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>7X8</scope><orcidid>https://orcid.org/0000-0001-8894-7212</orcidid><orcidid>https://orcid.org/0000-0002-1926-1804</orcidid></search><sort><creationdate>20190501</creationdate><title>CNN Fixations: An Unraveling Approach to Visualize the Discriminative Image Regions</title><author>Mopuri, Konda Reddy ; Garg, Utsav ; Venkatesh Babu, R.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c347t-4eca04c5659fc50997bb8fcc1298ef3cc023c91666ca2c4953c7aee6a51a7cb63</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2019</creationdate><topic>Artificial neural networks</topic><topic>Black boxes</topic><topic>CNN visualization</topic><topic>Computer architecture</topic><topic>Computer vision</topic><topic>Convolution</topic><topic>Explainable AI</topic><topic>label localization</topic><topic>Network architecture</topic><topic>Neurons</topic><topic>Object recognition</topic><topic>Task analysis</topic><topic>Training</topic><topic>visual explanations</topic><topic>Visualization</topic><topic>weakly supervised localization</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Mopuri, Konda Reddy</creatorcontrib><creatorcontrib>Garg, Utsav</creatorcontrib><creatorcontrib>Venkatesh Babu, R.</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics & Communications Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>MEDLINE - Academic</collection><jtitle>IEEE transactions on image processing</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Mopuri, Konda Reddy</au><au>Garg, Utsav</au><au>Venkatesh Babu, R.</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>CNN Fixations: An Unraveling Approach to Visualize the Discriminative Image Regions</atitle><jtitle>IEEE transactions on image processing</jtitle><stitle>TIP</stitle><addtitle>IEEE Trans Image Process</addtitle><date>2019-05-01</date><risdate>2019</risdate><volume>28</volume><issue>5</issue><spage>2116</spage><epage>2125</epage><pages>2116-2125</pages><issn>1057-7149</issn><eissn>1941-0042</eissn><coden>IIPRE4</coden><abstract>Deep convolutional neural networks (CNNs) have revolutionized the computer vision research and have seen unprecedented adoption for multiple tasks, such as classification, detection, and caption generation. However, they offer little transparency into their inner workings and are often treated as black boxes that deliver excellent performance. In this paper, we aim at alleviating this opaqueness of CNNs by providing visual explanations for the network's predictions. Our approach can analyze a variety of CNN-based models trained for computer vision applications, such as object recognition and caption generation. Unlike the existing methods, we achieve this via unraveling the forward pass operation. The proposed method exploits feature dependencies across the layer hierarchy and uncovers the discriminative image locations that guide the network's predictions. We name these locations CNN fixations, loosely analogous to human eye fixations. Our approach is a generic method that requires no architectural changes, additional training, or gradient computation, and computes the important image locations (CNN fixations). We demonstrate through a variety of applications that our approach is able to localize the discriminative image locations across different network architectures, diverse vision tasks, and data modalities.</abstract><cop>United States</cop><pub>IEEE</pub><pmid>30452367</pmid><doi>10.1109/TIP.2018.2881920</doi><tpages>10</tpages><orcidid>https://orcid.org/0000-0001-8894-7212</orcidid><orcidid>https://orcid.org/0000-0002-1926-1804</orcidid></addata></record> |
fulltext | fulltext_linktorsrc |
identifier | ISSN: 1057-7149 |
ispartof | IEEE transactions on image processing, 2019-05, Vol.28 (5), p.2116-2125 |
issn | 1057-7149 1941-0042 |
language | eng |
recordid | cdi_ieee_primary_8537979 |
source | IEEE Electronic Library (IEL) |
subjects | Artificial neural networks Black boxes CNN visualization Computer architecture Computer vision Convolution Explainable AI label localization Network architecture Neurons Object recognition Task analysis Training visual explanations Visualization weakly supervised localization |
title | CNN Fixations: An Unraveling Approach to Visualize the Discriminative Image Regions |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-02T03%3A33%3A49IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=CNN%20Fixations:%20An%20Unraveling%20Approach%20to%20Visualize%20the%20Discriminative%20Image%20Regions&rft.jtitle=IEEE%20transactions%20on%20image%20processing&rft.au=Mopuri,%20Konda%20Reddy&rft.date=2019-05-01&rft.volume=28&rft.issue=5&rft.spage=2116&rft.epage=2125&rft.pages=2116-2125&rft.issn=1057-7149&rft.eissn=1941-0042&rft.coden=IIPRE4&rft_id=info:doi/10.1109/TIP.2018.2881920&rft_dat=%3Cproquest_RIE%3E2136066102%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2169439245&rft_id=info:pmid/30452367&rft_ieee_id=8537979&rfr_iscdi=true |