ABSNet: Aesthetics-Based Saliency Network Using Multi-Task Convolutional Network

As a smart visual attention mechanism to analyze visual scenes, visual saliency has been shown to closely correlate with semantic information such as faces. Although many semantic-information-guided saliency models have been proposed, to the best of our knowledge, no semantic information in affectiv...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE signal processing letters 2020, Vol.27, p.2014-2018
Hauptverfasser:	Liu, Jing, Lv, Jincheng, Yuan, Min, Zhang, Jing, Su, Yuting
Format:	Artikel
Sprache:	eng
Schlagworte:	Aesthetics Aesthetics assessment Feature extraction multi-task learning Prediction algorithms Salience Saliency detection Semantics Signal processing algorithms Task analysis Visual aspects Visual perception visual saliency detection Visualization
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	2018
container_issue
container_start_page	2014
container_title	IEEE signal processing letters
container_volume	27
creator	Liu, Jing Lv, Jincheng Yuan, Min Zhang, Jing Su, Yuting
description	As a smart visual attention mechanism to analyze visual scenes, visual saliency has been shown to closely correlate with semantic information such as faces. Although many semantic-information-guided saliency models have been proposed, to the best of our knowledge, no semantic information in affective domain has been employed for saliency detection. Aesthetic, the affective perceptual quality that integrates factors like scene composition and contrast, can certainly benefit visual attention that highly depends on these visual factors. In this letter, we propose an end-to-end multi-task framework called aesthetics-based saliency network (ABSNet). We use three commonly-used shared backbones and design two distinct branches for each task. Mean square error (MSE) loss and Earth Mover's Distance (EMD) loss are jointly adopted to alternately train the shared network and individual branch for different tasks, facilitating the proposed model to extract more effective features for visual perception. Moreover, our model is resolution-friendly to predict saliency for images of arbitrary size. It has been shown that the proposed multi-task method is superior over single-task version and outperforms state-of-the-art saliency methods.
doi_str_mv	10.1109/LSP.2020.3035065
format	Article
fullrecord	<record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_crossref_primary_10_1109_LSP_2020_3035065</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>9246211</ieee_id><sourcerecordid>2465446141</sourcerecordid><originalsourceid>FETCH-LOGICAL-c333t-45539726e0c9673dade3c58bcc529157895b0c03ed608c028803b4a159113af93</originalsourceid><addsrcrecordid>eNo9kM1LAzEQxYMoWKt3wcuC562TZJNNvLXFL6haaHsOaTbVtOumJrtK_3sjrZ5mYN578_ghdIlhgDHIm8lsOiBAYECBMuDsCPUwYyInlOPjtEMJuZQgTtFZjGsAEFiwHpoOR7MX295mQxvbd9s6E_ORjrbKZrp2tjG7LJ2_fdhki-iat-y5q1uXz3XcZGPffPm6a51vdP0nO0cnK11He3GYfbS4v5uPH_PJ68PTeDjJDaW0zQvGqCwJt2AkL2mlK0sNE0tjGJGYlUKyJRigtuIgDBAhgC4LjZnEmOqVpH10vc_dBv_ZpfJq7buQikRFCs6KguMCJxXsVSb4GINdqW1wHzrsFAb1y00lbuqXmzpwS5arvcVZa__lMoWS9PoHNAtncw</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2465446141</pqid></control><display><type>article</type><title>ABSNet: Aesthetics-Based Saliency Network Using Multi-Task Convolutional Network</title><source>IEEE Electronic Library (IEL)</source><creator>Liu, Jing ; Lv, Jincheng ; Yuan, Min ; Zhang, Jing ; Su, Yuting</creator><creatorcontrib>Liu, Jing ; Lv, Jincheng ; Yuan, Min ; Zhang, Jing ; Su, Yuting</creatorcontrib><description>As a smart visual attention mechanism to analyze visual scenes, visual saliency has been shown to closely correlate with semantic information such as faces. Although many semantic-information-guided saliency models have been proposed, to the best of our knowledge, no semantic information in affective domain has been employed for saliency detection. Aesthetic, the affective perceptual quality that integrates factors like scene composition and contrast, can certainly benefit visual attention that highly depends on these visual factors. In this letter, we propose an end-to-end multi-task framework called aesthetics-based saliency network (ABSNet). We use three commonly-used shared backbones and design two distinct branches for each task. Mean square error (MSE) loss and Earth Mover's Distance (EMD) loss are jointly adopted to alternately train the shared network and individual branch for different tasks, facilitating the proposed model to extract more effective features for visual perception. Moreover, our model is resolution-friendly to predict saliency for images of arbitrary size. It has been shown that the proposed multi-task method is superior over single-task version and outperforms state-of-the-art saliency methods.</description><identifier>ISSN: 1070-9908</identifier><identifier>EISSN: 1558-2361</identifier><identifier>DOI: 10.1109/LSP.2020.3035065</identifier><identifier>CODEN: ISPLEM</identifier><language>eng</language><publisher>New York: IEEE</publisher><subject>Aesthetics ; Aesthetics assessment ; Feature extraction ; multi-task learning ; Prediction algorithms ; Salience ; Saliency detection ; Semantics ; Signal processing algorithms ; Task analysis ; Visual aspects ; Visual perception ; visual saliency detection ; Visualization</subject><ispartof>IEEE signal processing letters, 2020, Vol.27, p.2014-2018</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2020</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c333t-45539726e0c9673dade3c58bcc529157895b0c03ed608c028803b4a159113af93</citedby><cites>FETCH-LOGICAL-c333t-45539726e0c9673dade3c58bcc529157895b0c03ed608c028803b4a159113af93</cites><orcidid>0000-0002-6998-0268 ; 0000-0001-5165-204X ; 0000-0003-4690-1886</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/9246211$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,780,784,796,4024,27923,27924,27925,54758</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/9246211$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Liu, Jing</creatorcontrib><creatorcontrib>Lv, Jincheng</creatorcontrib><creatorcontrib>Yuan, Min</creatorcontrib><creatorcontrib>Zhang, Jing</creatorcontrib><creatorcontrib>Su, Yuting</creatorcontrib><title>ABSNet: Aesthetics-Based Saliency Network Using Multi-Task Convolutional Network</title><title>IEEE signal processing letters</title><addtitle>LSP</addtitle><description>As a smart visual attention mechanism to analyze visual scenes, visual saliency has been shown to closely correlate with semantic information such as faces. Although many semantic-information-guided saliency models have been proposed, to the best of our knowledge, no semantic information in affective domain has been employed for saliency detection. Aesthetic, the affective perceptual quality that integrates factors like scene composition and contrast, can certainly benefit visual attention that highly depends on these visual factors. In this letter, we propose an end-to-end multi-task framework called aesthetics-based saliency network (ABSNet). We use three commonly-used shared backbones and design two distinct branches for each task. Mean square error (MSE) loss and Earth Mover's Distance (EMD) loss are jointly adopted to alternately train the shared network and individual branch for different tasks, facilitating the proposed model to extract more effective features for visual perception. Moreover, our model is resolution-friendly to predict saliency for images of arbitrary size. It has been shown that the proposed multi-task method is superior over single-task version and outperforms state-of-the-art saliency methods.</description><subject>Aesthetics</subject><subject>Aesthetics assessment</subject><subject>Feature extraction</subject><subject>multi-task learning</subject><subject>Prediction algorithms</subject><subject>Salience</subject><subject>Saliency detection</subject><subject>Semantics</subject><subject>Signal processing algorithms</subject><subject>Task analysis</subject><subject>Visual aspects</subject><subject>Visual perception</subject><subject>visual saliency detection</subject><subject>Visualization</subject><issn>1070-9908</issn><issn>1558-2361</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2020</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNo9kM1LAzEQxYMoWKt3wcuC562TZJNNvLXFL6haaHsOaTbVtOumJrtK_3sjrZ5mYN578_ghdIlhgDHIm8lsOiBAYECBMuDsCPUwYyInlOPjtEMJuZQgTtFZjGsAEFiwHpoOR7MX295mQxvbd9s6E_ORjrbKZrp2tjG7LJ2_fdhki-iat-y5q1uXz3XcZGPffPm6a51vdP0nO0cnK11He3GYfbS4v5uPH_PJ68PTeDjJDaW0zQvGqCwJt2AkL2mlK0sNE0tjGJGYlUKyJRigtuIgDBAhgC4LjZnEmOqVpH10vc_dBv_ZpfJq7buQikRFCs6KguMCJxXsVSb4GINdqW1wHzrsFAb1y00lbuqXmzpwS5arvcVZa__lMoWS9PoHNAtncw</recordid><startdate>2020</startdate><enddate>2020</enddate><creator>Liu, Jing</creator><creator>Lv, Jincheng</creator><creator>Yuan, Min</creator><creator>Zhang, Jing</creator><creator>Su, Yuting</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><orcidid>https://orcid.org/0000-0002-6998-0268</orcidid><orcidid>https://orcid.org/0000-0001-5165-204X</orcidid><orcidid>https://orcid.org/0000-0003-4690-1886</orcidid></search><sort><creationdate>2020</creationdate><title>ABSNet: Aesthetics-Based Saliency Network Using Multi-Task Convolutional Network</title><author>Liu, Jing ; Lv, Jincheng ; Yuan, Min ; Zhang, Jing ; Su, Yuting</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c333t-45539726e0c9673dade3c58bcc529157895b0c03ed608c028803b4a159113af93</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2020</creationdate><topic>Aesthetics</topic><topic>Aesthetics assessment</topic><topic>Feature extraction</topic><topic>multi-task learning</topic><topic>Prediction algorithms</topic><topic>Salience</topic><topic>Saliency detection</topic><topic>Semantics</topic><topic>Signal processing algorithms</topic><topic>Task analysis</topic><topic>Visual aspects</topic><topic>Visual perception</topic><topic>visual saliency detection</topic><topic>Visualization</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Liu, Jing</creatorcontrib><creatorcontrib>Lv, Jincheng</creatorcontrib><creatorcontrib>Yuan, Min</creatorcontrib><creatorcontrib>Zhang, Jing</creatorcontrib><creatorcontrib>Su, Yuting</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics & Communications Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>IEEE signal processing letters</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Liu, Jing</au><au>Lv, Jincheng</au><au>Yuan, Min</au><au>Zhang, Jing</au><au>Su, Yuting</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>ABSNet: Aesthetics-Based Saliency Network Using Multi-Task Convolutional Network</atitle><jtitle>IEEE signal processing letters</jtitle><stitle>LSP</stitle><date>2020</date><risdate>2020</risdate><volume>27</volume><spage>2014</spage><epage>2018</epage><pages>2014-2018</pages><issn>1070-9908</issn><eissn>1558-2361</eissn><coden>ISPLEM</coden><abstract>As a smart visual attention mechanism to analyze visual scenes, visual saliency has been shown to closely correlate with semantic information such as faces. Although many semantic-information-guided saliency models have been proposed, to the best of our knowledge, no semantic information in affective domain has been employed for saliency detection. Aesthetic, the affective perceptual quality that integrates factors like scene composition and contrast, can certainly benefit visual attention that highly depends on these visual factors. In this letter, we propose an end-to-end multi-task framework called aesthetics-based saliency network (ABSNet). We use three commonly-used shared backbones and design two distinct branches for each task. Mean square error (MSE) loss and Earth Mover's Distance (EMD) loss are jointly adopted to alternately train the shared network and individual branch for different tasks, facilitating the proposed model to extract more effective features for visual perception. Moreover, our model is resolution-friendly to predict saliency for images of arbitrary size. It has been shown that the proposed multi-task method is superior over single-task version and outperforms state-of-the-art saliency methods.</abstract><cop>New York</cop><pub>IEEE</pub><doi>10.1109/LSP.2020.3035065</doi><tpages>5</tpages><orcidid>https://orcid.org/0000-0002-6998-0268</orcidid><orcidid>https://orcid.org/0000-0001-5165-204X</orcidid><orcidid>https://orcid.org/0000-0003-4690-1886</orcidid><oa>free_for_read</oa></addata></record>
fulltext	fulltext_linktorsrc
identifier	ISSN: 1070-9908
ispartof	IEEE signal processing letters, 2020, Vol.27, p.2014-2018
issn	1070-9908 1558-2361
language	eng
recordid	cdi_crossref_primary_10_1109_LSP_2020_3035065
source	IEEE Electronic Library (IEL)
subjects	Aesthetics Aesthetics assessment Feature extraction multi-task learning Prediction algorithms Salience Saliency detection Semantics Signal processing algorithms Task analysis Visual aspects Visual perception visual saliency detection Visualization
title	ABSNet: Aesthetics-Based Saliency Network Using Multi-Task Convolutional Network
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-06T09%3A53%3A42IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=ABSNet:%20Aesthetics-Based%20Saliency%20Network%20Using%20Multi-Task%20Convolutional%20Network&rft.jtitle=IEEE%20signal%20processing%20letters&rft.au=Liu,%20Jing&rft.date=2020&rft.volume=27&rft.spage=2014&rft.epage=2018&rft.pages=2014-2018&rft.issn=1070-9908&rft.eissn=1558-2361&rft.coden=ISPLEM&rft_id=info:doi/10.1109/LSP.2020.3035065&rft_dat=%3Cproquest_RIE%3E2465446141%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2465446141&rft_id=info:pmid/&rft_ieee_id=9246211&rfr_iscdi=true