SG-FCN: A Motion and Memory-Based Deep Learning Model for Video Saliency Detection

Data-driven saliency detection has attracted strong interest as a result of applying convolutional neural networks to the detection of eye fixations. Although a number of image-based salient object and fixation detection models have been proposed, video fixation detection still requires more explora...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE transactions on cybernetics 2019-08, Vol.49 (8), p.2900-2911
Hauptverfasser:	Sun, Meijun, Zhou, Ziqi, Hu, Qinghua, Wang, Zheng, Jiang, Jianmin
Format:	Artikel
Sprache:	eng
Schlagworte:	Artificial neural networks Axis movements Computational modeling Computer simulation Deep learning Eye fixation detection Feature extraction Fixation fully convolutional neural networks Human motion Image analysis Image detection Predictive models Salience Saliency detection Training video saliency Video sequences Visualization
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	2911
container_issue	8
container_start_page	2900
container_title	IEEE transactions on cybernetics
container_volume	49
creator	Sun, Meijun Zhou, Ziqi Hu, Qinghua Wang, Zheng Jiang, Jianmin
description	Data-driven saliency detection has attracted strong interest as a result of applying convolutional neural networks to the detection of eye fixations. Although a number of image-based salient object and fixation detection models have been proposed, video fixation detection still requires more exploration. Different from image analysis, motion and temporal information is a crucial factor affecting human attention when viewing video sequences. Although existing models based on local contrast and low-level features have been extensively researched, they failed to simultaneously consider interframe motion and temporal information across neighboring video frames, leading to unsatisfactory performance when handling complex scenes. To this end, we propose a novel and efficient video eye fixation detection model to improve the saliency detection performance. By simulating the memory mechanism and visual attention mechanism of human beings when watching a video, we propose a step-gained fully convolutional network by combining the memory information on the time axis with the motion information on the space axis while storing the saliency information of the current frame. The model is obtained through hierarchical training, which ensures the accuracy of the detection. Extensive experiments in comparison with 11 state-of-the-art methods are carried out, and the results show that our proposed model outperforms all 11 methods across a number of publicly available datasets.
doi_str_mv	10.1109/TCYB.2018.2832053
format	Article
fullrecord	<record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_crossref_primary_10_1109_TCYB_2018_2832053</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>8365810</ieee_id><sourcerecordid>2068343091</sourcerecordid><originalsourceid>FETCH-LOGICAL-c349t-b5a6e7c1316722dfa8c5e66e7edcc29a0e8b34890c67373abdebc2b5e174a5c93</originalsourceid><addsrcrecordid>eNpdkE1LAzEQhoMottT-ABEk4MXL1nzsR-LNVluFVsFWwVPIZqeyZbupSffQf2-W1h6cS8LMMy_Dg9AlJQNKibxbjL6GA0aoGDDBGUn4CeoymoqIsSw5Pf7TrIP63q9IKBFaUpyjDpNS8ozTLnqfT6Lx6PUeP-CZ3Za2xrou8AzW1u2iofZQ4EeADZ6CdnVZfweqgAovrcOfZQEWz3VVQm12AduCaRMu0NlSVx76h7eHPsZPi9FzNH2bvIweppHhsdxGeaJTyAzlNM0YK5ZamATS0ILCGCY1AZHzWEhi0izcqvMCcsPyBGgW68RI3kO3-9yNsz8N-K1al95AVekabOMVI6ngMSeSBvTmH7qyjavDdYq1RTkPbA_RPWWc9d7BUm1cudZupyhRrXPVOletc3VwHnauD8lNvobiuPFnOABXe6AEgONY8DQRlPBfevOCmA</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2222213368</pqid></control><display><type>article</type><title>SG-FCN: A Motion and Memory-Based Deep Learning Model for Video Saliency Detection</title><source>IEEE Electronic Library (IEL)</source><creator>Sun, Meijun ; Zhou, Ziqi ; Hu, Qinghua ; Wang, Zheng ; Jiang, Jianmin</creator><creatorcontrib>Sun, Meijun ; Zhou, Ziqi ; Hu, Qinghua ; Wang, Zheng ; Jiang, Jianmin</creatorcontrib><description>Data-driven saliency detection has attracted strong interest as a result of applying convolutional neural networks to the detection of eye fixations. Although a number of image-based salient object and fixation detection models have been proposed, video fixation detection still requires more exploration. Different from image analysis, motion and temporal information is a crucial factor affecting human attention when viewing video sequences. Although existing models based on local contrast and low-level features have been extensively researched, they failed to simultaneously consider interframe motion and temporal information across neighboring video frames, leading to unsatisfactory performance when handling complex scenes. To this end, we propose a novel and efficient video eye fixation detection model to improve the saliency detection performance. By simulating the memory mechanism and visual attention mechanism of human beings when watching a video, we propose a step-gained fully convolutional network by combining the memory information on the time axis with the motion information on the space axis while storing the saliency information of the current frame. The model is obtained through hierarchical training, which ensures the accuracy of the detection. Extensive experiments in comparison with 11 state-of-the-art methods are carried out, and the results show that our proposed model outperforms all 11 methods across a number of publicly available datasets.</description><identifier>ISSN: 2168-2267</identifier><identifier>EISSN: 2168-2275</identifier><identifier>DOI: 10.1109/TCYB.2018.2832053</identifier><identifier>PMID: 29993731</identifier><identifier>CODEN: ITCEB8</identifier><language>eng</language><publisher>United States: IEEE</publisher><subject>Artificial neural networks ; Axis movements ; Computational modeling ; Computer simulation ; Deep learning ; Eye fixation detection ; Feature extraction ; Fixation ; fully convolutional neural networks ; Human motion ; Image analysis ; Image detection ; Predictive models ; Salience ; Saliency detection ; Training ; video saliency ; Video sequences ; Visualization</subject><ispartof>IEEE transactions on cybernetics, 2019-08, Vol.49 (8), p.2900-2911</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2019</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c349t-b5a6e7c1316722dfa8c5e66e7edcc29a0e8b34890c67373abdebc2b5e174a5c93</citedby><cites>FETCH-LOGICAL-c349t-b5a6e7c1316722dfa8c5e66e7edcc29a0e8b34890c67373abdebc2b5e174a5c93</cites><orcidid>0000-0002-7576-3999 ; 0000-0001-8458-6704 ; 0000-0002-8691-8677 ; 0000-0001-8690-987X</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/8365810$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,777,781,793,27905,27906,54739</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/8365810$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/29993731$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Sun, Meijun</creatorcontrib><creatorcontrib>Zhou, Ziqi</creatorcontrib><creatorcontrib>Hu, Qinghua</creatorcontrib><creatorcontrib>Wang, Zheng</creatorcontrib><creatorcontrib>Jiang, Jianmin</creatorcontrib><title>SG-FCN: A Motion and Memory-Based Deep Learning Model for Video Saliency Detection</title><title>IEEE transactions on cybernetics</title><addtitle>TCYB</addtitle><addtitle>IEEE Trans Cybern</addtitle><description>Data-driven saliency detection has attracted strong interest as a result of applying convolutional neural networks to the detection of eye fixations. Although a number of image-based salient object and fixation detection models have been proposed, video fixation detection still requires more exploration. Different from image analysis, motion and temporal information is a crucial factor affecting human attention when viewing video sequences. Although existing models based on local contrast and low-level features have been extensively researched, they failed to simultaneously consider interframe motion and temporal information across neighboring video frames, leading to unsatisfactory performance when handling complex scenes. To this end, we propose a novel and efficient video eye fixation detection model to improve the saliency detection performance. By simulating the memory mechanism and visual attention mechanism of human beings when watching a video, we propose a step-gained fully convolutional network by combining the memory information on the time axis with the motion information on the space axis while storing the saliency information of the current frame. The model is obtained through hierarchical training, which ensures the accuracy of the detection. Extensive experiments in comparison with 11 state-of-the-art methods are carried out, and the results show that our proposed model outperforms all 11 methods across a number of publicly available datasets.</description><subject>Artificial neural networks</subject><subject>Axis movements</subject><subject>Computational modeling</subject><subject>Computer simulation</subject><subject>Deep learning</subject><subject>Eye fixation detection</subject><subject>Feature extraction</subject><subject>Fixation</subject><subject>fully convolutional neural networks</subject><subject>Human motion</subject><subject>Image analysis</subject><subject>Image detection</subject><subject>Predictive models</subject><subject>Salience</subject><subject>Saliency detection</subject><subject>Training</subject><subject>video saliency</subject><subject>Video sequences</subject><subject>Visualization</subject><issn>2168-2267</issn><issn>2168-2275</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2019</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNpdkE1LAzEQhoMottT-ABEk4MXL1nzsR-LNVluFVsFWwVPIZqeyZbupSffQf2-W1h6cS8LMMy_Dg9AlJQNKibxbjL6GA0aoGDDBGUn4CeoymoqIsSw5Pf7TrIP63q9IKBFaUpyjDpNS8ozTLnqfT6Lx6PUeP-CZ3Za2xrou8AzW1u2iofZQ4EeADZ6CdnVZfweqgAovrcOfZQEWz3VVQm12AduCaRMu0NlSVx76h7eHPsZPi9FzNH2bvIweppHhsdxGeaJTyAzlNM0YK5ZamATS0ILCGCY1AZHzWEhi0izcqvMCcsPyBGgW68RI3kO3-9yNsz8N-K1al95AVekabOMVI6ngMSeSBvTmH7qyjavDdYq1RTkPbA_RPWWc9d7BUm1cudZupyhRrXPVOletc3VwHnauD8lNvobiuPFnOABXe6AEgONY8DQRlPBfevOCmA</recordid><startdate>20190801</startdate><enddate>20190801</enddate><creator>Sun, Meijun</creator><creator>Zhou, Ziqi</creator><creator>Hu, Qinghua</creator><creator>Wang, Zheng</creator><creator>Jiang, Jianmin</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>7TB</scope><scope>8FD</scope><scope>F28</scope><scope>FR3</scope><scope>H8D</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>7X8</scope><orcidid>https://orcid.org/0000-0002-7576-3999</orcidid><orcidid>https://orcid.org/0000-0001-8458-6704</orcidid><orcidid>https://orcid.org/0000-0002-8691-8677</orcidid><orcidid>https://orcid.org/0000-0001-8690-987X</orcidid></search><sort><creationdate>20190801</creationdate><title>SG-FCN: A Motion and Memory-Based Deep Learning Model for Video Saliency Detection</title><author>Sun, Meijun ; Zhou, Ziqi ; Hu, Qinghua ; Wang, Zheng ; Jiang, Jianmin</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c349t-b5a6e7c1316722dfa8c5e66e7edcc29a0e8b34890c67373abdebc2b5e174a5c93</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2019</creationdate><topic>Artificial neural networks</topic><topic>Axis movements</topic><topic>Computational modeling</topic><topic>Computer simulation</topic><topic>Deep learning</topic><topic>Eye fixation detection</topic><topic>Feature extraction</topic><topic>Fixation</topic><topic>fully convolutional neural networks</topic><topic>Human motion</topic><topic>Image analysis</topic><topic>Image detection</topic><topic>Predictive models</topic><topic>Salience</topic><topic>Saliency detection</topic><topic>Training</topic><topic>video saliency</topic><topic>Video sequences</topic><topic>Visualization</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Sun, Meijun</creatorcontrib><creatorcontrib>Zhou, Ziqi</creatorcontrib><creatorcontrib>Hu, Qinghua</creatorcontrib><creatorcontrib>Wang, Zheng</creatorcontrib><creatorcontrib>Jiang, Jianmin</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics & Communications Abstracts</collection><collection>Mechanical & Transportation Engineering Abstracts</collection><collection>Technology Research Database</collection><collection>ANTE: Abstracts in New Technology & Engineering</collection><collection>Engineering Research Database</collection><collection>Aerospace Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>MEDLINE - Academic</collection><jtitle>IEEE transactions on cybernetics</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Sun, Meijun</au><au>Zhou, Ziqi</au><au>Hu, Qinghua</au><au>Wang, Zheng</au><au>Jiang, Jianmin</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>SG-FCN: A Motion and Memory-Based Deep Learning Model for Video Saliency Detection</atitle><jtitle>IEEE transactions on cybernetics</jtitle><stitle>TCYB</stitle><addtitle>IEEE Trans Cybern</addtitle><date>2019-08-01</date><risdate>2019</risdate><volume>49</volume><issue>8</issue><spage>2900</spage><epage>2911</epage><pages>2900-2911</pages><issn>2168-2267</issn><eissn>2168-2275</eissn><coden>ITCEB8</coden><abstract>Data-driven saliency detection has attracted strong interest as a result of applying convolutional neural networks to the detection of eye fixations. Although a number of image-based salient object and fixation detection models have been proposed, video fixation detection still requires more exploration. Different from image analysis, motion and temporal information is a crucial factor affecting human attention when viewing video sequences. Although existing models based on local contrast and low-level features have been extensively researched, they failed to simultaneously consider interframe motion and temporal information across neighboring video frames, leading to unsatisfactory performance when handling complex scenes. To this end, we propose a novel and efficient video eye fixation detection model to improve the saliency detection performance. By simulating the memory mechanism and visual attention mechanism of human beings when watching a video, we propose a step-gained fully convolutional network by combining the memory information on the time axis with the motion information on the space axis while storing the saliency information of the current frame. The model is obtained through hierarchical training, which ensures the accuracy of the detection. Extensive experiments in comparison with 11 state-of-the-art methods are carried out, and the results show that our proposed model outperforms all 11 methods across a number of publicly available datasets.</abstract><cop>United States</cop><pub>IEEE</pub><pmid>29993731</pmid><doi>10.1109/TCYB.2018.2832053</doi><tpages>12</tpages><orcidid>https://orcid.org/0000-0002-7576-3999</orcidid><orcidid>https://orcid.org/0000-0001-8458-6704</orcidid><orcidid>https://orcid.org/0000-0002-8691-8677</orcidid><orcidid>https://orcid.org/0000-0001-8690-987X</orcidid></addata></record>
fulltext	fulltext_linktorsrc
identifier	ISSN: 2168-2267
ispartof	IEEE transactions on cybernetics, 2019-08, Vol.49 (8), p.2900-2911
issn	2168-2267 2168-2275
language	eng
recordid	cdi_crossref_primary_10_1109_TCYB_2018_2832053
source	IEEE Electronic Library (IEL)
subjects	Artificial neural networks Axis movements Computational modeling Computer simulation Deep learning Eye fixation detection Feature extraction Fixation fully convolutional neural networks Human motion Image analysis Image detection Predictive models Salience Saliency detection Training video saliency Video sequences Visualization
title	SG-FCN: A Motion and Memory-Based Deep Learning Model for Video Saliency Detection
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-18T11%3A37%3A11IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=SG-FCN:%20A%20Motion%20and%20Memory-Based%20Deep%20Learning%20Model%20for%20Video%20Saliency%20Detection&rft.jtitle=IEEE%20transactions%20on%20cybernetics&rft.au=Sun,%20Meijun&rft.date=2019-08-01&rft.volume=49&rft.issue=8&rft.spage=2900&rft.epage=2911&rft.pages=2900-2911&rft.issn=2168-2267&rft.eissn=2168-2275&rft.coden=ITCEB8&rft_id=info:doi/10.1109/TCYB.2018.2832053&rft_dat=%3Cproquest_RIE%3E2068343091%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2222213368&rft_id=info:pmid/29993731&rft_ieee_id=8365810&rfr_iscdi=true