SG-FCN: A Motion and Memory-Based Deep Learning Model for Video Saliency Detection

Data-driven saliency detection has attracted strong interest as a result of applying convolutional neural networks to the detection of eye fixations. Although a number of image-based salient object and fixation detection models have been proposed, video fixation detection still requires more explora...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on cybernetics 2019-08, Vol.49 (8), p.2900-2911
Hauptverfasser: Sun, Meijun, Zhou, Ziqi, Hu, Qinghua, Wang, Zheng, Jiang, Jianmin
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 2911
container_issue 8
container_start_page 2900
container_title IEEE transactions on cybernetics
container_volume 49
creator Sun, Meijun
Zhou, Ziqi
Hu, Qinghua
Wang, Zheng
Jiang, Jianmin
description Data-driven saliency detection has attracted strong interest as a result of applying convolutional neural networks to the detection of eye fixations. Although a number of image-based salient object and fixation detection models have been proposed, video fixation detection still requires more exploration. Different from image analysis, motion and temporal information is a crucial factor affecting human attention when viewing video sequences. Although existing models based on local contrast and low-level features have been extensively researched, they failed to simultaneously consider interframe motion and temporal information across neighboring video frames, leading to unsatisfactory performance when handling complex scenes. To this end, we propose a novel and efficient video eye fixation detection model to improve the saliency detection performance. By simulating the memory mechanism and visual attention mechanism of human beings when watching a video, we propose a step-gained fully convolutional network by combining the memory information on the time axis with the motion information on the space axis while storing the saliency information of the current frame. The model is obtained through hierarchical training, which ensures the accuracy of the detection. Extensive experiments in comparison with 11 state-of-the-art methods are carried out, and the results show that our proposed model outperforms all 11 methods across a number of publicly available datasets.
doi_str_mv 10.1109/TCYB.2018.2832053
format Article
fullrecord <record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_crossref_primary_10_1109_TCYB_2018_2832053</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>8365810</ieee_id><sourcerecordid>2068343091</sourcerecordid><originalsourceid>FETCH-LOGICAL-c349t-b5a6e7c1316722dfa8c5e66e7edcc29a0e8b34890c67373abdebc2b5e174a5c93</originalsourceid><addsrcrecordid>eNpdkE1LAzEQhoMottT-ABEk4MXL1nzsR-LNVluFVsFWwVPIZqeyZbupSffQf2-W1h6cS8LMMy_Dg9AlJQNKibxbjL6GA0aoGDDBGUn4CeoymoqIsSw5Pf7TrIP63q9IKBFaUpyjDpNS8ozTLnqfT6Lx6PUeP-CZ3Za2xrou8AzW1u2iofZQ4EeADZ6CdnVZfweqgAovrcOfZQEWz3VVQm12AduCaRMu0NlSVx76h7eHPsZPi9FzNH2bvIweppHhsdxGeaJTyAzlNM0YK5ZamATS0ILCGCY1AZHzWEhi0izcqvMCcsPyBGgW68RI3kO3-9yNsz8N-K1al95AVekabOMVI6ngMSeSBvTmH7qyjavDdYq1RTkPbA_RPWWc9d7BUm1cudZupyhRrXPVOletc3VwHnauD8lNvobiuPFnOABXe6AEgONY8DQRlPBfevOCmA</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2222213368</pqid></control><display><type>article</type><title>SG-FCN: A Motion and Memory-Based Deep Learning Model for Video Saliency Detection</title><source>IEEE Electronic Library (IEL)</source><creator>Sun, Meijun ; Zhou, Ziqi ; Hu, Qinghua ; Wang, Zheng ; Jiang, Jianmin</creator><creatorcontrib>Sun, Meijun ; Zhou, Ziqi ; Hu, Qinghua ; Wang, Zheng ; Jiang, Jianmin</creatorcontrib><description>Data-driven saliency detection has attracted strong interest as a result of applying convolutional neural networks to the detection of eye fixations. Although a number of image-based salient object and fixation detection models have been proposed, video fixation detection still requires more exploration. Different from image analysis, motion and temporal information is a crucial factor affecting human attention when viewing video sequences. Although existing models based on local contrast and low-level features have been extensively researched, they failed to simultaneously consider interframe motion and temporal information across neighboring video frames, leading to unsatisfactory performance when handling complex scenes. To this end, we propose a novel and efficient video eye fixation detection model to improve the saliency detection performance. By simulating the memory mechanism and visual attention mechanism of human beings when watching a video, we propose a step-gained fully convolutional network by combining the memory information on the time axis with the motion information on the space axis while storing the saliency information of the current frame. The model is obtained through hierarchical training, which ensures the accuracy of the detection. Extensive experiments in comparison with 11 state-of-the-art methods are carried out, and the results show that our proposed model outperforms all 11 methods across a number of publicly available datasets.</description><identifier>ISSN: 2168-2267</identifier><identifier>EISSN: 2168-2275</identifier><identifier>DOI: 10.1109/TCYB.2018.2832053</identifier><identifier>PMID: 29993731</identifier><identifier>CODEN: ITCEB8</identifier><language>eng</language><publisher>United States: IEEE</publisher><subject>Artificial neural networks ; Axis movements ; Computational modeling ; Computer simulation ; Deep learning ; Eye fixation detection ; Feature extraction ; Fixation ; fully convolutional neural networks ; Human motion ; Image analysis ; Image detection ; Predictive models ; Salience ; Saliency detection ; Training ; video saliency ; Video sequences ; Visualization</subject><ispartof>IEEE transactions on cybernetics, 2019-08, Vol.49 (8), p.2900-2911</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2019</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c349t-b5a6e7c1316722dfa8c5e66e7edcc29a0e8b34890c67373abdebc2b5e174a5c93</citedby><cites>FETCH-LOGICAL-c349t-b5a6e7c1316722dfa8c5e66e7edcc29a0e8b34890c67373abdebc2b5e174a5c93</cites><orcidid>0000-0002-7576-3999 ; 0000-0001-8458-6704 ; 0000-0002-8691-8677 ; 0000-0001-8690-987X</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/8365810$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,777,781,793,27905,27906,54739</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/8365810$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/29993731$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Sun, Meijun</creatorcontrib><creatorcontrib>Zhou, Ziqi</creatorcontrib><creatorcontrib>Hu, Qinghua</creatorcontrib><creatorcontrib>Wang, Zheng</creatorcontrib><creatorcontrib>Jiang, Jianmin</creatorcontrib><title>SG-FCN: A Motion and Memory-Based Deep Learning Model for Video Saliency Detection</title><title>IEEE transactions on cybernetics</title><addtitle>TCYB</addtitle><addtitle>IEEE Trans Cybern</addtitle><description>Data-driven saliency detection has attracted strong interest as a result of applying convolutional neural networks to the detection of eye fixations. Although a number of image-based salient object and fixation detection models have been proposed, video fixation detection still requires more exploration. Different from image analysis, motion and temporal information is a crucial factor affecting human attention when viewing video sequences. Although existing models based on local contrast and low-level features have been extensively researched, they failed to simultaneously consider interframe motion and temporal information across neighboring video frames, leading to unsatisfactory performance when handling complex scenes. To this end, we propose a novel and efficient video eye fixation detection model to improve the saliency detection performance. By simulating the memory mechanism and visual attention mechanism of human beings when watching a video, we propose a step-gained fully convolutional network by combining the memory information on the time axis with the motion information on the space axis while storing the saliency information of the current frame. The model is obtained through hierarchical training, which ensures the accuracy of the detection. Extensive experiments in comparison with 11 state-of-the-art methods are carried out, and the results show that our proposed model outperforms all 11 methods across a number of publicly available datasets.</description><subject>Artificial neural networks</subject><subject>Axis movements</subject><subject>Computational modeling</subject><subject>Computer simulation</subject><subject>Deep learning</subject><subject>Eye fixation detection</subject><subject>Feature extraction</subject><subject>Fixation</subject><subject>fully convolutional neural networks</subject><subject>Human motion</subject><subject>Image analysis</subject><subject>Image detection</subject><subject>Predictive models</subject><subject>Salience</subject><subject>Saliency detection</subject><subject>Training</subject><subject>video saliency</subject><subject>Video sequences</subject><subject>Visualization</subject><issn>2168-2267</issn><issn>2168-2275</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2019</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNpdkE1LAzEQhoMottT-ABEk4MXL1nzsR-LNVluFVsFWwVPIZqeyZbupSffQf2-W1h6cS8LMMy_Dg9AlJQNKibxbjL6GA0aoGDDBGUn4CeoymoqIsSw5Pf7TrIP63q9IKBFaUpyjDpNS8ozTLnqfT6Lx6PUeP-CZ3Za2xrou8AzW1u2iofZQ4EeADZ6CdnVZfweqgAovrcOfZQEWz3VVQm12AduCaRMu0NlSVx76h7eHPsZPi9FzNH2bvIweppHhsdxGeaJTyAzlNM0YK5ZamATS0ILCGCY1AZHzWEhi0izcqvMCcsPyBGgW68RI3kO3-9yNsz8N-K1al95AVekabOMVI6ngMSeSBvTmH7qyjavDdYq1RTkPbA_RPWWc9d7BUm1cudZupyhRrXPVOletc3VwHnauD8lNvobiuPFnOABXe6AEgONY8DQRlPBfevOCmA</recordid><startdate>20190801</startdate><enddate>20190801</enddate><creator>Sun, Meijun</creator><creator>Zhou, Ziqi</creator><creator>Hu, Qinghua</creator><creator>Wang, Zheng</creator><creator>Jiang, Jianmin</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>7TB</scope><scope>8FD</scope><scope>F28</scope><scope>FR3</scope><scope>H8D</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>7X8</scope><orcidid>https://orcid.org/0000-0002-7576-3999</orcidid><orcidid>https://orcid.org/0000-0001-8458-6704</orcidid><orcidid>https://orcid.org/0000-0002-8691-8677</orcidid><orcidid>https://orcid.org/0000-0001-8690-987X</orcidid></search><sort><creationdate>20190801</creationdate><title>SG-FCN: A Motion and Memory-Based Deep Learning Model for Video Saliency Detection</title><author>Sun, Meijun ; Zhou, Ziqi ; Hu, Qinghua ; Wang, Zheng ; Jiang, Jianmin</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c349t-b5a6e7c1316722dfa8c5e66e7edcc29a0e8b34890c67373abdebc2b5e174a5c93</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2019</creationdate><topic>Artificial neural networks</topic><topic>Axis movements</topic><topic>Computational modeling</topic><topic>Computer simulation</topic><topic>Deep learning</topic><topic>Eye fixation detection</topic><topic>Feature extraction</topic><topic>Fixation</topic><topic>fully convolutional neural networks</topic><topic>Human motion</topic><topic>Image analysis</topic><topic>Image detection</topic><topic>Predictive models</topic><topic>Salience</topic><topic>Saliency detection</topic><topic>Training</topic><topic>video saliency</topic><topic>Video sequences</topic><topic>Visualization</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Sun, Meijun</creatorcontrib><creatorcontrib>Zhou, Ziqi</creatorcontrib><creatorcontrib>Hu, Qinghua</creatorcontrib><creatorcontrib>Wang, Zheng</creatorcontrib><creatorcontrib>Jiang, Jianmin</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics &amp; Communications Abstracts</collection><collection>Mechanical &amp; Transportation Engineering Abstracts</collection><collection>Technology Research Database</collection><collection>ANTE: Abstracts in New Technology &amp; Engineering</collection><collection>Engineering Research Database</collection><collection>Aerospace Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>MEDLINE - Academic</collection><jtitle>IEEE transactions on cybernetics</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Sun, Meijun</au><au>Zhou, Ziqi</au><au>Hu, Qinghua</au><au>Wang, Zheng</au><au>Jiang, Jianmin</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>SG-FCN: A Motion and Memory-Based Deep Learning Model for Video Saliency Detection</atitle><jtitle>IEEE transactions on cybernetics</jtitle><stitle>TCYB</stitle><addtitle>IEEE Trans Cybern</addtitle><date>2019-08-01</date><risdate>2019</risdate><volume>49</volume><issue>8</issue><spage>2900</spage><epage>2911</epage><pages>2900-2911</pages><issn>2168-2267</issn><eissn>2168-2275</eissn><coden>ITCEB8</coden><abstract>Data-driven saliency detection has attracted strong interest as a result of applying convolutional neural networks to the detection of eye fixations. Although a number of image-based salient object and fixation detection models have been proposed, video fixation detection still requires more exploration. Different from image analysis, motion and temporal information is a crucial factor affecting human attention when viewing video sequences. Although existing models based on local contrast and low-level features have been extensively researched, they failed to simultaneously consider interframe motion and temporal information across neighboring video frames, leading to unsatisfactory performance when handling complex scenes. To this end, we propose a novel and efficient video eye fixation detection model to improve the saliency detection performance. By simulating the memory mechanism and visual attention mechanism of human beings when watching a video, we propose a step-gained fully convolutional network by combining the memory information on the time axis with the motion information on the space axis while storing the saliency information of the current frame. The model is obtained through hierarchical training, which ensures the accuracy of the detection. Extensive experiments in comparison with 11 state-of-the-art methods are carried out, and the results show that our proposed model outperforms all 11 methods across a number of publicly available datasets.</abstract><cop>United States</cop><pub>IEEE</pub><pmid>29993731</pmid><doi>10.1109/TCYB.2018.2832053</doi><tpages>12</tpages><orcidid>https://orcid.org/0000-0002-7576-3999</orcidid><orcidid>https://orcid.org/0000-0001-8458-6704</orcidid><orcidid>https://orcid.org/0000-0002-8691-8677</orcidid><orcidid>https://orcid.org/0000-0001-8690-987X</orcidid></addata></record>
fulltext fulltext_linktorsrc
identifier ISSN: 2168-2267
ispartof IEEE transactions on cybernetics, 2019-08, Vol.49 (8), p.2900-2911
issn 2168-2267
2168-2275
language eng
recordid cdi_crossref_primary_10_1109_TCYB_2018_2832053
source IEEE Electronic Library (IEL)
subjects Artificial neural networks
Axis movements
Computational modeling
Computer simulation
Deep learning
Eye fixation detection
Feature extraction
Fixation
fully convolutional neural networks
Human motion
Image analysis
Image detection
Predictive models
Salience
Saliency detection
Training
video saliency
Video sequences
Visualization
title SG-FCN: A Motion and Memory-Based Deep Learning Model for Video Saliency Detection
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-18T11%3A37%3A11IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=SG-FCN:%20A%20Motion%20and%20Memory-Based%20Deep%20Learning%20Model%20for%20Video%20Saliency%20Detection&rft.jtitle=IEEE%20transactions%20on%20cybernetics&rft.au=Sun,%20Meijun&rft.date=2019-08-01&rft.volume=49&rft.issue=8&rft.spage=2900&rft.epage=2911&rft.pages=2900-2911&rft.issn=2168-2267&rft.eissn=2168-2275&rft.coden=ITCEB8&rft_id=info:doi/10.1109/TCYB.2018.2832053&rft_dat=%3Cproquest_RIE%3E2068343091%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2222213368&rft_id=info:pmid/29993731&rft_ieee_id=8365810&rfr_iscdi=true