SG-FCN: A Motion and Memory-Based Deep Learning Model for Video Saliency Detection
Data-driven saliency detection has attracted strong interest as a result of applying convolutional neural networks to the detection of eye fixations. Although a number of image-based salient object and fixation detection models have been proposed, video fixation detection still requires more explora...
Gespeichert in:
Veröffentlicht in: | IEEE transactions on cybernetics 2019-08, Vol.49 (8), p.2900-2911 |
---|---|
Hauptverfasser: | , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 2911 |
---|---|
container_issue | 8 |
container_start_page | 2900 |
container_title | IEEE transactions on cybernetics |
container_volume | 49 |
creator | Sun, Meijun Zhou, Ziqi Hu, Qinghua Wang, Zheng Jiang, Jianmin |
description | Data-driven saliency detection has attracted strong interest as a result of applying convolutional neural networks to the detection of eye fixations. Although a number of image-based salient object and fixation detection models have been proposed, video fixation detection still requires more exploration. Different from image analysis, motion and temporal information is a crucial factor affecting human attention when viewing video sequences. Although existing models based on local contrast and low-level features have been extensively researched, they failed to simultaneously consider interframe motion and temporal information across neighboring video frames, leading to unsatisfactory performance when handling complex scenes. To this end, we propose a novel and efficient video eye fixation detection model to improve the saliency detection performance. By simulating the memory mechanism and visual attention mechanism of human beings when watching a video, we propose a step-gained fully convolutional network by combining the memory information on the time axis with the motion information on the space axis while storing the saliency information of the current frame. The model is obtained through hierarchical training, which ensures the accuracy of the detection. Extensive experiments in comparison with 11 state-of-the-art methods are carried out, and the results show that our proposed model outperforms all 11 methods across a number of publicly available datasets. |
doi_str_mv | 10.1109/TCYB.2018.2832053 |
format | Article |
fullrecord | <record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_crossref_primary_10_1109_TCYB_2018_2832053</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>8365810</ieee_id><sourcerecordid>2068343091</sourcerecordid><originalsourceid>FETCH-LOGICAL-c349t-b5a6e7c1316722dfa8c5e66e7edcc29a0e8b34890c67373abdebc2b5e174a5c93</originalsourceid><addsrcrecordid>eNpdkE1LAzEQhoMottT-ABEk4MXL1nzsR-LNVluFVsFWwVPIZqeyZbupSffQf2-W1h6cS8LMMy_Dg9AlJQNKibxbjL6GA0aoGDDBGUn4CeoymoqIsSw5Pf7TrIP63q9IKBFaUpyjDpNS8ozTLnqfT6Lx6PUeP-CZ3Za2xrou8AzW1u2iofZQ4EeADZ6CdnVZfweqgAovrcOfZQEWz3VVQm12AduCaRMu0NlSVx76h7eHPsZPi9FzNH2bvIweppHhsdxGeaJTyAzlNM0YK5ZamATS0ILCGCY1AZHzWEhi0izcqvMCcsPyBGgW68RI3kO3-9yNsz8N-K1al95AVekabOMVI6ngMSeSBvTmH7qyjavDdYq1RTkPbA_RPWWc9d7BUm1cudZupyhRrXPVOletc3VwHnauD8lNvobiuPFnOABXe6AEgONY8DQRlPBfevOCmA</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2222213368</pqid></control><display><type>article</type><title>SG-FCN: A Motion and Memory-Based Deep Learning Model for Video Saliency Detection</title><source>IEEE Electronic Library (IEL)</source><creator>Sun, Meijun ; Zhou, Ziqi ; Hu, Qinghua ; Wang, Zheng ; Jiang, Jianmin</creator><creatorcontrib>Sun, Meijun ; Zhou, Ziqi ; Hu, Qinghua ; Wang, Zheng ; Jiang, Jianmin</creatorcontrib><description>Data-driven saliency detection has attracted strong interest as a result of applying convolutional neural networks to the detection of eye fixations. Although a number of image-based salient object and fixation detection models have been proposed, video fixation detection still requires more exploration. Different from image analysis, motion and temporal information is a crucial factor affecting human attention when viewing video sequences. Although existing models based on local contrast and low-level features have been extensively researched, they failed to simultaneously consider interframe motion and temporal information across neighboring video frames, leading to unsatisfactory performance when handling complex scenes. To this end, we propose a novel and efficient video eye fixation detection model to improve the saliency detection performance. By simulating the memory mechanism and visual attention mechanism of human beings when watching a video, we propose a step-gained fully convolutional network by combining the memory information on the time axis with the motion information on the space axis while storing the saliency information of the current frame. The model is obtained through hierarchical training, which ensures the accuracy of the detection. Extensive experiments in comparison with 11 state-of-the-art methods are carried out, and the results show that our proposed model outperforms all 11 methods across a number of publicly available datasets.</description><identifier>ISSN: 2168-2267</identifier><identifier>EISSN: 2168-2275</identifier><identifier>DOI: 10.1109/TCYB.2018.2832053</identifier><identifier>PMID: 29993731</identifier><identifier>CODEN: ITCEB8</identifier><language>eng</language><publisher>United States: IEEE</publisher><subject>Artificial neural networks ; Axis movements ; Computational modeling ; Computer simulation ; Deep learning ; Eye fixation detection ; Feature extraction ; Fixation ; fully convolutional neural networks ; Human motion ; Image analysis ; Image detection ; Predictive models ; Salience ; Saliency detection ; Training ; video saliency ; Video sequences ; Visualization</subject><ispartof>IEEE transactions on cybernetics, 2019-08, Vol.49 (8), p.2900-2911</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2019</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c349t-b5a6e7c1316722dfa8c5e66e7edcc29a0e8b34890c67373abdebc2b5e174a5c93</citedby><cites>FETCH-LOGICAL-c349t-b5a6e7c1316722dfa8c5e66e7edcc29a0e8b34890c67373abdebc2b5e174a5c93</cites><orcidid>0000-0002-7576-3999 ; 0000-0001-8458-6704 ; 0000-0002-8691-8677 ; 0000-0001-8690-987X</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/8365810$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,777,781,793,27905,27906,54739</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/8365810$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/29993731$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Sun, Meijun</creatorcontrib><creatorcontrib>Zhou, Ziqi</creatorcontrib><creatorcontrib>Hu, Qinghua</creatorcontrib><creatorcontrib>Wang, Zheng</creatorcontrib><creatorcontrib>Jiang, Jianmin</creatorcontrib><title>SG-FCN: A Motion and Memory-Based Deep Learning Model for Video Saliency Detection</title><title>IEEE transactions on cybernetics</title><addtitle>TCYB</addtitle><addtitle>IEEE Trans Cybern</addtitle><description>Data-driven saliency detection has attracted strong interest as a result of applying convolutional neural networks to the detection of eye fixations. Although a number of image-based salient object and fixation detection models have been proposed, video fixation detection still requires more exploration. Different from image analysis, motion and temporal information is a crucial factor affecting human attention when viewing video sequences. Although existing models based on local contrast and low-level features have been extensively researched, they failed to simultaneously consider interframe motion and temporal information across neighboring video frames, leading to unsatisfactory performance when handling complex scenes. To this end, we propose a novel and efficient video eye fixation detection model to improve the saliency detection performance. By simulating the memory mechanism and visual attention mechanism of human beings when watching a video, we propose a step-gained fully convolutional network by combining the memory information on the time axis with the motion information on the space axis while storing the saliency information of the current frame. The model is obtained through hierarchical training, which ensures the accuracy of the detection. Extensive experiments in comparison with 11 state-of-the-art methods are carried out, and the results show that our proposed model outperforms all 11 methods across a number of publicly available datasets.</description><subject>Artificial neural networks</subject><subject>Axis movements</subject><subject>Computational modeling</subject><subject>Computer simulation</subject><subject>Deep learning</subject><subject>Eye fixation detection</subject><subject>Feature extraction</subject><subject>Fixation</subject><subject>fully convolutional neural networks</subject><subject>Human motion</subject><subject>Image analysis</subject><subject>Image detection</subject><subject>Predictive models</subject><subject>Salience</subject><subject>Saliency detection</subject><subject>Training</subject><subject>video saliency</subject><subject>Video sequences</subject><subject>Visualization</subject><issn>2168-2267</issn><issn>2168-2275</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2019</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNpdkE1LAzEQhoMottT-ABEk4MXL1nzsR-LNVluFVsFWwVPIZqeyZbupSffQf2-W1h6cS8LMMy_Dg9AlJQNKibxbjL6GA0aoGDDBGUn4CeoymoqIsSw5Pf7TrIP63q9IKBFaUpyjDpNS8ozTLnqfT6Lx6PUeP-CZ3Za2xrou8AzW1u2iofZQ4EeADZ6CdnVZfweqgAovrcOfZQEWz3VVQm12AduCaRMu0NlSVx76h7eHPsZPi9FzNH2bvIweppHhsdxGeaJTyAzlNM0YK5ZamATS0ILCGCY1AZHzWEhi0izcqvMCcsPyBGgW68RI3kO3-9yNsz8N-K1al95AVekabOMVI6ngMSeSBvTmH7qyjavDdYq1RTkPbA_RPWWc9d7BUm1cudZupyhRrXPVOletc3VwHnauD8lNvobiuPFnOABXe6AEgONY8DQRlPBfevOCmA</recordid><startdate>20190801</startdate><enddate>20190801</enddate><creator>Sun, Meijun</creator><creator>Zhou, Ziqi</creator><creator>Hu, Qinghua</creator><creator>Wang, Zheng</creator><creator>Jiang, Jianmin</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>7TB</scope><scope>8FD</scope><scope>F28</scope><scope>FR3</scope><scope>H8D</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>7X8</scope><orcidid>https://orcid.org/0000-0002-7576-3999</orcidid><orcidid>https://orcid.org/0000-0001-8458-6704</orcidid><orcidid>https://orcid.org/0000-0002-8691-8677</orcidid><orcidid>https://orcid.org/0000-0001-8690-987X</orcidid></search><sort><creationdate>20190801</creationdate><title>SG-FCN: A Motion and Memory-Based Deep Learning Model for Video Saliency Detection</title><author>Sun, Meijun ; Zhou, Ziqi ; Hu, Qinghua ; Wang, Zheng ; Jiang, Jianmin</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c349t-b5a6e7c1316722dfa8c5e66e7edcc29a0e8b34890c67373abdebc2b5e174a5c93</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2019</creationdate><topic>Artificial neural networks</topic><topic>Axis movements</topic><topic>Computational modeling</topic><topic>Computer simulation</topic><topic>Deep learning</topic><topic>Eye fixation detection</topic><topic>Feature extraction</topic><topic>Fixation</topic><topic>fully convolutional neural networks</topic><topic>Human motion</topic><topic>Image analysis</topic><topic>Image detection</topic><topic>Predictive models</topic><topic>Salience</topic><topic>Saliency detection</topic><topic>Training</topic><topic>video saliency</topic><topic>Video sequences</topic><topic>Visualization</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Sun, Meijun</creatorcontrib><creatorcontrib>Zhou, Ziqi</creatorcontrib><creatorcontrib>Hu, Qinghua</creatorcontrib><creatorcontrib>Wang, Zheng</creatorcontrib><creatorcontrib>Jiang, Jianmin</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics & Communications Abstracts</collection><collection>Mechanical & Transportation Engineering Abstracts</collection><collection>Technology Research Database</collection><collection>ANTE: Abstracts in New Technology & Engineering</collection><collection>Engineering Research Database</collection><collection>Aerospace Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>MEDLINE - Academic</collection><jtitle>IEEE transactions on cybernetics</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Sun, Meijun</au><au>Zhou, Ziqi</au><au>Hu, Qinghua</au><au>Wang, Zheng</au><au>Jiang, Jianmin</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>SG-FCN: A Motion and Memory-Based Deep Learning Model for Video Saliency Detection</atitle><jtitle>IEEE transactions on cybernetics</jtitle><stitle>TCYB</stitle><addtitle>IEEE Trans Cybern</addtitle><date>2019-08-01</date><risdate>2019</risdate><volume>49</volume><issue>8</issue><spage>2900</spage><epage>2911</epage><pages>2900-2911</pages><issn>2168-2267</issn><eissn>2168-2275</eissn><coden>ITCEB8</coden><abstract>Data-driven saliency detection has attracted strong interest as a result of applying convolutional neural networks to the detection of eye fixations. Although a number of image-based salient object and fixation detection models have been proposed, video fixation detection still requires more exploration. Different from image analysis, motion and temporal information is a crucial factor affecting human attention when viewing video sequences. Although existing models based on local contrast and low-level features have been extensively researched, they failed to simultaneously consider interframe motion and temporal information across neighboring video frames, leading to unsatisfactory performance when handling complex scenes. To this end, we propose a novel and efficient video eye fixation detection model to improve the saliency detection performance. By simulating the memory mechanism and visual attention mechanism of human beings when watching a video, we propose a step-gained fully convolutional network by combining the memory information on the time axis with the motion information on the space axis while storing the saliency information of the current frame. The model is obtained through hierarchical training, which ensures the accuracy of the detection. Extensive experiments in comparison with 11 state-of-the-art methods are carried out, and the results show that our proposed model outperforms all 11 methods across a number of publicly available datasets.</abstract><cop>United States</cop><pub>IEEE</pub><pmid>29993731</pmid><doi>10.1109/TCYB.2018.2832053</doi><tpages>12</tpages><orcidid>https://orcid.org/0000-0002-7576-3999</orcidid><orcidid>https://orcid.org/0000-0001-8458-6704</orcidid><orcidid>https://orcid.org/0000-0002-8691-8677</orcidid><orcidid>https://orcid.org/0000-0001-8690-987X</orcidid></addata></record> |
fulltext | fulltext_linktorsrc |
identifier | ISSN: 2168-2267 |
ispartof | IEEE transactions on cybernetics, 2019-08, Vol.49 (8), p.2900-2911 |
issn | 2168-2267 2168-2275 |
language | eng |
recordid | cdi_crossref_primary_10_1109_TCYB_2018_2832053 |
source | IEEE Electronic Library (IEL) |
subjects | Artificial neural networks Axis movements Computational modeling Computer simulation Deep learning Eye fixation detection Feature extraction Fixation fully convolutional neural networks Human motion Image analysis Image detection Predictive models Salience Saliency detection Training video saliency Video sequences Visualization |
title | SG-FCN: A Motion and Memory-Based Deep Learning Model for Video Saliency Detection |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-18T11%3A37%3A11IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=SG-FCN:%20A%20Motion%20and%20Memory-Based%20Deep%20Learning%20Model%20for%20Video%20Saliency%20Detection&rft.jtitle=IEEE%20transactions%20on%20cybernetics&rft.au=Sun,%20Meijun&rft.date=2019-08-01&rft.volume=49&rft.issue=8&rft.spage=2900&rft.epage=2911&rft.pages=2900-2911&rft.issn=2168-2267&rft.eissn=2168-2275&rft.coden=ITCEB8&rft_id=info:doi/10.1109/TCYB.2018.2832053&rft_dat=%3Cproquest_RIE%3E2068343091%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2222213368&rft_id=info:pmid/29993731&rft_ieee_id=8365810&rfr_iscdi=true |