Hybrid Attention Mechanism And Forward Feedback Unit for RGB-D Salient Object Detection

RGB-D saliency object detection (SOD) is an important pre-processing operation for various computer vision tasks and has received much attention in recent years. However, how to extract more effective features and how to effectively fuse RGB and depth modality features are still challenges that rest...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE access 2023-01, Vol.11, p.1-1
Hauptverfasser: Li, Haitang, Han, Yibo, Li, Peiling, Li, Xiaohui, Shi, Lijuan
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 1
container_issue
container_start_page 1
container_title IEEE access
container_volume 11
creator Li, Haitang
Han, Yibo
Li, Peiling
Li, Xiaohui
Shi, Lijuan
description RGB-D saliency object detection (SOD) is an important pre-processing operation for various computer vision tasks and has received much attention in recent years. However, how to extract more effective features and how to effectively fuse RGB and depth modality features are still challenges that restrict the development of SOD. In this paper, we propose an effective network architecture called FFMA-Net: 1) We replace the backbone network of the baseline with a ResNet34 model to extract more effective features from the input data; 2) We design the HAM module to refine the features extracted by the ResNet34 model at different stages to ensure the effectiveness of features from each stage; 3) We propose the FFU module to perform multi-scale fusion of features from different stages, resulting in more semantic-rich features that are crucial for the decoding stage of the model. Finally, our model performs better than the latest methods on six RGB-D datasets on all evaluation metrics, especially in terms of F-measure metric, which shows significant improvement with approximately 5% on both SSD and LFSD datasets.
doi_str_mv 10.1109/ACCESS.2023.3309636
format Article
fullrecord <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_crossref_primary_10_1109_ACCESS_2023_3309636</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>10233844</ieee_id><doaj_id>oai_doaj_org_article_3635b675ed454d7ab6d1b3f8d39f99a0</doaj_id><sourcerecordid>2864343724</sourcerecordid><originalsourceid>FETCH-LOGICAL-c359t-c1817e67a12f82c000d099fbe54eb0b6d5f839755b937f7bd9ef84954987381e3</originalsourceid><addsrcrecordid>eNpNkctqHDEQRRuTQIzjL0gWgqx7InXpuZyMn-BgyMRkKaRWydFk3HLUMsF_b9ltgmtzi6LuqYLbdZ8YXTFGzdf1ZnO63a4GOsAKgBoJ8qA7HJg0PQiQ7970H7rjed7RVrqNhDrsfl08-pICWdeKU015It9x_O2mNN-R9RTIWS7_XGmKGLwb_5CbKVUScyE_zr_1J2Tr9qkZybXf4VjJCdYmDfOxex_dfsbjVz3qbs5Of24u-qvr88vN-qofQZjaj0wzhVI5NkQ9jO2xQI2JHgVHT70MImowSghvQEXlg8GouRHcaAWaIRx1lws3ZLez9yXdufJos0v2ZZDLrXWlpnGPFiQIL5XAwAUPyjU68xB1ABONcbSxviys-5L_PuBc7S4_lKm9bwctOXBQA29bsGyNJc9zwfj_KqP2ORC7BGKfA7GvgTTX58WVEPGNYwDQnMMT6syEzg</addsrcrecordid><sourcetype>Open Website</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2864343724</pqid></control><display><type>article</type><title>Hybrid Attention Mechanism And Forward Feedback Unit for RGB-D Salient Object Detection</title><source>IEEE Open Access Journals</source><source>DOAJ Directory of Open Access Journals</source><source>EZB-FREE-00999 freely available EZB journals</source><creator>Li, Haitang ; Han, Yibo ; Li, Peiling ; Li, Xiaohui ; Shi, Lijuan</creator><creatorcontrib>Li, Haitang ; Han, Yibo ; Li, Peiling ; Li, Xiaohui ; Shi, Lijuan</creatorcontrib><description>RGB-D saliency object detection (SOD) is an important pre-processing operation for various computer vision tasks and has received much attention in recent years. However, how to extract more effective features and how to effectively fuse RGB and depth modality features are still challenges that restrict the development of SOD. In this paper, we propose an effective network architecture called FFMA-Net: 1) We replace the backbone network of the baseline with a ResNet34 model to extract more effective features from the input data; 2) We design the HAM module to refine the features extracted by the ResNet34 model at different stages to ensure the effectiveness of features from each stage; 3) We propose the FFU module to perform multi-scale fusion of features from different stages, resulting in more semantic-rich features that are crucial for the decoding stage of the model. Finally, our model performs better than the latest methods on six RGB-D datasets on all evaluation metrics, especially in terms of F-measure metric, which shows significant improvement with approximately 5% on both SSD and LFSD datasets.</description><identifier>ISSN: 2169-3536</identifier><identifier>EISSN: 2169-3536</identifier><identifier>DOI: 10.1109/ACCESS.2023.3309636</identifier><identifier>CODEN: IAECCG</identifier><language>eng</language><publisher>Piscataway: IEEE</publisher><subject>Computer architecture ; Computer networks ; Computer vision ; Data mining ; Datasets ; Decoding ; Encoding ; Feature extraction ; Feedforward neural networks ; Forward Feedback Unit ; Hybrid Attention Mechanism ; Modules ; Object detection ; Object recognition ; RGB-D salient object detection ; Salience ; Task analysis ; Visualization</subject><ispartof>IEEE access, 2023-01, Vol.11, p.1-1</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2023</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c359t-c1817e67a12f82c000d099fbe54eb0b6d5f839755b937f7bd9ef84954987381e3</cites><orcidid>0009-0004-1785-7665</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/10233844$$EHTML$$P50$$Gieee$$Hfree_for_read</linktohtml><link.rule.ids>314,776,780,860,2095,27612,27903,27904,54912</link.rule.ids></links><search><creatorcontrib>Li, Haitang</creatorcontrib><creatorcontrib>Han, Yibo</creatorcontrib><creatorcontrib>Li, Peiling</creatorcontrib><creatorcontrib>Li, Xiaohui</creatorcontrib><creatorcontrib>Shi, Lijuan</creatorcontrib><title>Hybrid Attention Mechanism And Forward Feedback Unit for RGB-D Salient Object Detection</title><title>IEEE access</title><addtitle>Access</addtitle><description>RGB-D saliency object detection (SOD) is an important pre-processing operation for various computer vision tasks and has received much attention in recent years. However, how to extract more effective features and how to effectively fuse RGB and depth modality features are still challenges that restrict the development of SOD. In this paper, we propose an effective network architecture called FFMA-Net: 1) We replace the backbone network of the baseline with a ResNet34 model to extract more effective features from the input data; 2) We design the HAM module to refine the features extracted by the ResNet34 model at different stages to ensure the effectiveness of features from each stage; 3) We propose the FFU module to perform multi-scale fusion of features from different stages, resulting in more semantic-rich features that are crucial for the decoding stage of the model. Finally, our model performs better than the latest methods on six RGB-D datasets on all evaluation metrics, especially in terms of F-measure metric, which shows significant improvement with approximately 5% on both SSD and LFSD datasets.</description><subject>Computer architecture</subject><subject>Computer networks</subject><subject>Computer vision</subject><subject>Data mining</subject><subject>Datasets</subject><subject>Decoding</subject><subject>Encoding</subject><subject>Feature extraction</subject><subject>Feedforward neural networks</subject><subject>Forward Feedback Unit</subject><subject>Hybrid Attention Mechanism</subject><subject>Modules</subject><subject>Object detection</subject><subject>Object recognition</subject><subject>RGB-D salient object detection</subject><subject>Salience</subject><subject>Task analysis</subject><subject>Visualization</subject><issn>2169-3536</issn><issn>2169-3536</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><sourceid>ESBDL</sourceid><sourceid>RIE</sourceid><sourceid>DOA</sourceid><recordid>eNpNkctqHDEQRRuTQIzjL0gWgqx7InXpuZyMn-BgyMRkKaRWydFk3HLUMsF_b9ltgmtzi6LuqYLbdZ8YXTFGzdf1ZnO63a4GOsAKgBoJ8qA7HJg0PQiQ7970H7rjed7RVrqNhDrsfl08-pICWdeKU015It9x_O2mNN-R9RTIWS7_XGmKGLwb_5CbKVUScyE_zr_1J2Tr9qkZybXf4VjJCdYmDfOxex_dfsbjVz3qbs5Of24u-qvr88vN-qofQZjaj0wzhVI5NkQ9jO2xQI2JHgVHT70MImowSghvQEXlg8GouRHcaAWaIRx1lws3ZLez9yXdufJos0v2ZZDLrXWlpnGPFiQIL5XAwAUPyjU68xB1ABONcbSxviys-5L_PuBc7S4_lKm9bwctOXBQA29bsGyNJc9zwfj_KqP2ORC7BGKfA7GvgTTX58WVEPGNYwDQnMMT6syEzg</recordid><startdate>20230101</startdate><enddate>20230101</enddate><creator>Li, Haitang</creator><creator>Han, Yibo</creator><creator>Li, Peiling</creator><creator>Li, Xiaohui</creator><creator>Shi, Lijuan</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>ESBDL</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>7SR</scope><scope>8BQ</scope><scope>8FD</scope><scope>JG9</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>DOA</scope><orcidid>https://orcid.org/0009-0004-1785-7665</orcidid></search><sort><creationdate>20230101</creationdate><title>Hybrid Attention Mechanism And Forward Feedback Unit for RGB-D Salient Object Detection</title><author>Li, Haitang ; Han, Yibo ; Li, Peiling ; Li, Xiaohui ; Shi, Lijuan</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c359t-c1817e67a12f82c000d099fbe54eb0b6d5f839755b937f7bd9ef84954987381e3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Computer architecture</topic><topic>Computer networks</topic><topic>Computer vision</topic><topic>Data mining</topic><topic>Datasets</topic><topic>Decoding</topic><topic>Encoding</topic><topic>Feature extraction</topic><topic>Feedforward neural networks</topic><topic>Forward Feedback Unit</topic><topic>Hybrid Attention Mechanism</topic><topic>Modules</topic><topic>Object detection</topic><topic>Object recognition</topic><topic>RGB-D salient object detection</topic><topic>Salience</topic><topic>Task analysis</topic><topic>Visualization</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Li, Haitang</creatorcontrib><creatorcontrib>Han, Yibo</creatorcontrib><creatorcontrib>Li, Peiling</creatorcontrib><creatorcontrib>Li, Xiaohui</creatorcontrib><creatorcontrib>Shi, Lijuan</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE Open Access Journals</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics &amp; Communications Abstracts</collection><collection>Engineered Materials Abstracts</collection><collection>METADEX</collection><collection>Technology Research Database</collection><collection>Materials Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>DOAJ Directory of Open Access Journals</collection><jtitle>IEEE access</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Li, Haitang</au><au>Han, Yibo</au><au>Li, Peiling</au><au>Li, Xiaohui</au><au>Shi, Lijuan</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Hybrid Attention Mechanism And Forward Feedback Unit for RGB-D Salient Object Detection</atitle><jtitle>IEEE access</jtitle><stitle>Access</stitle><date>2023-01-01</date><risdate>2023</risdate><volume>11</volume><spage>1</spage><epage>1</epage><pages>1-1</pages><issn>2169-3536</issn><eissn>2169-3536</eissn><coden>IAECCG</coden><abstract>RGB-D saliency object detection (SOD) is an important pre-processing operation for various computer vision tasks and has received much attention in recent years. However, how to extract more effective features and how to effectively fuse RGB and depth modality features are still challenges that restrict the development of SOD. In this paper, we propose an effective network architecture called FFMA-Net: 1) We replace the backbone network of the baseline with a ResNet34 model to extract more effective features from the input data; 2) We design the HAM module to refine the features extracted by the ResNet34 model at different stages to ensure the effectiveness of features from each stage; 3) We propose the FFU module to perform multi-scale fusion of features from different stages, resulting in more semantic-rich features that are crucial for the decoding stage of the model. Finally, our model performs better than the latest methods on six RGB-D datasets on all evaluation metrics, especially in terms of F-measure metric, which shows significant improvement with approximately 5% on both SSD and LFSD datasets.</abstract><cop>Piscataway</cop><pub>IEEE</pub><doi>10.1109/ACCESS.2023.3309636</doi><tpages>1</tpages><orcidid>https://orcid.org/0009-0004-1785-7665</orcidid><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 2169-3536
ispartof IEEE access, 2023-01, Vol.11, p.1-1
issn 2169-3536
2169-3536
language eng
recordid cdi_crossref_primary_10_1109_ACCESS_2023_3309636
source IEEE Open Access Journals; DOAJ Directory of Open Access Journals; EZB-FREE-00999 freely available EZB journals
subjects Computer architecture
Computer networks
Computer vision
Data mining
Datasets
Decoding
Encoding
Feature extraction
Feedforward neural networks
Forward Feedback Unit
Hybrid Attention Mechanism
Modules
Object detection
Object recognition
RGB-D salient object detection
Salience
Task analysis
Visualization
title Hybrid Attention Mechanism And Forward Feedback Unit for RGB-D Salient Object Detection
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-22T07%3A27%3A12IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Hybrid%20Attention%20Mechanism%20And%20Forward%20Feedback%20Unit%20for%20RGB-D%20Salient%20Object%20Detection&rft.jtitle=IEEE%20access&rft.au=Li,%20Haitang&rft.date=2023-01-01&rft.volume=11&rft.spage=1&rft.epage=1&rft.pages=1-1&rft.issn=2169-3536&rft.eissn=2169-3536&rft.coden=IAECCG&rft_id=info:doi/10.1109/ACCESS.2023.3309636&rft_dat=%3Cproquest_cross%3E2864343724%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2864343724&rft_id=info:pmid/&rft_ieee_id=10233844&rft_doaj_id=oai_doaj_org_article_3635b675ed454d7ab6d1b3f8d39f99a0&rfr_iscdi=true