DARSegNet: A Real-Time Semantic Segmentation Method Based on Dual Attention Fusion Module and Encoder-Decoder Network

The convolutional neural network achieves excellent semantic segmentation results in artificially annotated datasets with complex scenes. However, semantic segmentation methods still suffer from several problems such as low use rate of the features, high computational complexity, and being far from...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Mathematical problems in engineering 2022-06, Vol.2022, p.1-10
Hauptverfasser: Xing, Yongfeng, Zhong, Luo, Zhong, Xian
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 10
container_issue
container_start_page 1
container_title Mathematical problems in engineering
container_volume 2022
creator Xing, Yongfeng
Zhong, Luo
Zhong, Xian
description The convolutional neural network achieves excellent semantic segmentation results in artificially annotated datasets with complex scenes. However, semantic segmentation methods still suffer from several problems such as low use rate of the features, high computational complexity, and being far from practical real-time application, which bring about challenges for the image semantic segmentation. Two factors are very critical to semantic segmentation task: global context and multilevel semantics. However, generating these two factors will always lead to high complexity. In order to solve this, we propose a novel structure, dual attention fusion module (DAFM), by eliminating structural redundancy. Unlike most of the existing algorithms, we combine the attention mechanism with the depth pyramid pool module (DPPM) to extract accurate dense features for pixel labeling rather than complex expansion convolution. Specifically, we introduce a DPPM to execute the spatial pyramid structure in output and combine the global pool method. The DAFM is introduced in each decoder layer. Finally, the low-level features and high-level features are fused to obtain semantic segmentation result. The experiments and visualization results on Cityscapes and CamVid datasets show that, in real-time semantic segmentation, we have achieved a satisfactory balance between accuracy and speed, which proves the effectiveness of the proposed algorithm. In particular, on a single 1080ti GPU computer, ResNet-18 produces 75.53% MIoU at 70 FPS on Cityscapes and 73.96% MIoU at 109 FPS on CamVid.
doi_str_mv 10.1155/2022/6195148
format Article
fullrecord <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_2678217083</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2678217083</sourcerecordid><originalsourceid>FETCH-LOGICAL-c337t-1ec39c4c093e6216790b0d37b0d6e37cf50e9fe07f7557164c9dfd30bbee8b983</originalsourceid><addsrcrecordid>eNp9kEtPwzAQhC0EEqVw4wdY4gihfsRxwq30AUgFpLZI3CLH3rQpeZTYUcW_x6U9c9mZ1X7akQaha0ruKRViwAhjg4gmgobxCepREfHAe3nqPWFhQBn_PEcX1m4IYVTQuIe68XC-gNUbuAc8xHNQZbAsKsALqFTtCu3NqoLaKVc0NX4Ft24MflQWDPb7uFMlHjrngf152tk_qjFdCVjVBk9q3RhogzH8KfY5u6b9ukRnuSotXB21jz6mk-XoOZi9P72MhrNAcy5dQEHzRIeaJBwiRiOZkIwYLv2IgEudCwJJDkTmUghJo1AnJjecZBlAnCUx76Obw99t23x3YF26abq29pEpi2TMqCQx99TdgdJtY20Lebpti0q1Pykl6b7YdF9seizW47cHfF3URu2K_-lfOhB3NQ</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2678217083</pqid></control><display><type>article</type><title>DARSegNet: A Real-Time Semantic Segmentation Method Based on Dual Attention Fusion Module and Encoder-Decoder Network</title><source>Wiley Online Library Open Access</source><source>EZB-FREE-00999 freely available EZB journals</source><source>Alma/SFX Local Collection</source><creator>Xing, Yongfeng ; Zhong, Luo ; Zhong, Xian</creator><contributor>Che, Hangjun ; Hangjun Che</contributor><creatorcontrib>Xing, Yongfeng ; Zhong, Luo ; Zhong, Xian ; Che, Hangjun ; Hangjun Che</creatorcontrib><description>The convolutional neural network achieves excellent semantic segmentation results in artificially annotated datasets with complex scenes. However, semantic segmentation methods still suffer from several problems such as low use rate of the features, high computational complexity, and being far from practical real-time application, which bring about challenges for the image semantic segmentation. Two factors are very critical to semantic segmentation task: global context and multilevel semantics. However, generating these two factors will always lead to high complexity. In order to solve this, we propose a novel structure, dual attention fusion module (DAFM), by eliminating structural redundancy. Unlike most of the existing algorithms, we combine the attention mechanism with the depth pyramid pool module (DPPM) to extract accurate dense features for pixel labeling rather than complex expansion convolution. Specifically, we introduce a DPPM to execute the spatial pyramid structure in output and combine the global pool method. The DAFM is introduced in each decoder layer. Finally, the low-level features and high-level features are fused to obtain semantic segmentation result. The experiments and visualization results on Cityscapes and CamVid datasets show that, in real-time semantic segmentation, we have achieved a satisfactory balance between accuracy and speed, which proves the effectiveness of the proposed algorithm. In particular, on a single 1080ti GPU computer, ResNet-18 produces 75.53% MIoU at 70 FPS on Cityscapes and 73.96% MIoU at 109 FPS on CamVid.</description><identifier>ISSN: 1024-123X</identifier><identifier>EISSN: 1563-5147</identifier><identifier>DOI: 10.1155/2022/6195148</identifier><language>eng</language><publisher>New York: Hindawi</publisher><subject>Accuracy ; Algorithms ; Artificial neural networks ; Classification ; Coders ; Complexity ; Datasets ; Encoders-Decoders ; Engineering ; Feature extraction ; Image segmentation ; Methods ; Modules ; Neural networks ; Real time ; Redundancy ; Semantic segmentation ; Semantics</subject><ispartof>Mathematical problems in engineering, 2022-06, Vol.2022, p.1-10</ispartof><rights>Copyright © 2022 Yongfeng Xing et al.</rights><rights>Copyright © 2022 Yongfeng Xing et al. This is an open access article distributed under the Creative Commons Attribution License (the “License”), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License. https://creativecommons.org/licenses/by/4.0</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c337t-1ec39c4c093e6216790b0d37b0d6e37cf50e9fe07f7557164c9dfd30bbee8b983</citedby><cites>FETCH-LOGICAL-c337t-1ec39c4c093e6216790b0d37b0d6e37cf50e9fe07f7557164c9dfd30bbee8b983</cites><orcidid>0000-0002-2346-9659 ; 0000-0003-4400-0523 ; 0000-0003-0977-7919</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,776,780,27903,27904</link.rule.ids></links><search><contributor>Che, Hangjun</contributor><contributor>Hangjun Che</contributor><creatorcontrib>Xing, Yongfeng</creatorcontrib><creatorcontrib>Zhong, Luo</creatorcontrib><creatorcontrib>Zhong, Xian</creatorcontrib><title>DARSegNet: A Real-Time Semantic Segmentation Method Based on Dual Attention Fusion Module and Encoder-Decoder Network</title><title>Mathematical problems in engineering</title><description>The convolutional neural network achieves excellent semantic segmentation results in artificially annotated datasets with complex scenes. However, semantic segmentation methods still suffer from several problems such as low use rate of the features, high computational complexity, and being far from practical real-time application, which bring about challenges for the image semantic segmentation. Two factors are very critical to semantic segmentation task: global context and multilevel semantics. However, generating these two factors will always lead to high complexity. In order to solve this, we propose a novel structure, dual attention fusion module (DAFM), by eliminating structural redundancy. Unlike most of the existing algorithms, we combine the attention mechanism with the depth pyramid pool module (DPPM) to extract accurate dense features for pixel labeling rather than complex expansion convolution. Specifically, we introduce a DPPM to execute the spatial pyramid structure in output and combine the global pool method. The DAFM is introduced in each decoder layer. Finally, the low-level features and high-level features are fused to obtain semantic segmentation result. The experiments and visualization results on Cityscapes and CamVid datasets show that, in real-time semantic segmentation, we have achieved a satisfactory balance between accuracy and speed, which proves the effectiveness of the proposed algorithm. In particular, on a single 1080ti GPU computer, ResNet-18 produces 75.53% MIoU at 70 FPS on Cityscapes and 73.96% MIoU at 109 FPS on CamVid.</description><subject>Accuracy</subject><subject>Algorithms</subject><subject>Artificial neural networks</subject><subject>Classification</subject><subject>Coders</subject><subject>Complexity</subject><subject>Datasets</subject><subject>Encoders-Decoders</subject><subject>Engineering</subject><subject>Feature extraction</subject><subject>Image segmentation</subject><subject>Methods</subject><subject>Modules</subject><subject>Neural networks</subject><subject>Real time</subject><subject>Redundancy</subject><subject>Semantic segmentation</subject><subject>Semantics</subject><issn>1024-123X</issn><issn>1563-5147</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2022</creationdate><recordtype>article</recordtype><sourceid>RHX</sourceid><sourceid>ABUWG</sourceid><sourceid>AFKRA</sourceid><sourceid>AZQEC</sourceid><sourceid>BENPR</sourceid><sourceid>CCPQU</sourceid><sourceid>DWQXO</sourceid><sourceid>GNUQQ</sourceid><recordid>eNp9kEtPwzAQhC0EEqVw4wdY4gihfsRxwq30AUgFpLZI3CLH3rQpeZTYUcW_x6U9c9mZ1X7akQaha0ruKRViwAhjg4gmgobxCepREfHAe3nqPWFhQBn_PEcX1m4IYVTQuIe68XC-gNUbuAc8xHNQZbAsKsALqFTtCu3NqoLaKVc0NX4Ft24MflQWDPb7uFMlHjrngf152tk_qjFdCVjVBk9q3RhogzH8KfY5u6b9ukRnuSotXB21jz6mk-XoOZi9P72MhrNAcy5dQEHzRIeaJBwiRiOZkIwYLv2IgEudCwJJDkTmUghJo1AnJjecZBlAnCUx76Obw99t23x3YF26abq29pEpi2TMqCQx99TdgdJtY20Lebpti0q1Pykl6b7YdF9seizW47cHfF3URu2K_-lfOhB3NQ</recordid><startdate>20220606</startdate><enddate>20220606</enddate><creator>Xing, Yongfeng</creator><creator>Zhong, Luo</creator><creator>Zhong, Xian</creator><general>Hindawi</general><general>Hindawi Limited</general><scope>RHU</scope><scope>RHW</scope><scope>RHX</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7TB</scope><scope>8FD</scope><scope>8FE</scope><scope>8FG</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>ARAPS</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>CWDGH</scope><scope>DWQXO</scope><scope>FR3</scope><scope>GNUQQ</scope><scope>HCIFZ</scope><scope>JQ2</scope><scope>K7-</scope><scope>KR7</scope><scope>L6V</scope><scope>M7S</scope><scope>P5Z</scope><scope>P62</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope><orcidid>https://orcid.org/0000-0002-2346-9659</orcidid><orcidid>https://orcid.org/0000-0003-4400-0523</orcidid><orcidid>https://orcid.org/0000-0003-0977-7919</orcidid></search><sort><creationdate>20220606</creationdate><title>DARSegNet: A Real-Time Semantic Segmentation Method Based on Dual Attention Fusion Module and Encoder-Decoder Network</title><author>Xing, Yongfeng ; Zhong, Luo ; Zhong, Xian</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c337t-1ec39c4c093e6216790b0d37b0d6e37cf50e9fe07f7557164c9dfd30bbee8b983</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2022</creationdate><topic>Accuracy</topic><topic>Algorithms</topic><topic>Artificial neural networks</topic><topic>Classification</topic><topic>Coders</topic><topic>Complexity</topic><topic>Datasets</topic><topic>Encoders-Decoders</topic><topic>Engineering</topic><topic>Feature extraction</topic><topic>Image segmentation</topic><topic>Methods</topic><topic>Modules</topic><topic>Neural networks</topic><topic>Real time</topic><topic>Redundancy</topic><topic>Semantic segmentation</topic><topic>Semantics</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Xing, Yongfeng</creatorcontrib><creatorcontrib>Zhong, Luo</creatorcontrib><creatorcontrib>Zhong, Xian</creatorcontrib><collection>Hindawi Publishing Complete</collection><collection>Hindawi Publishing Subscription Journals</collection><collection>Hindawi Publishing Open Access</collection><collection>CrossRef</collection><collection>Mechanical &amp; Transportation Engineering Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Materials Science &amp; Engineering Collection</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest Central UK/Ireland</collection><collection>Advanced Technologies &amp; Aerospace Collection</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>Middle East &amp; Africa Database</collection><collection>ProQuest Central Korea</collection><collection>Engineering Research Database</collection><collection>ProQuest Central Student</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Computer Science Collection</collection><collection>Computer Science Database</collection><collection>Civil Engineering Abstracts</collection><collection>ProQuest Engineering Collection</collection><collection>Engineering Database</collection><collection>Advanced Technologies &amp; Aerospace Database</collection><collection>ProQuest Advanced Technologies &amp; Aerospace Collection</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering Collection</collection><jtitle>Mathematical problems in engineering</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Xing, Yongfeng</au><au>Zhong, Luo</au><au>Zhong, Xian</au><au>Che, Hangjun</au><au>Hangjun Che</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>DARSegNet: A Real-Time Semantic Segmentation Method Based on Dual Attention Fusion Module and Encoder-Decoder Network</atitle><jtitle>Mathematical problems in engineering</jtitle><date>2022-06-06</date><risdate>2022</risdate><volume>2022</volume><spage>1</spage><epage>10</epage><pages>1-10</pages><issn>1024-123X</issn><eissn>1563-5147</eissn><abstract>The convolutional neural network achieves excellent semantic segmentation results in artificially annotated datasets with complex scenes. However, semantic segmentation methods still suffer from several problems such as low use rate of the features, high computational complexity, and being far from practical real-time application, which bring about challenges for the image semantic segmentation. Two factors are very critical to semantic segmentation task: global context and multilevel semantics. However, generating these two factors will always lead to high complexity. In order to solve this, we propose a novel structure, dual attention fusion module (DAFM), by eliminating structural redundancy. Unlike most of the existing algorithms, we combine the attention mechanism with the depth pyramid pool module (DPPM) to extract accurate dense features for pixel labeling rather than complex expansion convolution. Specifically, we introduce a DPPM to execute the spatial pyramid structure in output and combine the global pool method. The DAFM is introduced in each decoder layer. Finally, the low-level features and high-level features are fused to obtain semantic segmentation result. The experiments and visualization results on Cityscapes and CamVid datasets show that, in real-time semantic segmentation, we have achieved a satisfactory balance between accuracy and speed, which proves the effectiveness of the proposed algorithm. In particular, on a single 1080ti GPU computer, ResNet-18 produces 75.53% MIoU at 70 FPS on Cityscapes and 73.96% MIoU at 109 FPS on CamVid.</abstract><cop>New York</cop><pub>Hindawi</pub><doi>10.1155/2022/6195148</doi><tpages>10</tpages><orcidid>https://orcid.org/0000-0002-2346-9659</orcidid><orcidid>https://orcid.org/0000-0003-4400-0523</orcidid><orcidid>https://orcid.org/0000-0003-0977-7919</orcidid><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 1024-123X
ispartof Mathematical problems in engineering, 2022-06, Vol.2022, p.1-10
issn 1024-123X
1563-5147
language eng
recordid cdi_proquest_journals_2678217083
source Wiley Online Library Open Access; EZB-FREE-00999 freely available EZB journals; Alma/SFX Local Collection
subjects Accuracy
Algorithms
Artificial neural networks
Classification
Coders
Complexity
Datasets
Encoders-Decoders
Engineering
Feature extraction
Image segmentation
Methods
Modules
Neural networks
Real time
Redundancy
Semantic segmentation
Semantics
title DARSegNet: A Real-Time Semantic Segmentation Method Based on Dual Attention Fusion Module and Encoder-Decoder Network
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-26T13%3A39%3A39IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=DARSegNet:%20A%20Real-Time%20Semantic%20Segmentation%20Method%20Based%20on%20Dual%20Attention%20Fusion%20Module%20and%20Encoder-Decoder%20Network&rft.jtitle=Mathematical%20problems%20in%20engineering&rft.au=Xing,%20Yongfeng&rft.date=2022-06-06&rft.volume=2022&rft.spage=1&rft.epage=10&rft.pages=1-10&rft.issn=1024-123X&rft.eissn=1563-5147&rft_id=info:doi/10.1155/2022/6195148&rft_dat=%3Cproquest_cross%3E2678217083%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2678217083&rft_id=info:pmid/&rfr_iscdi=true