DARSegNet: A Real-Time Semantic Segmentation Method Based on Dual Attention Fusion Module and Encoder-Decoder Network

The convolutional neural network achieves excellent semantic segmentation results in artificially annotated datasets with complex scenes. However, semantic segmentation methods still suffer from several problems such as low use rate of the features, high computational complexity, and being far from...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Mathematical problems in engineering 2022-06, Vol.2022, p.1-10
Hauptverfasser:	Xing, Yongfeng, Zhong, Luo, Zhong, Xian
Format:	Artikel
Sprache:	eng
Schlagworte:	Accuracy Algorithms Artificial neural networks Classification Coders Complexity Datasets Encoders-Decoders Engineering Feature extraction Image segmentation Methods Modules Neural networks Real time Redundancy Semantic segmentation Semantics
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	10
container_issue
container_start_page	1
container_title	Mathematical problems in engineering
container_volume	2022
creator	Xing, Yongfeng Zhong, Luo Zhong, Xian
description	The convolutional neural network achieves excellent semantic segmentation results in artificially annotated datasets with complex scenes. However, semantic segmentation methods still suffer from several problems such as low use rate of the features, high computational complexity, and being far from practical real-time application, which bring about challenges for the image semantic segmentation. Two factors are very critical to semantic segmentation task: global context and multilevel semantics. However, generating these two factors will always lead to high complexity. In order to solve this, we propose a novel structure, dual attention fusion module (DAFM), by eliminating structural redundancy. Unlike most of the existing algorithms, we combine the attention mechanism with the depth pyramid pool module (DPPM) to extract accurate dense features for pixel labeling rather than complex expansion convolution. Specifically, we introduce a DPPM to execute the spatial pyramid structure in output and combine the global pool method. The DAFM is introduced in each decoder layer. Finally, the low-level features and high-level features are fused to obtain semantic segmentation result. The experiments and visualization results on Cityscapes and CamVid datasets show that, in real-time semantic segmentation, we have achieved a satisfactory balance between accuracy and speed, which proves the effectiveness of the proposed algorithm. In particular, on a single 1080ti GPU computer, ResNet-18 produces 75.53% MIoU at 70 FPS on Cityscapes and 73.96% MIoU at 109 FPS on CamVid.
doi_str_mv	10.1155/2022/6195148
format	Article
fullrecord	<record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_2678217083</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2678217083</sourcerecordid><originalsourceid>FETCH-LOGICAL-c337t-1ec39c4c093e6216790b0d37b0d6e37cf50e9fe07f7557164c9dfd30bbee8b983</originalsourceid><addsrcrecordid>eNp9kEtPwzAQhC0EEqVw4wdY4gihfsRxwq30AUgFpLZI3CLH3rQpeZTYUcW_x6U9c9mZ1X7akQaha0ruKRViwAhjg4gmgobxCepREfHAe3nqPWFhQBn_PEcX1m4IYVTQuIe68XC-gNUbuAc8xHNQZbAsKsALqFTtCu3NqoLaKVc0NX4Ft24MflQWDPb7uFMlHjrngf152tk_qjFdCVjVBk9q3RhogzH8KfY5u6b9ukRnuSotXB21jz6mk-XoOZi9P72MhrNAcy5dQEHzRIeaJBwiRiOZkIwYLv2IgEudCwJJDkTmUghJo1AnJjecZBlAnCUx76Obw99t23x3YF26abq29pEpi2TMqCQx99TdgdJtY20Lebpti0q1Pykl6b7YdF9seizW47cHfF3URu2K_-lfOhB3NQ</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2678217083</pqid></control><display><type>article</type><title>DARSegNet: A Real-Time Semantic Segmentation Method Based on Dual Attention Fusion Module and Encoder-Decoder Network</title><source>Wiley Online Library Open Access</source><source>EZB-FREE-00999 freely available EZB journals</source><source>Alma/SFX Local Collection</source><creator>Xing, Yongfeng ; Zhong, Luo ; Zhong, Xian</creator><contributor>Che, Hangjun ; Hangjun Che</contributor><creatorcontrib>Xing, Yongfeng ; Zhong, Luo ; Zhong, Xian ; Che, Hangjun ; Hangjun Che</creatorcontrib><description>The convolutional neural network achieves excellent semantic segmentation results in artificially annotated datasets with complex scenes. However, semantic segmentation methods still suffer from several problems such as low use rate of the features, high computational complexity, and being far from practical real-time application, which bring about challenges for the image semantic segmentation. Two factors are very critical to semantic segmentation task: global context and multilevel semantics. However, generating these two factors will always lead to high complexity. In order to solve this, we propose a novel structure, dual attention fusion module (DAFM), by eliminating structural redundancy. Unlike most of the existing algorithms, we combine the attention mechanism with the depth pyramid pool module (DPPM) to extract accurate dense features for pixel labeling rather than complex expansion convolution. Specifically, we introduce a DPPM to execute the spatial pyramid structure in output and combine the global pool method. The DAFM is introduced in each decoder layer. Finally, the low-level features and high-level features are fused to obtain semantic segmentation result. The experiments and visualization results on Cityscapes and CamVid datasets show that, in real-time semantic segmentation, we have achieved a satisfactory balance between accuracy and speed, which proves the effectiveness of the proposed algorithm. In particular, on a single 1080ti GPU computer, ResNet-18 produces 75.53% MIoU at 70 FPS on Cityscapes and 73.96% MIoU at 109 FPS on CamVid.</description><identifier>ISSN: 1024-123X</identifier><identifier>EISSN: 1563-5147</identifier><identifier>DOI: 10.1155/2022/6195148</identifier><language>eng</language><publisher>New York: Hindawi</publisher><subject>Accuracy ; Algorithms ; Artificial neural networks ; Classification ; Coders ; Complexity ; Datasets ; Encoders-Decoders ; Engineering ; Feature extraction ; Image segmentation ; Methods ; Modules ; Neural networks ; Real time ; Redundancy ; Semantic segmentation ; Semantics</subject><ispartof>Mathematical problems in engineering, 2022-06, Vol.2022, p.1-10</ispartof><rights>Copyright © 2022 Yongfeng Xing et al.</rights><rights>Copyright © 2022 Yongfeng Xing et al. This is an open access article distributed under the Creative Commons Attribution License (the “License”), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License. https://creativecommons.org/licenses/by/4.0</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c337t-1ec39c4c093e6216790b0d37b0d6e37cf50e9fe07f7557164c9dfd30bbee8b983</citedby><cites>FETCH-LOGICAL-c337t-1ec39c4c093e6216790b0d37b0d6e37cf50e9fe07f7557164c9dfd30bbee8b983</cites><orcidid>0000-0002-2346-9659 ; 0000-0003-4400-0523 ; 0000-0003-0977-7919</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,776,780,27903,27904</link.rule.ids></links><search><contributor>Che, Hangjun</contributor><contributor>Hangjun Che</contributor><creatorcontrib>Xing, Yongfeng</creatorcontrib><creatorcontrib>Zhong, Luo</creatorcontrib><creatorcontrib>Zhong, Xian</creatorcontrib><title>DARSegNet: A Real-Time Semantic Segmentation Method Based on Dual Attention Fusion Module and Encoder-Decoder Network</title><title>Mathematical problems in engineering</title><description>The convolutional neural network achieves excellent semantic segmentation results in artificially annotated datasets with complex scenes. However, semantic segmentation methods still suffer from several problems such as low use rate of the features, high computational complexity, and being far from practical real-time application, which bring about challenges for the image semantic segmentation. Two factors are very critical to semantic segmentation task: global context and multilevel semantics. However, generating these two factors will always lead to high complexity. In order to solve this, we propose a novel structure, dual attention fusion module (DAFM), by eliminating structural redundancy. Unlike most of the existing algorithms, we combine the attention mechanism with the depth pyramid pool module (DPPM) to extract accurate dense features for pixel labeling rather than complex expansion convolution. Specifically, we introduce a DPPM to execute the spatial pyramid structure in output and combine the global pool method. The DAFM is introduced in each decoder layer. Finally, the low-level features and high-level features are fused to obtain semantic segmentation result. The experiments and visualization results on Cityscapes and CamVid datasets show that, in real-time semantic segmentation, we have achieved a satisfactory balance between accuracy and speed, which proves the effectiveness of the proposed algorithm. In particular, on a single 1080ti GPU computer, ResNet-18 produces 75.53% MIoU at 70 FPS on Cityscapes and 73.96% MIoU at 109 FPS on CamVid.</description><subject>Accuracy</subject><subject>Algorithms</subject><subject>Artificial neural networks</subject><subject>Classification</subject><subject>Coders</subject><subject>Complexity</subject><subject>Datasets</subject><subject>Encoders-Decoders</subject><subject>Engineering</subject><subject>Feature extraction</subject><subject>Image segmentation</subject><subject>Methods</subject><subject>Modules</subject><subject>Neural networks</subject><subject>Real time</subject><subject>Redundancy</subject><subject>Semantic segmentation</subject><subject>Semantics</subject><issn>1024-123X</issn><issn>1563-5147</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2022</creationdate><recordtype>article</recordtype><sourceid>RHX</sourceid><sourceid>ABUWG</sourceid><sourceid>AFKRA</sourceid><sourceid>AZQEC</sourceid><sourceid>BENPR</sourceid><sourceid>CCPQU</sourceid><sourceid>DWQXO</sourceid><sourceid>GNUQQ</sourceid><recordid>eNp9kEtPwzAQhC0EEqVw4wdY4gihfsRxwq30AUgFpLZI3CLH3rQpeZTYUcW_x6U9c9mZ1X7akQaha0ruKRViwAhjg4gmgobxCepREfHAe3nqPWFhQBn_PEcX1m4IYVTQuIe68XC-gNUbuAc8xHNQZbAsKsALqFTtCu3NqoLaKVc0NX4Ft24MflQWDPb7uFMlHjrngf152tk_qjFdCVjVBk9q3RhogzH8KfY5u6b9ukRnuSotXB21jz6mk-XoOZi9P72MhrNAcy5dQEHzRIeaJBwiRiOZkIwYLv2IgEudCwJJDkTmUghJo1AnJjecZBlAnCUx76Obw99t23x3YF26abq29pEpi2TMqCQx99TdgdJtY20Lebpti0q1Pykl6b7YdF9seizW47cHfF3URu2K_-lfOhB3NQ</recordid><startdate>20220606</startdate><enddate>20220606</enddate><creator>Xing, Yongfeng</creator><creator>Zhong, Luo</creator><creator>Zhong, Xian</creator><general>Hindawi</general><general>Hindawi Limited</general><scope>RHU</scope><scope>RHW</scope><scope>RHX</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7TB</scope><scope>8FD</scope><scope>8FE</scope><scope>8FG</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>ARAPS</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>CWDGH</scope><scope>DWQXO</scope><scope>FR3</scope><scope>GNUQQ</scope><scope>HCIFZ</scope><scope>JQ2</scope><scope>K7-</scope><scope>KR7</scope><scope>L6V</scope><scope>M7S</scope><scope>P5Z</scope><scope>P62</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope><orcidid>https://orcid.org/0000-0002-2346-9659</orcidid><orcidid>https://orcid.org/0000-0003-4400-0523</orcidid><orcidid>https://orcid.org/0000-0003-0977-7919</orcidid></search><sort><creationdate>20220606</creationdate><title>DARSegNet: A Real-Time Semantic Segmentation Method Based on Dual Attention Fusion Module and Encoder-Decoder Network</title><author>Xing, Yongfeng ; Zhong, Luo ; Zhong, Xian</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c337t-1ec39c4c093e6216790b0d37b0d6e37cf50e9fe07f7557164c9dfd30bbee8b983</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2022</creationdate><topic>Accuracy</topic><topic>Algorithms</topic><topic>Artificial neural networks</topic><topic>Classification</topic><topic>Coders</topic><topic>Complexity</topic><topic>Datasets</topic><topic>Encoders-Decoders</topic><topic>Engineering</topic><topic>Feature extraction</topic><topic>Image segmentation</topic><topic>Methods</topic><topic>Modules</topic><topic>Neural networks</topic><topic>Real time</topic><topic>Redundancy</topic><topic>Semantic segmentation</topic><topic>Semantics</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Xing, Yongfeng</creatorcontrib><creatorcontrib>Zhong, Luo</creatorcontrib><creatorcontrib>Zhong, Xian</creatorcontrib><collection>Hindawi Publishing Complete</collection><collection>Hindawi Publishing Subscription Journals</collection><collection>Hindawi Publishing Open Access</collection><collection>CrossRef</collection><collection>Mechanical & Transportation Engineering Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Materials Science & Engineering Collection</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest Central UK/Ireland</collection><collection>Advanced Technologies & Aerospace Collection</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>Middle East & Africa Database</collection><collection>ProQuest Central Korea</collection><collection>Engineering Research Database</collection><collection>ProQuest Central Student</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Computer Science Collection</collection><collection>Computer Science Database</collection><collection>Civil Engineering Abstracts</collection><collection>ProQuest Engineering Collection</collection><collection>Engineering Database</collection><collection>Advanced Technologies & Aerospace Database</collection><collection>ProQuest Advanced Technologies & Aerospace Collection</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering Collection</collection><jtitle>Mathematical problems in engineering</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Xing, Yongfeng</au><au>Zhong, Luo</au><au>Zhong, Xian</au><au>Che, Hangjun</au><au>Hangjun Che</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>DARSegNet: A Real-Time Semantic Segmentation Method Based on Dual Attention Fusion Module and Encoder-Decoder Network</atitle><jtitle>Mathematical problems in engineering</jtitle><date>2022-06-06</date><risdate>2022</risdate><volume>2022</volume><spage>1</spage><epage>10</epage><pages>1-10</pages><issn>1024-123X</issn><eissn>1563-5147</eissn><abstract>The convolutional neural network achieves excellent semantic segmentation results in artificially annotated datasets with complex scenes. However, semantic segmentation methods still suffer from several problems such as low use rate of the features, high computational complexity, and being far from practical real-time application, which bring about challenges for the image semantic segmentation. Two factors are very critical to semantic segmentation task: global context and multilevel semantics. However, generating these two factors will always lead to high complexity. In order to solve this, we propose a novel structure, dual attention fusion module (DAFM), by eliminating structural redundancy. Unlike most of the existing algorithms, we combine the attention mechanism with the depth pyramid pool module (DPPM) to extract accurate dense features for pixel labeling rather than complex expansion convolution. Specifically, we introduce a DPPM to execute the spatial pyramid structure in output and combine the global pool method. The DAFM is introduced in each decoder layer. Finally, the low-level features and high-level features are fused to obtain semantic segmentation result. The experiments and visualization results on Cityscapes and CamVid datasets show that, in real-time semantic segmentation, we have achieved a satisfactory balance between accuracy and speed, which proves the effectiveness of the proposed algorithm. In particular, on a single 1080ti GPU computer, ResNet-18 produces 75.53% MIoU at 70 FPS on Cityscapes and 73.96% MIoU at 109 FPS on CamVid.</abstract><cop>New York</cop><pub>Hindawi</pub><doi>10.1155/2022/6195148</doi><tpages>10</tpages><orcidid>https://orcid.org/0000-0002-2346-9659</orcidid><orcidid>https://orcid.org/0000-0003-4400-0523</orcidid><orcidid>https://orcid.org/0000-0003-0977-7919</orcidid><oa>free_for_read</oa></addata></record>
fulltext	fulltext
identifier	ISSN: 1024-123X
ispartof	Mathematical problems in engineering, 2022-06, Vol.2022, p.1-10
issn	1024-123X 1563-5147
language	eng
recordid	cdi_proquest_journals_2678217083
source	Wiley Online Library Open Access; EZB-FREE-00999 freely available EZB journals; Alma/SFX Local Collection
subjects	Accuracy Algorithms Artificial neural networks Classification Coders Complexity Datasets Encoders-Decoders Engineering Feature extraction Image segmentation Methods Modules Neural networks Real time Redundancy Semantic segmentation Semantics
title	DARSegNet: A Real-Time Semantic Segmentation Method Based on Dual Attention Fusion Module and Encoder-Decoder Network
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-26T13%3A39%3A39IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=DARSegNet:%20A%20Real-Time%20Semantic%20Segmentation%20Method%20Based%20on%20Dual%20Attention%20Fusion%20Module%20and%20Encoder-Decoder%20Network&rft.jtitle=Mathematical%20problems%20in%20engineering&rft.au=Xing,%20Yongfeng&rft.date=2022-06-06&rft.volume=2022&rft.spage=1&rft.epage=10&rft.pages=1-10&rft.issn=1024-123X&rft.eissn=1563-5147&rft_id=info:doi/10.1155/2022/6195148&rft_dat=%3Cproquest_cross%3E2678217083%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2678217083&rft_id=info:pmid/&rfr_iscdi=true