DARSegNet: A Real-Time Semantic Segmentation Method Based on Dual Attention Fusion Module and Encoder-Decoder Network
The convolutional neural network achieves excellent semantic segmentation results in artificially annotated datasets with complex scenes. However, semantic segmentation methods still suffer from several problems such as low use rate of the features, high computational complexity, and being far from...
Gespeichert in:
Veröffentlicht in: | Mathematical problems in engineering 2022-06, Vol.2022, p.1-10 |
---|---|
Hauptverfasser: | , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 10 |
---|---|
container_issue | |
container_start_page | 1 |
container_title | Mathematical problems in engineering |
container_volume | 2022 |
creator | Xing, Yongfeng Zhong, Luo Zhong, Xian |
description | The convolutional neural network achieves excellent semantic segmentation results in artificially annotated datasets with complex scenes. However, semantic segmentation methods still suffer from several problems such as low use rate of the features, high computational complexity, and being far from practical real-time application, which bring about challenges for the image semantic segmentation. Two factors are very critical to semantic segmentation task: global context and multilevel semantics. However, generating these two factors will always lead to high complexity. In order to solve this, we propose a novel structure, dual attention fusion module (DAFM), by eliminating structural redundancy. Unlike most of the existing algorithms, we combine the attention mechanism with the depth pyramid pool module (DPPM) to extract accurate dense features for pixel labeling rather than complex expansion convolution. Specifically, we introduce a DPPM to execute the spatial pyramid structure in output and combine the global pool method. The DAFM is introduced in each decoder layer. Finally, the low-level features and high-level features are fused to obtain semantic segmentation result. The experiments and visualization results on Cityscapes and CamVid datasets show that, in real-time semantic segmentation, we have achieved a satisfactory balance between accuracy and speed, which proves the effectiveness of the proposed algorithm. In particular, on a single 1080ti GPU computer, ResNet-18 produces 75.53% MIoU at 70 FPS on Cityscapes and 73.96% MIoU at 109 FPS on CamVid. |
doi_str_mv | 10.1155/2022/6195148 |
format | Article |
fullrecord | <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_2678217083</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2678217083</sourcerecordid><originalsourceid>FETCH-LOGICAL-c337t-1ec39c4c093e6216790b0d37b0d6e37cf50e9fe07f7557164c9dfd30bbee8b983</originalsourceid><addsrcrecordid>eNp9kEtPwzAQhC0EEqVw4wdY4gihfsRxwq30AUgFpLZI3CLH3rQpeZTYUcW_x6U9c9mZ1X7akQaha0ruKRViwAhjg4gmgobxCepREfHAe3nqPWFhQBn_PEcX1m4IYVTQuIe68XC-gNUbuAc8xHNQZbAsKsALqFTtCu3NqoLaKVc0NX4Ft24MflQWDPb7uFMlHjrngf152tk_qjFdCVjVBk9q3RhogzH8KfY5u6b9ukRnuSotXB21jz6mk-XoOZi9P72MhrNAcy5dQEHzRIeaJBwiRiOZkIwYLv2IgEudCwJJDkTmUghJo1AnJjecZBlAnCUx76Obw99t23x3YF26abq29pEpi2TMqCQx99TdgdJtY20Lebpti0q1Pykl6b7YdF9seizW47cHfF3URu2K_-lfOhB3NQ</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2678217083</pqid></control><display><type>article</type><title>DARSegNet: A Real-Time Semantic Segmentation Method Based on Dual Attention Fusion Module and Encoder-Decoder Network</title><source>Wiley Online Library Open Access</source><source>EZB-FREE-00999 freely available EZB journals</source><source>Alma/SFX Local Collection</source><creator>Xing, Yongfeng ; Zhong, Luo ; Zhong, Xian</creator><contributor>Che, Hangjun ; Hangjun Che</contributor><creatorcontrib>Xing, Yongfeng ; Zhong, Luo ; Zhong, Xian ; Che, Hangjun ; Hangjun Che</creatorcontrib><description>The convolutional neural network achieves excellent semantic segmentation results in artificially annotated datasets with complex scenes. However, semantic segmentation methods still suffer from several problems such as low use rate of the features, high computational complexity, and being far from practical real-time application, which bring about challenges for the image semantic segmentation. Two factors are very critical to semantic segmentation task: global context and multilevel semantics. However, generating these two factors will always lead to high complexity. In order to solve this, we propose a novel structure, dual attention fusion module (DAFM), by eliminating structural redundancy. Unlike most of the existing algorithms, we combine the attention mechanism with the depth pyramid pool module (DPPM) to extract accurate dense features for pixel labeling rather than complex expansion convolution. Specifically, we introduce a DPPM to execute the spatial pyramid structure in output and combine the global pool method. The DAFM is introduced in each decoder layer. Finally, the low-level features and high-level features are fused to obtain semantic segmentation result. The experiments and visualization results on Cityscapes and CamVid datasets show that, in real-time semantic segmentation, we have achieved a satisfactory balance between accuracy and speed, which proves the effectiveness of the proposed algorithm. In particular, on a single 1080ti GPU computer, ResNet-18 produces 75.53% MIoU at 70 FPS on Cityscapes and 73.96% MIoU at 109 FPS on CamVid.</description><identifier>ISSN: 1024-123X</identifier><identifier>EISSN: 1563-5147</identifier><identifier>DOI: 10.1155/2022/6195148</identifier><language>eng</language><publisher>New York: Hindawi</publisher><subject>Accuracy ; Algorithms ; Artificial neural networks ; Classification ; Coders ; Complexity ; Datasets ; Encoders-Decoders ; Engineering ; Feature extraction ; Image segmentation ; Methods ; Modules ; Neural networks ; Real time ; Redundancy ; Semantic segmentation ; Semantics</subject><ispartof>Mathematical problems in engineering, 2022-06, Vol.2022, p.1-10</ispartof><rights>Copyright © 2022 Yongfeng Xing et al.</rights><rights>Copyright © 2022 Yongfeng Xing et al. This is an open access article distributed under the Creative Commons Attribution License (the “License”), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License. https://creativecommons.org/licenses/by/4.0</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c337t-1ec39c4c093e6216790b0d37b0d6e37cf50e9fe07f7557164c9dfd30bbee8b983</citedby><cites>FETCH-LOGICAL-c337t-1ec39c4c093e6216790b0d37b0d6e37cf50e9fe07f7557164c9dfd30bbee8b983</cites><orcidid>0000-0002-2346-9659 ; 0000-0003-4400-0523 ; 0000-0003-0977-7919</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,776,780,27903,27904</link.rule.ids></links><search><contributor>Che, Hangjun</contributor><contributor>Hangjun Che</contributor><creatorcontrib>Xing, Yongfeng</creatorcontrib><creatorcontrib>Zhong, Luo</creatorcontrib><creatorcontrib>Zhong, Xian</creatorcontrib><title>DARSegNet: A Real-Time Semantic Segmentation Method Based on Dual Attention Fusion Module and Encoder-Decoder Network</title><title>Mathematical problems in engineering</title><description>The convolutional neural network achieves excellent semantic segmentation results in artificially annotated datasets with complex scenes. However, semantic segmentation methods still suffer from several problems such as low use rate of the features, high computational complexity, and being far from practical real-time application, which bring about challenges for the image semantic segmentation. Two factors are very critical to semantic segmentation task: global context and multilevel semantics. However, generating these two factors will always lead to high complexity. In order to solve this, we propose a novel structure, dual attention fusion module (DAFM), by eliminating structural redundancy. Unlike most of the existing algorithms, we combine the attention mechanism with the depth pyramid pool module (DPPM) to extract accurate dense features for pixel labeling rather than complex expansion convolution. Specifically, we introduce a DPPM to execute the spatial pyramid structure in output and combine the global pool method. The DAFM is introduced in each decoder layer. Finally, the low-level features and high-level features are fused to obtain semantic segmentation result. The experiments and visualization results on Cityscapes and CamVid datasets show that, in real-time semantic segmentation, we have achieved a satisfactory balance between accuracy and speed, which proves the effectiveness of the proposed algorithm. In particular, on a single 1080ti GPU computer, ResNet-18 produces 75.53% MIoU at 70 FPS on Cityscapes and 73.96% MIoU at 109 FPS on CamVid.</description><subject>Accuracy</subject><subject>Algorithms</subject><subject>Artificial neural networks</subject><subject>Classification</subject><subject>Coders</subject><subject>Complexity</subject><subject>Datasets</subject><subject>Encoders-Decoders</subject><subject>Engineering</subject><subject>Feature extraction</subject><subject>Image segmentation</subject><subject>Methods</subject><subject>Modules</subject><subject>Neural networks</subject><subject>Real time</subject><subject>Redundancy</subject><subject>Semantic segmentation</subject><subject>Semantics</subject><issn>1024-123X</issn><issn>1563-5147</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2022</creationdate><recordtype>article</recordtype><sourceid>RHX</sourceid><sourceid>ABUWG</sourceid><sourceid>AFKRA</sourceid><sourceid>AZQEC</sourceid><sourceid>BENPR</sourceid><sourceid>CCPQU</sourceid><sourceid>DWQXO</sourceid><sourceid>GNUQQ</sourceid><recordid>eNp9kEtPwzAQhC0EEqVw4wdY4gihfsRxwq30AUgFpLZI3CLH3rQpeZTYUcW_x6U9c9mZ1X7akQaha0ruKRViwAhjg4gmgobxCepREfHAe3nqPWFhQBn_PEcX1m4IYVTQuIe68XC-gNUbuAc8xHNQZbAsKsALqFTtCu3NqoLaKVc0NX4Ft24MflQWDPb7uFMlHjrngf152tk_qjFdCVjVBk9q3RhogzH8KfY5u6b9ukRnuSotXB21jz6mk-XoOZi9P72MhrNAcy5dQEHzRIeaJBwiRiOZkIwYLv2IgEudCwJJDkTmUghJo1AnJjecZBlAnCUx76Obw99t23x3YF26abq29pEpi2TMqCQx99TdgdJtY20Lebpti0q1Pykl6b7YdF9seizW47cHfF3URu2K_-lfOhB3NQ</recordid><startdate>20220606</startdate><enddate>20220606</enddate><creator>Xing, Yongfeng</creator><creator>Zhong, Luo</creator><creator>Zhong, Xian</creator><general>Hindawi</general><general>Hindawi Limited</general><scope>RHU</scope><scope>RHW</scope><scope>RHX</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7TB</scope><scope>8FD</scope><scope>8FE</scope><scope>8FG</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>ARAPS</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>CWDGH</scope><scope>DWQXO</scope><scope>FR3</scope><scope>GNUQQ</scope><scope>HCIFZ</scope><scope>JQ2</scope><scope>K7-</scope><scope>KR7</scope><scope>L6V</scope><scope>M7S</scope><scope>P5Z</scope><scope>P62</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope><orcidid>https://orcid.org/0000-0002-2346-9659</orcidid><orcidid>https://orcid.org/0000-0003-4400-0523</orcidid><orcidid>https://orcid.org/0000-0003-0977-7919</orcidid></search><sort><creationdate>20220606</creationdate><title>DARSegNet: A Real-Time Semantic Segmentation Method Based on Dual Attention Fusion Module and Encoder-Decoder Network</title><author>Xing, Yongfeng ; Zhong, Luo ; Zhong, Xian</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c337t-1ec39c4c093e6216790b0d37b0d6e37cf50e9fe07f7557164c9dfd30bbee8b983</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2022</creationdate><topic>Accuracy</topic><topic>Algorithms</topic><topic>Artificial neural networks</topic><topic>Classification</topic><topic>Coders</topic><topic>Complexity</topic><topic>Datasets</topic><topic>Encoders-Decoders</topic><topic>Engineering</topic><topic>Feature extraction</topic><topic>Image segmentation</topic><topic>Methods</topic><topic>Modules</topic><topic>Neural networks</topic><topic>Real time</topic><topic>Redundancy</topic><topic>Semantic segmentation</topic><topic>Semantics</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Xing, Yongfeng</creatorcontrib><creatorcontrib>Zhong, Luo</creatorcontrib><creatorcontrib>Zhong, Xian</creatorcontrib><collection>Hindawi Publishing Complete</collection><collection>Hindawi Publishing Subscription Journals</collection><collection>Hindawi Publishing Open Access</collection><collection>CrossRef</collection><collection>Mechanical & Transportation Engineering Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Materials Science & Engineering Collection</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest Central UK/Ireland</collection><collection>Advanced Technologies & Aerospace Collection</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>Middle East & Africa Database</collection><collection>ProQuest Central Korea</collection><collection>Engineering Research Database</collection><collection>ProQuest Central Student</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Computer Science Collection</collection><collection>Computer Science Database</collection><collection>Civil Engineering Abstracts</collection><collection>ProQuest Engineering Collection</collection><collection>Engineering Database</collection><collection>Advanced Technologies & Aerospace Database</collection><collection>ProQuest Advanced Technologies & Aerospace Collection</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering Collection</collection><jtitle>Mathematical problems in engineering</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Xing, Yongfeng</au><au>Zhong, Luo</au><au>Zhong, Xian</au><au>Che, Hangjun</au><au>Hangjun Che</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>DARSegNet: A Real-Time Semantic Segmentation Method Based on Dual Attention Fusion Module and Encoder-Decoder Network</atitle><jtitle>Mathematical problems in engineering</jtitle><date>2022-06-06</date><risdate>2022</risdate><volume>2022</volume><spage>1</spage><epage>10</epage><pages>1-10</pages><issn>1024-123X</issn><eissn>1563-5147</eissn><abstract>The convolutional neural network achieves excellent semantic segmentation results in artificially annotated datasets with complex scenes. However, semantic segmentation methods still suffer from several problems such as low use rate of the features, high computational complexity, and being far from practical real-time application, which bring about challenges for the image semantic segmentation. Two factors are very critical to semantic segmentation task: global context and multilevel semantics. However, generating these two factors will always lead to high complexity. In order to solve this, we propose a novel structure, dual attention fusion module (DAFM), by eliminating structural redundancy. Unlike most of the existing algorithms, we combine the attention mechanism with the depth pyramid pool module (DPPM) to extract accurate dense features for pixel labeling rather than complex expansion convolution. Specifically, we introduce a DPPM to execute the spatial pyramid structure in output and combine the global pool method. The DAFM is introduced in each decoder layer. Finally, the low-level features and high-level features are fused to obtain semantic segmentation result. The experiments and visualization results on Cityscapes and CamVid datasets show that, in real-time semantic segmentation, we have achieved a satisfactory balance between accuracy and speed, which proves the effectiveness of the proposed algorithm. In particular, on a single 1080ti GPU computer, ResNet-18 produces 75.53% MIoU at 70 FPS on Cityscapes and 73.96% MIoU at 109 FPS on CamVid.</abstract><cop>New York</cop><pub>Hindawi</pub><doi>10.1155/2022/6195148</doi><tpages>10</tpages><orcidid>https://orcid.org/0000-0002-2346-9659</orcidid><orcidid>https://orcid.org/0000-0003-4400-0523</orcidid><orcidid>https://orcid.org/0000-0003-0977-7919</orcidid><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | ISSN: 1024-123X |
ispartof | Mathematical problems in engineering, 2022-06, Vol.2022, p.1-10 |
issn | 1024-123X 1563-5147 |
language | eng |
recordid | cdi_proquest_journals_2678217083 |
source | Wiley Online Library Open Access; EZB-FREE-00999 freely available EZB journals; Alma/SFX Local Collection |
subjects | Accuracy Algorithms Artificial neural networks Classification Coders Complexity Datasets Encoders-Decoders Engineering Feature extraction Image segmentation Methods Modules Neural networks Real time Redundancy Semantic segmentation Semantics |
title | DARSegNet: A Real-Time Semantic Segmentation Method Based on Dual Attention Fusion Module and Encoder-Decoder Network |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-26T13%3A39%3A39IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=DARSegNet:%20A%20Real-Time%20Semantic%20Segmentation%20Method%20Based%20on%20Dual%20Attention%20Fusion%20Module%20and%20Encoder-Decoder%20Network&rft.jtitle=Mathematical%20problems%20in%20engineering&rft.au=Xing,%20Yongfeng&rft.date=2022-06-06&rft.volume=2022&rft.spage=1&rft.epage=10&rft.pages=1-10&rft.issn=1024-123X&rft.eissn=1563-5147&rft_id=info:doi/10.1155/2022/6195148&rft_dat=%3Cproquest_cross%3E2678217083%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2678217083&rft_id=info:pmid/&rfr_iscdi=true |