An attention-fused network for semantic segmentation of very-high-resolution remote sensing imagery

Semantic segmentation is an essential part of deep learning. In recent years, with the development of remote sensing big data, semantic segmentation has been increasingly used in remote sensing. Deep convolutional neural networks (DCNNs) face the challenge of feature fusion: very-high-resolution rem...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:ISPRS journal of photogrammetry and remote sensing 2021-07, Vol.177, p.238-262
Hauptverfasser: Yang, Xuan, Li, Shanshan, Chen, Zhengchao, Chanussot, Jocelyn, Jia, Xiuping, Zhang, Bing, Li, Baipeng, Chen, Pan
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 262
container_issue
container_start_page 238
container_title ISPRS journal of photogrammetry and remote sensing
container_volume 177
creator Yang, Xuan
Li, Shanshan
Chen, Zhengchao
Chanussot, Jocelyn
Jia, Xiuping
Zhang, Bing
Li, Baipeng
Chen, Pan
description Semantic segmentation is an essential part of deep learning. In recent years, with the development of remote sensing big data, semantic segmentation has been increasingly used in remote sensing. Deep convolutional neural networks (DCNNs) face the challenge of feature fusion: very-high-resolution remote sensing image multisource data fusion can increase the network’s learnable information, which is conducive to correctly classifying target objects by DCNNs; simultaneously, the fusion of high-level abstract features and low-level spatial features can improve the classification accuracy at the border between target objects. In this paper, we propose a multipath encoder structure to extract features of multipath inputs, a multipath attention-fused block module to fuse multipath features, and a refinement attention-fused block module to fuse high-level abstract features and low-level spatial features. Furthermore, we propose a novel convolutional neural network architecture, named attention-fused network (AFNet). Based on our AFNet, we achieve state-of-the-art performance with an overall accuracy of 91.7% and a mean F1 score of 90.96% on the ISPRS Vaihingen 2D dataset and an overall accuracy of 92.1% and a mean F1 score of 93.44% on the ISPRS Potsdam 2D dataset.
doi_str_mv 10.1016/j.isprsjprs.2021.05.004
format Article
fullrecord <record><control><sourceid>elsevier_hal_p</sourceid><recordid>TN_cdi_hal_primary_oai_HAL_hal_03430200v1</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><els_id>S0924271621001295</els_id><sourcerecordid>S0924271621001295</sourcerecordid><originalsourceid>FETCH-LOGICAL-c349t-b51c85e28ccf0373a5d996f531fba8a47b88f8bdecce6b07d871d9fe5af09acd3</originalsourceid><addsrcrecordid>eNqFkE1PwzAMhiMEEmPwG-iVQ4vTNm16rCa-pElc4BylqdOlrM2UdEP796QM7crBsmU_ry2_hNxTSCjQ4rFPjN8534dIUkhpAiwByC_IgvIyjXmasUuygCrN47SkxTW58b4HAMoKviCqHiM5TThOxo6x3ntsoxGnb-u-Im1d5HGQYaZC0Q2BkjMXWR0d0B3jjek2sUNvt_vfvsPBThjY0Zuxi8wgu4Ddkisttx7v_vKSfD4_faxe4_X7y9uqXscqy6spbhhVnGHKldKQlZlkbVUVmmVUN5LLvGw417xpUSksGihbXtK20sikhkqqNluSh9PejdyKnQvX3VFYacRrvRZzD7I8gxTgQANbnljlrPcO9VlAQcy-il6cfRWzrwKYCL4GZX1SYnjlYNAJrwyOClvjUE2itebfHT8LRolk</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>An attention-fused network for semantic segmentation of very-high-resolution remote sensing imagery</title><source>Elsevier ScienceDirect Journals Complete</source><creator>Yang, Xuan ; Li, Shanshan ; Chen, Zhengchao ; Chanussot, Jocelyn ; Jia, Xiuping ; Zhang, Bing ; Li, Baipeng ; Chen, Pan</creator><creatorcontrib>Yang, Xuan ; Li, Shanshan ; Chen, Zhengchao ; Chanussot, Jocelyn ; Jia, Xiuping ; Zhang, Bing ; Li, Baipeng ; Chen, Pan</creatorcontrib><description>Semantic segmentation is an essential part of deep learning. In recent years, with the development of remote sensing big data, semantic segmentation has been increasingly used in remote sensing. Deep convolutional neural networks (DCNNs) face the challenge of feature fusion: very-high-resolution remote sensing image multisource data fusion can increase the network’s learnable information, which is conducive to correctly classifying target objects by DCNNs; simultaneously, the fusion of high-level abstract features and low-level spatial features can improve the classification accuracy at the border between target objects. In this paper, we propose a multipath encoder structure to extract features of multipath inputs, a multipath attention-fused block module to fuse multipath features, and a refinement attention-fused block module to fuse high-level abstract features and low-level spatial features. Furthermore, we propose a novel convolutional neural network architecture, named attention-fused network (AFNet). Based on our AFNet, we achieve state-of-the-art performance with an overall accuracy of 91.7% and a mean F1 score of 90.96% on the ISPRS Vaihingen 2D dataset and an overall accuracy of 92.1% and a mean F1 score of 93.44% on the ISPRS Potsdam 2D dataset.</description><identifier>ISSN: 0924-2716</identifier><identifier>EISSN: 1872-8235</identifier><identifier>DOI: 10.1016/j.isprsjprs.2021.05.004</identifier><language>eng</language><publisher>Elsevier B.V</publisher><subject>Attention-fused network ; Convolutional neural network ; Deep learning ; Engineering Sciences ; ISPRS ; Semantic segmentation ; Signal and Image processing ; Very-high-resolution imagery</subject><ispartof>ISPRS journal of photogrammetry and remote sensing, 2021-07, Vol.177, p.238-262</ispartof><rights>2021 International Society for Photogrammetry and Remote Sensing, Inc. (ISPRS)</rights><rights>Distributed under a Creative Commons Attribution 4.0 International License</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c349t-b51c85e28ccf0373a5d996f531fba8a47b88f8bdecce6b07d871d9fe5af09acd3</citedby><cites>FETCH-LOGICAL-c349t-b51c85e28ccf0373a5d996f531fba8a47b88f8bdecce6b07d871d9fe5af09acd3</cites><orcidid>0000-0003-4817-2875 ; 0000-0001-7311-9844</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://dx.doi.org/10.1016/j.isprsjprs.2021.05.004$$EHTML$$P50$$Gelsevier$$H</linktohtml><link.rule.ids>230,314,780,784,885,3550,27924,27925,45995</link.rule.ids><backlink>$$Uhttps://hal.science/hal-03430200$$DView record in HAL$$Hfree_for_read</backlink></links><search><creatorcontrib>Yang, Xuan</creatorcontrib><creatorcontrib>Li, Shanshan</creatorcontrib><creatorcontrib>Chen, Zhengchao</creatorcontrib><creatorcontrib>Chanussot, Jocelyn</creatorcontrib><creatorcontrib>Jia, Xiuping</creatorcontrib><creatorcontrib>Zhang, Bing</creatorcontrib><creatorcontrib>Li, Baipeng</creatorcontrib><creatorcontrib>Chen, Pan</creatorcontrib><title>An attention-fused network for semantic segmentation of very-high-resolution remote sensing imagery</title><title>ISPRS journal of photogrammetry and remote sensing</title><description>Semantic segmentation is an essential part of deep learning. In recent years, with the development of remote sensing big data, semantic segmentation has been increasingly used in remote sensing. Deep convolutional neural networks (DCNNs) face the challenge of feature fusion: very-high-resolution remote sensing image multisource data fusion can increase the network’s learnable information, which is conducive to correctly classifying target objects by DCNNs; simultaneously, the fusion of high-level abstract features and low-level spatial features can improve the classification accuracy at the border between target objects. In this paper, we propose a multipath encoder structure to extract features of multipath inputs, a multipath attention-fused block module to fuse multipath features, and a refinement attention-fused block module to fuse high-level abstract features and low-level spatial features. Furthermore, we propose a novel convolutional neural network architecture, named attention-fused network (AFNet). Based on our AFNet, we achieve state-of-the-art performance with an overall accuracy of 91.7% and a mean F1 score of 90.96% on the ISPRS Vaihingen 2D dataset and an overall accuracy of 92.1% and a mean F1 score of 93.44% on the ISPRS Potsdam 2D dataset.</description><subject>Attention-fused network</subject><subject>Convolutional neural network</subject><subject>Deep learning</subject><subject>Engineering Sciences</subject><subject>ISPRS</subject><subject>Semantic segmentation</subject><subject>Signal and Image processing</subject><subject>Very-high-resolution imagery</subject><issn>0924-2716</issn><issn>1872-8235</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2021</creationdate><recordtype>article</recordtype><recordid>eNqFkE1PwzAMhiMEEmPwG-iVQ4vTNm16rCa-pElc4BylqdOlrM2UdEP796QM7crBsmU_ry2_hNxTSCjQ4rFPjN8534dIUkhpAiwByC_IgvIyjXmasUuygCrN47SkxTW58b4HAMoKviCqHiM5TThOxo6x3ntsoxGnb-u-Im1d5HGQYaZC0Q2BkjMXWR0d0B3jjek2sUNvt_vfvsPBThjY0Zuxi8wgu4Ddkisttx7v_vKSfD4_faxe4_X7y9uqXscqy6spbhhVnGHKldKQlZlkbVUVmmVUN5LLvGw417xpUSksGihbXtK20sikhkqqNluSh9PejdyKnQvX3VFYacRrvRZzD7I8gxTgQANbnljlrPcO9VlAQcy-il6cfRWzrwKYCL4GZX1SYnjlYNAJrwyOClvjUE2itebfHT8LRolk</recordid><startdate>20210701</startdate><enddate>20210701</enddate><creator>Yang, Xuan</creator><creator>Li, Shanshan</creator><creator>Chen, Zhengchao</creator><creator>Chanussot, Jocelyn</creator><creator>Jia, Xiuping</creator><creator>Zhang, Bing</creator><creator>Li, Baipeng</creator><creator>Chen, Pan</creator><general>Elsevier B.V</general><general>Elsevier</general><scope>AAYXX</scope><scope>CITATION</scope><scope>1XC</scope><orcidid>https://orcid.org/0000-0003-4817-2875</orcidid><orcidid>https://orcid.org/0000-0001-7311-9844</orcidid></search><sort><creationdate>20210701</creationdate><title>An attention-fused network for semantic segmentation of very-high-resolution remote sensing imagery</title><author>Yang, Xuan ; Li, Shanshan ; Chen, Zhengchao ; Chanussot, Jocelyn ; Jia, Xiuping ; Zhang, Bing ; Li, Baipeng ; Chen, Pan</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c349t-b51c85e28ccf0373a5d996f531fba8a47b88f8bdecce6b07d871d9fe5af09acd3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2021</creationdate><topic>Attention-fused network</topic><topic>Convolutional neural network</topic><topic>Deep learning</topic><topic>Engineering Sciences</topic><topic>ISPRS</topic><topic>Semantic segmentation</topic><topic>Signal and Image processing</topic><topic>Very-high-resolution imagery</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Yang, Xuan</creatorcontrib><creatorcontrib>Li, Shanshan</creatorcontrib><creatorcontrib>Chen, Zhengchao</creatorcontrib><creatorcontrib>Chanussot, Jocelyn</creatorcontrib><creatorcontrib>Jia, Xiuping</creatorcontrib><creatorcontrib>Zhang, Bing</creatorcontrib><creatorcontrib>Li, Baipeng</creatorcontrib><creatorcontrib>Chen, Pan</creatorcontrib><collection>CrossRef</collection><collection>Hyper Article en Ligne (HAL)</collection><jtitle>ISPRS journal of photogrammetry and remote sensing</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Yang, Xuan</au><au>Li, Shanshan</au><au>Chen, Zhengchao</au><au>Chanussot, Jocelyn</au><au>Jia, Xiuping</au><au>Zhang, Bing</au><au>Li, Baipeng</au><au>Chen, Pan</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>An attention-fused network for semantic segmentation of very-high-resolution remote sensing imagery</atitle><jtitle>ISPRS journal of photogrammetry and remote sensing</jtitle><date>2021-07-01</date><risdate>2021</risdate><volume>177</volume><spage>238</spage><epage>262</epage><pages>238-262</pages><issn>0924-2716</issn><eissn>1872-8235</eissn><abstract>Semantic segmentation is an essential part of deep learning. In recent years, with the development of remote sensing big data, semantic segmentation has been increasingly used in remote sensing. Deep convolutional neural networks (DCNNs) face the challenge of feature fusion: very-high-resolution remote sensing image multisource data fusion can increase the network’s learnable information, which is conducive to correctly classifying target objects by DCNNs; simultaneously, the fusion of high-level abstract features and low-level spatial features can improve the classification accuracy at the border between target objects. In this paper, we propose a multipath encoder structure to extract features of multipath inputs, a multipath attention-fused block module to fuse multipath features, and a refinement attention-fused block module to fuse high-level abstract features and low-level spatial features. Furthermore, we propose a novel convolutional neural network architecture, named attention-fused network (AFNet). Based on our AFNet, we achieve state-of-the-art performance with an overall accuracy of 91.7% and a mean F1 score of 90.96% on the ISPRS Vaihingen 2D dataset and an overall accuracy of 92.1% and a mean F1 score of 93.44% on the ISPRS Potsdam 2D dataset.</abstract><pub>Elsevier B.V</pub><doi>10.1016/j.isprsjprs.2021.05.004</doi><tpages>25</tpages><orcidid>https://orcid.org/0000-0003-4817-2875</orcidid><orcidid>https://orcid.org/0000-0001-7311-9844</orcidid></addata></record>
fulltext fulltext
identifier ISSN: 0924-2716
ispartof ISPRS journal of photogrammetry and remote sensing, 2021-07, Vol.177, p.238-262
issn 0924-2716
1872-8235
language eng
recordid cdi_hal_primary_oai_HAL_hal_03430200v1
source Elsevier ScienceDirect Journals Complete
subjects Attention-fused network
Convolutional neural network
Deep learning
Engineering Sciences
ISPRS
Semantic segmentation
Signal and Image processing
Very-high-resolution imagery
title An attention-fused network for semantic segmentation of very-high-resolution remote sensing imagery
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-04T16%3A30%3A52IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-elsevier_hal_p&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=An%20attention-fused%20network%20for%20semantic%20segmentation%20of%20very-high-resolution%20remote%20sensing%20imagery&rft.jtitle=ISPRS%20journal%20of%20photogrammetry%20and%20remote%20sensing&rft.au=Yang,%20Xuan&rft.date=2021-07-01&rft.volume=177&rft.spage=238&rft.epage=262&rft.pages=238-262&rft.issn=0924-2716&rft.eissn=1872-8235&rft_id=info:doi/10.1016/j.isprsjprs.2021.05.004&rft_dat=%3Celsevier_hal_p%3ES0924271621001295%3C/elsevier_hal_p%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rft_els_id=S0924271621001295&rfr_iscdi=true