Explicit High-Level Semantic Network for Domain Generalization in Hyperspectral Image Classification

When applied across different scenes, hyperspectral image (HSI) classification models often struggle to generalize due to the data distribution disparities and labels' scarcity, leading to domain shift (DS) problems. Recently, the high-level semantics from text has demonstrated the potential to...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on geoscience and remote sensing 2024, Vol.62, p.1-14
Hauptverfasser: Wang, Xusheng, Dong, Shoubin, Zheng, Xiaorou, Lu, Runuo, Jia, Jianxin
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 14
container_issue
container_start_page 1
container_title IEEE transactions on geoscience and remote sensing
container_volume 62
creator Wang, Xusheng
Dong, Shoubin
Zheng, Xiaorou
Lu, Runuo
Jia, Jianxin
description When applied across different scenes, hyperspectral image (HSI) classification models often struggle to generalize due to the data distribution disparities and labels' scarcity, leading to domain shift (DS) problems. Recently, the high-level semantics from text has demonstrated the potential to address the DS problem, by improving the generalization capability of image encoders through aligning image-text pairs. However, the main challenge still lies in crafting appropriate texts that accurately represent the intricate interrelationships and the fragmented nature of land cover in HSIs and effectively extracting spectral-spatial features from HSI data. This article proposes a domain generalization (DG) method, EHSnet, to address these issues by leveraging multilayered explicit high-level semantic (EHS) information from different types of texts to provide precisely relevant semantic information for the image encoder. A multilayered EHS information paradigm is well-defined, aiming to extract the HSI's intricate interrelationships and the fragmented land-cover features, and a dual-residual encoder connected by a 2-D convolution is designed, which combines CNNs with residual structure and Vision Transformers (ViTs) with short-range cross-layer connections to explore the spectral-spatial features of HSIs. By aligning text features with image features in the semantic space, EHSnet improves the representation capability of the image encoder and is endowed with zero-shot generalization ability for cross-scene tasks. Extensive experiments conducted on three hyperspectral datasets, including Houston, Pavia, and XS datasets, validate the effectiveness and superiority of EHSnet, with the Kappa coefficient improved by 8.17%, 3.22%, and 3.62% across three datasets compared to the state-of-the-art (SOTA) methods. The code is available at https://github.com/SCUT-CCNL/EHSnet .
doi_str_mv 10.1109/TGRS.2024.3495765
format Article
fullrecord <record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_proquest_journals_3133488689</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>10750220</ieee_id><sourcerecordid>3133488689</sourcerecordid><originalsourceid>FETCH-LOGICAL-c919-b4eba9d40772727336fb00fee1d33014513de0bf54db939fca1e3edc31c6bf1e3</originalsourceid><addsrcrecordid>eNpNkFtLAzEQhYMoWKs_QPAh4PPWZJO95FFqbQtFwfY9ZLOTmro3k61af73p5UHmYYbDOTPDh9AtJSNKiXhYTd-Wo5jEfMS4SLI0OUMDmiR5RFLOz9GAUJFGcS7iS3Tl_YYQyhOaDVA5-ekqq22PZ3b9Hi3gCyq8hFo1vdX4Bfrv1n1g0zr81NbKNngKDThV2V_V27bBQZntOnC-A90HHc9rtQY8rpT31lh9cF2jC6MqDzenPkSr58lqPIsWr9P5-HERaUFFVHAolCg5ybI4FGOpKQgxALRk7PAvK4EUJuFlIZgwWlFgUGpGdVqYMA_R_XFt59rPLfhebtqta8JFyShjPM_TXAQXPbq0a713YGTnbK3cTlIi9yzlnqXcs5QnliFzd8xYAPjnzxISx4T9Acg7cak</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>3133488689</pqid></control><display><type>article</type><title>Explicit High-Level Semantic Network for Domain Generalization in Hyperspectral Image Classification</title><source>IEEE Electronic Library (IEL)</source><creator>Wang, Xusheng ; Dong, Shoubin ; Zheng, Xiaorou ; Lu, Runuo ; Jia, Jianxin</creator><creatorcontrib>Wang, Xusheng ; Dong, Shoubin ; Zheng, Xiaorou ; Lu, Runuo ; Jia, Jianxin</creatorcontrib><description>When applied across different scenes, hyperspectral image (HSI) classification models often struggle to generalize due to the data distribution disparities and labels' scarcity, leading to domain shift (DS) problems. Recently, the high-level semantics from text has demonstrated the potential to address the DS problem, by improving the generalization capability of image encoders through aligning image-text pairs. However, the main challenge still lies in crafting appropriate texts that accurately represent the intricate interrelationships and the fragmented nature of land cover in HSIs and effectively extracting spectral-spatial features from HSI data. This article proposes a domain generalization (DG) method, EHSnet, to address these issues by leveraging multilayered explicit high-level semantic (EHS) information from different types of texts to provide precisely relevant semantic information for the image encoder. A multilayered EHS information paradigm is well-defined, aiming to extract the HSI's intricate interrelationships and the fragmented land-cover features, and a dual-residual encoder connected by a 2-D convolution is designed, which combines CNNs with residual structure and Vision Transformers (ViTs) with short-range cross-layer connections to explore the spectral-spatial features of HSIs. By aligning text features with image features in the semantic space, EHSnet improves the representation capability of the image encoder and is endowed with zero-shot generalization ability for cross-scene tasks. Extensive experiments conducted on three hyperspectral datasets, including Houston, Pavia, and XS datasets, validate the effectiveness and superiority of EHSnet, with the Kappa coefficient improved by 8.17%, 3.22%, and 3.62% across three datasets compared to the state-of-the-art (SOTA) methods. The code is available at https://github.com/SCUT-CCNL/EHSnet .</description><identifier>ISSN: 0196-2892</identifier><identifier>EISSN: 1558-0644</identifier><identifier>DOI: 10.1109/TGRS.2024.3495765</identifier><identifier>CODEN: IGRSD2</identifier><language>eng</language><publisher>New York: IEEE</publisher><subject>Classification ; Coders ; Contrastive learning ; Data mining ; Data models ; Datasets ; Domain generalization (DG) ; Feature extraction ; hyperspectral image (HSI) classification ; Hyperspectral imaging ; Image classification ; Image processing ; Information processing ; Land cover ; Land surface ; multiple modality ; Representation learning ; Semantics ; Spatial data ; Texts ; Training ; Visualization ; visual–language model</subject><ispartof>IEEE transactions on geoscience and remote sensing, 2024, Vol.62, p.1-14</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2024</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c919-b4eba9d40772727336fb00fee1d33014513de0bf54db939fca1e3edc31c6bf1e3</cites><orcidid>0000-0002-4941-5907 ; 0009-0008-5429-3984 ; 0000-0003-0153-850X ; 0000-0003-4366-4547</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/10750220$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,777,781,793,4010,27904,27905,27906,54739</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/10750220$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Wang, Xusheng</creatorcontrib><creatorcontrib>Dong, Shoubin</creatorcontrib><creatorcontrib>Zheng, Xiaorou</creatorcontrib><creatorcontrib>Lu, Runuo</creatorcontrib><creatorcontrib>Jia, Jianxin</creatorcontrib><title>Explicit High-Level Semantic Network for Domain Generalization in Hyperspectral Image Classification</title><title>IEEE transactions on geoscience and remote sensing</title><addtitle>TGRS</addtitle><description>When applied across different scenes, hyperspectral image (HSI) classification models often struggle to generalize due to the data distribution disparities and labels' scarcity, leading to domain shift (DS) problems. Recently, the high-level semantics from text has demonstrated the potential to address the DS problem, by improving the generalization capability of image encoders through aligning image-text pairs. However, the main challenge still lies in crafting appropriate texts that accurately represent the intricate interrelationships and the fragmented nature of land cover in HSIs and effectively extracting spectral-spatial features from HSI data. This article proposes a domain generalization (DG) method, EHSnet, to address these issues by leveraging multilayered explicit high-level semantic (EHS) information from different types of texts to provide precisely relevant semantic information for the image encoder. A multilayered EHS information paradigm is well-defined, aiming to extract the HSI's intricate interrelationships and the fragmented land-cover features, and a dual-residual encoder connected by a 2-D convolution is designed, which combines CNNs with residual structure and Vision Transformers (ViTs) with short-range cross-layer connections to explore the spectral-spatial features of HSIs. By aligning text features with image features in the semantic space, EHSnet improves the representation capability of the image encoder and is endowed with zero-shot generalization ability for cross-scene tasks. Extensive experiments conducted on three hyperspectral datasets, including Houston, Pavia, and XS datasets, validate the effectiveness and superiority of EHSnet, with the Kappa coefficient improved by 8.17%, 3.22%, and 3.62% across three datasets compared to the state-of-the-art (SOTA) methods. The code is available at https://github.com/SCUT-CCNL/EHSnet .</description><subject>Classification</subject><subject>Coders</subject><subject>Contrastive learning</subject><subject>Data mining</subject><subject>Data models</subject><subject>Datasets</subject><subject>Domain generalization (DG)</subject><subject>Feature extraction</subject><subject>hyperspectral image (HSI) classification</subject><subject>Hyperspectral imaging</subject><subject>Image classification</subject><subject>Image processing</subject><subject>Information processing</subject><subject>Land cover</subject><subject>Land surface</subject><subject>multiple modality</subject><subject>Representation learning</subject><subject>Semantics</subject><subject>Spatial data</subject><subject>Texts</subject><subject>Training</subject><subject>Visualization</subject><subject>visual–language model</subject><issn>0196-2892</issn><issn>1558-0644</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNpNkFtLAzEQhYMoWKs_QPAh4PPWZJO95FFqbQtFwfY9ZLOTmro3k61af73p5UHmYYbDOTPDh9AtJSNKiXhYTd-Wo5jEfMS4SLI0OUMDmiR5RFLOz9GAUJFGcS7iS3Tl_YYQyhOaDVA5-ekqq22PZ3b9Hi3gCyq8hFo1vdX4Bfrv1n1g0zr81NbKNngKDThV2V_V27bBQZntOnC-A90HHc9rtQY8rpT31lh9cF2jC6MqDzenPkSr58lqPIsWr9P5-HERaUFFVHAolCg5ybI4FGOpKQgxALRk7PAvK4EUJuFlIZgwWlFgUGpGdVqYMA_R_XFt59rPLfhebtqta8JFyShjPM_TXAQXPbq0a713YGTnbK3cTlIi9yzlnqXcs5QnliFzd8xYAPjnzxISx4T9Acg7cak</recordid><startdate>2024</startdate><enddate>2024</enddate><creator>Wang, Xusheng</creator><creator>Dong, Shoubin</creator><creator>Zheng, Xiaorou</creator><creator>Lu, Runuo</creator><creator>Jia, Jianxin</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7UA</scope><scope>8FD</scope><scope>C1K</scope><scope>F1W</scope><scope>FR3</scope><scope>H8D</scope><scope>H96</scope><scope>KR7</scope><scope>L.G</scope><scope>L7M</scope><orcidid>https://orcid.org/0000-0002-4941-5907</orcidid><orcidid>https://orcid.org/0009-0008-5429-3984</orcidid><orcidid>https://orcid.org/0000-0003-0153-850X</orcidid><orcidid>https://orcid.org/0000-0003-4366-4547</orcidid></search><sort><creationdate>2024</creationdate><title>Explicit High-Level Semantic Network for Domain Generalization in Hyperspectral Image Classification</title><author>Wang, Xusheng ; Dong, Shoubin ; Zheng, Xiaorou ; Lu, Runuo ; Jia, Jianxin</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c919-b4eba9d40772727336fb00fee1d33014513de0bf54db939fca1e3edc31c6bf1e3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Classification</topic><topic>Coders</topic><topic>Contrastive learning</topic><topic>Data mining</topic><topic>Data models</topic><topic>Datasets</topic><topic>Domain generalization (DG)</topic><topic>Feature extraction</topic><topic>hyperspectral image (HSI) classification</topic><topic>Hyperspectral imaging</topic><topic>Image classification</topic><topic>Image processing</topic><topic>Information processing</topic><topic>Land cover</topic><topic>Land surface</topic><topic>multiple modality</topic><topic>Representation learning</topic><topic>Semantics</topic><topic>Spatial data</topic><topic>Texts</topic><topic>Training</topic><topic>Visualization</topic><topic>visual–language model</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Wang, Xusheng</creatorcontrib><creatorcontrib>Dong, Shoubin</creatorcontrib><creatorcontrib>Zheng, Xiaorou</creatorcontrib><creatorcontrib>Lu, Runuo</creatorcontrib><creatorcontrib>Jia, Jianxin</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><collection>Water Resources Abstracts</collection><collection>Technology Research Database</collection><collection>Environmental Sciences and Pollution Management</collection><collection>ASFA: Aquatic Sciences and Fisheries Abstracts</collection><collection>Engineering Research Database</collection><collection>Aerospace Database</collection><collection>Aquatic Science &amp; Fisheries Abstracts (ASFA) 2: Ocean Technology, Policy &amp; Non-Living Resources</collection><collection>Civil Engineering Abstracts</collection><collection>Aquatic Science &amp; Fisheries Abstracts (ASFA) Professional</collection><collection>Advanced Technologies Database with Aerospace</collection><jtitle>IEEE transactions on geoscience and remote sensing</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Wang, Xusheng</au><au>Dong, Shoubin</au><au>Zheng, Xiaorou</au><au>Lu, Runuo</au><au>Jia, Jianxin</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Explicit High-Level Semantic Network for Domain Generalization in Hyperspectral Image Classification</atitle><jtitle>IEEE transactions on geoscience and remote sensing</jtitle><stitle>TGRS</stitle><date>2024</date><risdate>2024</risdate><volume>62</volume><spage>1</spage><epage>14</epage><pages>1-14</pages><issn>0196-2892</issn><eissn>1558-0644</eissn><coden>IGRSD2</coden><abstract>When applied across different scenes, hyperspectral image (HSI) classification models often struggle to generalize due to the data distribution disparities and labels' scarcity, leading to domain shift (DS) problems. Recently, the high-level semantics from text has demonstrated the potential to address the DS problem, by improving the generalization capability of image encoders through aligning image-text pairs. However, the main challenge still lies in crafting appropriate texts that accurately represent the intricate interrelationships and the fragmented nature of land cover in HSIs and effectively extracting spectral-spatial features from HSI data. This article proposes a domain generalization (DG) method, EHSnet, to address these issues by leveraging multilayered explicit high-level semantic (EHS) information from different types of texts to provide precisely relevant semantic information for the image encoder. A multilayered EHS information paradigm is well-defined, aiming to extract the HSI's intricate interrelationships and the fragmented land-cover features, and a dual-residual encoder connected by a 2-D convolution is designed, which combines CNNs with residual structure and Vision Transformers (ViTs) with short-range cross-layer connections to explore the spectral-spatial features of HSIs. By aligning text features with image features in the semantic space, EHSnet improves the representation capability of the image encoder and is endowed with zero-shot generalization ability for cross-scene tasks. Extensive experiments conducted on three hyperspectral datasets, including Houston, Pavia, and XS datasets, validate the effectiveness and superiority of EHSnet, with the Kappa coefficient improved by 8.17%, 3.22%, and 3.62% across three datasets compared to the state-of-the-art (SOTA) methods. The code is available at https://github.com/SCUT-CCNL/EHSnet .</abstract><cop>New York</cop><pub>IEEE</pub><doi>10.1109/TGRS.2024.3495765</doi><tpages>14</tpages><orcidid>https://orcid.org/0000-0002-4941-5907</orcidid><orcidid>https://orcid.org/0009-0008-5429-3984</orcidid><orcidid>https://orcid.org/0000-0003-0153-850X</orcidid><orcidid>https://orcid.org/0000-0003-4366-4547</orcidid></addata></record>
fulltext fulltext_linktorsrc
identifier ISSN: 0196-2892
ispartof IEEE transactions on geoscience and remote sensing, 2024, Vol.62, p.1-14
issn 0196-2892
1558-0644
language eng
recordid cdi_proquest_journals_3133488689
source IEEE Electronic Library (IEL)
subjects Classification
Coders
Contrastive learning
Data mining
Data models
Datasets
Domain generalization (DG)
Feature extraction
hyperspectral image (HSI) classification
Hyperspectral imaging
Image classification
Image processing
Information processing
Land cover
Land surface
multiple modality
Representation learning
Semantics
Spatial data
Texts
Training
Visualization
visual–language model
title Explicit High-Level Semantic Network for Domain Generalization in Hyperspectral Image Classification
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-18T19%3A35%3A18IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Explicit%20High-Level%20Semantic%20Network%20for%20Domain%20Generalization%20in%20Hyperspectral%20Image%20Classification&rft.jtitle=IEEE%20transactions%20on%20geoscience%20and%20remote%20sensing&rft.au=Wang,%20Xusheng&rft.date=2024&rft.volume=62&rft.spage=1&rft.epage=14&rft.pages=1-14&rft.issn=0196-2892&rft.eissn=1558-0644&rft.coden=IGRSD2&rft_id=info:doi/10.1109/TGRS.2024.3495765&rft_dat=%3Cproquest_RIE%3E3133488689%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=3133488689&rft_id=info:pmid/&rft_ieee_id=10750220&rfr_iscdi=true