Explicit High-Level Semantic Network for Domain Generalization in Hyperspectral Image Classification

When applied across different scenes, hyperspectral image (HSI) classification models often struggle to generalize due to the data distribution disparities and labels' scarcity, leading to domain shift (DS) problems. Recently, the high-level semantics from text has demonstrated the potential to...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE transactions on geoscience and remote sensing 2024, Vol.62, p.1-14
Hauptverfasser:	Wang, Xusheng, Dong, Shoubin, Zheng, Xiaorou, Lu, Runuo, Jia, Jianxin
Format:	Artikel
Sprache:	eng
Schlagworte:	Classification Coders Contrastive learning Data mining Data models Datasets Domain generalization (DG) Feature extraction hyperspectral image (HSI) classification Hyperspectral imaging Image classification Image processing Information processing Land cover Land surface multiple modality Representation learning Semantics Spatial data Texts Training Visualization visual–language model
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	14
container_issue
container_start_page	1
container_title	IEEE transactions on geoscience and remote sensing
container_volume	62
creator	Wang, Xusheng Dong, Shoubin Zheng, Xiaorou Lu, Runuo Jia, Jianxin
description	When applied across different scenes, hyperspectral image (HSI) classification models often struggle to generalize due to the data distribution disparities and labels' scarcity, leading to domain shift (DS) problems. Recently, the high-level semantics from text has demonstrated the potential to address the DS problem, by improving the generalization capability of image encoders through aligning image-text pairs. However, the main challenge still lies in crafting appropriate texts that accurately represent the intricate interrelationships and the fragmented nature of land cover in HSIs and effectively extracting spectral-spatial features from HSI data. This article proposes a domain generalization (DG) method, EHSnet, to address these issues by leveraging multilayered explicit high-level semantic (EHS) information from different types of texts to provide precisely relevant semantic information for the image encoder. A multilayered EHS information paradigm is well-defined, aiming to extract the HSI's intricate interrelationships and the fragmented land-cover features, and a dual-residual encoder connected by a 2-D convolution is designed, which combines CNNs with residual structure and Vision Transformers (ViTs) with short-range cross-layer connections to explore the spectral-spatial features of HSIs. By aligning text features with image features in the semantic space, EHSnet improves the representation capability of the image encoder and is endowed with zero-shot generalization ability for cross-scene tasks. Extensive experiments conducted on three hyperspectral datasets, including Houston, Pavia, and XS datasets, validate the effectiveness and superiority of EHSnet, with the Kappa coefficient improved by 8.17%, 3.22%, and 3.62% across three datasets compared to the state-of-the-art (SOTA) methods. The code is available at https://github.com/SCUT-CCNL/EHSnet .
doi_str_mv	10.1109/TGRS.2024.3495765
format	Article
fullrecord	<record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_proquest_journals_3133488689</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>10750220</ieee_id><sourcerecordid>3133488689</sourcerecordid><originalsourceid>FETCH-LOGICAL-c919-b4eba9d40772727336fb00fee1d33014513de0bf54db939fca1e3edc31c6bf1e3</originalsourceid><addsrcrecordid>eNpNkFtLAzEQhYMoWKs_QPAh4PPWZJO95FFqbQtFwfY9ZLOTmro3k61af73p5UHmYYbDOTPDh9AtJSNKiXhYTd-Wo5jEfMS4SLI0OUMDmiR5RFLOz9GAUJFGcS7iS3Tl_YYQyhOaDVA5-ekqq22PZ3b9Hi3gCyq8hFo1vdX4Bfrv1n1g0zr81NbKNngKDThV2V_V27bBQZntOnC-A90HHc9rtQY8rpT31lh9cF2jC6MqDzenPkSr58lqPIsWr9P5-HERaUFFVHAolCg5ybI4FGOpKQgxALRk7PAvK4EUJuFlIZgwWlFgUGpGdVqYMA_R_XFt59rPLfhebtqta8JFyShjPM_TXAQXPbq0a713YGTnbK3cTlIi9yzlnqXcs5QnliFzd8xYAPjnzxISx4T9Acg7cak</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>3133488689</pqid></control><display><type>article</type><title>Explicit High-Level Semantic Network for Domain Generalization in Hyperspectral Image Classification</title><source>IEEE Electronic Library (IEL)</source><creator>Wang, Xusheng ; Dong, Shoubin ; Zheng, Xiaorou ; Lu, Runuo ; Jia, Jianxin</creator><creatorcontrib>Wang, Xusheng ; Dong, Shoubin ; Zheng, Xiaorou ; Lu, Runuo ; Jia, Jianxin</creatorcontrib><description>When applied across different scenes, hyperspectral image (HSI) classification models often struggle to generalize due to the data distribution disparities and labels' scarcity, leading to domain shift (DS) problems. Recently, the high-level semantics from text has demonstrated the potential to address the DS problem, by improving the generalization capability of image encoders through aligning image-text pairs. However, the main challenge still lies in crafting appropriate texts that accurately represent the intricate interrelationships and the fragmented nature of land cover in HSIs and effectively extracting spectral-spatial features from HSI data. This article proposes a domain generalization (DG) method, EHSnet, to address these issues by leveraging multilayered explicit high-level semantic (EHS) information from different types of texts to provide precisely relevant semantic information for the image encoder. A multilayered EHS information paradigm is well-defined, aiming to extract the HSI's intricate interrelationships and the fragmented land-cover features, and a dual-residual encoder connected by a 2-D convolution is designed, which combines CNNs with residual structure and Vision Transformers (ViTs) with short-range cross-layer connections to explore the spectral-spatial features of HSIs. By aligning text features with image features in the semantic space, EHSnet improves the representation capability of the image encoder and is endowed with zero-shot generalization ability for cross-scene tasks. Extensive experiments conducted on three hyperspectral datasets, including Houston, Pavia, and XS datasets, validate the effectiveness and superiority of EHSnet, with the Kappa coefficient improved by 8.17%, 3.22%, and 3.62% across three datasets compared to the state-of-the-art (SOTA) methods. The code is available at https://github.com/SCUT-CCNL/EHSnet .</description><identifier>ISSN: 0196-2892</identifier><identifier>EISSN: 1558-0644</identifier><identifier>DOI: 10.1109/TGRS.2024.3495765</identifier><identifier>CODEN: IGRSD2</identifier><language>eng</language><publisher>New York: IEEE</publisher><subject>Classification ; Coders ; Contrastive learning ; Data mining ; Data models ; Datasets ; Domain generalization (DG) ; Feature extraction ; hyperspectral image (HSI) classification ; Hyperspectral imaging ; Image classification ; Image processing ; Information processing ; Land cover ; Land surface ; multiple modality ; Representation learning ; Semantics ; Spatial data ; Texts ; Training ; Visualization ; visual–language model</subject><ispartof>IEEE transactions on geoscience and remote sensing, 2024, Vol.62, p.1-14</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2024</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c919-b4eba9d40772727336fb00fee1d33014513de0bf54db939fca1e3edc31c6bf1e3</cites><orcidid>0000-0002-4941-5907 ; 0009-0008-5429-3984 ; 0000-0003-0153-850X ; 0000-0003-4366-4547</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/10750220$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,777,781,793,4010,27904,27905,27906,54739</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/10750220$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Wang, Xusheng</creatorcontrib><creatorcontrib>Dong, Shoubin</creatorcontrib><creatorcontrib>Zheng, Xiaorou</creatorcontrib><creatorcontrib>Lu, Runuo</creatorcontrib><creatorcontrib>Jia, Jianxin</creatorcontrib><title>Explicit High-Level Semantic Network for Domain Generalization in Hyperspectral Image Classification</title><title>IEEE transactions on geoscience and remote sensing</title><addtitle>TGRS</addtitle><description>When applied across different scenes, hyperspectral image (HSI) classification models often struggle to generalize due to the data distribution disparities and labels' scarcity, leading to domain shift (DS) problems. Recently, the high-level semantics from text has demonstrated the potential to address the DS problem, by improving the generalization capability of image encoders through aligning image-text pairs. However, the main challenge still lies in crafting appropriate texts that accurately represent the intricate interrelationships and the fragmented nature of land cover in HSIs and effectively extracting spectral-spatial features from HSI data. This article proposes a domain generalization (DG) method, EHSnet, to address these issues by leveraging multilayered explicit high-level semantic (EHS) information from different types of texts to provide precisely relevant semantic information for the image encoder. A multilayered EHS information paradigm is well-defined, aiming to extract the HSI's intricate interrelationships and the fragmented land-cover features, and a dual-residual encoder connected by a 2-D convolution is designed, which combines CNNs with residual structure and Vision Transformers (ViTs) with short-range cross-layer connections to explore the spectral-spatial features of HSIs. By aligning text features with image features in the semantic space, EHSnet improves the representation capability of the image encoder and is endowed with zero-shot generalization ability for cross-scene tasks. Extensive experiments conducted on three hyperspectral datasets, including Houston, Pavia, and XS datasets, validate the effectiveness and superiority of EHSnet, with the Kappa coefficient improved by 8.17%, 3.22%, and 3.62% across three datasets compared to the state-of-the-art (SOTA) methods. The code is available at https://github.com/SCUT-CCNL/EHSnet .</description><subject>Classification</subject><subject>Coders</subject><subject>Contrastive learning</subject><subject>Data mining</subject><subject>Data models</subject><subject>Datasets</subject><subject>Domain generalization (DG)</subject><subject>Feature extraction</subject><subject>hyperspectral image (HSI) classification</subject><subject>Hyperspectral imaging</subject><subject>Image classification</subject><subject>Image processing</subject><subject>Information processing</subject><subject>Land cover</subject><subject>Land surface</subject><subject>multiple modality</subject><subject>Representation learning</subject><subject>Semantics</subject><subject>Spatial data</subject><subject>Texts</subject><subject>Training</subject><subject>Visualization</subject><subject>visual–language model</subject><issn>0196-2892</issn><issn>1558-0644</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNpNkFtLAzEQhYMoWKs_QPAh4PPWZJO95FFqbQtFwfY9ZLOTmro3k61af73p5UHmYYbDOTPDh9AtJSNKiXhYTd-Wo5jEfMS4SLI0OUMDmiR5RFLOz9GAUJFGcS7iS3Tl_YYQyhOaDVA5-ekqq22PZ3b9Hi3gCyq8hFo1vdX4Bfrv1n1g0zr81NbKNngKDThV2V_V27bBQZntOnC-A90HHc9rtQY8rpT31lh9cF2jC6MqDzenPkSr58lqPIsWr9P5-HERaUFFVHAolCg5ybI4FGOpKQgxALRk7PAvK4EUJuFlIZgwWlFgUGpGdVqYMA_R_XFt59rPLfhebtqta8JFyShjPM_TXAQXPbq0a713YGTnbK3cTlIi9yzlnqXcs5QnliFzd8xYAPjnzxISx4T9Acg7cak</recordid><startdate>2024</startdate><enddate>2024</enddate><creator>Wang, Xusheng</creator><creator>Dong, Shoubin</creator><creator>Zheng, Xiaorou</creator><creator>Lu, Runuo</creator><creator>Jia, Jianxin</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7UA</scope><scope>8FD</scope><scope>C1K</scope><scope>F1W</scope><scope>FR3</scope><scope>H8D</scope><scope>H96</scope><scope>KR7</scope><scope>L.G</scope><scope>L7M</scope><orcidid>https://orcid.org/0000-0002-4941-5907</orcidid><orcidid>https://orcid.org/0009-0008-5429-3984</orcidid><orcidid>https://orcid.org/0000-0003-0153-850X</orcidid><orcidid>https://orcid.org/0000-0003-4366-4547</orcidid></search><sort><creationdate>2024</creationdate><title>Explicit High-Level Semantic Network for Domain Generalization in Hyperspectral Image Classification</title><author>Wang, Xusheng ; Dong, Shoubin ; Zheng, Xiaorou ; Lu, Runuo ; Jia, Jianxin</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c919-b4eba9d40772727336fb00fee1d33014513de0bf54db939fca1e3edc31c6bf1e3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Classification</topic><topic>Coders</topic><topic>Contrastive learning</topic><topic>Data mining</topic><topic>Data models</topic><topic>Datasets</topic><topic>Domain generalization (DG)</topic><topic>Feature extraction</topic><topic>hyperspectral image (HSI) classification</topic><topic>Hyperspectral imaging</topic><topic>Image classification</topic><topic>Image processing</topic><topic>Information processing</topic><topic>Land cover</topic><topic>Land surface</topic><topic>multiple modality</topic><topic>Representation learning</topic><topic>Semantics</topic><topic>Spatial data</topic><topic>Texts</topic><topic>Training</topic><topic>Visualization</topic><topic>visual–language model</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Wang, Xusheng</creatorcontrib><creatorcontrib>Dong, Shoubin</creatorcontrib><creatorcontrib>Zheng, Xiaorou</creatorcontrib><creatorcontrib>Lu, Runuo</creatorcontrib><creatorcontrib>Jia, Jianxin</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><collection>Water Resources Abstracts</collection><collection>Technology Research Database</collection><collection>Environmental Sciences and Pollution Management</collection><collection>ASFA: Aquatic Sciences and Fisheries Abstracts</collection><collection>Engineering Research Database</collection><collection>Aerospace Database</collection><collection>Aquatic Science & Fisheries Abstracts (ASFA) 2: Ocean Technology, Policy & Non-Living Resources</collection><collection>Civil Engineering Abstracts</collection><collection>Aquatic Science & Fisheries Abstracts (ASFA) Professional</collection><collection>Advanced Technologies Database with Aerospace</collection><jtitle>IEEE transactions on geoscience and remote sensing</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Wang, Xusheng</au><au>Dong, Shoubin</au><au>Zheng, Xiaorou</au><au>Lu, Runuo</au><au>Jia, Jianxin</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Explicit High-Level Semantic Network for Domain Generalization in Hyperspectral Image Classification</atitle><jtitle>IEEE transactions on geoscience and remote sensing</jtitle><stitle>TGRS</stitle><date>2024</date><risdate>2024</risdate><volume>62</volume><spage>1</spage><epage>14</epage><pages>1-14</pages><issn>0196-2892</issn><eissn>1558-0644</eissn><coden>IGRSD2</coden><abstract>When applied across different scenes, hyperspectral image (HSI) classification models often struggle to generalize due to the data distribution disparities and labels' scarcity, leading to domain shift (DS) problems. Recently, the high-level semantics from text has demonstrated the potential to address the DS problem, by improving the generalization capability of image encoders through aligning image-text pairs. However, the main challenge still lies in crafting appropriate texts that accurately represent the intricate interrelationships and the fragmented nature of land cover in HSIs and effectively extracting spectral-spatial features from HSI data. This article proposes a domain generalization (DG) method, EHSnet, to address these issues by leveraging multilayered explicit high-level semantic (EHS) information from different types of texts to provide precisely relevant semantic information for the image encoder. A multilayered EHS information paradigm is well-defined, aiming to extract the HSI's intricate interrelationships and the fragmented land-cover features, and a dual-residual encoder connected by a 2-D convolution is designed, which combines CNNs with residual structure and Vision Transformers (ViTs) with short-range cross-layer connections to explore the spectral-spatial features of HSIs. By aligning text features with image features in the semantic space, EHSnet improves the representation capability of the image encoder and is endowed with zero-shot generalization ability for cross-scene tasks. Extensive experiments conducted on three hyperspectral datasets, including Houston, Pavia, and XS datasets, validate the effectiveness and superiority of EHSnet, with the Kappa coefficient improved by 8.17%, 3.22%, and 3.62% across three datasets compared to the state-of-the-art (SOTA) methods. The code is available at https://github.com/SCUT-CCNL/EHSnet .</abstract><cop>New York</cop><pub>IEEE</pub><doi>10.1109/TGRS.2024.3495765</doi><tpages>14</tpages><orcidid>https://orcid.org/0000-0002-4941-5907</orcidid><orcidid>https://orcid.org/0009-0008-5429-3984</orcidid><orcidid>https://orcid.org/0000-0003-0153-850X</orcidid><orcidid>https://orcid.org/0000-0003-4366-4547</orcidid></addata></record>
fulltext	fulltext_linktorsrc
identifier	ISSN: 0196-2892
ispartof	IEEE transactions on geoscience and remote sensing, 2024, Vol.62, p.1-14
issn	0196-2892 1558-0644
language	eng
recordid	cdi_proquest_journals_3133488689
source	IEEE Electronic Library (IEL)
subjects	Classification Coders Contrastive learning Data mining Data models Datasets Domain generalization (DG) Feature extraction hyperspectral image (HSI) classification Hyperspectral imaging Image classification Image processing Information processing Land cover Land surface multiple modality Representation learning Semantics Spatial data Texts Training Visualization visual–language model
title	Explicit High-Level Semantic Network for Domain Generalization in Hyperspectral Image Classification
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-18T19%3A35%3A18IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Explicit%20High-Level%20Semantic%20Network%20for%20Domain%20Generalization%20in%20Hyperspectral%20Image%20Classification&rft.jtitle=IEEE%20transactions%20on%20geoscience%20and%20remote%20sensing&rft.au=Wang,%20Xusheng&rft.date=2024&rft.volume=62&rft.spage=1&rft.epage=14&rft.pages=1-14&rft.issn=0196-2892&rft.eissn=1558-0644&rft.coden=IGRSD2&rft_id=info:doi/10.1109/TGRS.2024.3495765&rft_dat=%3Cproquest_RIE%3E3133488689%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=3133488689&rft_id=info:pmid/&rft_ieee_id=10750220&rfr_iscdi=true