Masked Auto-Encoding Spectral-Spatial Transformer for Hyperspectral Image Classification

Deep learning has certainly become the dominant trend in hyperspectral (HS) remote sensing (RS) image classification owing to its excellent capabilities to extract highly discriminating spectral-spatial features. In this context, transformer networks have recently shown prominent results in distingu...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE transactions on geoscience and remote sensing 2022, Vol.60, p.1-14
Hauptverfasser:	Ibanez, Damian, Fernandez-Beltran, Ruben, Pla, Filiberto, Yokoya, Naoto
Format:	Artikel
Sprache:	eng
Schlagworte:	Atmospheric effects Atmospheric models Classification Complexity theory Computer architecture Deep learning Feature extraction Hyperspectral (HS) imaging Hyperspectral imaging Image classification Image reconstruction Machine learning mask auto-encoders (MAEs) Remote sensing Thermal noise Transformers Vision Transformers (ViTs)
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	14
container_issue
container_start_page	1
container_title	IEEE transactions on geoscience and remote sensing
container_volume	60
creator	Ibanez, Damian Fernandez-Beltran, Ruben Pla, Filiberto Yokoya, Naoto
description	Deep learning has certainly become the dominant trend in hyperspectral (HS) remote sensing (RS) image classification owing to its excellent capabilities to extract highly discriminating spectral-spatial features. In this context, transformer networks have recently shown prominent results in distinguishing even the most subtle spectral differences because of their potential to characterize sequential spectral data. Nonetheless, many complexities affecting HS remote sensing data (e.g., atmospheric effects, thermal noise, quantization noise) may severely undermine such potential since no mode of relieving noisy feature patterns has still been developed within transformer networks. To address the problem, this article presents a novel masked auto-encoding spectral-spatial transformer (MAEST), which gathers two different collaborative branches: 1) a reconstruction path, which dynamically uncovers the most robust encoding features based on a masking auto-encoding strategy, and 2) a classification path, which embeds these features onto a transformer network to classify the data focusing on the features that better reconstruct the input. Unlike other existing models, this novel design pursues to learn refined transformer features considering the aforementioned complexities of the HS remote sensing image domain. The experimental comparison, including several state-of-the-art methods and benchmark datasets, shows the superior results obtained by MAEST. The codes of this article will be available at https://github.com/ibanezfd/MAEST .
doi_str_mv	10.1109/TGRS.2022.3217892
format	Article
fullrecord	<record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_crossref_primary_10_1109_TGRS_2022_3217892</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>9931741</ieee_id><sourcerecordid>2736886490</sourcerecordid><originalsourceid>FETCH-LOGICAL-c402t-1c200c16ccf7481ed62c9c3e019b7ee417ed88f00b036c8d5d8104610558816a3</originalsourceid><addsrcrecordid>eNo9kE1rwkAQhpfSQq3tDyi9BHqOndlsNpujiFXBUqgWelvWzURiY5LuxoP_3hWlp4HheefjYewZYYQI-dt69rUaceB8lHDMVM5v2ADTVMUghbhlA8Bcxjz079mD9zsAFClmA_bzYfwvFdH40LfxtLFtUTXbaNWR7Z2p41Vn-srU0dqZxpet25OLQonmx46cv1LRYm-2FE1q431VVjZE2uaR3ZWm9vR0rUP2_T5dT-bx8nO2mIyXsRXA-xgtB7AorS0zoZAKyW1uEwr3bjIigRkVSpUAG0ikVUVaKAQhEcJvCqVJhuz1Mrdz7d-BfK937cE1YaXmWSKVkiKHQOGFsq713lGpO1ftjTtqBH0WqM8C9VmgvgoMmZdLpiKifz7PE8wEJic6n2xP</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2736886490</pqid></control><display><type>article</type><title>Masked Auto-Encoding Spectral-Spatial Transformer for Hyperspectral Image Classification</title><source>IEEE Electronic Library (IEL)</source><creator>Ibanez, Damian ; Fernandez-Beltran, Ruben ; Pla, Filiberto ; Yokoya, Naoto</creator><creatorcontrib>Ibanez, Damian ; Fernandez-Beltran, Ruben ; Pla, Filiberto ; Yokoya, Naoto</creatorcontrib><description>Deep learning has certainly become the dominant trend in hyperspectral (HS) remote sensing (RS) image classification owing to its excellent capabilities to extract highly discriminating spectral-spatial features. In this context, transformer networks have recently shown prominent results in distinguishing even the most subtle spectral differences because of their potential to characterize sequential spectral data. Nonetheless, many complexities affecting HS remote sensing data (e.g., atmospheric effects, thermal noise, quantization noise) may severely undermine such potential since no mode of relieving noisy feature patterns has still been developed within transformer networks. To address the problem, this article presents a novel masked auto-encoding spectral-spatial transformer (MAEST), which gathers two different collaborative branches: 1) a reconstruction path, which dynamically uncovers the most robust encoding features based on a masking auto-encoding strategy, and 2) a classification path, which embeds these features onto a transformer network to classify the data focusing on the features that better reconstruct the input. Unlike other existing models, this novel design pursues to learn refined transformer features considering the aforementioned complexities of the HS remote sensing image domain. The experimental comparison, including several state-of-the-art methods and benchmark datasets, shows the superior results obtained by MAEST. The codes of this article will be available at https://github.com/ibanezfd/MAEST .</description><identifier>ISSN: 0196-2892</identifier><identifier>EISSN: 1558-0644</identifier><identifier>DOI: 10.1109/TGRS.2022.3217892</identifier><identifier>CODEN: IGRSD2</identifier><language>eng</language><publisher>New York: IEEE</publisher><subject>Atmospheric effects ; Atmospheric models ; Classification ; Complexity theory ; Computer architecture ; Deep learning ; Feature extraction ; Hyperspectral (HS) imaging ; Hyperspectral imaging ; Image classification ; Image reconstruction ; Machine learning ; mask auto-encoders (MAEs) ; Remote sensing ; Thermal noise ; Transformers ; Vision Transformers (ViTs)</subject><ispartof>IEEE transactions on geoscience and remote sensing, 2022, Vol.60, p.1-14</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2022</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c402t-1c200c16ccf7481ed62c9c3e019b7ee417ed88f00b036c8d5d8104610558816a3</citedby><cites>FETCH-LOGICAL-c402t-1c200c16ccf7481ed62c9c3e019b7ee417ed88f00b036c8d5d8104610558816a3</cites><orcidid>0000-0003-0054-3489 ; 0000-0002-7321-4590 ; 0000-0003-1374-8416 ; 0000-0002-3252-1252</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/9931741$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,777,781,793,4010,27904,27905,27906,54739</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/9931741$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Ibanez, Damian</creatorcontrib><creatorcontrib>Fernandez-Beltran, Ruben</creatorcontrib><creatorcontrib>Pla, Filiberto</creatorcontrib><creatorcontrib>Yokoya, Naoto</creatorcontrib><title>Masked Auto-Encoding Spectral-Spatial Transformer for Hyperspectral Image Classification</title><title>IEEE transactions on geoscience and remote sensing</title><addtitle>TGRS</addtitle><description>Deep learning has certainly become the dominant trend in hyperspectral (HS) remote sensing (RS) image classification owing to its excellent capabilities to extract highly discriminating spectral-spatial features. In this context, transformer networks have recently shown prominent results in distinguishing even the most subtle spectral differences because of their potential to characterize sequential spectral data. Nonetheless, many complexities affecting HS remote sensing data (e.g., atmospheric effects, thermal noise, quantization noise) may severely undermine such potential since no mode of relieving noisy feature patterns has still been developed within transformer networks. To address the problem, this article presents a novel masked auto-encoding spectral-spatial transformer (MAEST), which gathers two different collaborative branches: 1) a reconstruction path, which dynamically uncovers the most robust encoding features based on a masking auto-encoding strategy, and 2) a classification path, which embeds these features onto a transformer network to classify the data focusing on the features that better reconstruct the input. Unlike other existing models, this novel design pursues to learn refined transformer features considering the aforementioned complexities of the HS remote sensing image domain. The experimental comparison, including several state-of-the-art methods and benchmark datasets, shows the superior results obtained by MAEST. The codes of this article will be available at https://github.com/ibanezfd/MAEST .</description><subject>Atmospheric effects</subject><subject>Atmospheric models</subject><subject>Classification</subject><subject>Complexity theory</subject><subject>Computer architecture</subject><subject>Deep learning</subject><subject>Feature extraction</subject><subject>Hyperspectral (HS) imaging</subject><subject>Hyperspectral imaging</subject><subject>Image classification</subject><subject>Image reconstruction</subject><subject>Machine learning</subject><subject>mask auto-encoders (MAEs)</subject><subject>Remote sensing</subject><subject>Thermal noise</subject><subject>Transformers</subject><subject>Vision Transformers (ViTs)</subject><issn>0196-2892</issn><issn>1558-0644</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2022</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNo9kE1rwkAQhpfSQq3tDyi9BHqOndlsNpujiFXBUqgWelvWzURiY5LuxoP_3hWlp4HheefjYewZYYQI-dt69rUaceB8lHDMVM5v2ADTVMUghbhlA8Bcxjz079mD9zsAFClmA_bzYfwvFdH40LfxtLFtUTXbaNWR7Z2p41Vn-srU0dqZxpet25OLQonmx46cv1LRYm-2FE1q431VVjZE2uaR3ZWm9vR0rUP2_T5dT-bx8nO2mIyXsRXA-xgtB7AorS0zoZAKyW1uEwr3bjIigRkVSpUAG0ikVUVaKAQhEcJvCqVJhuz1Mrdz7d-BfK937cE1YaXmWSKVkiKHQOGFsq713lGpO1ftjTtqBH0WqM8C9VmgvgoMmZdLpiKifz7PE8wEJic6n2xP</recordid><startdate>2022</startdate><enddate>2022</enddate><creator>Ibanez, Damian</creator><creator>Fernandez-Beltran, Ruben</creator><creator>Pla, Filiberto</creator><creator>Yokoya, Naoto</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7UA</scope><scope>8FD</scope><scope>C1K</scope><scope>F1W</scope><scope>FR3</scope><scope>H8D</scope><scope>H96</scope><scope>KR7</scope><scope>L.G</scope><scope>L7M</scope><orcidid>https://orcid.org/0000-0003-0054-3489</orcidid><orcidid>https://orcid.org/0000-0002-7321-4590</orcidid><orcidid>https://orcid.org/0000-0003-1374-8416</orcidid><orcidid>https://orcid.org/0000-0002-3252-1252</orcidid></search><sort><creationdate>2022</creationdate><title>Masked Auto-Encoding Spectral-Spatial Transformer for Hyperspectral Image Classification</title><author>Ibanez, Damian ; Fernandez-Beltran, Ruben ; Pla, Filiberto ; Yokoya, Naoto</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c402t-1c200c16ccf7481ed62c9c3e019b7ee417ed88f00b036c8d5d8104610558816a3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2022</creationdate><topic>Atmospheric effects</topic><topic>Atmospheric models</topic><topic>Classification</topic><topic>Complexity theory</topic><topic>Computer architecture</topic><topic>Deep learning</topic><topic>Feature extraction</topic><topic>Hyperspectral (HS) imaging</topic><topic>Hyperspectral imaging</topic><topic>Image classification</topic><topic>Image reconstruction</topic><topic>Machine learning</topic><topic>mask auto-encoders (MAEs)</topic><topic>Remote sensing</topic><topic>Thermal noise</topic><topic>Transformers</topic><topic>Vision Transformers (ViTs)</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Ibanez, Damian</creatorcontrib><creatorcontrib>Fernandez-Beltran, Ruben</creatorcontrib><creatorcontrib>Pla, Filiberto</creatorcontrib><creatorcontrib>Yokoya, Naoto</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><collection>Water Resources Abstracts</collection><collection>Technology Research Database</collection><collection>Environmental Sciences and Pollution Management</collection><collection>ASFA: Aquatic Sciences and Fisheries Abstracts</collection><collection>Engineering Research Database</collection><collection>Aerospace Database</collection><collection>Aquatic Science & Fisheries Abstracts (ASFA) 2: Ocean Technology, Policy & Non-Living Resources</collection><collection>Civil Engineering Abstracts</collection><collection>Aquatic Science & Fisheries Abstracts (ASFA) Professional</collection><collection>Advanced Technologies Database with Aerospace</collection><jtitle>IEEE transactions on geoscience and remote sensing</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Ibanez, Damian</au><au>Fernandez-Beltran, Ruben</au><au>Pla, Filiberto</au><au>Yokoya, Naoto</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Masked Auto-Encoding Spectral-Spatial Transformer for Hyperspectral Image Classification</atitle><jtitle>IEEE transactions on geoscience and remote sensing</jtitle><stitle>TGRS</stitle><date>2022</date><risdate>2022</risdate><volume>60</volume><spage>1</spage><epage>14</epage><pages>1-14</pages><issn>0196-2892</issn><eissn>1558-0644</eissn><coden>IGRSD2</coden><abstract>Deep learning has certainly become the dominant trend in hyperspectral (HS) remote sensing (RS) image classification owing to its excellent capabilities to extract highly discriminating spectral-spatial features. In this context, transformer networks have recently shown prominent results in distinguishing even the most subtle spectral differences because of their potential to characterize sequential spectral data. Nonetheless, many complexities affecting HS remote sensing data (e.g., atmospheric effects, thermal noise, quantization noise) may severely undermine such potential since no mode of relieving noisy feature patterns has still been developed within transformer networks. To address the problem, this article presents a novel masked auto-encoding spectral-spatial transformer (MAEST), which gathers two different collaborative branches: 1) a reconstruction path, which dynamically uncovers the most robust encoding features based on a masking auto-encoding strategy, and 2) a classification path, which embeds these features onto a transformer network to classify the data focusing on the features that better reconstruct the input. Unlike other existing models, this novel design pursues to learn refined transformer features considering the aforementioned complexities of the HS remote sensing image domain. The experimental comparison, including several state-of-the-art methods and benchmark datasets, shows the superior results obtained by MAEST. The codes of this article will be available at https://github.com/ibanezfd/MAEST .</abstract><cop>New York</cop><pub>IEEE</pub><doi>10.1109/TGRS.2022.3217892</doi><tpages>14</tpages><orcidid>https://orcid.org/0000-0003-0054-3489</orcidid><orcidid>https://orcid.org/0000-0002-7321-4590</orcidid><orcidid>https://orcid.org/0000-0003-1374-8416</orcidid><orcidid>https://orcid.org/0000-0002-3252-1252</orcidid><oa>free_for_read</oa></addata></record>
fulltext	fulltext_linktorsrc
identifier	ISSN: 0196-2892
ispartof	IEEE transactions on geoscience and remote sensing, 2022, Vol.60, p.1-14
issn	0196-2892 1558-0644
language	eng
recordid	cdi_crossref_primary_10_1109_TGRS_2022_3217892
source	IEEE Electronic Library (IEL)
subjects	Atmospheric effects Atmospheric models Classification Complexity theory Computer architecture Deep learning Feature extraction Hyperspectral (HS) imaging Hyperspectral imaging Image classification Image reconstruction Machine learning mask auto-encoders (MAEs) Remote sensing Thermal noise Transformers Vision Transformers (ViTs)
title	Masked Auto-Encoding Spectral-Spatial Transformer for Hyperspectral Image Classification
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-20T12%3A36%3A48IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Masked%20Auto-Encoding%20Spectral-Spatial%20Transformer%20for%20Hyperspectral%20Image%20Classification&rft.jtitle=IEEE%20transactions%20on%20geoscience%20and%20remote%20sensing&rft.au=Ibanez,%20Damian&rft.date=2022&rft.volume=60&rft.spage=1&rft.epage=14&rft.pages=1-14&rft.issn=0196-2892&rft.eissn=1558-0644&rft.coden=IGRSD2&rft_id=info:doi/10.1109/TGRS.2022.3217892&rft_dat=%3Cproquest_RIE%3E2736886490%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2736886490&rft_id=info:pmid/&rft_ieee_id=9931741&rfr_iscdi=true