Progressive Downsampling Transformer with Convolution-based Decoder and Its Application in Gear Pitting Measurement

In Transformer for semantic segmentation, patch embedding usually has only one convolutional layer with a large stride, leading to the decrease of feature extraction capability. Additionally, the complex decoder results in high computation cost. To address above-mentioned two issues, we put forward...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE transactions on instrumentation and measurement 2023-01, Vol.72, p.1-1
Hauptverfasser:	Qin, Yi, Wang, Sijun, Xi, Dejun, Liang, Chen
Format:	Artikel
Sprache:	eng
Schlagworte:	Coders Convolution Convolutional neural networks Decoders Decoding Embedding Fatigue tests Feature extraction features refined Gears neural network Object segmentation pitting measurement Semantic segmentation Transformer Transformers
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	1
container_issue
container_start_page	1
container_title	IEEE transactions on instrumentation and measurement
container_volume	72
creator	Qin, Yi Wang, Sijun Xi, Dejun Liang, Chen
description	In Transformer for semantic segmentation, patch embedding usually has only one convolutional layer with a large stride, leading to the decrease of feature extraction capability. Additionally, the complex decoder results in high computation cost. To address above-mentioned two issues, we put forward a progressive downsampling Transformer with convolution-based decoder (PDCDT), which is a simple, efficient yet powerful framework. Specifically, progressive downsampling layers for patch embedding are designed to refine the extracted features and reduce information loss at each stage of the hierarchical Transformer encoder. Meanwhile, a simple decoder based on a convolution (conv) module is proposed for aggregating the characteristic information from multiscale output layers of the encoder, and it can realize dimensional transformation and information interaction with fewer parameters than the decoders used in the existing Transformers. Extensive experiments show that PDCDT achieves competitive results on ADE20K (47.9% mIoU) and Cityscapes (82.6% mIoU). Finally, PDCDT is applied to gear pitting measurement in gear contact fatigue test, and the comparative results indicate that PDCDT can improve the accuracy of pitting detection.
doi_str_mv	10.1109/TIM.2023.3250305
format	Article
fullrecord	<record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_crossref_primary_10_1109_TIM_2023_3250305</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>10056324</ieee_id><sourcerecordid>2785445951</sourcerecordid><originalsourceid>FETCH-LOGICAL-c292t-61eb41b1307fb0fb05cd7cfbfe425551360968a3982b078d12e784f3b0a74fc3</originalsourceid><addsrcrecordid>eNpNkE1PwkAQhjdGExG9e_Cwiefi7Fe3PRJQJIHIgXuzbadYQndxt4X4720DB5NJ5jDP-07yEPLMYMIYpG_b5XrCgYuJ4AoEqBsyYkrpKI1jfktGACyJUqnie_IQwh4AdCz1iISNdzuPIdQnpHN3tsE0x0Ntd3TrjQ2V8w16eq7bbzpz9uQOXVs7G-UmYEnnWLiyPxtb0mUb6PTYRwszELS2dIHG003dtkPdGk3oPDZo20dyV5lDwKfrHpPtx_t29hmtvhbL2XQVFTzlbRQzzCXLmQBd5dCPKkpdVHmFkiulmIghjRMj0oTnoJOScdSJrEQORsuqEGPyeqk9evfTYWizveu87T9mXCdKSpX2JWMCF6rwLgSPVXb0dWP8b8YgG8xmvdlsMJtdzfaRl0ukRsR_OKhYcCn-ACVAdkU</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2785445951</pqid></control><display><type>article</type><title>Progressive Downsampling Transformer with Convolution-based Decoder and Its Application in Gear Pitting Measurement</title><source>IEEE Electronic Library (IEL)</source><creator>Qin, Yi ; Wang, Sijun ; Xi, Dejun ; Liang, Chen</creator><creatorcontrib>Qin, Yi ; Wang, Sijun ; Xi, Dejun ; Liang, Chen</creatorcontrib><description>In Transformer for semantic segmentation, patch embedding usually has only one convolutional layer with a large stride, leading to the decrease of feature extraction capability. Additionally, the complex decoder results in high computation cost. To address above-mentioned two issues, we put forward a progressive downsampling Transformer with convolution-based decoder (PDCDT), which is a simple, efficient yet powerful framework. Specifically, progressive downsampling layers for patch embedding are designed to refine the extracted features and reduce information loss at each stage of the hierarchical Transformer encoder. Meanwhile, a simple decoder based on a convolution (conv) module is proposed for aggregating the characteristic information from multiscale output layers of the encoder, and it can realize dimensional transformation and information interaction with fewer parameters than the decoders used in the existing Transformers. Extensive experiments show that PDCDT achieves competitive results on ADE20K (47.9% mIoU) and Cityscapes (82.6% mIoU). Finally, PDCDT is applied to gear pitting measurement in gear contact fatigue test, and the comparative results indicate that PDCDT can improve the accuracy of pitting detection.</description><identifier>ISSN: 0018-9456</identifier><identifier>EISSN: 1557-9662</identifier><identifier>DOI: 10.1109/TIM.2023.3250305</identifier><identifier>CODEN: IEIMAO</identifier><language>eng</language><publisher>New York: IEEE</publisher><subject>Coders ; Convolution ; Convolutional neural networks ; Decoders ; Decoding ; Embedding ; Fatigue tests ; Feature extraction ; features refined ; Gears ; neural network ; Object segmentation ; pitting measurement ; Semantic segmentation ; Transformer ; Transformers</subject><ispartof>IEEE transactions on instrumentation and measurement, 2023-01, Vol.72, p.1-1</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2023</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c292t-61eb41b1307fb0fb05cd7cfbfe425551360968a3982b078d12e784f3b0a74fc3</citedby><cites>FETCH-LOGICAL-c292t-61eb41b1307fb0fb05cd7cfbfe425551360968a3982b078d12e784f3b0a74fc3</cites><orcidid>0000-0001-5591-5951 ; 0000-0002-2160-4300 ; 0000-0001-7968-6938 ; 0000-0001-8057-9187</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/10056324$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,777,781,793,27906,27907,54740</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/10056324$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Qin, Yi</creatorcontrib><creatorcontrib>Wang, Sijun</creatorcontrib><creatorcontrib>Xi, Dejun</creatorcontrib><creatorcontrib>Liang, Chen</creatorcontrib><title>Progressive Downsampling Transformer with Convolution-based Decoder and Its Application in Gear Pitting Measurement</title><title>IEEE transactions on instrumentation and measurement</title><addtitle>TIM</addtitle><description>In Transformer for semantic segmentation, patch embedding usually has only one convolutional layer with a large stride, leading to the decrease of feature extraction capability. Additionally, the complex decoder results in high computation cost. To address above-mentioned two issues, we put forward a progressive downsampling Transformer with convolution-based decoder (PDCDT), which is a simple, efficient yet powerful framework. Specifically, progressive downsampling layers for patch embedding are designed to refine the extracted features and reduce information loss at each stage of the hierarchical Transformer encoder. Meanwhile, a simple decoder based on a convolution (conv) module is proposed for aggregating the characteristic information from multiscale output layers of the encoder, and it can realize dimensional transformation and information interaction with fewer parameters than the decoders used in the existing Transformers. Extensive experiments show that PDCDT achieves competitive results on ADE20K (47.9% mIoU) and Cityscapes (82.6% mIoU). Finally, PDCDT is applied to gear pitting measurement in gear contact fatigue test, and the comparative results indicate that PDCDT can improve the accuracy of pitting detection.</description><subject>Coders</subject><subject>Convolution</subject><subject>Convolutional neural networks</subject><subject>Decoders</subject><subject>Decoding</subject><subject>Embedding</subject><subject>Fatigue tests</subject><subject>Feature extraction</subject><subject>features refined</subject><subject>Gears</subject><subject>neural network</subject><subject>Object segmentation</subject><subject>pitting measurement</subject><subject>Semantic segmentation</subject><subject>Transformer</subject><subject>Transformers</subject><issn>0018-9456</issn><issn>1557-9662</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNpNkE1PwkAQhjdGExG9e_Cwiefi7Fe3PRJQJIHIgXuzbadYQndxt4X4720DB5NJ5jDP-07yEPLMYMIYpG_b5XrCgYuJ4AoEqBsyYkrpKI1jfktGACyJUqnie_IQwh4AdCz1iISNdzuPIdQnpHN3tsE0x0Ntd3TrjQ2V8w16eq7bbzpz9uQOXVs7G-UmYEnnWLiyPxtb0mUb6PTYRwszELS2dIHG003dtkPdGk3oPDZo20dyV5lDwKfrHpPtx_t29hmtvhbL2XQVFTzlbRQzzCXLmQBd5dCPKkpdVHmFkiulmIghjRMj0oTnoJOScdSJrEQORsuqEGPyeqk9evfTYWizveu87T9mXCdKSpX2JWMCF6rwLgSPVXb0dWP8b8YgG8xmvdlsMJtdzfaRl0ukRsR_OKhYcCn-ACVAdkU</recordid><startdate>20230101</startdate><enddate>20230101</enddate><creator>Qin, Yi</creator><creator>Wang, Sijun</creator><creator>Xi, Dejun</creator><creator>Liang, Chen</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SP</scope><scope>7U5</scope><scope>8FD</scope><scope>L7M</scope><orcidid>https://orcid.org/0000-0001-5591-5951</orcidid><orcidid>https://orcid.org/0000-0002-2160-4300</orcidid><orcidid>https://orcid.org/0000-0001-7968-6938</orcidid><orcidid>https://orcid.org/0000-0001-8057-9187</orcidid></search><sort><creationdate>20230101</creationdate><title>Progressive Downsampling Transformer with Convolution-based Decoder and Its Application in Gear Pitting Measurement</title><author>Qin, Yi ; Wang, Sijun ; Xi, Dejun ; Liang, Chen</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c292t-61eb41b1307fb0fb05cd7cfbfe425551360968a3982b078d12e784f3b0a74fc3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Coders</topic><topic>Convolution</topic><topic>Convolutional neural networks</topic><topic>Decoders</topic><topic>Decoding</topic><topic>Embedding</topic><topic>Fatigue tests</topic><topic>Feature extraction</topic><topic>features refined</topic><topic>Gears</topic><topic>neural network</topic><topic>Object segmentation</topic><topic>pitting measurement</topic><topic>Semantic segmentation</topic><topic>Transformer</topic><topic>Transformers</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Qin, Yi</creatorcontrib><creatorcontrib>Wang, Sijun</creatorcontrib><creatorcontrib>Xi, Dejun</creatorcontrib><creatorcontrib>Liang, Chen</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><collection>Electronics & Communications Abstracts</collection><collection>Solid State and Superconductivity Abstracts</collection><collection>Technology Research Database</collection><collection>Advanced Technologies Database with Aerospace</collection><jtitle>IEEE transactions on instrumentation and measurement</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Qin, Yi</au><au>Wang, Sijun</au><au>Xi, Dejun</au><au>Liang, Chen</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Progressive Downsampling Transformer with Convolution-based Decoder and Its Application in Gear Pitting Measurement</atitle><jtitle>IEEE transactions on instrumentation and measurement</jtitle><stitle>TIM</stitle><date>2023-01-01</date><risdate>2023</risdate><volume>72</volume><spage>1</spage><epage>1</epage><pages>1-1</pages><issn>0018-9456</issn><eissn>1557-9662</eissn><coden>IEIMAO</coden><abstract>In Transformer for semantic segmentation, patch embedding usually has only one convolutional layer with a large stride, leading to the decrease of feature extraction capability. Additionally, the complex decoder results in high computation cost. To address above-mentioned two issues, we put forward a progressive downsampling Transformer with convolution-based decoder (PDCDT), which is a simple, efficient yet powerful framework. Specifically, progressive downsampling layers for patch embedding are designed to refine the extracted features and reduce information loss at each stage of the hierarchical Transformer encoder. Meanwhile, a simple decoder based on a convolution (conv) module is proposed for aggregating the characteristic information from multiscale output layers of the encoder, and it can realize dimensional transformation and information interaction with fewer parameters than the decoders used in the existing Transformers. Extensive experiments show that PDCDT achieves competitive results on ADE20K (47.9% mIoU) and Cityscapes (82.6% mIoU). Finally, PDCDT is applied to gear pitting measurement in gear contact fatigue test, and the comparative results indicate that PDCDT can improve the accuracy of pitting detection.</abstract><cop>New York</cop><pub>IEEE</pub><doi>10.1109/TIM.2023.3250305</doi><tpages>1</tpages><orcidid>https://orcid.org/0000-0001-5591-5951</orcidid><orcidid>https://orcid.org/0000-0002-2160-4300</orcidid><orcidid>https://orcid.org/0000-0001-7968-6938</orcidid><orcidid>https://orcid.org/0000-0001-8057-9187</orcidid></addata></record>
fulltext	fulltext_linktorsrc
identifier	ISSN: 0018-9456
ispartof	IEEE transactions on instrumentation and measurement, 2023-01, Vol.72, p.1-1
issn	0018-9456 1557-9662
language	eng
recordid	cdi_crossref_primary_10_1109_TIM_2023_3250305
source	IEEE Electronic Library (IEL)
subjects	Coders Convolution Convolutional neural networks Decoders Decoding Embedding Fatigue tests Feature extraction features refined Gears neural network Object segmentation pitting measurement Semantic segmentation Transformer Transformers
title	Progressive Downsampling Transformer with Convolution-based Decoder and Its Application in Gear Pitting Measurement
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-17T11%3A41%3A15IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Progressive%20Downsampling%20Transformer%20with%20Convolution-based%20Decoder%20and%20Its%20Application%20in%20Gear%20Pitting%20Measurement&rft.jtitle=IEEE%20transactions%20on%20instrumentation%20and%20measurement&rft.au=Qin,%20Yi&rft.date=2023-01-01&rft.volume=72&rft.spage=1&rft.epage=1&rft.pages=1-1&rft.issn=0018-9456&rft.eissn=1557-9662&rft.coden=IEIMAO&rft_id=info:doi/10.1109/TIM.2023.3250305&rft_dat=%3Cproquest_RIE%3E2785445951%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2785445951&rft_id=info:pmid/&rft_ieee_id=10056324&rfr_iscdi=true