Progressive Downsampling Transformer with Convolution-based Decoder and Its Application in Gear Pitting Measurement

In Transformer for semantic segmentation, patch embedding usually has only one convolutional layer with a large stride, leading to the decrease of feature extraction capability. Additionally, the complex decoder results in high computation cost. To address above-mentioned two issues, we put forward...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on instrumentation and measurement 2023-01, Vol.72, p.1-1
Hauptverfasser: Qin, Yi, Wang, Sijun, Xi, Dejun, Liang, Chen
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:In Transformer for semantic segmentation, patch embedding usually has only one convolutional layer with a large stride, leading to the decrease of feature extraction capability. Additionally, the complex decoder results in high computation cost. To address above-mentioned two issues, we put forward a progressive downsampling Transformer with convolution-based decoder (PDCDT), which is a simple, efficient yet powerful framework. Specifically, progressive downsampling layers for patch embedding are designed to refine the extracted features and reduce information loss at each stage of the hierarchical Transformer encoder. Meanwhile, a simple decoder based on a convolution (conv) module is proposed for aggregating the characteristic information from multiscale output layers of the encoder, and it can realize dimensional transformation and information interaction with fewer parameters than the decoders used in the existing Transformers. Extensive experiments show that PDCDT achieves competitive results on ADE20K (47.9% mIoU) and Cityscapes (82.6% mIoU). Finally, PDCDT is applied to gear pitting measurement in gear contact fatigue test, and the comparative results indicate that PDCDT can improve the accuracy of pitting detection.
ISSN:0018-9456
1557-9662
DOI:10.1109/TIM.2023.3250305