PTM: Torus Masking for 3D Representation Learning Guided by Robust and Trusted Teachers

3D Masked Point Modeling (MPM) typically involves randomly or blockly discarding points or patches and then reconstructing them, offering a promising avenue for exploring geometric representation. By surveying current masking strategies, we have found that random-masked regions are provided with exc...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on circuits and systems for video technology 2024-12, Vol.34 (12), p.12158-12170
Hauptverfasser: Cheng, Haozhe, Zhu, Jihua, Hu, Naiwen, Chen, Jinqian, Yan, Wenbiao
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:3D Masked Point Modeling (MPM) typically involves randomly or blockly discarding points or patches and then reconstructing them, offering a promising avenue for exploring geometric representation. By surveying current masking strategies, we have found that random-masked regions are provided with excessive context, reducing modeling difficulty but impeding knowledge transfer. While, block-masked regions lack sufficient guidance, resulting in significant generated noise. To address these issues, we propose PTM, a novel Transformer-style 3D MPM method employing a torus masking strategy. Specifically, a high-density area is chosen as the masked region, forming a torus by retaining small-radius neighborhoods around the center point. To mitigate torus modeling noise, the designed robust teacher model captures density scale to construct noise embedding, utilizing a reverse fit function for reconstruction assistance. Furthermore, the proposed trusted teacher model defines the multi-modal global descriptor as subjective evidence. On a semantic level, we form semi-subjective trusted evidence to guide reconstruction by evaluating the contribution of each subjective evidence to 3D representation. Downstream fine-tuning tasks validate the state-of-the-art performance of PTM in multi-scale point cloud classification and segmentation.
ISSN:1051-8215
1558-2205
DOI:10.1109/TCSVT.2024.3430904