PointNet-Transformer Fusion Network for In-Cabin Occupancy Monitoring With mm-Wave Radar

Due to the irregular distribution of 3-D point clouds and the random movement of the passengers, it is difficult to accurately and rapidly determine the passenger occupancy based on the multiple-input and multiple-output (MIMO) radar. In this study, we present a lightweight neural network, dubbed Po...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE sensors journal 2024-02, Vol.24 (4), p.5370-5382
Hauptverfasser: Xiao, Zhiqiang, Ye, Kuntao, Cui, Guolong
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Due to the irregular distribution of 3-D point clouds and the random movement of the passengers, it is difficult to accurately and rapidly determine the passenger occupancy based on the multiple-input and multiple-output (MIMO) radar. In this study, we present a lightweight neural network, dubbed PointNet and transformer fusion neural network (PTFNet), that fuses the PointNet and transformer to quickly and accurately detect the occupancy of three rear seats and two footwells. PTFNet adapts an input transform block to encode the 3-D point cloud directly, which makes it more lightweight than conventional voxelization, heatmap, and PointNet based algorithms. Meanwhile, a cross-attention mechanism block inspired by transformer is employed to efficiently extract features from 3-D point clouds, resulting in an improvement in detection accuracy. An in-cabin occupancy monitoring system (OMS) is implemented in a real vehicle to obtain 3-D point cloud datasets, which are then used to evaluate the detection performance of PTFNet. Our datasets cover both the rear seat area and the footwell area. The footwell space is considered for the first time in the point cloud datasets. The experimental results demonstrate that PTFNet outperforms other popular approaches in terms of detection accuracy by at least 1.6%, while the consumption of memory and running time are reduced by at least 44% and 34%, respectively.
ISSN:1530-437X
1558-1748
DOI:10.1109/JSEN.2023.3347893