A novel multi‐model 3D object detection framework with adaptive voxel‐image feature fusion
The multifaceted nature of sensor data has long been a hurdle for those seeking to harness its full potential in the field of 3D object detection. Although the utilisation of point clouds as input has yielded exceptional results, the challenge of effectively combining the complementary properties of...
Gespeichert in:
Veröffentlicht in: | IET Computer Vision 2024-08, Vol.18 (5), p.640-651 |
---|---|
Hauptverfasser: | , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | The multifaceted nature of sensor data has long been a hurdle for those seeking to harness its full potential in the field of 3D object detection. Although the utilisation of point clouds as input has yielded exceptional results, the challenge of effectively combining the complementary properties of multi‐sensor data looms large. This work presents a new approach to multi‐model 3D object detection, called adaptive voxel‐image feature fusion (AVIFF). Adaptive voxel‐image feature fusion is an end‐to‐end single‐shot framework that can dynamically and adaptively fuse point cloud and image features, resulting in a more comprehensive and integrated analysis of the camera sensor and the LiDar sensor data. With the aid of the adaptive feature fusion module, spatialised image features can be adroitly fused with voxel‐based point cloud features, while the Dense Fusion module ensures the preservation of the distinctive characteristics of 3D point cloud data through the use of a heterogeneous architecture. Notably, the authors’ framework features a novel generalised intersection over union loss function that enhances the perceptibility of object localsation and rotation in 3D space. Comprehensive experimentation has validated the efficacy of the authors’ proposed modules, firmly establishing AVIFF as a novel framework in the field of 3D object detection.
A voxel‐based single‐shot multi‐model network for 3D object detection is introduced, namely AVIFF. The authors made some new attempts in fusing features of point cloud and image by designing the adaptive feature fusion (AFF) module and dense fusion (DF) module. Besides, the authors introduced GIoU loss into 3D space to increase localisation and rotation perception capabilities to the authors’ framework. |
---|---|
ISSN: | 1751-9632 1751-9640 |
DOI: | 10.1049/cvi2.12269 |