Voxel-CIM: An Efficient Compute-in-Memory Accelerator for Voxel-based Point Cloud Neural Networks

The 3D point cloud perception has emerged as a fundamental role for a wide range of applications. In particular, with the rapid development of neural networks, the voxel-based networks attract great attention due to their excellent performance. Various accelerator designs have been proposed to impro...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Lin, Xipeng, Huang, Shanshi, Jiang, Hongwu
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:The 3D point cloud perception has emerged as a fundamental role for a wide range of applications. In particular, with the rapid development of neural networks, the voxel-based networks attract great attention due to their excellent performance. Various accelerator designs have been proposed to improve the hardware performance of voxel-based networks, especially to speed up the map search process. However, several challenges still exist including: (1) massive off-chip data access volume caused by map search operations, notably for high resolution and dense distribution cases, (2) frequent data movement for data-intensive convolution operations, (3) imbalanced workload caused by irregular sparsity of point data. To address the above challenges, we propose Voxel-CIM, an efficient Compute-in-Memory based accelerator for voxel-based neural network processing. To reduce off-chip memory access for map search, a depth-encoding-based output major search approach is introduced to maximize data reuse, achieving stable $O(N)$-level data access volume in various situations. Voxel-CIM also employs the in-memory computing paradigm and designs innovative weight mapping strategies to efficiently process Sparse 3D convolutions and 2D convolutions. Implemented on 22 nm technology and evaluated on representative benchmarks, the Voxel-CIM achieves averagely 4.5~7.0$\times$ higher energy efficiency (10.8 TOPS/w), and 2.4~5.4$\times$ speed up in detection task and 1.2~8.1$\times$ speed up in segmentation task compared to the state-of-the-art point cloud accelerators and powerful GPUs.
DOI:10.48550/arxiv.2409.19077