Improving autonomous driving systems with CPU extensions for point cloud processing

(English) Autonomous Driving Systems (ADS) are at the cusp of large-scale adoption, promising accident reduction and market potential. However, the complex software and sensor data pressure for better hardware support in this safety-critical scenario, where high performance is mandatory to meet late...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
1. Verfasser: Exenberger Becker, Pedro Henrique
Format: Dissertation
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:(English) Autonomous Driving Systems (ADS) are at the cusp of large-scale adoption, promising accident reduction and market potential. However, the complex software and sensor data pressure for better hardware support in this safety-critical scenario, where high performance is mandatory to meet latency deadlines. Additionally, energy efficiency, cost, and volume must also be first-class for market feasibility, calling computer architects into action. To enrich hardware support for ADS, we carry out a performance and power characterization of Autoware.ai, a state-of-the-art ADS software stack. We find significant time spent processing Light Imaging Detection and Ranging (LiDAR) sensor data, which are widely used by ADS. LiDAR captures 3D point clouds for tasks such as segmentation, localization, and object detection. Despite its importance, hardware support for LiDAR has only recently gained traction. Further, while most point cloud processing algorithms run on CPUs, recent works propose costly hardware accelerators. Instead, we aim to use existing general-purpose hardware and software for point cloud processing with minor CPU augmentations. For that, we introduce a small set of CPU instructions targeting point cloud neighbor search based on k-d trees, a key operation used in various algorithms. The first technique we propose is K-D Bonsai, which reduces data movement during the neighbor search by compressing k-d tree leaves in execution time, exploiting value similarity. K-D Bonsai further compresses the data using a reduced floating-point representation, exploiting the physically limited range of point cloud values collected with LiDAR. We implement K-D Bonsai through a small set of new CPU instructions to compress, decompress, and operate on points. To maintain baseline accuracy, we carefully craft the instructions to detect precision loss due to compression, allowing re-computation in full precision to take place if necessary. Therefore, K-D Bonsai reduces data movement, improving performance and energy efficiency while guaranteeing baseline accuracy and programmability. K-D Bonsai improves the end-to-end latency of the segmentation task of Autoware.ai by 9.26% on average, 12.19% in tail latency, and reduces energy consumption by 10.84%. Unlike the expensive accelerators proposed in related work, K-D Bonsai improves neighbor search with minimal area increase (0.36%). In the second technique, we found that consecutive neighbor search queries are often si