YOLOBench: Benchmarking Efficient Object Detectors on Embedded Systems
We present YOLOBench, a benchmark comprised of 550+ YOLO-based object detection models on 4 different datasets and 4 different embedded hardware platforms (x86 CPU, ARM CPU, Nvidia GPU, NPU). We collect accuracy and latency numbers for a variety of YOLO-based one-stage detectors at different model s...
Gespeichert in:
Hauptverfasser: | , , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | We present YOLOBench, a benchmark comprised of 550+ YOLO-based object
detection models on 4 different datasets and 4 different embedded hardware
platforms (x86 CPU, ARM CPU, Nvidia GPU, NPU). We collect accuracy and latency
numbers for a variety of YOLO-based one-stage detectors at different model
scales by performing a fair, controlled comparison of these detectors with a
fixed training environment (code and training hyperparameters).
Pareto-optimality analysis of the collected data reveals that, if modern
detection heads and training techniques are incorporated into the learning
process, multiple architectures of the YOLO series achieve a good
accuracy-latency trade-off, including older models like YOLOv3 and YOLOv4. We
also evaluate training-free accuracy estimators used in neural architecture
search on YOLOBench and demonstrate that, while most state-of-the-art zero-cost
accuracy estimators are outperformed by a simple baseline like MAC count, some
of them can be effectively used to predict Pareto-optimal detection models. We
showcase that by using a zero-cost proxy to identify a YOLO architecture
competitive against a state-of-the-art YOLOv8 model on a Raspberry Pi 4 CPU.
The code and data are available at
https://github.com/Deeplite/deeplite-torch-zoo |
---|---|
DOI: | 10.48550/arxiv.2307.13901 |