Optimizing YOLO Architectures for Optimal Road Damage Detection and Classification: A Comparative Study from YOLOv7 to YOLOv10
Maintaining roadway infrastructure is essential for ensuring a safe, efficient, and sustainable transportation system. However, manual data collection for detecting road damage is time-consuming, labor-intensive, and poses safety risks. Recent advancements in artificial intelligence, particularly de...
Gespeichert in:
Hauptverfasser: | , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Maintaining roadway infrastructure is essential for ensuring a safe,
efficient, and sustainable transportation system. However, manual data
collection for detecting road damage is time-consuming, labor-intensive, and
poses safety risks. Recent advancements in artificial intelligence,
particularly deep learning, offer a promising solution for automating this
process using road images. This paper presents a comprehensive workflow for
road damage detection using deep learning models, focusing on optimizations for
inference speed while preserving detection accuracy. Specifically, to
accommodate hardware limitations, large images are cropped, and lightweight
models are utilized. Additionally, an external pothole dataset is incorporated
to enhance the detection of this underrepresented damage class. The proposed
approach employs multiple model architectures, including a custom YOLOv7 model
with Coordinate Attention layers and a Tiny YOLOv7 model, which are trained and
combined to maximize detection performance. The models are further
reparameterized to optimize inference efficiency. Experimental results
demonstrate that the ensemble of the custom YOLOv7 model with three Coordinate
Attention layers and the default Tiny YOLOv7 model achieves an F1 score of
0.7027 with an inference speed of 0.0547 seconds per image. The complete
pipeline, including data preprocessing, model training, and inference scripts,
is publicly available on the project's GitHub repository, enabling
reproducibility and facilitating further research. |
---|---|
DOI: | 10.48550/arxiv.2410.08409 |