Detection and classification of vehicles using audio visual cues

This paper presents a software-based vehicle detection and classification system capable of classifying traffic into four different classes, namely two-wheeler, three-wheeler, car, and heavy motor vehicle. It uses traffic video collected by a camera mounted on a vehicle parked by the side of a two-l...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Multimedia tools and applications 2023-11, Vol.82 (28), p.44087-44106
Hauptverfasser:	S., Anuja Prasad, Mary, Leena, Koshy, Bino I.
Format:	Artikel
Sprache:	eng
Schlagworte:	Algorithms Artificial neural networks Automatic vehicle identification systems Cameras Computer Communication Networks Computer Science Data Structures and Information Theory Frames (data processing) Image classification Motor vehicles Multimedia Information Systems Special Purpose and Application-Based Systems System effectiveness Traffic information Vehicles
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	This paper presents a software-based vehicle detection and classification system capable of classifying traffic into four different classes, namely two-wheeler, three-wheeler, car, and heavy motor vehicle. It uses traffic video collected by a camera mounted on a vehicle parked by the side of a two-lane undivided road. Video frames containing vehicles are identified, both by automatically detecting peaks in Short Time Energy (STE) of the corresponding audio signal and adaptive background subtraction of the video frames, followed by blob subtraction and morphological operations. This may result in multiple images containing the same vehicle, which is eliminated using a Speeded Up Robust Feature (SURF) matching algorithm. Classification of resulting images is attempted in three different ways. In System 1, an SVM trained with explicit features such as Histogram of Gradient (HOG), Local Binary Pattern (LBP), and KAZE are used as a classifier and their performance is compared. In System 2, the task is performed using a deep neural network namely Single Shot Multibox Detector (SSD). The accuracy of the SSD system deteriorates when it is tested using video collected by another camera in a different environment. This issue is addressed in System 3 by retraining the SSD in the new set of images, without the use of manually labeled images. The effectiveness of all the proposed systems is validated using the collected heterogeneous traffic data.
ISSN:	1380-7501 1573-7721
DOI:	10.1007/s11042-023-14868-2