Dumodds: Dual modeling approach for drowsiness detection based on spatial and spatio-temporal features

Road accidents have been a significant problem in recent years. As per statistics, this is primarily due to the driver’s drowsy behavior. As an impact, many valuable lives have been lost in road accidents. So, a reliable system is required to overcome this issue. As part of this meticulous analysis,...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Engineering applications of artificial intelligence 2023-03, Vol.119, p.105759, Article 105759
Hauptverfasser: Pandey, Nageshwar Nath, Muppalaneni, Naresh Babu
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Road accidents have been a significant problem in recent years. As per statistics, this is primarily due to the driver’s drowsy behavior. As an impact, many valuable lives have been lost in road accidents. So, a reliable system is required to overcome this issue. As part of this meticulous analysis, we have chosen a sizable realistic drowsiness video dataset created by the University of Texas. After that, we picked just the extreme classes of videos, such as alert and drowsy, from this dataset. Then, we created two distinct models, namely Model-A for temporal features and Model-B for spatiotemporal characteristics. In the first model, computer vision techniques, i.e., YOLOv3, are used to retrieve temporal characteristics, then processed using long short-term memory (LSTM). Here, we suited the occlusion issue by imposing a condition on each frame. The overfitting problem arises when occluded frames are discarded during this procedure. This issue is handled with the help of TransGAN’s augmentation approach. The second model, on the other hand, extracts spatial information using a convolution neural network (CNN) called InceptionV3, which is subsequently processed using LSTM. Even though Model-A is more complicated and has lower accuracy, i.e., 86%, than Model-B, with an accuracy of 97.5%, the investigation reveals that Model-A seems much superior to Model-B regarding the training period. These differences are emphasized through the AUC-ROC score and confusion metrics.
ISSN:0952-1976
1873-6769
DOI:10.1016/j.engappai.2022.105759