Extracting Fallen Objects on the Road From Accident Reports Using a Natural Language Processing Model-Based Approach

Keyword extraction is an effective way to quickly identify key elements in text. It can accelerate the identification of key factors that play a role in accidents when applied to incident report analysis. Our research presents an innovative process for extracting keywords from incident reports with...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE access 2023, Vol.11, p.139521-139533
Hauptverfasser: Lee, Seung-Seok, Cha, So-Mi, Ko, Bonggyun, Park, Je Jin
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Keyword extraction is an effective way to quickly identify key elements in text. It can accelerate the identification of key factors that play a role in accidents when applied to incident report analysis. Our research presents an innovative process for extracting keywords from incident reports with the pre-trained natural language processing models. We utilized fine-tuning techniques to integrate a BiLSTM-CRF with a fully-connected layer and pre-trained natural language models. The process of extracting keyphrases is approached as a task of labeling sequences. To analyze incident reports from Korea, we employ pre-trained models customized for the Korean context, such as KoBERT and KoELECTRA. Our approach is assessed using a range of metrics, including accuracy, area under the curve (AUC), F1-score, slot error rate (SER), and simple matching coefficient (SMC). In contrast to traditional approaches which mainly concentrate on document summarization, our research provides a distinct method tailored to identifying falling objects as the main cause of accidents. Our findings demonstrate that the ELECTRA-based model with a BiLSTM-CRF outperforms other models, achieving an accuracy of 0.943, an AUC of 0.991, and a low SER of 0.075. The F1-score and SMC closely resemble the BERT-based model with a BiLSTM-CRF, with no significant differences observed within the 95% confidence interval. These results underscore the potential of fine-tuning pre-trained models for post-hoc traffic accident analysis. This method offers a swift preliminary step to identify the key factors before human analysis, presenting a multifaceted strategy to enhance road safety and prevent accidents.
ISSN:2169-3536
2169-3536
DOI:10.1109/ACCESS.2023.3339774