Question answering method for infrastructure damage information retrieval from textual data using bidirectional encoder representations from transformers

Manual searching for infrastructure damage information from large amounts of textual data requires considerable time and effort. A fast and accurate collection of damage information from such data is necessary for effective infrastructure planning. In this study, a question answering method was prop...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Automation in construction 2022-02, Vol.134, p.104061, Article 104061
Hauptverfasser: Kim, Yohan, Bang, Seongdeok, Sohn, Jiu, Kim, Hyoungkwan
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Manual searching for infrastructure damage information from large amounts of textual data requires considerable time and effort. A fast and accurate collection of damage information from such data is necessary for effective infrastructure planning. In this study, a question answering method was proposed to provide users with infrastructure damage information from textual data automatically. The proposed method relies on a natural language model called bidirectional encoder representations from transformers for information retrieval. From the 143 reports collected from the National Hurricane Center, 533 question-answer pairs were formulated. The proposed model was trained with 435 pairs and tested with the remainder. The model was also tested with 43 question-answer pairs created using earthquake-related textual data and achieved F1-scores of 90.5% and 83.6% for the hurricane and earthquake datasets, respectively. •Question answering method extracts infrastructure damage information from text data.•Proposed method based on bidirectional encoder representations from transformers.•Training and test datasets consist of 533 question-answer pairs.•F1-scores for hurricane and earthquake data are 90.5% and 83.6%, respectively.
ISSN:0926-5805
1872-7891
DOI:10.1016/j.autcon.2021.104061