What is this article about? Generative summarization with the BERT model in the geosciences domain

In recent years, a large amount of data has been accumulated, such as those recorded in geological journals and report literature, which contain a wealth of information, but these data have not been fully exploited or mined. Automatic information extraction offers an effective way to achieve new dis...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Earth science informatics 2022-03, Vol.15 (1), p.21-36
Hauptverfasser: Ma, Kai, Tian, Miao, Tan, Yongjian, Xie, Xuejing, Qiu, Qinjun
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:In recent years, a large amount of data has been accumulated, such as those recorded in geological journals and report literature, which contain a wealth of information, but these data have not been fully exploited or mined. Automatic information extraction offers an effective way to achieve new discoveries and pursue further analysis, which is of great significance for users, researchers or decision makers to aid and support analysis. In this paper, we utilize the bidirectional encoder representations from transformers (BERT) model, which is fine-tuned and then applied to automatically generate the title of a given input summarization based on the collection of published literature samples. The framework contains an encoder module, decoder module and training module. The core stages of summary generation involve the combination of encoder and decoder modules, and the multi-stage function is then used to connect modules, thus endowing the text summarization model with a multi-task learning architecture. Compared to other baseline models, our proposed model obtains the best results on the constructed dataset. Therefore, based on the proposed model, an automatic geological briefing generation platform is developed and used as an online platform to support the excavation of key areas and a visual presentation analysis of the literature.
ISSN:1865-0473
1865-0481
DOI:10.1007/s12145-021-00695-2