Survey of BERT-Base Models for Scientific Text Classification: COVID-19 Case Study

On 30 January 2020, the World Health Organization announced a new coronavirus, which later turned out to be very dangerous. Since that date, COVID-19 has spread to become a pandemic that has now affected practically all regions in the world. Since then, many researchers in medicine have contributed...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Applied sciences 2022-03, Vol.12 (6), p.2891
Hauptverfasser:	Khadhraoui, Mayara, Bellaaj, Hatem, Ammar, Mehdi Ben, Hamam, Habib, Jmaiel, Mohamed
Format:	Artikel
Sprache:	eng
Schlagworte:	Artificial intelligence BERT Classification Coronaviruses COVID-19 Data retrieval Datasets deep learning Epidemiology Literature reviews Medical research Scientific papers scientific publications scientific text classification Search engines Statistical analysis transfer learning
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	On 30 January 2020, the World Health Organization announced a new coronavirus, which later turned out to be very dangerous. Since that date, COVID-19 has spread to become a pandemic that has now affected practically all regions in the world. Since then, many researchers in medicine have contributed to fighting COVID-19. In this context and given the great growth of scientific publications related to this global pandemic, manual text and data retrieval has become a challenging task. To remedy this challenge, we are proposing CovBERT, a pre-trained language model based on the BERT model to automate the literature review process. CovBERT relies on prior training on a large corpus of scientific publications in the biomedical domain and related to COVID-19 to increase its performance on the literature review task. We evaluate CovBERT on the classification of short text based on our scientific dataset of biomedical articles on COVID-19 entitled COV-Dat-20. We demonstrate statistically significant improvements by using BERT.
ISSN:	2076-3417 2076-3417
DOI:	10.3390/app12062891