A large-scale dataset for korean document-level relation extraction from encyclopedia texts

Document-level relation extraction (RE) aims to predict the relational facts between two given entities from a document. Unlike widespread research on document-level RE in English, Korean document-level RE research is still at the very beginning due to the absence of a dataset. To accelerate the stu...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Applied intelligence (Dordrecht, Netherlands) Netherlands), 2024-09, Vol.54 (17-18), p.8681-8701
Hauptverfasser: Son, Suhyune, Lim, Jungwoo, Koo, Seonmin, Kim, Jinsung, Kim, Younghoon, Lim, Youngsik, Hyun, Dongseok, Lim, Heuiseok
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Document-level relation extraction (RE) aims to predict the relational facts between two given entities from a document. Unlike widespread research on document-level RE in English, Korean document-level RE research is still at the very beginning due to the absence of a dataset. To accelerate the studies, we present TREK ( T oward Document-Level R elation E xtraction in K orean) dataset constructed from Korean encyclopedia documents written by the domain experts. We provide detailed statistical analyses for our large-scale dataset and human evaluation results suggest the assured quality of TREK . Also, we introduce the document-level RE model that considers the named entity-type while considering the Korean language’s properties. In the experiments, we demonstrate that our proposed model outperforms the baselines and conduct qualitative analysis.
ISSN:0924-669X
1573-7497
DOI:10.1007/s10489-024-05605-9