Korean Historical Documents Analysis with Improved Dynamic Word Embedding

Historical documents refer to records or books that provide textual information about the thoughts and consciousness of past civilisations, and therefore, they have historical significance. These documents are used as key sources for historical studies as they provide information over several histor...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Applied sciences 2020-11, Vol.10 (21), p.7939
Hauptverfasser: Jin, KyoHoon, Wi, JeongA, Kang, KyeongPil, Kim, YoungBin
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Historical documents refer to records or books that provide textual information about the thoughts and consciousness of past civilisations, and therefore, they have historical significance. These documents are used as key sources for historical studies as they provide information over several historical periods. Many studies have analysed various historical documents using deep learning; however, studies that employ changes in information over time are lacking. In this study, we propose a deep-learning approach using improved dynamic word embedding to determine the characteristics of 27 kings mentioned in the Annals of the Joseon Dynasty, which contains a record of 500 years. The characteristics of words for each king were quantitated based on dynamic word embedding; further, this information was applied to named entity recognition and neural machine translation.In experiments, we confirmed that the method we proposed showed better performance than other methods. In the named entity recognition task, the F1-score was 0.68; in the neural machine translation task, the BLEU4 score was 0.34. We demonstrated that this approach can be used to extract information about diplomatic relationships with neighbouring countries and the economic conditions of the Joseon Dynasty.
ISSN:2076-3417
2076-3417
DOI:10.3390/app10217939