Research on high-performance English translation based on topic model

Retelling extraction is an important branch of Natural Language Processing (NLP), and high-quality retelling resources are very helpful to improve the performance of machine translation. However, traditional methods based on the bilingual parallel corpus often ignore the document background in the p...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Digital communications and networks 2023-04, Vol.9 (2), p.505-511
Hauptverfasser: Shen, Yumin, Guo, Hongyu
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Retelling extraction is an important branch of Natural Language Processing (NLP), and high-quality retelling resources are very helpful to improve the performance of machine translation. However, traditional methods based on the bilingual parallel corpus often ignore the document background in the process of retelling acquisition and application. In order to solve this problem, we introduce topic model information into the translation mode and propose a topic-based statistical machine translation method to improve the translation performance. In this method, Probabilistic Latent Semantic Analysis (PLSA) is used to obtains the co-occurrence relationship between words and documents by the hybrid matrix decomposition. Then we design a decoder to simplify the decoding process. Experiments show that the proposed method can effectively improve the accuracy of translation.
ISSN:2352-8648
2468-5925
2352-8648
DOI:10.1016/j.dcan.2022.03.015