Research on high-performance English translation based on topic model
Retelling extraction is an important branch of Natural Language Processing (NLP), and high-quality retelling resources are very helpful to improve the performance of machine translation. However, traditional methods based on the bilingual parallel corpus often ignore the document background in the p...
Gespeichert in:
Veröffentlicht in: | Digital communications and networks 2023-04, Vol.9 (2), p.505-511 |
---|---|
Hauptverfasser: | , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Retelling extraction is an important branch of Natural Language Processing (NLP), and high-quality retelling resources are very helpful to improve the performance of machine translation. However, traditional methods based on the bilingual parallel corpus often ignore the document background in the process of retelling acquisition and application. In order to solve this problem, we introduce topic model information into the translation mode and propose a topic-based statistical machine translation method to improve the translation performance. In this method, Probabilistic Latent Semantic Analysis (PLSA) is used to obtains the co-occurrence relationship between words and documents by the hybrid matrix decomposition. Then we design a decoder to simplify the decoding process. Experiments show that the proposed method can effectively improve the accuracy of translation. |
---|---|
ISSN: | 2352-8648 2468-5925 2352-8648 |
DOI: | 10.1016/j.dcan.2022.03.015 |