Transfer learning parallel statement pair extraction method and device based on language similarity

The invention relates to a transfer learning parallel statement pair extraction method and device based on language similarity, and belongs to the field of natural language processing. According to the method, firstly, corpora of Thai and Lao are preprocessed, sub-words and words in the Thai are rep...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: WANG ZHENHAN, MAO CUNLI, YU ZHENGTAO, HUANG YUXIN, MAN ZHIBO, GAO SHENGXIANG
Format: Patent
Sprache:chi ; eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:The invention relates to a transfer learning parallel statement pair extraction method and device based on language similarity, and belongs to the field of natural language processing. According to the method, firstly, corpora of Thai and Lao are preprocessed, sub-words and words in the Thai are replaced based on phonetic symbols, and unified representation of Thai and Lao statements is obtained; a Chinese-Thai parallel statement pair extraction model is migrated to a Chinese-Lao model by utilizing a data migration and model migration method based on Thai-Lao language similarity, and finally, Chinese-Lao parallel statement pairs input into the model are predicted by utilizing the pre-trained parallel statement pair extraction model. According to the method provided by the invention, the language similarity can be effectively modeled, and the Chinese-Thai statement pair extraction model with rich resources is migrated to a Chinese-Lao statement pair extraction model with scarce resources, so that the purpose o