Data poisoning sample generator for Chinese-English neural machine translation model

The invention provides a data poisoning sample generator for a Chinese-English neural machine translation model, and relates to the technical field of data poisoning. The method comprises the following steps: acquiring syntactic information such as a dependency relationship of a sentence sequence; i...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: ZHAO XIANGGUO, SUN YONGJIAO, LU QING, JI HANGXU, ZHOU XIANWEI
Format: Patent
Sprache:chi ; eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:The invention provides a data poisoning sample generator for a Chinese-English neural machine translation model, and relates to the technical field of data poisoning. The method comprises the following steps: acquiring syntactic information such as a dependency relationship of a sentence sequence; inputting the sentence sequence and the processed sentence sequence into a BERT model to obtain a feature vector of the sentence sequence and a feature vector of each word in the sentence sequence; constructing a graph based on the dependency relationship; acquiring a context semantic feature vector by using a graph attention network; obtaining feature vectors of the word entities; fusing each feature vector into a multi-feature fusion feature vector; the multi-feature fusion feature vectors are sent to a relation classifier for relation classification; accessing a large model; and using the large model to generate samples of Chinese-English bilingual sentence pairs according to the obtained relationship. According