Data poisoning sample generator for Chinese-English neural machine translation model

The invention provides a data poisoning sample generator for a Chinese-English neural machine translation model, and relates to the technical field of data poisoning. The method comprises the following steps: acquiring syntactic information such as a dependency relationship of a sentence sequence; i...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	ZHAO XIANGGUO, SUN YONGJIAO, LU QING, JI HANGXU, ZHOU XIANWEI
Format:	Patent
Sprache:	chi ; eng
Schlagworte:	CALCULATING COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS COMPUTING COUNTING ELECTRIC DIGITAL DATA PROCESSING PHYSICS
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	The invention provides a data poisoning sample generator for a Chinese-English neural machine translation model, and relates to the technical field of data poisoning. The method comprises the following steps: acquiring syntactic information such as a dependency relationship of a sentence sequence; inputting the sentence sequence and the processed sentence sequence into a BERT model to obtain a feature vector of the sentence sequence and a feature vector of each word in the sentence sequence; constructing a graph based on the dependency relationship; acquiring a context semantic feature vector by using a graph attention network; obtaining feature vectors of the word entities; fusing each feature vector into a multi-feature fusion feature vector; the multi-feature fusion feature vectors are sent to a relation classifier for relation classification; accessing a large model; and using the large model to generate samples of Chinese-English bilingual sentence pairs according to the obtained relationship. According