Data poisoning sample generator for Chinese-English neural machine translation model
The invention provides a data poisoning sample generator for a Chinese-English neural machine translation model, and relates to the technical field of data poisoning. The method comprises the following steps: acquiring syntactic information such as a dependency relationship of a sentence sequence; i...
Gespeichert in:
Hauptverfasser: | , , , , |
---|---|
Format: | Patent |
Sprache: | chi ; eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | The invention provides a data poisoning sample generator for a Chinese-English neural machine translation model, and relates to the technical field of data poisoning. The method comprises the following steps: acquiring syntactic information such as a dependency relationship of a sentence sequence; inputting the sentence sequence and the processed sentence sequence into a BERT model to obtain a feature vector of the sentence sequence and a feature vector of each word in the sentence sequence; constructing a graph based on the dependency relationship; acquiring a context semantic feature vector by using a graph attention network; obtaining feature vectors of the word entities; fusing each feature vector into a multi-feature fusion feature vector; the multi-feature fusion feature vectors are sent to a relation classifier for relation classification; accessing a large model; and using the large model to generate samples of Chinese-English bilingual sentence pairs according to the obtained relationship. According |
---|