Neural machine translation method and apparatus
The present invention provides a method of generating training data to which explicit word-alignment information is added without impairing sub-word tokens, and a neural machine translation method and apparatus including the method. The method of generating training data includes the steps of: (1) s...
Gespeichert in:
1. Verfasser: | |
---|---|
Format: | Patent |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | The present invention provides a method of generating training data to which explicit word-alignment information is added without impairing sub-word tokens, and a neural machine translation method and apparatus including the method. The method of generating training data includes the steps of: (1) separating basic word boundaries through morphological analysis or named entity recognition of a sentence of a bilingual corpus used for learning; (2) extracting explicit word-alignment information from the sentence of the bilingual corpus used for learning; (3) further dividing the word boundaries separated in step (1) into sub-word tokens; (4) generating new source language training data by using an output from the step (1) and an output from the step (3); and (5) generating new target language training data by using the explicit word-alignment information generated in the step (2) and the target language outputs from the steps (1) and (3). |
---|