CORPUS CONVERSION APPARATUS AND COMPUTER PROGRAM
PROBLEM TO BE SOLVED: To provide a corpus conversion apparatus capable of preparing a parallel translation corpus capable of more effectively learning of a statistical machine translation apparatus. SOLUTION: The corpus conversion apparatus 20 for converting a parallel translation corpus 50 between...
Gespeichert in:
1. Verfasser: | |
---|---|
Format: | Patent |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | PROBLEM TO BE SOLVED: To provide a corpus conversion apparatus capable of preparing a parallel translation corpus capable of more effectively learning of a statistical machine translation apparatus. SOLUTION: The corpus conversion apparatus 20 for converting a parallel translation corpus 50 between Japanese and English into a learning parallel translation corpus suitable for learning statistical machine translation comprises: a classification processing part 52 for classifying parallel translation included in the parallel translation corpus 50 into synonymous sentence classes each of which comprises parallel translations judged as mutually synonym in accordance with prescribed standards; a Japanese representative sentence determination part 56 for determining a representative sentence representing Japanese sentences of the parallel translation included in each classified synonymous sentence class; and a Japanese substitution processing part 60 for preparing a Japanese-English corpus 22 by substantially substituting the representative sentence determined by the Japanese representative sentence determination part 56 in the synonymous sentence class classified by the classification processing part 52 for each Japanese sentence of the parallel translation. COPYRIGHT: (C)2007,JPO&INPIT |
---|