Cross-lingual document similarity estimation and dictionary generation with comparable corpora
This paper proposes an approach for performing bilingual dictionary generation even when trained on widely available comparable bilingual corpora. We also show its capability to provide cross-lingual similarity estimates that correlate well with human judgments. We implement an approach using a nonl...
Gespeichert in:
Veröffentlicht in: | Knowledge and information systems 2019-03, Vol.58 (3), p.729-743 |
---|---|
Hauptverfasser: | , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | This paper proposes an approach for performing bilingual dictionary generation even when trained on widely available comparable bilingual corpora. We also show its capability to provide cross-lingual similarity estimates that correlate well with human judgments. We implement an approach using a nonlinear bilingual translation model that we train using comparable corpora. We propose a method using word embeddings and kernel approximation to train scalable nonlinear transformations. We demonstrate that this novel method works better on a majority of evaluated language pairs. |
---|---|
ISSN: | 0219-1377 0219-3116 |
DOI: | 10.1007/s10115-018-1179-9 |