Towards Accurate Translation via Semantically Appropriate Application of Lexical Constraints
Lexically-constrained NMT (LNMT) aims to incorporate user-provided terminology into translations. Despite its practical advantages, existing work has not evaluated LNMT models under challenging real-world conditions. In this paper, we focus on two important but under-studied issues that lie in the c...
Gespeichert in:
Hauptverfasser: | , , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Lexically-constrained NMT (LNMT) aims to incorporate user-provided
terminology into translations. Despite its practical advantages, existing work
has not evaluated LNMT models under challenging real-world conditions. In this
paper, we focus on two important but under-studied issues that lie in the
current evaluation process of LNMT studies. The model needs to cope with
challenging lexical constraints that are "homographs" or "unseen" during
training. To this end, we first design a homograph disambiguation module to
differentiate the meanings of homographs. Moreover, we propose PLUMCOT, which
integrates contextually rich information about unseen lexical constraints from
pre-trained language models and strengthens a copy mechanism of the pointer
network via direct supervision of a copying score. We also release HOLLY, an
evaluation benchmark for assessing the ability of a model to cope with
"homographic" and "unseen" lexical constraints. Experiments on HOLLY and the
previous test setup show the effectiveness of our method. The effects of
PLUMCOT are shown to be remarkable in "unseen" constraints. Our dataset is
available at https://github.com/papago-lab/HOLLY-benchmark |
---|---|
DOI: | 10.48550/arxiv.2306.12089 |