Lattice-Based Transformer Encoder for Neural Machine Translation
Neural machine translation (NMT) takes deterministic sequences for source representations. However, either word-level or subword-level segmentations have multiple choices to split a source sequence with different word segmentors or different subword vocabulary sizes. We hypothesize that the diversit...
Gespeichert in:
Hauptverfasser: | , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Neural machine translation (NMT) takes deterministic sequences for source
representations. However, either word-level or subword-level segmentations have
multiple choices to split a source sequence with different word segmentors or
different subword vocabulary sizes. We hypothesize that the diversity in
segmentations may affect the NMT performance. To integrate different
segmentations with the state-of-the-art NMT model, Transformer, we propose
lattice-based encoders to explore effective word or subword representation in
an automatic way during training. We propose two methods: 1) lattice positional
encoding and 2) lattice-aware self-attention. These two methods can be used
together and show complementary to each other to further improve translation
performance. Experiment results show superiorities of lattice-based encoders in
word-level and subword-level representations over conventional Transformer
encoder. |
---|---|
DOI: | 10.48550/arxiv.1906.01282 |