Unsupervised Text Style Transfer with Padded Masked Language Models
We propose Masker, an unsupervised text-editing method for style transfer. To tackle cases when no parallel source-target pairs are available, we train masked language models (MLMs) for both the source and the target domain. Then we find the text spans where the two models disagree the most in terms...
Gespeichert in:
Hauptverfasser: | , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | We propose Masker, an unsupervised text-editing method for style transfer. To
tackle cases when no parallel source-target pairs are available, we train
masked language models (MLMs) for both the source and the target domain. Then
we find the text spans where the two models disagree the most in terms of
likelihood. This allows us to identify the source tokens to delete to transform
the source text to match the style of the target domain. The deleted tokens are
replaced with the target MLM, and by using a padded MLM variant, we avoid
having to predetermine the number of inserted tokens. Our experiments on
sentence fusion and sentiment transfer demonstrate that Masker performs
competitively in a fully unsupervised setting. Moreover, in low-resource
settings, it improves supervised methods' accuracy by over 10 percentage points
when pre-training them on silver training data generated by Masker. |
---|---|
DOI: | 10.48550/arxiv.2010.01054 |