Learning to Paraphrase Sentences to Different Complexity Levels

While sentence simplification is an active research topic in NLP, its adjacent tasks of sentence complexification and same-level paraphrasing are not. To train models on all three tasks, we present two new unsupervised datasets. We compare these datasets, one labeled by a weak classifier and the oth...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Transactions of the Association for Computational Linguistics 2023-11, Vol.11, p.1332-1354
Hauptverfasser:	Chi, Alison, Chen, Li-Kuang, Chang, Yi-Chen, Lee, Shu-Hui, Chang, Jason S.
Format:	Artikel
Sprache:	eng
Schlagworte:	Classification Classifiers Datasets Language Language modeling Large language models Linguistics Multitasking Natural language processing Paraphrase Readability Sentences Simplification Task complexity Writing
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	While sentence simplification is an active research topic in NLP, its adjacent tasks of sentence complexification and same-level paraphrasing are not. To train models on all three tasks, we present two new unsupervised datasets. We compare these datasets, one labeled by a weak classifier and the other by a rule-based approach, with a single supervised dataset. Using these three datasets for training, we perform extensive experiments on both multitasking and prompting strategies. Compared to other systems trained on unsupervised parallel data, models trained on our weak classifier labeled dataset achieve state-of-the-art performance on the ASSET simplification benchmark. Our models also outperform previous work on sentence-level targeting. Finally, we establish how a handful of Large Language Models perform on these tasks under a zero-shot setting.
ISSN:	2307-387X 2307-387X
DOI:	10.1162/tacl_a_00606