Continued Pretraining for Better Zero- and Few-Shot Promptability
Recently introduced language model prompting methods can achieve high accuracy in zero- and few-shot settings while requiring few to no learned task-specific parameters. Nevertheless, these methods still often trail behind full model finetuning. In this work, we investigate if a dedicated continued...
Gespeichert in:
Hauptverfasser: | , , , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Recently introduced language model prompting methods can achieve high
accuracy in zero- and few-shot settings while requiring few to no learned
task-specific parameters. Nevertheless, these methods still often trail behind
full model finetuning. In this work, we investigate if a dedicated continued
pretraining stage could improve "promptability", i.e., zero-shot performance
with natural language prompts or few-shot performance with prompt tuning. We
reveal settings where existing continued pretraining methods lack
promptability. We also identify current methodological gaps, which we fill with
thorough large-scale experiments. We demonstrate that a simple recipe,
continued pretraining that incorporates a trainable prompt during multi-task
learning, leads to improved promptability in both zero- and few-shot settings
compared to existing methods, up to 31% relative. On the other hand, we find
that continued pretraining using MAML-style meta-learning, a method that
directly optimizes few-shot promptability, yields subpar performance. We
validate our findings with two prompt tuning methods, and, based on our
results, we provide concrete recommendations to optimize promptability for
different use cases. |
---|---|
DOI: | 10.48550/arxiv.2210.10258 |