A Universal Discriminator for Zero-Shot Generalization
Generative modeling has been the dominant approach for large-scale pretraining and zero-shot generalization. In this work, we challenge this convention by showing that discriminative approaches perform substantially better than generative ones on a large number of NLP tasks. Technically, we train a...
Gespeichert in:
Hauptverfasser: | , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Generative modeling has been the dominant approach for large-scale
pretraining and zero-shot generalization. In this work, we challenge this
convention by showing that discriminative approaches perform substantially
better than generative ones on a large number of NLP tasks. Technically, we
train a single discriminator to predict whether a text sample comes from the
true data distribution, similar to GANs. Since many NLP tasks can be formulated
as selecting from a few options, we use this discriminator to predict the
concatenation of input and which option has the highest probability of coming
from the true data distribution. This simple formulation achieves
state-of-the-art zero-shot results on the T0 benchmark, outperforming T0 by
16.0\%, 7.8\%, and 11.5\% respectively on different scales. In the finetuning
setting, our approach also achieves new state-of-the-art results on a wide
range of NLP tasks, with only 1/4 parameters of previous methods. Meanwhile,
our approach requires minimal prompting efforts, which largely improves
robustness and is essential for real-world applications. Furthermore, we also
jointly train a generalized UD in combination with generative tasks, which
maintains its advantage on discriminative tasks and simultaneously works on
generative tasks. |
---|---|
DOI: | 10.48550/arxiv.2211.08099 |