Open-Domain Text Evaluation via Contrastive Distribution Methods
Recent advancements in open-domain text generation, driven by the power of large pre-trained language models (LLMs), have demonstrated remarkable performance. However, assessing these models' generation quality remains a challenge. In this paper, we introduce a novel method for evaluating open-...
Gespeichert in:
Hauptverfasser: | , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Recent advancements in open-domain text generation, driven by the power of
large pre-trained language models (LLMs), have demonstrated remarkable
performance. However, assessing these models' generation quality remains a
challenge. In this paper, we introduce a novel method for evaluating
open-domain text generation called Contrastive Distribution Methods (CDM).
Leveraging the connection between increasing model parameters and enhanced LLM
performance, CDM creates a mapping from the _contrast_ of two probabilistic
distributions -- one known to be superior to the other -- to quality measures.
We investigate CDM for open-domain text generation evaluation under two
paradigms: 1) _Generative_ CDM, which harnesses the contrast of two language
models' distributions to generate synthetic examples for training
discriminator-based metrics; 2) _Discriminative_ CDM, which directly uses
distribution disparities between two language models for evaluation. Our
experiments on coherence evaluation for multi-turn dialogue and commonsense
evaluation for controllable generation demonstrate CDM's superior correlate
with human judgment than existing automatic evaluation metrics, highlighting
the strong performance and generalizability of our approach. |
---|---|
DOI: | 10.48550/arxiv.2306.11879 |