Improving Label Quality by Jointly Modeling Items and Annotators
We propose a fully Bayesian framework for learning ground truth labels from noisy annotators. Our framework ensures scalability by factoring a generative, Bayesian soft clustering model over label distributions into the classic David and Skene joint annotator-data model. Earlier research along these...
Gespeichert in:
Hauptverfasser: | , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | We propose a fully Bayesian framework for learning ground truth labels from
noisy annotators.
Our framework ensures scalability by factoring a generative, Bayesian soft
clustering model over label distributions into the classic David and Skene
joint annotator-data model. Earlier research along these lines has neither
fully incorporated label distributions nor explored clustering by annotators
only or data only. Our framework incorporates all of these properties as:
(1) a graphical model designed to provide better ground truth estimates of
annotator responses as input to \emph{any} black box supervised learning
algorithm, and
(2) a standalone neural model whose internal structure captures many of the
properties of the graphical model.
We conduct supervised learning experiments using both models and compare them
to the performance of one baseline and a state-of-the-art model. |
---|---|
DOI: | 10.48550/arxiv.2106.10600 |