Noisy-Labeled NER with Confidence Estimation
Recent studies in deep learning have shown significant progress in named entity recognition (NER). Most existing works assume clean data annotation, yet a fundamental challenge in real-world scenarios is the large amount of noise from a variety of sources (e.g., pseudo, weak, or distant annotations)...
Gespeichert in:
Hauptverfasser: | , , , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Recent studies in deep learning have shown significant progress in named
entity recognition (NER). Most existing works assume clean data annotation, yet
a fundamental challenge in real-world scenarios is the large amount of noise
from a variety of sources (e.g., pseudo, weak, or distant annotations). This
work studies NER under a noisy labeled setting with calibrated confidence
estimation. Based on empirical observations of different training dynamics of
noisy and clean labels, we propose strategies for estimating confidence scores
based on local and global independence assumptions. We partially marginalize
out labels of low confidence with a CRF model. We further propose a calibration
method for confidence scores based on the structure of entity labels. We
integrate our approach into a self-training framework for boosting performance.
Experiments in general noisy settings with four languages and distantly labeled
settings demonstrate the effectiveness of our method. Our code can be found at
https://github.com/liukun95/Noisy-NER-Confidence-Estimation |
---|---|
DOI: | 10.48550/arxiv.2104.04318 |