Overcoming label noise in audio event detection using sequential labeling
This paper addresses the noisy label issue in audio event detection (AED) by refining strong labels as sequential labels with inaccurate timestamps removed. In AED, strong labels contain the occurrence of a specific event and its timestamps corresponding to the start and end of the event in an audio...
Gespeichert in:
Hauptverfasser: | , , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | This paper addresses the noisy label issue in audio event detection (AED) by
refining strong labels as sequential labels with inaccurate timestamps removed.
In AED, strong labels contain the occurrence of a specific event and its
timestamps corresponding to the start and end of the event in an audio clip.
The timestamps depend on subjectivity of each annotator, and their label noise
is inevitable. Contrary to the strong labels, weak labels indicate only the
occurrence of a specific event. They do not have the label noise caused by the
timestamps, but the time information is excluded. To fully exploit information
from available strong and weak labels, we propose an AED scheme to train with
sequential labels in addition to the given strong and weak labels after
converting the strong labels into the sequential labels. Using sequential
labels consistently improved the performance particularly with the
segment-based F-score by focusing on occurrences of events. In the
mean-teacher-based approach for semi-supervised learning, including an early
step with sequential prediction in addition to supervised learning with
sequential labels mitigated label noise and inaccurate prediction of the
teacher model and improved the segment-based F-score significantly while
maintaining the event-based F-score. |
---|---|
DOI: | 10.48550/arxiv.2007.05191 |