Audio-visual atoms for generic video concept classification

We investigate the challenging issue of joint audio-visual analysis of generic videos targeting at concept detection. We extract a novel local representation, Audio-Visual Atom (AVA), which is defined as a region track associated with regional visual features and audio onset features. We develop a h...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:ACM transactions on multimedia computing communications and applications 2010-08, Vol.6 (3), p.1-19
Hauptverfasser: Jiang, Wei, Cotton, Courtenay, Chang, Shih-Fu, Ellis, Dan, Loui, Alexander C.
Format: Artikel
Sprache:eng
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:We investigate the challenging issue of joint audio-visual analysis of generic videos targeting at concept detection. We extract a novel local representation, Audio-Visual Atom (AVA), which is defined as a region track associated with regional visual features and audio onset features. We develop a hierarchical algorithm to extract visual atoms from generic videos, and locate energy onsets from the corresponding soundtrack by time-frequency analysis. Audio atoms are extracted around energy onsets. Visual and audio atoms form AVAs, based on which discriminative audio-visual codebooks are constructed for concept detection. Experiments over Kodak's consumer benchmark videos confirm the effectiveness of our approach.
ISSN:1551-6857
1551-6865
DOI:10.1145/1823746.1823748