Audio-visual atoms for generic video concept classification
We investigate the challenging issue of joint audio-visual analysis of generic videos targeting at concept detection. We extract a novel local representation, Audio-Visual Atom (AVA), which is defined as a region track associated with regional visual features and audio onset features. We develop a h...
Gespeichert in:
Veröffentlicht in: | ACM transactions on multimedia computing communications and applications 2010-08, Vol.6 (3), p.1-19 |
---|---|
Hauptverfasser: | , , , , |
Format: | Artikel |
Sprache: | eng |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | We investigate the challenging issue of joint audio-visual analysis of generic videos targeting at concept detection. We extract a novel local representation, Audio-Visual Atom (AVA), which is defined as a region track associated with regional visual features and audio onset features. We develop a hierarchical algorithm to extract visual atoms from generic videos, and locate energy onsets from the corresponding soundtrack by time-frequency analysis. Audio atoms are extracted around energy onsets. Visual and audio atoms form AVAs, based on which discriminative audio-visual codebooks are constructed for concept detection. Experiments over Kodak's consumer benchmark videos confirm the effectiveness of our approach. |
---|---|
ISSN: | 1551-6857 1551-6865 |
DOI: | 10.1145/1823746.1823748 |