Structural Segmentation of Musical Audio by Constrained Clustering

We describe a method of segmenting musical audio into structural sections based on a hierarchical labeling of spectral features. Frames of audio are first labeled as belonging to one of a number of discrete states using a hidden Markov model trained on the features. Histograms of neighboring frames...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE transactions on audio, speech, and language processing speech, and language processing, 2008-02, Vol.16 (2), p.318-326
Hauptverfasser:	Levy, M., Sandler, M.
Format:	Artikel
Sprache:	eng
Schlagworte:	Audio Bridges Clustering Clustering algorithms Constraints Continuity Frames Hidden Markov models Histograms Instruments Labeling Mathematical models Multiple signal classification music Music information retrieval Natural language processing Pattern recognition Segmentation Spectra
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	We describe a method of segmenting musical audio into structural sections based on a hierarchical labeling of spectral features. Frames of audio are first labeled as belonging to one of a number of discrete states using a hidden Markov model trained on the features. Histograms of neighboring frames are then clustered into segment-types representing distinct distributions of states, using a clustering algorithm in which temporal continuity is expressed as a set of constraints modeled by a hidden Markov random field. We give experimental results which show that in many cases the resulting segmentations correspond well to conventional notions of musical form. We show further how the constrained clustering approach can easily be extended to include prior musical knowledge, input from other machine approaches, or semi-supervision.
ISSN:	1558-7916 2329-9290 1558-7924 2329-9304
DOI:	10.1109/TASL.2007.910781