Feature clustering for instrument classification

We propose a method that allows for instrument classification from a piece of sound. Features are derived from a pre-filtered time series divided into small windows. Afterwards, features from the (transformed) spectrum, Perceptive Linear Prediction (PLP), and Mel Frequency Cepstral Coefficients (MFC...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Computational statistics 2011-06, Vol.26 (2), p.279-291
Hauptverfasser:	Ligges, Uwe, Krey, Sebastian
Format:	Artikel
Sprache:	eng
Schlagworte:	Classification Clustering Computer programs Economic Theory/Quantitative Economics/Mathematical Methods Errors Feature extraction Library cataloging Mathematical analysis Mathematics and Statistics Matlab Methods Musical instruments Musical recordings Original Paper Probability and Statistics in Computer Science Probability Theory and Stochastic Processes Programming languages Sound Speech processing Statistics Studies Support vector machines Time series Windows (intervals)
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	We propose a method that allows for instrument classification from a piece of sound. Features are derived from a pre-filtered time series divided into small windows. Afterwards, features from the (transformed) spectrum, Perceptive Linear Prediction (PLP), and Mel Frequency Cepstral Coefficients (MFCCs) as known from speech processing are selected. As a clustering method, k-means is applied yielding a reduced number of features for the classification task. A SVM classifier using a polynomial kernel yields good results. The accuracy is very convincing given a misclassification error of roughly 19% for 59 different classes of instruments. As expected, misclassification error is smaller for a problem with less classes. The rastamat library (Ellis in PLP and RASTA (and MFCC, and inversion) in Matlab. http://www.ee.columbia.edu/~dpwe/resources/matlab/rastamat/ , online web resource, 2005 ) functionality has been ported from Matlab to R. This means feature extraction as known from speech processing is now easily available from the statistical programming language R. This software has been used on a cluster of machines for the computer intensive evaluation of the proposed method.
ISSN:	0943-4062 1613-9658
DOI:	10.1007/s00180-011-0234-8