Non-Negative Multilinear Principal Component Analysis of Auditory Temporal Modulations for Music Genre Classification

Motivated by psychophysiological investigations on the human auditory system, a bio-inspired two-dimensional auditory representation of music signals is exploited, that captures the slow temporal modulations. Although each recording is represented by a second-order tensor (i.e., a matrix), a third-o...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE transactions on audio, speech, and language processing speech, and language processing, 2010-03, Vol.18 (3), p.576-588
Hauptverfasser:	Panagakis, Y., Kotropoulos, C., Arce, G.R.
Format:	Artikel
Sprache:	eng
Schlagworte:	Algorithms Auditory representations Auditory system Classification Feature extraction Humans Mathematical analysis Matrix decomposition Modulation Multiple signal classification Music music genre classification non-negative multilinear principal components analysis (NMPCA) non-negative tensor factorization (NTF) nonnegative matrix factorization (NMF) Principal component analysis Principal components analysis Psychology Scattering Singular value decomposition Studies Subspaces Tensile stress Tensors
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	588
container_issue	3
container_start_page	576
container_title	IEEE transactions on audio, speech, and language processing
container_volume	18
creator	Panagakis, Y. Kotropoulos, C. Arce, G.R.
description	Motivated by psychophysiological investigations on the human auditory system, a bio-inspired two-dimensional auditory representation of music signals is exploited, that captures the slow temporal modulations. Although each recording is represented by a second-order tensor (i.e., a matrix), a third-order tensor is needed to represent a music corpus. Non-negative multilinear principal component analysis (NMPCA) is proposed for the unsupervised dimensionality reduction of the third-order tensors. The NMPCA maximizes the total tensor scatter while preserving the non-negativity of auditory representations. An algorithm for NMPCA is derived by exploiting the structure of the Grassmann manifold. The NMPCA is compared against three multilinear subspace analysis techniques, namely the non-negative tensor factorization, the high-order singular value decomposition, and the multilinear principal component analysis as well as their linear counterparts, i.e., the non-negative matrix factorization, the singular value decomposition, and the principal components analysis in extracting features that are subsequently classified by either support vector machine or nearest neighbor classifiers. Three different sets of experiments conducted on the GTZAN and the ISMIR2004 Genre datasets demonstrate the superiority of NMPCA against the aforementioned subspace analysis techniques in extracting more discriminating features, especially when the training set has small cardinality. The best classification accuracies reported in the paper exceed those obtained by the state-of-the-art music genre classification algorithms applied to both datasets.
doi_str_mv	10.1109/TASL.2009.2036813
format	Article
fullrecord	<record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_proquest_miscellaneous_875030002</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>5337979</ieee_id><sourcerecordid>2567680431</sourcerecordid><originalsourceid>FETCH-LOGICAL-c422t-38661d27c63f666a02cec478137d347aa7d4cb4ad576cc9ca81bb5a02f9e2fc33</originalsourceid><addsrcrecordid>eNqFkUlLBDEQhRtRcP0B4iV48dSarZPOcRjcYFzA8Rwy6WqJZJIx6Rbm35txxIMXL1UF76sHVa-qTgm-JASrq_nkZXZJMValMNEStlMdkKZpa6ko3_2didivDnN-x5gzwclBNT7GUD_CmxncJ6CH0Q_OuwAmoefkgnUr49E0LlcxQBjQJBi_zi6j2KPJ2LkhpjWaQ5FT4R5iN_piFENGfUzFLTuLbiEkQFNvcna9s9_6cbXXG5_h5KcfVa831_PpXT17ur2fTma15ZQONWuFIB2VVrBeCGEwtWC5LNfJjnFpjOy4XXDTNVJYq6xpyWLRFKxXQHvL2FF1sfVdpfgxQh700mUL3psAccy6lQ1mGGP6LynLv7BkbON5_od8j2Mqj8laEUmpwpwXiGwhm2LOCXq9Sm5p0loTrDeB6U1gehOY_gms7JxtdxwA_PINY1JJxb4Abu-S-g</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>917229044</pqid></control><display><type>article</type><title>Non-Negative Multilinear Principal Component Analysis of Auditory Temporal Modulations for Music Genre Classification</title><source>IEEE Electronic Library (IEL)</source><creator>Panagakis, Y. ; Kotropoulos, C. ; Arce, G.R.</creator><creatorcontrib>Panagakis, Y. ; Kotropoulos, C. ; Arce, G.R.</creatorcontrib><description>Motivated by psychophysiological investigations on the human auditory system, a bio-inspired two-dimensional auditory representation of music signals is exploited, that captures the slow temporal modulations. Although each recording is represented by a second-order tensor (i.e., a matrix), a third-order tensor is needed to represent a music corpus. Non-negative multilinear principal component analysis (NMPCA) is proposed for the unsupervised dimensionality reduction of the third-order tensors. The NMPCA maximizes the total tensor scatter while preserving the non-negativity of auditory representations. An algorithm for NMPCA is derived by exploiting the structure of the Grassmann manifold. The NMPCA is compared against three multilinear subspace analysis techniques, namely the non-negative tensor factorization, the high-order singular value decomposition, and the multilinear principal component analysis as well as their linear counterparts, i.e., the non-negative matrix factorization, the singular value decomposition, and the principal components analysis in extracting features that are subsequently classified by either support vector machine or nearest neighbor classifiers. Three different sets of experiments conducted on the GTZAN and the ISMIR2004 Genre datasets demonstrate the superiority of NMPCA against the aforementioned subspace analysis techniques in extracting more discriminating features, especially when the training set has small cardinality. The best classification accuracies reported in the paper exceed those obtained by the state-of-the-art music genre classification algorithms applied to both datasets.</description><identifier>ISSN: 1558-7916</identifier><identifier>ISSN: 2329-9290</identifier><identifier>EISSN: 1558-7924</identifier><identifier>EISSN: 2329-9304</identifier><identifier>DOI: 10.1109/TASL.2009.2036813</identifier><identifier>CODEN: ITASD8</identifier><language>eng</language><publisher>New York: IEEE</publisher><subject>Algorithms ; Auditory representations ; Auditory system ; Classification ; Feature extraction ; Humans ; Mathematical analysis ; Matrix decomposition ; Modulation ; Multiple signal classification ; Music ; music genre classification ; non-negative multilinear principal components analysis (NMPCA) ; non-negative tensor factorization (NTF) ; nonnegative matrix factorization (NMF) ; Principal component analysis ; Principal components analysis ; Psychology ; Scattering ; Singular value decomposition ; Studies ; Subspaces ; Tensile stress ; Tensors</subject><ispartof>IEEE transactions on audio, speech, and language processing, 2010-03, Vol.18 (3), p.576-588</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) Mar 2010</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c422t-38661d27c63f666a02cec478137d347aa7d4cb4ad576cc9ca81bb5a02f9e2fc33</citedby><cites>FETCH-LOGICAL-c422t-38661d27c63f666a02cec478137d347aa7d4cb4ad576cc9ca81bb5a02f9e2fc33</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/5337979$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,780,784,796,27923,27924,54757</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/5337979$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Panagakis, Y.</creatorcontrib><creatorcontrib>Kotropoulos, C.</creatorcontrib><creatorcontrib>Arce, G.R.</creatorcontrib><title>Non-Negative Multilinear Principal Component Analysis of Auditory Temporal Modulations for Music Genre Classification</title><title>IEEE transactions on audio, speech, and language processing</title><addtitle>TASL</addtitle><description>Motivated by psychophysiological investigations on the human auditory system, a bio-inspired two-dimensional auditory representation of music signals is exploited, that captures the slow temporal modulations. Although each recording is represented by a second-order tensor (i.e., a matrix), a third-order tensor is needed to represent a music corpus. Non-negative multilinear principal component analysis (NMPCA) is proposed for the unsupervised dimensionality reduction of the third-order tensors. The NMPCA maximizes the total tensor scatter while preserving the non-negativity of auditory representations. An algorithm for NMPCA is derived by exploiting the structure of the Grassmann manifold. The NMPCA is compared against three multilinear subspace analysis techniques, namely the non-negative tensor factorization, the high-order singular value decomposition, and the multilinear principal component analysis as well as their linear counterparts, i.e., the non-negative matrix factorization, the singular value decomposition, and the principal components analysis in extracting features that are subsequently classified by either support vector machine or nearest neighbor classifiers. Three different sets of experiments conducted on the GTZAN and the ISMIR2004 Genre datasets demonstrate the superiority of NMPCA against the aforementioned subspace analysis techniques in extracting more discriminating features, especially when the training set has small cardinality. The best classification accuracies reported in the paper exceed those obtained by the state-of-the-art music genre classification algorithms applied to both datasets.</description><subject>Algorithms</subject><subject>Auditory representations</subject><subject>Auditory system</subject><subject>Classification</subject><subject>Feature extraction</subject><subject>Humans</subject><subject>Mathematical analysis</subject><subject>Matrix decomposition</subject><subject>Modulation</subject><subject>Multiple signal classification</subject><subject>Music</subject><subject>music genre classification</subject><subject>non-negative multilinear principal components analysis (NMPCA)</subject><subject>non-negative tensor factorization (NTF)</subject><subject>nonnegative matrix factorization (NMF)</subject><subject>Principal component analysis</subject><subject>Principal components analysis</subject><subject>Psychology</subject><subject>Scattering</subject><subject>Singular value decomposition</subject><subject>Studies</subject><subject>Subspaces</subject><subject>Tensile stress</subject><subject>Tensors</subject><issn>1558-7916</issn><issn>2329-9290</issn><issn>1558-7924</issn><issn>2329-9304</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2010</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNqFkUlLBDEQhRtRcP0B4iV48dSarZPOcRjcYFzA8Rwy6WqJZJIx6Rbm35txxIMXL1UF76sHVa-qTgm-JASrq_nkZXZJMValMNEStlMdkKZpa6ko3_2didivDnN-x5gzwclBNT7GUD_CmxncJ6CH0Q_OuwAmoefkgnUr49E0LlcxQBjQJBi_zi6j2KPJ2LkhpjWaQ5FT4R5iN_piFENGfUzFLTuLbiEkQFNvcna9s9_6cbXXG5_h5KcfVa831_PpXT17ur2fTma15ZQONWuFIB2VVrBeCGEwtWC5LNfJjnFpjOy4XXDTNVJYq6xpyWLRFKxXQHvL2FF1sfVdpfgxQh700mUL3psAccy6lQ1mGGP6LynLv7BkbON5_od8j2Mqj8laEUmpwpwXiGwhm2LOCXq9Sm5p0loTrDeB6U1gehOY_gms7JxtdxwA_PINY1JJxb4Abu-S-g</recordid><startdate>201003</startdate><enddate>201003</enddate><creator>Panagakis, Y.</creator><creator>Kotropoulos, C.</creator><creator>Arce, G.R.</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope></search><sort><creationdate>201003</creationdate><title>Non-Negative Multilinear Principal Component Analysis of Auditory Temporal Modulations for Music Genre Classification</title><author>Panagakis, Y. ; Kotropoulos, C. ; Arce, G.R.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c422t-38661d27c63f666a02cec478137d347aa7d4cb4ad576cc9ca81bb5a02f9e2fc33</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2010</creationdate><topic>Algorithms</topic><topic>Auditory representations</topic><topic>Auditory system</topic><topic>Classification</topic><topic>Feature extraction</topic><topic>Humans</topic><topic>Mathematical analysis</topic><topic>Matrix decomposition</topic><topic>Modulation</topic><topic>Multiple signal classification</topic><topic>Music</topic><topic>music genre classification</topic><topic>non-negative multilinear principal components analysis (NMPCA)</topic><topic>non-negative tensor factorization (NTF)</topic><topic>nonnegative matrix factorization (NMF)</topic><topic>Principal component analysis</topic><topic>Principal components analysis</topic><topic>Psychology</topic><topic>Scattering</topic><topic>Singular value decomposition</topic><topic>Studies</topic><topic>Subspaces</topic><topic>Tensile stress</topic><topic>Tensors</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Panagakis, Y.</creatorcontrib><creatorcontrib>Kotropoulos, C.</creatorcontrib><creatorcontrib>Arce, G.R.</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>IEEE transactions on audio, speech, and language processing</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Panagakis, Y.</au><au>Kotropoulos, C.</au><au>Arce, G.R.</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Non-Negative Multilinear Principal Component Analysis of Auditory Temporal Modulations for Music Genre Classification</atitle><jtitle>IEEE transactions on audio, speech, and language processing</jtitle><stitle>TASL</stitle><date>2010-03</date><risdate>2010</risdate><volume>18</volume><issue>3</issue><spage>576</spage><epage>588</epage><pages>576-588</pages><issn>1558-7916</issn><issn>2329-9290</issn><eissn>1558-7924</eissn><eissn>2329-9304</eissn><coden>ITASD8</coden><abstract>Motivated by psychophysiological investigations on the human auditory system, a bio-inspired two-dimensional auditory representation of music signals is exploited, that captures the slow temporal modulations. Although each recording is represented by a second-order tensor (i.e., a matrix), a third-order tensor is needed to represent a music corpus. Non-negative multilinear principal component analysis (NMPCA) is proposed for the unsupervised dimensionality reduction of the third-order tensors. The NMPCA maximizes the total tensor scatter while preserving the non-negativity of auditory representations. An algorithm for NMPCA is derived by exploiting the structure of the Grassmann manifold. The NMPCA is compared against three multilinear subspace analysis techniques, namely the non-negative tensor factorization, the high-order singular value decomposition, and the multilinear principal component analysis as well as their linear counterparts, i.e., the non-negative matrix factorization, the singular value decomposition, and the principal components analysis in extracting features that are subsequently classified by either support vector machine or nearest neighbor classifiers. Three different sets of experiments conducted on the GTZAN and the ISMIR2004 Genre datasets demonstrate the superiority of NMPCA against the aforementioned subspace analysis techniques in extracting more discriminating features, especially when the training set has small cardinality. The best classification accuracies reported in the paper exceed those obtained by the state-of-the-art music genre classification algorithms applied to both datasets.</abstract><cop>New York</cop><pub>IEEE</pub><doi>10.1109/TASL.2009.2036813</doi><tpages>13</tpages></addata></record>
fulltext	fulltext_linktorsrc
identifier	ISSN: 1558-7916
ispartof	IEEE transactions on audio, speech, and language processing, 2010-03, Vol.18 (3), p.576-588
issn	1558-7916 2329-9290 1558-7924 2329-9304
language	eng
recordid	cdi_proquest_miscellaneous_875030002
source	IEEE Electronic Library (IEL)
subjects	Algorithms Auditory representations Auditory system Classification Feature extraction Humans Mathematical analysis Matrix decomposition Modulation Multiple signal classification Music music genre classification non-negative multilinear principal components analysis (NMPCA) non-negative tensor factorization (NTF) nonnegative matrix factorization (NMF) Principal component analysis Principal components analysis Psychology Scattering Singular value decomposition Studies Subspaces Tensile stress Tensors
title	Non-Negative Multilinear Principal Component Analysis of Auditory Temporal Modulations for Music Genre Classification
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-11T17%3A52%3A51IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Non-Negative%20Multilinear%20Principal%20Component%20Analysis%20of%20Auditory%20Temporal%20Modulations%20for%20Music%20Genre%20Classification&rft.jtitle=IEEE%20transactions%20on%20audio,%20speech,%20and%20language%20processing&rft.au=Panagakis,%20Y.&rft.date=2010-03&rft.volume=18&rft.issue=3&rft.spage=576&rft.epage=588&rft.pages=576-588&rft.issn=1558-7916&rft.eissn=1558-7924&rft.coden=ITASD8&rft_id=info:doi/10.1109/TASL.2009.2036813&rft_dat=%3Cproquest_RIE%3E2567680431%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=917229044&rft_id=info:pmid/&rft_ieee_id=5337979&rfr_iscdi=true