Automatic detection of head voice in sung musical signals via machine learning classification of time-varying partial intensities

The automatic detection of portions of a musical signal produced according to time-varying performance parameters is an important problem in musical signal processing. The present work attempts such a task: the algorithms presented seek to determine from a sung input signal which portions of the sig...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	The Journal of the Acoustical Society of America 2006-11, Vol.120 (5_Supplement), p.3029-3029
Hauptverfasser:	Cassidy, Ryan J., Mysore, Gautham J.
Format:	Artikel
Sprache:	eng
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	3029
container_issue	5_Supplement
container_start_page	3029
container_title	The Journal of the Acoustical Society of America
container_volume	120
creator	Cassidy, Ryan J. Mysore, Gautham J.
description	The automatic detection of portions of a musical signal produced according to time-varying performance parameters is an important problem in musical signal processing. The present work attempts such a task: the algorithms presented seek to determine from a sung input signal which portions of the signal are sung using the head voice, also known as falsetto in the case of a male singer. In the authors’ prior work [Mysore et al., Asilomar Conf. Signal. Sys. Comp. (2006) (submitted)], a machine learning technique known as a support vector classifier [Boyd and Vandenberghe, 2004] was used to identify falsetto portions of a sung signal using the mel-frequency cepstral coefficients (MFCCs) of that signal (computed at a frame rate of 50 Hz). In the present work, the time-varying amplitudes of the first four harmonics, relative to the intensity of the fundamental, and as estimated by the quadratically interpolated fast Fourier transform (QIFFT) [Abe and Smith, ICASSP 2005], are used as a basis for classification. Preliminary experiments show a successful classification rate of over 95% for the QIFFT-based technique, compared to approximately 90% success with the prior MFCC-based approach. [Ryan J. Cassidy supported by the Natural Sciences and Engineering Research Council of Canada.]
doi_str_mv	10.1121/1.4787140
format	Article
fullrecord	<record><control><sourceid>crossref</sourceid><recordid>TN_cdi_crossref_primary_10_1121_1_4787140</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>10_1121_1_4787140</sourcerecordid><originalsourceid>FETCH-crossref_primary_10_1121_1_47871403</originalsourceid><addsrcrecordid>eNqVj0FOwzAQRS0EEgG64AazZZHiSZM2XSIE4gDdW5Y7aaeK7crjRGLJzXGlcgBWX6P_5klfqWfUS8QGX3HZbvoNtvpGVdg1uu67pr1VldYa63a7Xt-rB5FTObt-ta3Uz9uUo7eZHewpk8scA8QBjmT3MEd2BBxApnAAPwk7O4LwIdhRYGYL3rojB4KRbApcIDdaER4K-GfK7Kmebfq-1GebMhcHh0xBODPJk7obio4W13xUL58fu_ev2qUokmgw58S-_BvU5rLRoLluXP2H_QXaUFmA</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Automatic detection of head voice in sung musical signals via machine learning classification of time-varying partial intensities</title><source>AIP Journals Complete</source><source>AIP Acoustical Society of America</source><creator>Cassidy, Ryan J. ; Mysore, Gautham J.</creator><creatorcontrib>Cassidy, Ryan J. ; Mysore, Gautham J.</creatorcontrib><description>The automatic detection of portions of a musical signal produced according to time-varying performance parameters is an important problem in musical signal processing. The present work attempts such a task: the algorithms presented seek to determine from a sung input signal which portions of the signal are sung using the head voice, also known as falsetto in the case of a male singer. In the authors’ prior work [Mysore et al., Asilomar Conf. Signal. Sys. Comp. (2006) (submitted)], a machine learning technique known as a support vector classifier [Boyd and Vandenberghe, 2004] was used to identify falsetto portions of a sung signal using the mel-frequency cepstral coefficients (MFCCs) of that signal (computed at a frame rate of 50 Hz). In the present work, the time-varying amplitudes of the first four harmonics, relative to the intensity of the fundamental, and as estimated by the quadratically interpolated fast Fourier transform (QIFFT) [Abe and Smith, ICASSP 2005], are used as a basis for classification. Preliminary experiments show a successful classification rate of over 95% for the QIFFT-based technique, compared to approximately 90% success with the prior MFCC-based approach. [Ryan J. Cassidy supported by the Natural Sciences and Engineering Research Council of Canada.]</description><identifier>ISSN: 0001-4966</identifier><identifier>EISSN: 1520-8524</identifier><identifier>DOI: 10.1121/1.4787140</identifier><language>eng</language><ispartof>The Journal of the Acoustical Society of America, 2006-11, Vol.120 (5_Supplement), p.3029-3029</ispartof><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>207,208,314,776,780,27901,27902</link.rule.ids></links><search><creatorcontrib>Cassidy, Ryan J.</creatorcontrib><creatorcontrib>Mysore, Gautham J.</creatorcontrib><title>Automatic detection of head voice in sung musical signals via machine learning classification of time-varying partial intensities</title><title>The Journal of the Acoustical Society of America</title><description>The automatic detection of portions of a musical signal produced according to time-varying performance parameters is an important problem in musical signal processing. The present work attempts such a task: the algorithms presented seek to determine from a sung input signal which portions of the signal are sung using the head voice, also known as falsetto in the case of a male singer. In the authors’ prior work [Mysore et al., Asilomar Conf. Signal. Sys. Comp. (2006) (submitted)], a machine learning technique known as a support vector classifier [Boyd and Vandenberghe, 2004] was used to identify falsetto portions of a sung signal using the mel-frequency cepstral coefficients (MFCCs) of that signal (computed at a frame rate of 50 Hz). In the present work, the time-varying amplitudes of the first four harmonics, relative to the intensity of the fundamental, and as estimated by the quadratically interpolated fast Fourier transform (QIFFT) [Abe and Smith, ICASSP 2005], are used as a basis for classification. Preliminary experiments show a successful classification rate of over 95% for the QIFFT-based technique, compared to approximately 90% success with the prior MFCC-based approach. [Ryan J. Cassidy supported by the Natural Sciences and Engineering Research Council of Canada.]</description><issn>0001-4966</issn><issn>1520-8524</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2006</creationdate><recordtype>article</recordtype><recordid>eNqVj0FOwzAQRS0EEgG64AazZZHiSZM2XSIE4gDdW5Y7aaeK7crjRGLJzXGlcgBWX6P_5klfqWfUS8QGX3HZbvoNtvpGVdg1uu67pr1VldYa63a7Xt-rB5FTObt-ta3Uz9uUo7eZHewpk8scA8QBjmT3MEd2BBxApnAAPwk7O4LwIdhRYGYL3rojB4KRbApcIDdaER4K-GfK7Kmebfq-1GebMhcHh0xBODPJk7obio4W13xUL58fu_ev2qUokmgw58S-_BvU5rLRoLluXP2H_QXaUFmA</recordid><startdate>20061101</startdate><enddate>20061101</enddate><creator>Cassidy, Ryan J.</creator><creator>Mysore, Gautham J.</creator><scope>AAYXX</scope><scope>CITATION</scope></search><sort><creationdate>20061101</creationdate><title>Automatic detection of head voice in sung musical signals via machine learning classification of time-varying partial intensities</title><author>Cassidy, Ryan J. ; Mysore, Gautham J.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-crossref_primary_10_1121_1_47871403</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2006</creationdate><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Cassidy, Ryan J.</creatorcontrib><creatorcontrib>Mysore, Gautham J.</creatorcontrib><collection>CrossRef</collection><jtitle>The Journal of the Acoustical Society of America</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Cassidy, Ryan J.</au><au>Mysore, Gautham J.</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Automatic detection of head voice in sung musical signals via machine learning classification of time-varying partial intensities</atitle><jtitle>The Journal of the Acoustical Society of America</jtitle><date>2006-11-01</date><risdate>2006</risdate><volume>120</volume><issue>5_Supplement</issue><spage>3029</spage><epage>3029</epage><pages>3029-3029</pages><issn>0001-4966</issn><eissn>1520-8524</eissn><abstract>The automatic detection of portions of a musical signal produced according to time-varying performance parameters is an important problem in musical signal processing. The present work attempts such a task: the algorithms presented seek to determine from a sung input signal which portions of the signal are sung using the head voice, also known as falsetto in the case of a male singer. In the authors’ prior work [Mysore et al., Asilomar Conf. Signal. Sys. Comp. (2006) (submitted)], a machine learning technique known as a support vector classifier [Boyd and Vandenberghe, 2004] was used to identify falsetto portions of a sung signal using the mel-frequency cepstral coefficients (MFCCs) of that signal (computed at a frame rate of 50 Hz). In the present work, the time-varying amplitudes of the first four harmonics, relative to the intensity of the fundamental, and as estimated by the quadratically interpolated fast Fourier transform (QIFFT) [Abe and Smith, ICASSP 2005], are used as a basis for classification. Preliminary experiments show a successful classification rate of over 95% for the QIFFT-based technique, compared to approximately 90% success with the prior MFCC-based approach. [Ryan J. Cassidy supported by the Natural Sciences and Engineering Research Council of Canada.]</abstract><doi>10.1121/1.4787140</doi></addata></record>
fulltext	fulltext
identifier	ISSN: 0001-4966
ispartof	The Journal of the Acoustical Society of America, 2006-11, Vol.120 (5_Supplement), p.3029-3029
issn	0001-4966 1520-8524
language	eng
recordid	cdi_crossref_primary_10_1121_1_4787140
source	AIP Journals Complete; AIP Acoustical Society of America
title	Automatic detection of head voice in sung musical signals via machine learning classification of time-varying partial intensities
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-08T00%3A24%3A46IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-crossref&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Automatic%20detection%20of%20head%20voice%20in%20sung%20musical%20signals%20via%20machine%20learning%20classification%20of%20time-varying%20partial%20intensities&rft.jtitle=The%20Journal%20of%20the%20Acoustical%20Society%20of%20America&rft.au=Cassidy,%20Ryan%20J.&rft.date=2006-11-01&rft.volume=120&rft.issue=5_Supplement&rft.spage=3029&rft.epage=3029&rft.pages=3029-3029&rft.issn=0001-4966&rft.eissn=1520-8524&rft_id=info:doi/10.1121/1.4787140&rft_dat=%3Ccrossref%3E10_1121_1_4787140%3C/crossref%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true