On the Parameter Estimation of Sinusoidal Models for Speech and Audio Signals

In this paper, we examine the parameter estimation performance of three well-known sinusoidal models for speech and audio. The first one is the standard Sinusoidal Model (SM), which is based on the Fast Fourier Transform (FFT). The second is the Exponentially Damped Sinusoidal Model (EDSM) which has...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	arXiv.org 2024-01
1. Verfasser:	Kafentzis, George P
Format:	Artikel
Sprache:	eng
Schlagworte:	Audio signals Basis functions Fast Fourier transformations Fourier transforms Mathematical models Parameter estimation Parameter robustness Signal reconstruction Sine waves Speech Subspace methods
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page
container_issue
container_start_page
container_title	arXiv.org
container_volume
creator	Kafentzis, George P
description	In this paper, we examine the parameter estimation performance of three well-known sinusoidal models for speech and audio. The first one is the standard Sinusoidal Model (SM), which is based on the Fast Fourier Transform (FFT). The second is the Exponentially Damped Sinusoidal Model (EDSM) which has been proposed in the last decade, and utilizes a subspace method for parameter estimation, and finally the extended adaptive Quasi-Harmonic Model (eaQHM), which has been recently proposed for AM-FM decomposition, and estimates the signal parameters using Least Squares on a set of basis function that are adaptive to the local characteristics of the signal. The parameter estimation of each model is briefly described and its performance is compared to the others in terms of signal reconstruction accuracy versus window size on a variety of synthetic signals and versus the number of sinusoids on real signals. The latter include highly non stationary signals, such as singing voices and guitar solos. The advantages and disadvantages of each model are presented via synthetic signals and then the application on real signals is discussed. Conclusively, eaQHM outperforms EDS in medium-to-large window size analysis, whereas EDSM yields higher reconstruction values for smaller analysis window sizes. Thus, a future research direction appears to be the merge of adaptivity of the eaQHM and parameter estimation robustness of the EDSM in a new paradigm for high-quality analysis and resynthesis of general audio signals.
format	Article
fullrecord	<record><control><sourceid>proquest</sourceid><recordid>TN_cdi_proquest_journals_2909328035</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2909328035</sourcerecordid><originalsourceid>FETCH-proquest_journals_29093280353</originalsourceid><addsrcrecordid>eNqNyssKgkAUANAhCJLyHy60FqaZLF1GGG2kwPYy5DVHdK7N4_8z6ANanc1ZsEhIuUuyvRArFjvXc87F4SjSVEasvBnwHcJdWTWiRwuF83pUXpMBaqHSJjjSjRqgpAYHBy1ZqCbEZwfKNHAKjaa5vYwa3IYt2xmMf67Z9lI8ztdksvQO6HzdU7DfWYuc51JkXKbyv_UBOpY9KQ</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2909328035</pqid></control><display><type>article</type><title>On the Parameter Estimation of Sinusoidal Models for Speech and Audio Signals</title><source>Free E- Journals</source><creator>Kafentzis, George P</creator><creatorcontrib>Kafentzis, George P</creatorcontrib><description>In this paper, we examine the parameter estimation performance of three well-known sinusoidal models for speech and audio. The first one is the standard Sinusoidal Model (SM), which is based on the Fast Fourier Transform (FFT). The second is the Exponentially Damped Sinusoidal Model (EDSM) which has been proposed in the last decade, and utilizes a subspace method for parameter estimation, and finally the extended adaptive Quasi-Harmonic Model (eaQHM), which has been recently proposed for AM-FM decomposition, and estimates the signal parameters using Least Squares on a set of basis function that are adaptive to the local characteristics of the signal. The parameter estimation of each model is briefly described and its performance is compared to the others in terms of signal reconstruction accuracy versus window size on a variety of synthetic signals and versus the number of sinusoids on real signals. The latter include highly non stationary signals, such as singing voices and guitar solos. The advantages and disadvantages of each model are presented via synthetic signals and then the application on real signals is discussed. Conclusively, eaQHM outperforms EDS in medium-to-large window size analysis, whereas EDSM yields higher reconstruction values for smaller analysis window sizes. Thus, a future research direction appears to be the merge of adaptivity of the eaQHM and parameter estimation robustness of the EDSM in a new paradigm for high-quality analysis and resynthesis of general audio signals.</description><identifier>EISSN: 2331-8422</identifier><language>eng</language><publisher>Ithaca: Cornell University Library, arXiv.org</publisher><subject>Audio signals ; Basis functions ; Fast Fourier transformations ; Fourier transforms ; Mathematical models ; Parameter estimation ; Parameter robustness ; Signal reconstruction ; Sine waves ; Speech ; Subspace methods</subject><ispartof>arXiv.org, 2024-01</ispartof><rights>2024. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>780,784</link.rule.ids></links><search><creatorcontrib>Kafentzis, George P</creatorcontrib><title>On the Parameter Estimation of Sinusoidal Models for Speech and Audio Signals</title><title>arXiv.org</title><description>In this paper, we examine the parameter estimation performance of three well-known sinusoidal models for speech and audio. The first one is the standard Sinusoidal Model (SM), which is based on the Fast Fourier Transform (FFT). The second is the Exponentially Damped Sinusoidal Model (EDSM) which has been proposed in the last decade, and utilizes a subspace method for parameter estimation, and finally the extended adaptive Quasi-Harmonic Model (eaQHM), which has been recently proposed for AM-FM decomposition, and estimates the signal parameters using Least Squares on a set of basis function that are adaptive to the local characteristics of the signal. The parameter estimation of each model is briefly described and its performance is compared to the others in terms of signal reconstruction accuracy versus window size on a variety of synthetic signals and versus the number of sinusoids on real signals. The latter include highly non stationary signals, such as singing voices and guitar solos. The advantages and disadvantages of each model are presented via synthetic signals and then the application on real signals is discussed. Conclusively, eaQHM outperforms EDS in medium-to-large window size analysis, whereas EDSM yields higher reconstruction values for smaller analysis window sizes. Thus, a future research direction appears to be the merge of adaptivity of the eaQHM and parameter estimation robustness of the EDSM in a new paradigm for high-quality analysis and resynthesis of general audio signals.</description><subject>Audio signals</subject><subject>Basis functions</subject><subject>Fast Fourier transformations</subject><subject>Fourier transforms</subject><subject>Mathematical models</subject><subject>Parameter estimation</subject><subject>Parameter robustness</subject><subject>Signal reconstruction</subject><subject>Sine waves</subject><subject>Speech</subject><subject>Subspace methods</subject><issn>2331-8422</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>ABUWG</sourceid><sourceid>AFKRA</sourceid><sourceid>AZQEC</sourceid><sourceid>BENPR</sourceid><sourceid>CCPQU</sourceid><sourceid>DWQXO</sourceid><recordid>eNqNyssKgkAUANAhCJLyHy60FqaZLF1GGG2kwPYy5DVHdK7N4_8z6ANanc1ZsEhIuUuyvRArFjvXc87F4SjSVEasvBnwHcJdWTWiRwuF83pUXpMBaqHSJjjSjRqgpAYHBy1ZqCbEZwfKNHAKjaa5vYwa3IYt2xmMf67Z9lI8ztdksvQO6HzdU7DfWYuc51JkXKbyv_UBOpY9KQ</recordid><startdate>20240102</startdate><enddate>20240102</enddate><creator>Kafentzis, George P</creator><general>Cornell University Library, arXiv.org</general><scope>8FE</scope><scope>8FG</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>HCIFZ</scope><scope>L6V</scope><scope>M7S</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope></search><sort><creationdate>20240102</creationdate><title>On the Parameter Estimation of Sinusoidal Models for Speech and Audio Signals</title><author>Kafentzis, George P</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-proquest_journals_29093280353</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Audio signals</topic><topic>Basis functions</topic><topic>Fast Fourier transformations</topic><topic>Fourier transforms</topic><topic>Mathematical models</topic><topic>Parameter estimation</topic><topic>Parameter robustness</topic><topic>Signal reconstruction</topic><topic>Sine waves</topic><topic>Speech</topic><topic>Subspace methods</topic><toplevel>online_resources</toplevel><creatorcontrib>Kafentzis, George P</creatorcontrib><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Materials Science & Engineering Collection</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest Central UK/Ireland</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Engineering Collection</collection><collection>Engineering Database</collection><collection>Access via ProQuest (Open Access)</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering Collection</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Kafentzis, George P</au><format>book</format><genre>document</genre><ristype>GEN</ristype><atitle>On the Parameter Estimation of Sinusoidal Models for Speech and Audio Signals</atitle><jtitle>arXiv.org</jtitle><date>2024-01-02</date><risdate>2024</risdate><eissn>2331-8422</eissn><abstract>In this paper, we examine the parameter estimation performance of three well-known sinusoidal models for speech and audio. The first one is the standard Sinusoidal Model (SM), which is based on the Fast Fourier Transform (FFT). The second is the Exponentially Damped Sinusoidal Model (EDSM) which has been proposed in the last decade, and utilizes a subspace method for parameter estimation, and finally the extended adaptive Quasi-Harmonic Model (eaQHM), which has been recently proposed for AM-FM decomposition, and estimates the signal parameters using Least Squares on a set of basis function that are adaptive to the local characteristics of the signal. The parameter estimation of each model is briefly described and its performance is compared to the others in terms of signal reconstruction accuracy versus window size on a variety of synthetic signals and versus the number of sinusoids on real signals. The latter include highly non stationary signals, such as singing voices and guitar solos. The advantages and disadvantages of each model are presented via synthetic signals and then the application on real signals is discussed. Conclusively, eaQHM outperforms EDS in medium-to-large window size analysis, whereas EDSM yields higher reconstruction values for smaller analysis window sizes. Thus, a future research direction appears to be the merge of adaptivity of the eaQHM and parameter estimation robustness of the EDSM in a new paradigm for high-quality analysis and resynthesis of general audio signals.</abstract><cop>Ithaca</cop><pub>Cornell University Library, arXiv.org</pub><oa>free_for_read</oa></addata></record>
fulltext	fulltext
identifier	EISSN: 2331-8422
ispartof	arXiv.org, 2024-01
issn	2331-8422
language	eng
recordid	cdi_proquest_journals_2909328035
source	Free E- Journals
subjects	Audio signals Basis functions Fast Fourier transformations Fourier transforms Mathematical models Parameter estimation Parameter robustness Signal reconstruction Sine waves Speech Subspace methods
title	On the Parameter Estimation of Sinusoidal Models for Speech and Audio Signals
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-20T08%3A49%3A24IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=document&rft.atitle=On%20the%20Parameter%20Estimation%20of%20Sinusoidal%20Models%20for%20Speech%20and%20Audio%20Signals&rft.jtitle=arXiv.org&rft.au=Kafentzis,%20George%20P&rft.date=2024-01-02&rft.eissn=2331-8422&rft_id=info:doi/&rft_dat=%3Cproquest%3E2909328035%3C/proquest%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2909328035&rft_id=info:pmid/&rfr_iscdi=true