Statistical analysis of the autoregressive modeling of reverberant speech

Hands-free speech input is required in many modern telecommunication applications that employ autoregressive (AR) techniques such as linear predictive coding. When the hands-free input is obtained in enclosed reverberant spaces such as typical office rooms, the speech signal is distorted by the room...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	The Journal of the Acoustical Society of America 2006-12, Vol.120 (6), p.4031-4039
Hauptverfasser:	Gaubitch, Nikolay D., Ward, Darren B., Naylor, Patrick A.
Format:	Artikel
Sprache:	eng
Schlagworte:	Acoustic signal processing Acoustics Architectural acoustics Exact sciences and technology Fundamental areas of phenomenology (including applications) Humans Models, Biological Physics Speech Perception Speech Production Measurement
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	4039
container_issue	6
container_start_page	4031
container_title	The Journal of the Acoustical Society of America
container_volume	120
creator	Gaubitch, Nikolay D. Ward, Darren B. Naylor, Patrick A.
description	Hands-free speech input is required in many modern telecommunication applications that employ autoregressive (AR) techniques such as linear predictive coding. When the hands-free input is obtained in enclosed reverberant spaces such as typical office rooms, the speech signal is distorted by the room transfer function. This paper utilizes theoretical results from statistical room acoustics to analyze the AR modeling of speech under these reverberant conditions. Three cases are considered: (i) AR coefficients calculated from a single observation; (ii) AR coefficients calculated jointly from an M -channel observation ( M > 1 ) ; and (iii) AR coefficients calculated from the output of a delay-and sum beamformer. The statistical analysis, with supporting simulations, shows that the spatial expectation of the AR coefficients for cases (i) and (ii) are approximately equal to those from the original speech, while for case (iii) there is a discrepancy due to spatial correlation between the microphones which can be significant. It is subsequently demonstrated that at each individual source-microphone position (without spatial expectation), the M -channel AR coefficients from case (ii) provide the best approximation to the clean speech coefficients when microphones are closely spaced ( < 0.3 m ) .
doi_str_mv	10.1121/1.2356840
format	Article
fullrecord	<record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_miscellaneous_85660095</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>68295369</sourcerecordid><originalsourceid>FETCH-LOGICAL-c314t-d8f266b2348f369812327c1513b65d2098c9f3b564badb8733e9ebd13a17f8863</originalsourceid><addsrcrecordid>eNqF0U1Lw0AQBuBFFFurB_-A5KLgIXU_spvdiyDFj0LBg3oOm82kXUmTupMW-u9NaaRexNMy7MMMvC8hl4yOGePsjo25kEon9IgMmeQ01pInx2RIKWVxYpQakDPEz26UWphTMmAp5zLhZkimb61tPbbe2Sqyta226DFqyqhdQGTXbRNgHgDRbyBaNgVUvp7vvgNsIOQQbN1GuAJwi3NyUtoK4aJ_R-Tj6fF98hLPXp-nk4dZ7ARL2rjQJVcq5yLRpVBGMy546phkIley4NRoZ0qRS5Xktsh1KgQYyAsmLEtLrZUYkZv93lVovtaAbbb06KCqbA3NGjMtlaLUyH-h0rxTynTwdg9daBADlNkq-KUN24zRbBdwxrI-4M5e9UvX-RKKg-wT7cB1Dyx2mZZdQs7jwWmhUqrSzt3vHTq_q6Cp_776q6PspyPxDf6Tlx0</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>68295369</pqid></control><display><type>article</type><title>Statistical analysis of the autoregressive modeling of reverberant speech</title><source>MEDLINE</source><source>AIP Journals Complete</source><source>AIP Acoustical Society of America</source><creator>Gaubitch, Nikolay D. ; Ward, Darren B. ; Naylor, Patrick A.</creator><creatorcontrib>Gaubitch, Nikolay D. ; Ward, Darren B. ; Naylor, Patrick A.</creatorcontrib><description>Hands-free speech input is required in many modern telecommunication applications that employ autoregressive (AR) techniques such as linear predictive coding. When the hands-free input is obtained in enclosed reverberant spaces such as typical office rooms, the speech signal is distorted by the room transfer function. This paper utilizes theoretical results from statistical room acoustics to analyze the AR modeling of speech under these reverberant conditions. Three cases are considered: (i) AR coefficients calculated from a single observation; (ii) AR coefficients calculated jointly from an M -channel observation ( M > 1 ) ; and (iii) AR coefficients calculated from the output of a delay-and sum beamformer. The statistical analysis, with supporting simulations, shows that the spatial expectation of the AR coefficients for cases (i) and (ii) are approximately equal to those from the original speech, while for case (iii) there is a discrepancy due to spatial correlation between the microphones which can be significant. It is subsequently demonstrated that at each individual source-microphone position (without spatial expectation), the M -channel AR coefficients from case (ii) provide the best approximation to the clean speech coefficients when microphones are closely spaced ( < 0.3 m ) .</description><identifier>ISSN: 0001-4966</identifier><identifier>EISSN: 1520-8524</identifier><identifier>DOI: 10.1121/1.2356840</identifier><identifier>PMID: 17225429</identifier><identifier>CODEN: JASMAN</identifier><language>eng</language><publisher>Woodbury, NY: Acoustical Society of America</publisher><subject>Acoustic signal processing ; Acoustics ; Architectural acoustics ; Exact sciences and technology ; Fundamental areas of phenomenology (including applications) ; Humans ; Models, Biological ; Physics ; Speech Perception ; Speech Production Measurement</subject><ispartof>The Journal of the Acoustical Society of America, 2006-12, Vol.120 (6), p.4031-4039</ispartof><rights>2006 Acoustical Society of America</rights><rights>2007 INIST-CNRS</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c314t-d8f266b2348f369812327c1513b65d2098c9f3b564badb8733e9ebd13a17f8863</citedby><cites>FETCH-LOGICAL-c314t-d8f266b2348f369812327c1513b65d2098c9f3b564badb8733e9ebd13a17f8863</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://pubs.aip.org/jasa/article-lookup/doi/10.1121/1.2356840$$EHTML$$P50$$Gscitation$$H</linktohtml><link.rule.ids>207,208,314,776,780,790,1559,4498,27901,27902,76127</link.rule.ids><backlink>$$Uhttp://pascal-francis.inist.fr/vibad/index.php?action=getRecordDetail&idt=18367067$$DView record in Pascal Francis$$Hfree_for_read</backlink><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/17225429$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Gaubitch, Nikolay D.</creatorcontrib><creatorcontrib>Ward, Darren B.</creatorcontrib><creatorcontrib>Naylor, Patrick A.</creatorcontrib><title>Statistical analysis of the autoregressive modeling of reverberant speech</title><title>The Journal of the Acoustical Society of America</title><addtitle>J Acoust Soc Am</addtitle><description>Hands-free speech input is required in many modern telecommunication applications that employ autoregressive (AR) techniques such as linear predictive coding. When the hands-free input is obtained in enclosed reverberant spaces such as typical office rooms, the speech signal is distorted by the room transfer function. This paper utilizes theoretical results from statistical room acoustics to analyze the AR modeling of speech under these reverberant conditions. Three cases are considered: (i) AR coefficients calculated from a single observation; (ii) AR coefficients calculated jointly from an M -channel observation ( M > 1 ) ; and (iii) AR coefficients calculated from the output of a delay-and sum beamformer. The statistical analysis, with supporting simulations, shows that the spatial expectation of the AR coefficients for cases (i) and (ii) are approximately equal to those from the original speech, while for case (iii) there is a discrepancy due to spatial correlation between the microphones which can be significant. It is subsequently demonstrated that at each individual source-microphone position (without spatial expectation), the M -channel AR coefficients from case (ii) provide the best approximation to the clean speech coefficients when microphones are closely spaced ( < 0.3 m ) .</description><subject>Acoustic signal processing</subject><subject>Acoustics</subject><subject>Architectural acoustics</subject><subject>Exact sciences and technology</subject><subject>Fundamental areas of phenomenology (including applications)</subject><subject>Humans</subject><subject>Models, Biological</subject><subject>Physics</subject><subject>Speech Perception</subject><subject>Speech Production Measurement</subject><issn>0001-4966</issn><issn>1520-8524</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2006</creationdate><recordtype>article</recordtype><sourceid>EIF</sourceid><recordid>eNqF0U1Lw0AQBuBFFFurB_-A5KLgIXU_spvdiyDFj0LBg3oOm82kXUmTupMW-u9NaaRexNMy7MMMvC8hl4yOGePsjo25kEon9IgMmeQ01pInx2RIKWVxYpQakDPEz26UWphTMmAp5zLhZkimb61tPbbe2Sqyta226DFqyqhdQGTXbRNgHgDRbyBaNgVUvp7vvgNsIOQQbN1GuAJwi3NyUtoK4aJ_R-Tj6fF98hLPXp-nk4dZ7ARL2rjQJVcq5yLRpVBGMy546phkIley4NRoZ0qRS5Xktsh1KgQYyAsmLEtLrZUYkZv93lVovtaAbbb06KCqbA3NGjMtlaLUyH-h0rxTynTwdg9daBADlNkq-KUN24zRbBdwxrI-4M5e9UvX-RKKg-wT7cB1Dyx2mZZdQs7jwWmhUqrSzt3vHTq_q6Cp_776q6PspyPxDf6Tlx0</recordid><startdate>200612</startdate><enddate>200612</enddate><creator>Gaubitch, Nikolay D.</creator><creator>Ward, Darren B.</creator><creator>Naylor, Patrick A.</creator><general>Acoustical Society of America</general><general>American Institute of Physics</general><scope>IQODW</scope><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7X8</scope><scope>8BM</scope><scope>7T9</scope></search><sort><creationdate>200612</creationdate><title>Statistical analysis of the autoregressive modeling of reverberant speech</title><author>Gaubitch, Nikolay D. ; Ward, Darren B. ; Naylor, Patrick A.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c314t-d8f266b2348f369812327c1513b65d2098c9f3b564badb8733e9ebd13a17f8863</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2006</creationdate><topic>Acoustic signal processing</topic><topic>Acoustics</topic><topic>Architectural acoustics</topic><topic>Exact sciences and technology</topic><topic>Fundamental areas of phenomenology (including applications)</topic><topic>Humans</topic><topic>Models, Biological</topic><topic>Physics</topic><topic>Speech Perception</topic><topic>Speech Production Measurement</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Gaubitch, Nikolay D.</creatorcontrib><creatorcontrib>Ward, Darren B.</creatorcontrib><creatorcontrib>Naylor, Patrick A.</creatorcontrib><collection>Pascal-Francis</collection><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>MEDLINE - Academic</collection><collection>ComDisDome</collection><collection>Linguistics and Language Behavior Abstracts (LLBA)</collection><jtitle>The Journal of the Acoustical Society of America</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Gaubitch, Nikolay D.</au><au>Ward, Darren B.</au><au>Naylor, Patrick A.</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Statistical analysis of the autoregressive modeling of reverberant speech</atitle><jtitle>The Journal of the Acoustical Society of America</jtitle><addtitle>J Acoust Soc Am</addtitle><date>2006-12</date><risdate>2006</risdate><volume>120</volume><issue>6</issue><spage>4031</spage><epage>4039</epage><pages>4031-4039</pages><issn>0001-4966</issn><eissn>1520-8524</eissn><coden>JASMAN</coden><abstract>Hands-free speech input is required in many modern telecommunication applications that employ autoregressive (AR) techniques such as linear predictive coding. When the hands-free input is obtained in enclosed reverberant spaces such as typical office rooms, the speech signal is distorted by the room transfer function. This paper utilizes theoretical results from statistical room acoustics to analyze the AR modeling of speech under these reverberant conditions. Three cases are considered: (i) AR coefficients calculated from a single observation; (ii) AR coefficients calculated jointly from an M -channel observation ( M > 1 ) ; and (iii) AR coefficients calculated from the output of a delay-and sum beamformer. The statistical analysis, with supporting simulations, shows that the spatial expectation of the AR coefficients for cases (i) and (ii) are approximately equal to those from the original speech, while for case (iii) there is a discrepancy due to spatial correlation between the microphones which can be significant. It is subsequently demonstrated that at each individual source-microphone position (without spatial expectation), the M -channel AR coefficients from case (ii) provide the best approximation to the clean speech coefficients when microphones are closely spaced ( < 0.3 m ) .</abstract><cop>Woodbury, NY</cop><pub>Acoustical Society of America</pub><pmid>17225429</pmid><doi>10.1121/1.2356840</doi><tpages>9</tpages></addata></record>
fulltext	fulltext
identifier	ISSN: 0001-4966
ispartof	The Journal of the Acoustical Society of America, 2006-12, Vol.120 (6), p.4031-4039
issn	0001-4966 1520-8524
language	eng
recordid	cdi_proquest_miscellaneous_85660095
source	MEDLINE; AIP Journals Complete; AIP Acoustical Society of America
subjects	Acoustic signal processing Acoustics Architectural acoustics Exact sciences and technology Fundamental areas of phenomenology (including applications) Humans Models, Biological Physics Speech Perception Speech Production Measurement
title	Statistical analysis of the autoregressive modeling of reverberant speech
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-30T13%3A54%3A57IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Statistical%20analysis%20of%20the%20autoregressive%20modeling%20of%20reverberant%20speech&rft.jtitle=The%20Journal%20of%20the%20Acoustical%20Society%20of%20America&rft.au=Gaubitch,%20Nikolay%20D.&rft.date=2006-12&rft.volume=120&rft.issue=6&rft.spage=4031&rft.epage=4039&rft.pages=4031-4039&rft.issn=0001-4966&rft.eissn=1520-8524&rft.coden=JASMAN&rft_id=info:doi/10.1121/1.2356840&rft_dat=%3Cproquest_cross%3E68295369%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=68295369&rft_id=info:pmid/17225429&rfr_iscdi=true