Comparing performance of spectral distance measures and neural network methods for vowel recognition

Neural networks were trained to classify single 20 ms frames of vowels using either perceptually-based spectral representations or LPC spectra as input. Classification performance was compared with performance of several distance measures using nearest-neighbor and mean-distance decision criteria. T...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Computer speech & language 1989, Vol.3 (1), p.21-34
Hauptverfasser: Kamm, Candace A., Streeter, Lynn A., Kane-Esrig, Yana, Burr, David J.
Format: Artikel
Sprache:eng
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 34
container_issue 1
container_start_page 21
container_title Computer speech & language
container_volume 3
creator Kamm, Candace A.
Streeter, Lynn A.
Kane-Esrig, Yana
Burr, David J.
description Neural networks were trained to classify single 20 ms frames of vowels using either perceptually-based spectral representations or LPC spectra as input. Classification performance was compared with performance of several distance measures using nearest-neighbor and mean-distance decision criteria. The non-network distance measures included LPC-residual and cepstral distance measures used in conventional automatic speech recognition systems, as well as a formant-based measure and a new elastic distance measure that explicitly corrects for the effects of spectral tilt. Using an optimal error rate criterion, vowels were discriminated best using the elastic distance measure with the perceptually-based spectrum. Neural networks with LPC spectra as input performed comparably to the better conventional distance measures. While the performance of networks trained with perceptually-based spectral inputs was poorer than that of networks trained with LPC spectra, the features represented by the hidden nodes of this network were more consistent with factors related to human vowel perception.
doi_str_mv 10.1016/0885-2308(89)90012-0
format Article
fullrecord <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_miscellaneous_85332337</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><els_id>0885230889900120</els_id><sourcerecordid>58170774</sourcerecordid><originalsourceid>FETCH-LOGICAL-c311t-359c6789ff5041bd5f6ef2118b7dc560024c12c2898d8db852091e5ecc9cfafa3</originalsourceid><addsrcrecordid>eNqFkU2LFDEQhoMoOK7-Aw8BQfTQWkk66eQiyOAXLHjZPYdMUlmzdnfapHuX_feb2REPHvRUUPXUC_UUIS8ZvGPA1HvQWnZcgH6jzVsDwHgHj8iOgZGdFko8Jrs_yFPyrNZrAFCyH3Yk7PO0uJLmK7pgiblMbvZIc6R1Qb8WN9KQ6vrQnNDVrWClbg50xu04nHG9zeVnm60_cqi0JdCbfIsjLejz1ZzWlOfn5El0Y8UXv-sZufz86WL_tTv__uXb_uN55wVjayek8WrQJkYJPTsEGRVGzpg-DMFLBcB7z7jn2uigw0FLDoahRO-Njy46cUZen3KXkn9tWFc7pepxHN2MeatWSyG4EMN_QanZAMPQN_DVX-B13srcjrBMgFJS6Z43qj9RvuRaC0a7lDS5cmcZ2OOH7FG_Peq32tiHD1loax9Oa9ic3CQstvqEzXRITd5qQ07_DrgH0-iYsw</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>1306656842</pqid></control><display><type>article</type><title>Comparing performance of spectral distance measures and neural network methods for vowel recognition</title><source>Elsevier ScienceDirect Journals</source><source>Periodicals Index Online</source><creator>Kamm, Candace A. ; Streeter, Lynn A. ; Kane-Esrig, Yana ; Burr, David J.</creator><creatorcontrib>Kamm, Candace A. ; Streeter, Lynn A. ; Kane-Esrig, Yana ; Burr, David J.</creatorcontrib><description>Neural networks were trained to classify single 20 ms frames of vowels using either perceptually-based spectral representations or LPC spectra as input. Classification performance was compared with performance of several distance measures using nearest-neighbor and mean-distance decision criteria. The non-network distance measures included LPC-residual and cepstral distance measures used in conventional automatic speech recognition systems, as well as a formant-based measure and a new elastic distance measure that explicitly corrects for the effects of spectral tilt. Using an optimal error rate criterion, vowels were discriminated best using the elastic distance measure with the perceptually-based spectrum. Neural networks with LPC spectra as input performed comparably to the better conventional distance measures. While the performance of networks trained with perceptually-based spectral inputs was poorer than that of networks trained with LPC spectra, the features represented by the hidden nodes of this network were more consistent with factors related to human vowel perception.</description><identifier>ISSN: 0885-2308</identifier><identifier>EISSN: 1095-8363</identifier><identifier>DOI: 10.1016/0885-2308(89)90012-0</identifier><identifier>CODEN: CSPLEO</identifier><language>eng</language><publisher>London: Elsevier Ltd</publisher><ispartof>Computer speech &amp; language, 1989, Vol.3 (1), p.21-34</ispartof><rights>1989</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c311t-359c6789ff5041bd5f6ef2118b7dc560024c12c2898d8db852091e5ecc9cfafa3</citedby><cites>FETCH-LOGICAL-c311t-359c6789ff5041bd5f6ef2118b7dc560024c12c2898d8db852091e5ecc9cfafa3</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://dx.doi.org/10.1016/0885-2308(89)90012-0$$EHTML$$P50$$Gelsevier$$H</linktohtml><link.rule.ids>314,776,780,3536,4009,27848,27902,27903,27904,45974</link.rule.ids></links><search><creatorcontrib>Kamm, Candace A.</creatorcontrib><creatorcontrib>Streeter, Lynn A.</creatorcontrib><creatorcontrib>Kane-Esrig, Yana</creatorcontrib><creatorcontrib>Burr, David J.</creatorcontrib><title>Comparing performance of spectral distance measures and neural network methods for vowel recognition</title><title>Computer speech &amp; language</title><description>Neural networks were trained to classify single 20 ms frames of vowels using either perceptually-based spectral representations or LPC spectra as input. Classification performance was compared with performance of several distance measures using nearest-neighbor and mean-distance decision criteria. The non-network distance measures included LPC-residual and cepstral distance measures used in conventional automatic speech recognition systems, as well as a formant-based measure and a new elastic distance measure that explicitly corrects for the effects of spectral tilt. Using an optimal error rate criterion, vowels were discriminated best using the elastic distance measure with the perceptually-based spectrum. Neural networks with LPC spectra as input performed comparably to the better conventional distance measures. While the performance of networks trained with perceptually-based spectral inputs was poorer than that of networks trained with LPC spectra, the features represented by the hidden nodes of this network were more consistent with factors related to human vowel perception.</description><issn>0885-2308</issn><issn>1095-8363</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>1989</creationdate><recordtype>article</recordtype><sourceid>K30</sourceid><recordid>eNqFkU2LFDEQhoMoOK7-Aw8BQfTQWkk66eQiyOAXLHjZPYdMUlmzdnfapHuX_feb2REPHvRUUPXUC_UUIS8ZvGPA1HvQWnZcgH6jzVsDwHgHj8iOgZGdFko8Jrs_yFPyrNZrAFCyH3Yk7PO0uJLmK7pgiblMbvZIc6R1Qb8WN9KQ6vrQnNDVrWClbg50xu04nHG9zeVnm60_cqi0JdCbfIsjLejz1ZzWlOfn5El0Y8UXv-sZufz86WL_tTv__uXb_uN55wVjayek8WrQJkYJPTsEGRVGzpg-DMFLBcB7z7jn2uigw0FLDoahRO-Njy46cUZen3KXkn9tWFc7pepxHN2MeatWSyG4EMN_QanZAMPQN_DVX-B13srcjrBMgFJS6Z43qj9RvuRaC0a7lDS5cmcZ2OOH7FG_Peq32tiHD1loax9Oa9ic3CQstvqEzXRITd5qQ07_DrgH0-iYsw</recordid><startdate>1989</startdate><enddate>1989</enddate><creator>Kamm, Candace A.</creator><creator>Streeter, Lynn A.</creator><creator>Kane-Esrig, Yana</creator><creator>Burr, David J.</creator><general>Elsevier Ltd</general><general>Academic Press</general><scope>AAYXX</scope><scope>CITATION</scope><scope>HVZBN</scope><scope>K30</scope><scope>PAAUG</scope><scope>PAWHS</scope><scope>PAWZZ</scope><scope>PAXOH</scope><scope>PBHAV</scope><scope>PBQSW</scope><scope>PBYQZ</scope><scope>PCIWU</scope><scope>PCMID</scope><scope>PCZJX</scope><scope>PDGRG</scope><scope>PDWWI</scope><scope>PETMR</scope><scope>PFVGT</scope><scope>PGXDX</scope><scope>PIHIL</scope><scope>PISVA</scope><scope>PJCTQ</scope><scope>PJTMS</scope><scope>PLCHJ</scope><scope>PMHAD</scope><scope>PNQDJ</scope><scope>POUND</scope><scope>PPLAD</scope><scope>PQAPC</scope><scope>PQCAN</scope><scope>PQCMW</scope><scope>PQEME</scope><scope>PQHKH</scope><scope>PQMID</scope><scope>PQNCT</scope><scope>PQNET</scope><scope>PQSCT</scope><scope>PQSET</scope><scope>PSVJG</scope><scope>PVMQY</scope><scope>PZGFC</scope><scope>7T9</scope><scope>8BM</scope></search><sort><creationdate>1989</creationdate><title>Comparing performance of spectral distance measures and neural network methods for vowel recognition</title><author>Kamm, Candace A. ; Streeter, Lynn A. ; Kane-Esrig, Yana ; Burr, David J.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c311t-359c6789ff5041bd5f6ef2118b7dc560024c12c2898d8db852091e5ecc9cfafa3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>1989</creationdate><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Kamm, Candace A.</creatorcontrib><creatorcontrib>Streeter, Lynn A.</creatorcontrib><creatorcontrib>Kane-Esrig, Yana</creatorcontrib><creatorcontrib>Burr, David J.</creatorcontrib><collection>CrossRef</collection><collection>Periodicals Index Online Segment 24</collection><collection>Periodicals Index Online</collection><collection>Primary Sources Access—Foundation Edition (Plan E) - West</collection><collection>Primary Sources Access (Plan D) - International</collection><collection>Primary Sources Access &amp; Build (Plan A) - MEA</collection><collection>Primary Sources Access—Foundation Edition (Plan E) - Midwest</collection><collection>Primary Sources Access—Foundation Edition (Plan E) - Northeast</collection><collection>Primary Sources Access (Plan D) - Southeast</collection><collection>Primary Sources Access (Plan D) - North Central</collection><collection>Primary Sources Access—Foundation Edition (Plan E) - Southeast</collection><collection>Primary Sources Access (Plan D) - South Central</collection><collection>Primary Sources Access &amp; Build (Plan A) - UK / I</collection><collection>Primary Sources Access (Plan D) - Canada</collection><collection>Primary Sources Access (Plan D) - EMEALA</collection><collection>Primary Sources Access—Foundation Edition (Plan E) - North Central</collection><collection>Primary Sources Access—Foundation Edition (Plan E) - South Central</collection><collection>Primary Sources Access &amp; Build (Plan A) - International</collection><collection>Primary Sources Access—Foundation Edition (Plan E) - International</collection><collection>Primary Sources Access (Plan D) - West</collection><collection>Periodicals Index Online Segments 1-50</collection><collection>Primary Sources Access (Plan D) - APAC</collection><collection>Primary Sources Access (Plan D) - Midwest</collection><collection>Primary Sources Access (Plan D) - MEA</collection><collection>Primary Sources Access—Foundation Edition (Plan E) - Canada</collection><collection>Primary Sources Access—Foundation Edition (Plan E) - UK / I</collection><collection>Primary Sources Access—Foundation Edition (Plan E) - EMEALA</collection><collection>Primary Sources Access &amp; Build (Plan A) - APAC</collection><collection>Primary Sources Access &amp; Build (Plan A) - Canada</collection><collection>Primary Sources Access &amp; Build (Plan A) - West</collection><collection>Primary Sources Access &amp; Build (Plan A) - EMEALA</collection><collection>Primary Sources Access (Plan D) - Northeast</collection><collection>Primary Sources Access &amp; Build (Plan A) - Midwest</collection><collection>Primary Sources Access &amp; Build (Plan A) - North Central</collection><collection>Primary Sources Access &amp; Build (Plan A) - Northeast</collection><collection>Primary Sources Access &amp; Build (Plan A) - South Central</collection><collection>Primary Sources Access &amp; Build (Plan A) - Southeast</collection><collection>Primary Sources Access (Plan D) - UK / I</collection><collection>Primary Sources Access—Foundation Edition (Plan E) - APAC</collection><collection>Primary Sources Access—Foundation Edition (Plan E) - MEA</collection><collection>Linguistics and Language Behavior Abstracts (LLBA)</collection><collection>ComDisDome</collection><jtitle>Computer speech &amp; language</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Kamm, Candace A.</au><au>Streeter, Lynn A.</au><au>Kane-Esrig, Yana</au><au>Burr, David J.</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Comparing performance of spectral distance measures and neural network methods for vowel recognition</atitle><jtitle>Computer speech &amp; language</jtitle><date>1989</date><risdate>1989</risdate><volume>3</volume><issue>1</issue><spage>21</spage><epage>34</epage><pages>21-34</pages><issn>0885-2308</issn><eissn>1095-8363</eissn><coden>CSPLEO</coden><abstract>Neural networks were trained to classify single 20 ms frames of vowels using either perceptually-based spectral representations or LPC spectra as input. Classification performance was compared with performance of several distance measures using nearest-neighbor and mean-distance decision criteria. The non-network distance measures included LPC-residual and cepstral distance measures used in conventional automatic speech recognition systems, as well as a formant-based measure and a new elastic distance measure that explicitly corrects for the effects of spectral tilt. Using an optimal error rate criterion, vowels were discriminated best using the elastic distance measure with the perceptually-based spectrum. Neural networks with LPC spectra as input performed comparably to the better conventional distance measures. While the performance of networks trained with perceptually-based spectral inputs was poorer than that of networks trained with LPC spectra, the features represented by the hidden nodes of this network were more consistent with factors related to human vowel perception.</abstract><cop>London</cop><pub>Elsevier Ltd</pub><doi>10.1016/0885-2308(89)90012-0</doi><tpages>14</tpages></addata></record>
fulltext fulltext
identifier ISSN: 0885-2308
ispartof Computer speech & language, 1989, Vol.3 (1), p.21-34
issn 0885-2308
1095-8363
language eng
recordid cdi_proquest_miscellaneous_85332337
source Elsevier ScienceDirect Journals; Periodicals Index Online
title Comparing performance of spectral distance measures and neural network methods for vowel recognition
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-21T19%3A44%3A38IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Comparing%20performance%20of%20spectral%20distance%20measures%20and%20neural%20network%20methods%20for%20vowel%20recognition&rft.jtitle=Computer%20speech%20&%20language&rft.au=Kamm,%20Candace%20A.&rft.date=1989&rft.volume=3&rft.issue=1&rft.spage=21&rft.epage=34&rft.pages=21-34&rft.issn=0885-2308&rft.eissn=1095-8363&rft.coden=CSPLEO&rft_id=info:doi/10.1016/0885-2308(89)90012-0&rft_dat=%3Cproquest_cross%3E58170774%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=1306656842&rft_id=info:pmid/&rft_els_id=0885230889900120&rfr_iscdi=true