Automatic Recognition of Spoken Digits
The recognizer discussed will automatically recognize telephone-quality digits spoken at normal speech rates by a single individual, with an accuracy varying between 97 and 99 percent. After some preliminary analysis of the speech of any individual, the circuit can be adjusted to deliver a similar a...
Gespeichert in:
Veröffentlicht in: | The Journal of the Acoustical Society of America 1952-11, Vol.24 (6), p.637-642 |
---|---|
Hauptverfasser: | , , |
Format: | Artikel |
Sprache: | eng |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 642 |
---|---|
container_issue | 6 |
container_start_page | 637 |
container_title | The Journal of the Acoustical Society of America |
container_volume | 24 |
creator | Davis, K. H. Biddulph, R. Balashek, S. |
description | The recognizer discussed will automatically recognize telephone-quality digits spoken at normal speech rates by a single individual, with an accuracy varying between 97 and 99 percent. After some preliminary analysis of the speech of any individual, the circuit can be adjusted to deliver a similar accuracy on the speech of that individual. The circuit is not, however, in its present configuration, capable of performing equally well on the speech of a series of talkers without recourse to such adjustment.
Circuitry involves division of the speech spectrum into two frequency bands, one below and the other above 900 cps. Axis-crossing counts are then individually made of both band energies to determine the frequency of the maximum syllabic rate energy with each band. Simultaneous two-dimensional frequency portrayal is found to possess recognition significance. Standards are then determined, one for each digit of the ten-digit series, and are built into the recognizer as a form of elemental memory. By means of a series of calculations performed automatically on the spoken input digit, a best match type comparison is made with each of the ten standard digit patterns and the digit of best match selected. |
doi_str_mv | 10.1121/1.1906946 |
format | Article |
fullrecord | <record><control><sourceid>crossref</sourceid><recordid>TN_cdi_crossref_primary_10_1121_1_1906946</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>10_1121_1_1906946</sourcerecordid><originalsourceid>FETCH-LOGICAL-c295t-43b6a59ee047f991cf4011306e79a570238c192be61be522ce949a10bd33b4a23</originalsourceid><addsrcrecordid>eNotj0tLAzEURoMoOFYX_oNZCS5S781r5i5LtSoUBB_rIYlJidpJmcSF_95K5Sw-zuaDw9glwhxR4A3OkcCQMkesQS2A91qoY9YAAHJFxpyys1I-9qp7SQ27WnzXvLU1-fY5-LwZU015bHNsX3b5M4ztbdqkWs7ZSbRfJVz874y9re5elw98_XT_uFysuRekK1fSGaspBFBdJEIfFSBKMKEjqzsQsvdIwgWDLmghfCBFFsG9S-mUFXLGrg-_fsqlTCEOuylt7fQzIAx_gcOeQ6D8BR_sQHs</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Automatic Recognition of Spoken Digits</title><source>AIP Acoustical Society of America</source><creator>Davis, K. H. ; Biddulph, R. ; Balashek, S.</creator><creatorcontrib>Davis, K. H. ; Biddulph, R. ; Balashek, S.</creatorcontrib><description>The recognizer discussed will automatically recognize telephone-quality digits spoken at normal speech rates by a single individual, with an accuracy varying between 97 and 99 percent. After some preliminary analysis of the speech of any individual, the circuit can be adjusted to deliver a similar accuracy on the speech of that individual. The circuit is not, however, in its present configuration, capable of performing equally well on the speech of a series of talkers without recourse to such adjustment.
Circuitry involves division of the speech spectrum into two frequency bands, one below and the other above 900 cps. Axis-crossing counts are then individually made of both band energies to determine the frequency of the maximum syllabic rate energy with each band. Simultaneous two-dimensional frequency portrayal is found to possess recognition significance. Standards are then determined, one for each digit of the ten-digit series, and are built into the recognizer as a form of elemental memory. By means of a series of calculations performed automatically on the spoken input digit, a best match type comparison is made with each of the ten standard digit patterns and the digit of best match selected.</description><identifier>ISSN: 0001-4966</identifier><identifier>EISSN: 1520-8524</identifier><identifier>DOI: 10.1121/1.1906946</identifier><language>eng</language><ispartof>The Journal of the Acoustical Society of America, 1952-11, Vol.24 (6), p.637-642</ispartof><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c295t-43b6a59ee047f991cf4011306e79a570238c192be61be522ce949a10bd33b4a23</citedby></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>207,314,780,784,27922,27923</link.rule.ids></links><search><creatorcontrib>Davis, K. H.</creatorcontrib><creatorcontrib>Biddulph, R.</creatorcontrib><creatorcontrib>Balashek, S.</creatorcontrib><title>Automatic Recognition of Spoken Digits</title><title>The Journal of the Acoustical Society of America</title><description>The recognizer discussed will automatically recognize telephone-quality digits spoken at normal speech rates by a single individual, with an accuracy varying between 97 and 99 percent. After some preliminary analysis of the speech of any individual, the circuit can be adjusted to deliver a similar accuracy on the speech of that individual. The circuit is not, however, in its present configuration, capable of performing equally well on the speech of a series of talkers without recourse to such adjustment.
Circuitry involves division of the speech spectrum into two frequency bands, one below and the other above 900 cps. Axis-crossing counts are then individually made of both band energies to determine the frequency of the maximum syllabic rate energy with each band. Simultaneous two-dimensional frequency portrayal is found to possess recognition significance. Standards are then determined, one for each digit of the ten-digit series, and are built into the recognizer as a form of elemental memory. By means of a series of calculations performed automatically on the spoken input digit, a best match type comparison is made with each of the ten standard digit patterns and the digit of best match selected.</description><issn>0001-4966</issn><issn>1520-8524</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>1952</creationdate><recordtype>article</recordtype><recordid>eNotj0tLAzEURoMoOFYX_oNZCS5S781r5i5LtSoUBB_rIYlJidpJmcSF_95K5Sw-zuaDw9glwhxR4A3OkcCQMkesQS2A91qoY9YAAHJFxpyys1I-9qp7SQ27WnzXvLU1-fY5-LwZU015bHNsX3b5M4ztbdqkWs7ZSbRfJVz874y9re5elw98_XT_uFysuRekK1fSGaspBFBdJEIfFSBKMKEjqzsQsvdIwgWDLmghfCBFFsG9S-mUFXLGrg-_fsqlTCEOuylt7fQzIAx_gcOeQ6D8BR_sQHs</recordid><startdate>19521101</startdate><enddate>19521101</enddate><creator>Davis, K. H.</creator><creator>Biddulph, R.</creator><creator>Balashek, S.</creator><scope>AAYXX</scope><scope>CITATION</scope></search><sort><creationdate>19521101</creationdate><title>Automatic Recognition of Spoken Digits</title><author>Davis, K. H. ; Biddulph, R. ; Balashek, S.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c295t-43b6a59ee047f991cf4011306e79a570238c192be61be522ce949a10bd33b4a23</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>1952</creationdate><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Davis, K. H.</creatorcontrib><creatorcontrib>Biddulph, R.</creatorcontrib><creatorcontrib>Balashek, S.</creatorcontrib><collection>CrossRef</collection><jtitle>The Journal of the Acoustical Society of America</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Davis, K. H.</au><au>Biddulph, R.</au><au>Balashek, S.</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Automatic Recognition of Spoken Digits</atitle><jtitle>The Journal of the Acoustical Society of America</jtitle><date>1952-11-01</date><risdate>1952</risdate><volume>24</volume><issue>6</issue><spage>637</spage><epage>642</epage><pages>637-642</pages><issn>0001-4966</issn><eissn>1520-8524</eissn><abstract>The recognizer discussed will automatically recognize telephone-quality digits spoken at normal speech rates by a single individual, with an accuracy varying between 97 and 99 percent. After some preliminary analysis of the speech of any individual, the circuit can be adjusted to deliver a similar accuracy on the speech of that individual. The circuit is not, however, in its present configuration, capable of performing equally well on the speech of a series of talkers without recourse to such adjustment.
Circuitry involves division of the speech spectrum into two frequency bands, one below and the other above 900 cps. Axis-crossing counts are then individually made of both band energies to determine the frequency of the maximum syllabic rate energy with each band. Simultaneous two-dimensional frequency portrayal is found to possess recognition significance. Standards are then determined, one for each digit of the ten-digit series, and are built into the recognizer as a form of elemental memory. By means of a series of calculations performed automatically on the spoken input digit, a best match type comparison is made with each of the ten standard digit patterns and the digit of best match selected.</abstract><doi>10.1121/1.1906946</doi><tpages>6</tpages></addata></record> |
fulltext | fulltext |
identifier | ISSN: 0001-4966 |
ispartof | The Journal of the Acoustical Society of America, 1952-11, Vol.24 (6), p.637-642 |
issn | 0001-4966 1520-8524 |
language | eng |
recordid | cdi_crossref_primary_10_1121_1_1906946 |
source | AIP Acoustical Society of America |
title | Automatic Recognition of Spoken Digits |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-09T18%3A00%3A20IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-crossref&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Automatic%20Recognition%20of%20Spoken%20Digits&rft.jtitle=The%20Journal%20of%20the%20Acoustical%20Society%20of%20America&rft.au=Davis,%20K.%20H.&rft.date=1952-11-01&rft.volume=24&rft.issue=6&rft.spage=637&rft.epage=642&rft.pages=637-642&rft.issn=0001-4966&rft.eissn=1520-8524&rft_id=info:doi/10.1121/1.1906946&rft_dat=%3Ccrossref%3E10_1121_1_1906946%3C/crossref%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true |