A model for context effects in speech recognition
A model is presented that quantifies the effect of context on speech recognition. In this model, a speech stimulus is considered as a concatenation of a number of equivalent elements (e.g., phonemes constituting a word). The model employs probabilities that individual elements are recognized and cha...
Gespeichert in:
Veröffentlicht in: | The Journal of the Acoustical Society of America 1993, Vol.93 (1), p.499-509 |
---|---|
Hauptverfasser: | , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 509 |
---|---|
container_issue | 1 |
container_start_page | 499 |
container_title | The Journal of the Acoustical Society of America |
container_volume | 93 |
creator | BRONKHORST, A. W BOSMAN, A. J SMOORENBURG, G. F |
description | A model is presented that quantifies the effect of context on speech recognition. In this model, a speech stimulus is considered as a concatenation of a number of equivalent elements (e.g., phonemes constituting a word). The model employs probabilities that individual elements are recognized and chances that missed elements are guessed using contextual information. Predictions are given of the probability that the entire stimulus, or part of it, is reproduced correctly. The model can be applied to both speech recognition and visual recognition of printed text. It has been verified with data obtained with syllables of the consonant-vowel-consonant (CVC) type presented near the reception threshold in quiet and in noise, with the results of an experiment using orthographic presentation of incomplete CVC syllables and with results of word counts in a CVC lexicon. A remarkable outcome of the analysis is that the cues which occur only in spoken language (e.g., coarticulatory cues) seem to have a much greater influence on recognition performance when the stimuli are presented near the threshold in noise than when they are presented near the absolute threshold. Demonstrations are given of further predictions provided by the model: word recognition as a function of signal-to-noise ratio, closed-set word recognition, recognition of interrupted speech, and sentence recognition. |
doi_str_mv | 10.1121/1.406844 |
format | Article |
fullrecord | <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_miscellaneous_75571809</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>58305742</sourcerecordid><originalsourceid>FETCH-LOGICAL-c338t-3bfe411c2a2ed215220043c64565cd62fa22ddd654fe6f04788e118a59f69833</originalsourceid><addsrcrecordid>eNqFkEtLw0AUhQdRaqyCf0DIQsRN6tx5ZbIsxRcU3HQ_TCd3NJJk6kwK-u-NNHTr6nI4H4d7DiHXQBcADB5gIajSQpyQDCSjhZZMnJKMUgqFqJQ6JxcpfY5Sal7NyEwLxpmSGYFl3oUa29yHmLvQD_g95Og9uiHlTZ-nHaL7yCO68N43QxP6S3LmbZvwarpzsnl63KxeivXb8-tquS4c53oo-NajAHDMMqzZ-BSjVHCnhFTS1Yp5y1hd10oKj8pTUWqNANrKyqtKcz4nd4fYXQxfe0yD6ZrksG1tj2GfTCllCZpW_4JjYyrLse-c3B9AF0NKEb3Zxaaz8ccANX8rGjCHFUf0Zsrcbzusj-A02-jfTr5NzrY-2t416YgJKRWXFf8F-sV2NA</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>58305742</pqid></control><display><type>article</type><title>A model for context effects in speech recognition</title><source>MEDLINE</source><source>AIP Acoustical Society of America</source><creator>BRONKHORST, A. W ; BOSMAN, A. J ; SMOORENBURG, G. F</creator><creatorcontrib>BRONKHORST, A. W ; BOSMAN, A. J ; SMOORENBURG, G. F</creatorcontrib><description>A model is presented that quantifies the effect of context on speech recognition. In this model, a speech stimulus is considered as a concatenation of a number of equivalent elements (e.g., phonemes constituting a word). The model employs probabilities that individual elements are recognized and chances that missed elements are guessed using contextual information. Predictions are given of the probability that the entire stimulus, or part of it, is reproduced correctly. The model can be applied to both speech recognition and visual recognition of printed text. It has been verified with data obtained with syllables of the consonant-vowel-consonant (CVC) type presented near the reception threshold in quiet and in noise, with the results of an experiment using orthographic presentation of incomplete CVC syllables and with results of word counts in a CVC lexicon. A remarkable outcome of the analysis is that the cues which occur only in spoken language (e.g., coarticulatory cues) seem to have a much greater influence on recognition performance when the stimuli are presented near the threshold in noise than when they are presented near the absolute threshold. Demonstrations are given of further predictions provided by the model: word recognition as a function of signal-to-noise ratio, closed-set word recognition, recognition of interrupted speech, and sentence recognition.</description><identifier>ISSN: 0001-4966</identifier><identifier>EISSN: 1520-8524</identifier><identifier>DOI: 10.1121/1.406844</identifier><identifier>PMID: 8423265</identifier><identifier>CODEN: JASMAN</identifier><language>eng</language><publisher>Woodbury, NY: Acoustical Society of America</publisher><subject>Acoustic Stimulation ; Adult ; Biological and medical sciences ; Female ; Fundamental and applied biological sciences. Psychology ; Humans ; Language ; Male ; Photic Stimulation ; Production and perception of spoken language ; Psychoacoustics ; Psycholinguistics ; Psychology. Psychoanalysis. Psychiatry ; Psychology. Psychophysiology ; Semantics ; Speech Perception</subject><ispartof>The Journal of the Acoustical Society of America, 1993, Vol.93 (1), p.499-509</ispartof><rights>1993 INIST-CNRS</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c338t-3bfe411c2a2ed215220043c64565cd62fa22ddd654fe6f04788e118a59f69833</citedby></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>207,314,777,781,4010,27904,27905,27906</link.rule.ids><backlink>$$Uhttp://pascal-francis.inist.fr/vibad/index.php?action=getRecordDetail&idt=4556359$$DView record in Pascal Francis$$Hfree_for_read</backlink><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/8423265$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>BRONKHORST, A. W</creatorcontrib><creatorcontrib>BOSMAN, A. J</creatorcontrib><creatorcontrib>SMOORENBURG, G. F</creatorcontrib><title>A model for context effects in speech recognition</title><title>The Journal of the Acoustical Society of America</title><addtitle>J Acoust Soc Am</addtitle><description>A model is presented that quantifies the effect of context on speech recognition. In this model, a speech stimulus is considered as a concatenation of a number of equivalent elements (e.g., phonemes constituting a word). The model employs probabilities that individual elements are recognized and chances that missed elements are guessed using contextual information. Predictions are given of the probability that the entire stimulus, or part of it, is reproduced correctly. The model can be applied to both speech recognition and visual recognition of printed text. It has been verified with data obtained with syllables of the consonant-vowel-consonant (CVC) type presented near the reception threshold in quiet and in noise, with the results of an experiment using orthographic presentation of incomplete CVC syllables and with results of word counts in a CVC lexicon. A remarkable outcome of the analysis is that the cues which occur only in spoken language (e.g., coarticulatory cues) seem to have a much greater influence on recognition performance when the stimuli are presented near the threshold in noise than when they are presented near the absolute threshold. Demonstrations are given of further predictions provided by the model: word recognition as a function of signal-to-noise ratio, closed-set word recognition, recognition of interrupted speech, and sentence recognition.</description><subject>Acoustic Stimulation</subject><subject>Adult</subject><subject>Biological and medical sciences</subject><subject>Female</subject><subject>Fundamental and applied biological sciences. Psychology</subject><subject>Humans</subject><subject>Language</subject><subject>Male</subject><subject>Photic Stimulation</subject><subject>Production and perception of spoken language</subject><subject>Psychoacoustics</subject><subject>Psycholinguistics</subject><subject>Psychology. Psychoanalysis. Psychiatry</subject><subject>Psychology. Psychophysiology</subject><subject>Semantics</subject><subject>Speech Perception</subject><issn>0001-4966</issn><issn>1520-8524</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>1993</creationdate><recordtype>article</recordtype><sourceid>EIF</sourceid><recordid>eNqFkEtLw0AUhQdRaqyCf0DIQsRN6tx5ZbIsxRcU3HQ_TCd3NJJk6kwK-u-NNHTr6nI4H4d7DiHXQBcADB5gIajSQpyQDCSjhZZMnJKMUgqFqJQ6JxcpfY5Sal7NyEwLxpmSGYFl3oUa29yHmLvQD_g95Og9uiHlTZ-nHaL7yCO68N43QxP6S3LmbZvwarpzsnl63KxeivXb8-tquS4c53oo-NajAHDMMqzZ-BSjVHCnhFTS1Yp5y1hd10oKj8pTUWqNANrKyqtKcz4nd4fYXQxfe0yD6ZrksG1tj2GfTCllCZpW_4JjYyrLse-c3B9AF0NKEb3Zxaaz8ccANX8rGjCHFUf0Zsrcbzusj-A02-jfTr5NzrY-2t416YgJKRWXFf8F-sV2NA</recordid><startdate>1993</startdate><enddate>1993</enddate><creator>BRONKHORST, A. W</creator><creator>BOSMAN, A. J</creator><creator>SMOORENBURG, G. F</creator><general>Acoustical Society of America</general><general>American Institute of Physics</general><scope>IQODW</scope><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7T9</scope><scope>7X8</scope><scope>8BM</scope></search><sort><creationdate>1993</creationdate><title>A model for context effects in speech recognition</title><author>BRONKHORST, A. W ; BOSMAN, A. J ; SMOORENBURG, G. F</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c338t-3bfe411c2a2ed215220043c64565cd62fa22ddd654fe6f04788e118a59f69833</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>1993</creationdate><topic>Acoustic Stimulation</topic><topic>Adult</topic><topic>Biological and medical sciences</topic><topic>Female</topic><topic>Fundamental and applied biological sciences. Psychology</topic><topic>Humans</topic><topic>Language</topic><topic>Male</topic><topic>Photic Stimulation</topic><topic>Production and perception of spoken language</topic><topic>Psychoacoustics</topic><topic>Psycholinguistics</topic><topic>Psychology. Psychoanalysis. Psychiatry</topic><topic>Psychology. Psychophysiology</topic><topic>Semantics</topic><topic>Speech Perception</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>BRONKHORST, A. W</creatorcontrib><creatorcontrib>BOSMAN, A. J</creatorcontrib><creatorcontrib>SMOORENBURG, G. F</creatorcontrib><collection>Pascal-Francis</collection><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>Linguistics and Language Behavior Abstracts (LLBA)</collection><collection>MEDLINE - Academic</collection><collection>ComDisDome</collection><jtitle>The Journal of the Acoustical Society of America</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>BRONKHORST, A. W</au><au>BOSMAN, A. J</au><au>SMOORENBURG, G. F</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>A model for context effects in speech recognition</atitle><jtitle>The Journal of the Acoustical Society of America</jtitle><addtitle>J Acoust Soc Am</addtitle><date>1993</date><risdate>1993</risdate><volume>93</volume><issue>1</issue><spage>499</spage><epage>509</epage><pages>499-509</pages><issn>0001-4966</issn><eissn>1520-8524</eissn><coden>JASMAN</coden><abstract>A model is presented that quantifies the effect of context on speech recognition. In this model, a speech stimulus is considered as a concatenation of a number of equivalent elements (e.g., phonemes constituting a word). The model employs probabilities that individual elements are recognized and chances that missed elements are guessed using contextual information. Predictions are given of the probability that the entire stimulus, or part of it, is reproduced correctly. The model can be applied to both speech recognition and visual recognition of printed text. It has been verified with data obtained with syllables of the consonant-vowel-consonant (CVC) type presented near the reception threshold in quiet and in noise, with the results of an experiment using orthographic presentation of incomplete CVC syllables and with results of word counts in a CVC lexicon. A remarkable outcome of the analysis is that the cues which occur only in spoken language (e.g., coarticulatory cues) seem to have a much greater influence on recognition performance when the stimuli are presented near the threshold in noise than when they are presented near the absolute threshold. Demonstrations are given of further predictions provided by the model: word recognition as a function of signal-to-noise ratio, closed-set word recognition, recognition of interrupted speech, and sentence recognition.</abstract><cop>Woodbury, NY</cop><pub>Acoustical Society of America</pub><pmid>8423265</pmid><doi>10.1121/1.406844</doi><tpages>11</tpages></addata></record> |
fulltext | fulltext |
identifier | ISSN: 0001-4966 |
ispartof | The Journal of the Acoustical Society of America, 1993, Vol.93 (1), p.499-509 |
issn | 0001-4966 1520-8524 |
language | eng |
recordid | cdi_proquest_miscellaneous_75571809 |
source | MEDLINE; AIP Acoustical Society of America |
subjects | Acoustic Stimulation Adult Biological and medical sciences Female Fundamental and applied biological sciences. Psychology Humans Language Male Photic Stimulation Production and perception of spoken language Psychoacoustics Psycholinguistics Psychology. Psychoanalysis. Psychiatry Psychology. Psychophysiology Semantics Speech Perception |
title | A model for context effects in speech recognition |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-17T20%3A24%3A30IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=A%20model%20for%20context%20effects%20in%20speech%20recognition&rft.jtitle=The%20Journal%20of%20the%20Acoustical%20Society%20of%20America&rft.au=BRONKHORST,%20A.%20W&rft.date=1993&rft.volume=93&rft.issue=1&rft.spage=499&rft.epage=509&rft.pages=499-509&rft.issn=0001-4966&rft.eissn=1520-8524&rft.coden=JASMAN&rft_id=info:doi/10.1121/1.406844&rft_dat=%3Cproquest_cross%3E58305742%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=58305742&rft_id=info:pmid/8423265&rfr_iscdi=true |