A model for context effects in speech recognition

A model is presented that quantifies the effect of context on speech recognition. In this model, a speech stimulus is considered as a concatenation of a number of equivalent elements (e.g., phonemes constituting a word). The model employs probabilities that individual elements are recognized and cha...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:The Journal of the Acoustical Society of America 1993, Vol.93 (1), p.499-509
Hauptverfasser: BRONKHORST, A. W, BOSMAN, A. J, SMOORENBURG, G. F
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 509
container_issue 1
container_start_page 499
container_title The Journal of the Acoustical Society of America
container_volume 93
creator BRONKHORST, A. W
BOSMAN, A. J
SMOORENBURG, G. F
description A model is presented that quantifies the effect of context on speech recognition. In this model, a speech stimulus is considered as a concatenation of a number of equivalent elements (e.g., phonemes constituting a word). The model employs probabilities that individual elements are recognized and chances that missed elements are guessed using contextual information. Predictions are given of the probability that the entire stimulus, or part of it, is reproduced correctly. The model can be applied to both speech recognition and visual recognition of printed text. It has been verified with data obtained with syllables of the consonant-vowel-consonant (CVC) type presented near the reception threshold in quiet and in noise, with the results of an experiment using orthographic presentation of incomplete CVC syllables and with results of word counts in a CVC lexicon. A remarkable outcome of the analysis is that the cues which occur only in spoken language (e.g., coarticulatory cues) seem to have a much greater influence on recognition performance when the stimuli are presented near the threshold in noise than when they are presented near the absolute threshold. Demonstrations are given of further predictions provided by the model: word recognition as a function of signal-to-noise ratio, closed-set word recognition, recognition of interrupted speech, and sentence recognition.
doi_str_mv 10.1121/1.406844
format Article
fullrecord <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_miscellaneous_75571809</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>58305742</sourcerecordid><originalsourceid>FETCH-LOGICAL-c338t-3bfe411c2a2ed215220043c64565cd62fa22ddd654fe6f04788e118a59f69833</originalsourceid><addsrcrecordid>eNqFkEtLw0AUhQdRaqyCf0DIQsRN6tx5ZbIsxRcU3HQ_TCd3NJJk6kwK-u-NNHTr6nI4H4d7DiHXQBcADB5gIajSQpyQDCSjhZZMnJKMUgqFqJQ6JxcpfY5Sal7NyEwLxpmSGYFl3oUa29yHmLvQD_g95Og9uiHlTZ-nHaL7yCO68N43QxP6S3LmbZvwarpzsnl63KxeivXb8-tquS4c53oo-NajAHDMMqzZ-BSjVHCnhFTS1Yp5y1hd10oKj8pTUWqNANrKyqtKcz4nd4fYXQxfe0yD6ZrksG1tj2GfTCllCZpW_4JjYyrLse-c3B9AF0NKEb3Zxaaz8ccANX8rGjCHFUf0Zsrcbzusj-A02-jfTr5NzrY-2t416YgJKRWXFf8F-sV2NA</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>58305742</pqid></control><display><type>article</type><title>A model for context effects in speech recognition</title><source>MEDLINE</source><source>AIP Acoustical Society of America</source><creator>BRONKHORST, A. W ; BOSMAN, A. J ; SMOORENBURG, G. F</creator><creatorcontrib>BRONKHORST, A. W ; BOSMAN, A. J ; SMOORENBURG, G. F</creatorcontrib><description>A model is presented that quantifies the effect of context on speech recognition. In this model, a speech stimulus is considered as a concatenation of a number of equivalent elements (e.g., phonemes constituting a word). The model employs probabilities that individual elements are recognized and chances that missed elements are guessed using contextual information. Predictions are given of the probability that the entire stimulus, or part of it, is reproduced correctly. The model can be applied to both speech recognition and visual recognition of printed text. It has been verified with data obtained with syllables of the consonant-vowel-consonant (CVC) type presented near the reception threshold in quiet and in noise, with the results of an experiment using orthographic presentation of incomplete CVC syllables and with results of word counts in a CVC lexicon. A remarkable outcome of the analysis is that the cues which occur only in spoken language (e.g., coarticulatory cues) seem to have a much greater influence on recognition performance when the stimuli are presented near the threshold in noise than when they are presented near the absolute threshold. Demonstrations are given of further predictions provided by the model: word recognition as a function of signal-to-noise ratio, closed-set word recognition, recognition of interrupted speech, and sentence recognition.</description><identifier>ISSN: 0001-4966</identifier><identifier>EISSN: 1520-8524</identifier><identifier>DOI: 10.1121/1.406844</identifier><identifier>PMID: 8423265</identifier><identifier>CODEN: JASMAN</identifier><language>eng</language><publisher>Woodbury, NY: Acoustical Society of America</publisher><subject>Acoustic Stimulation ; Adult ; Biological and medical sciences ; Female ; Fundamental and applied biological sciences. Psychology ; Humans ; Language ; Male ; Photic Stimulation ; Production and perception of spoken language ; Psychoacoustics ; Psycholinguistics ; Psychology. Psychoanalysis. Psychiatry ; Psychology. Psychophysiology ; Semantics ; Speech Perception</subject><ispartof>The Journal of the Acoustical Society of America, 1993, Vol.93 (1), p.499-509</ispartof><rights>1993 INIST-CNRS</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c338t-3bfe411c2a2ed215220043c64565cd62fa22ddd654fe6f04788e118a59f69833</citedby></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>207,314,777,781,4010,27904,27905,27906</link.rule.ids><backlink>$$Uhttp://pascal-francis.inist.fr/vibad/index.php?action=getRecordDetail&amp;idt=4556359$$DView record in Pascal Francis$$Hfree_for_read</backlink><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/8423265$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>BRONKHORST, A. W</creatorcontrib><creatorcontrib>BOSMAN, A. J</creatorcontrib><creatorcontrib>SMOORENBURG, G. F</creatorcontrib><title>A model for context effects in speech recognition</title><title>The Journal of the Acoustical Society of America</title><addtitle>J Acoust Soc Am</addtitle><description>A model is presented that quantifies the effect of context on speech recognition. In this model, a speech stimulus is considered as a concatenation of a number of equivalent elements (e.g., phonemes constituting a word). The model employs probabilities that individual elements are recognized and chances that missed elements are guessed using contextual information. Predictions are given of the probability that the entire stimulus, or part of it, is reproduced correctly. The model can be applied to both speech recognition and visual recognition of printed text. It has been verified with data obtained with syllables of the consonant-vowel-consonant (CVC) type presented near the reception threshold in quiet and in noise, with the results of an experiment using orthographic presentation of incomplete CVC syllables and with results of word counts in a CVC lexicon. A remarkable outcome of the analysis is that the cues which occur only in spoken language (e.g., coarticulatory cues) seem to have a much greater influence on recognition performance when the stimuli are presented near the threshold in noise than when they are presented near the absolute threshold. Demonstrations are given of further predictions provided by the model: word recognition as a function of signal-to-noise ratio, closed-set word recognition, recognition of interrupted speech, and sentence recognition.</description><subject>Acoustic Stimulation</subject><subject>Adult</subject><subject>Biological and medical sciences</subject><subject>Female</subject><subject>Fundamental and applied biological sciences. Psychology</subject><subject>Humans</subject><subject>Language</subject><subject>Male</subject><subject>Photic Stimulation</subject><subject>Production and perception of spoken language</subject><subject>Psychoacoustics</subject><subject>Psycholinguistics</subject><subject>Psychology. Psychoanalysis. Psychiatry</subject><subject>Psychology. Psychophysiology</subject><subject>Semantics</subject><subject>Speech Perception</subject><issn>0001-4966</issn><issn>1520-8524</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>1993</creationdate><recordtype>article</recordtype><sourceid>EIF</sourceid><recordid>eNqFkEtLw0AUhQdRaqyCf0DIQsRN6tx5ZbIsxRcU3HQ_TCd3NJJk6kwK-u-NNHTr6nI4H4d7DiHXQBcADB5gIajSQpyQDCSjhZZMnJKMUgqFqJQ6JxcpfY5Sal7NyEwLxpmSGYFl3oUa29yHmLvQD_g95Og9uiHlTZ-nHaL7yCO68N43QxP6S3LmbZvwarpzsnl63KxeivXb8-tquS4c53oo-NajAHDMMqzZ-BSjVHCnhFTS1Yp5y1hd10oKj8pTUWqNANrKyqtKcz4nd4fYXQxfe0yD6ZrksG1tj2GfTCllCZpW_4JjYyrLse-c3B9AF0NKEb3Zxaaz8ccANX8rGjCHFUf0Zsrcbzusj-A02-jfTr5NzrY-2t416YgJKRWXFf8F-sV2NA</recordid><startdate>1993</startdate><enddate>1993</enddate><creator>BRONKHORST, A. W</creator><creator>BOSMAN, A. J</creator><creator>SMOORENBURG, G. F</creator><general>Acoustical Society of America</general><general>American Institute of Physics</general><scope>IQODW</scope><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7T9</scope><scope>7X8</scope><scope>8BM</scope></search><sort><creationdate>1993</creationdate><title>A model for context effects in speech recognition</title><author>BRONKHORST, A. W ; BOSMAN, A. J ; SMOORENBURG, G. F</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c338t-3bfe411c2a2ed215220043c64565cd62fa22ddd654fe6f04788e118a59f69833</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>1993</creationdate><topic>Acoustic Stimulation</topic><topic>Adult</topic><topic>Biological and medical sciences</topic><topic>Female</topic><topic>Fundamental and applied biological sciences. Psychology</topic><topic>Humans</topic><topic>Language</topic><topic>Male</topic><topic>Photic Stimulation</topic><topic>Production and perception of spoken language</topic><topic>Psychoacoustics</topic><topic>Psycholinguistics</topic><topic>Psychology. Psychoanalysis. Psychiatry</topic><topic>Psychology. Psychophysiology</topic><topic>Semantics</topic><topic>Speech Perception</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>BRONKHORST, A. W</creatorcontrib><creatorcontrib>BOSMAN, A. J</creatorcontrib><creatorcontrib>SMOORENBURG, G. F</creatorcontrib><collection>Pascal-Francis</collection><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>Linguistics and Language Behavior Abstracts (LLBA)</collection><collection>MEDLINE - Academic</collection><collection>ComDisDome</collection><jtitle>The Journal of the Acoustical Society of America</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>BRONKHORST, A. W</au><au>BOSMAN, A. J</au><au>SMOORENBURG, G. F</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>A model for context effects in speech recognition</atitle><jtitle>The Journal of the Acoustical Society of America</jtitle><addtitle>J Acoust Soc Am</addtitle><date>1993</date><risdate>1993</risdate><volume>93</volume><issue>1</issue><spage>499</spage><epage>509</epage><pages>499-509</pages><issn>0001-4966</issn><eissn>1520-8524</eissn><coden>JASMAN</coden><abstract>A model is presented that quantifies the effect of context on speech recognition. In this model, a speech stimulus is considered as a concatenation of a number of equivalent elements (e.g., phonemes constituting a word). The model employs probabilities that individual elements are recognized and chances that missed elements are guessed using contextual information. Predictions are given of the probability that the entire stimulus, or part of it, is reproduced correctly. The model can be applied to both speech recognition and visual recognition of printed text. It has been verified with data obtained with syllables of the consonant-vowel-consonant (CVC) type presented near the reception threshold in quiet and in noise, with the results of an experiment using orthographic presentation of incomplete CVC syllables and with results of word counts in a CVC lexicon. A remarkable outcome of the analysis is that the cues which occur only in spoken language (e.g., coarticulatory cues) seem to have a much greater influence on recognition performance when the stimuli are presented near the threshold in noise than when they are presented near the absolute threshold. Demonstrations are given of further predictions provided by the model: word recognition as a function of signal-to-noise ratio, closed-set word recognition, recognition of interrupted speech, and sentence recognition.</abstract><cop>Woodbury, NY</cop><pub>Acoustical Society of America</pub><pmid>8423265</pmid><doi>10.1121/1.406844</doi><tpages>11</tpages></addata></record>
fulltext fulltext
identifier ISSN: 0001-4966
ispartof The Journal of the Acoustical Society of America, 1993, Vol.93 (1), p.499-509
issn 0001-4966
1520-8524
language eng
recordid cdi_proquest_miscellaneous_75571809
source MEDLINE; AIP Acoustical Society of America
subjects Acoustic Stimulation
Adult
Biological and medical sciences
Female
Fundamental and applied biological sciences. Psychology
Humans
Language
Male
Photic Stimulation
Production and perception of spoken language
Psychoacoustics
Psycholinguistics
Psychology. Psychoanalysis. Psychiatry
Psychology. Psychophysiology
Semantics
Speech Perception
title A model for context effects in speech recognition
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-17T20%3A24%3A30IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=A%20model%20for%20context%20effects%20in%20speech%20recognition&rft.jtitle=The%20Journal%20of%20the%20Acoustical%20Society%20of%20America&rft.au=BRONKHORST,%20A.%20W&rft.date=1993&rft.volume=93&rft.issue=1&rft.spage=499&rft.epage=509&rft.pages=499-509&rft.issn=0001-4966&rft.eissn=1520-8524&rft.coden=JASMAN&rft_id=info:doi/10.1121/1.406844&rft_dat=%3Cproquest_cross%3E58305742%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=58305742&rft_id=info:pmid/8423265&rfr_iscdi=true