Stochastic automata for language modeling

Stochastic language models are widely used in spoken language understanding to recognize and interpret the speech signal: the speech samples are decoded into word transcriptions by means of acoustic and syntactic models and then interpreted according to a semantic model. Both for speech recognition...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Computer speech & language 1996-10, Vol.10 (4), p.265-293
Hauptverfasser: Riccardi, Giuseppe, Pieraccini, Roberto, Bocchieri, Enrico
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 293
container_issue 4
container_start_page 265
container_title Computer speech & language
container_volume 10
creator Riccardi, Giuseppe
Pieraccini, Roberto
Bocchieri, Enrico
description Stochastic language models are widely used in spoken language understanding to recognize and interpret the speech signal: the speech samples are decoded into word transcriptions by means of acoustic and syntactic models and then interpreted according to a semantic model. Both for speech recognition and understanding, search algorithms use stochastic models to extract the most likely uttered sentence and its correspondent interpretation. The design of the language models has to be effective in order to mostly constrain the search algorithms and has to be efficient to comply with the storage space limits. In this work we present the Variable N-gram Stochastic Automaton (VNSA) language model that provides a unified formalism for building a wide class of language models. First, this approach allows for the use of accurate language models for large vocabulary speech recognition by using the standard search algorithm in the one-pass Viterbi decoder. Second, the unified formalism is an effective approach to incorporate different sources of information for computing the probability of word sequences. Third, the VNSAs are well suited for those applications where speech and language decoding cascades are implemented through weighted rational transductions. The VNSAs have been compared to standard bigram and trigram language models and their reduced set of parameters does not affect by any means the performances in terms of perplexity. The design of a stochastic language model through the VNSA is described and applied to word and phrase class-based language models. The effectiveness of VNSAs has been tested within the Air Travel Information System (ATIS) task to build the language model for the speech recognition and the language understanding system.
doi_str_mv 10.1006/csla.1996.0014
format Article
fullrecord <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_miscellaneous_85657343</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><els_id>S0885230896900145</els_id><sourcerecordid>23453823</sourcerecordid><originalsourceid>FETCH-LOGICAL-c436t-787926a89586bbb13bb5fc15532a5c568156585c566467aefd54e1a62e14ce293</originalsourceid><addsrcrecordid>eNqNkUtLxDAURoMoOI5uXRcUwUXHvJssRXzBgAt1HdL0dszQNmPSCv57W2ZwIYiuchfnOzfJh9ApwQuCsbxyqbELorVcYEz4HpoRrEWumGT7aIaVEjllWB2io5TWeAwIXszQ5XMf3JtNvXeZHfrQ2t5mdYhZY7vVYFeQtaGCxnerY3RQ2ybBye6co9e725ebh3z5dP94c73MHWeyzwtVaCqt0kLJsiwJK0tROyIEo1Y4IRURUqhpklwWFupKcCBWUiDcAdVsji623k0M7wOk3rQ-OWjGC0EYklFjvmCc_QNknEr1t5EyLpiik_HsB7gOQ-zG1xqimFaaC0xGarGlXAwpRajNJvrWxk9DsJmaMFMTZmrCTE2MgfOd1iZnmzrazvn0naKqEJJOXrXFYPzdDw_RJOehc1D5CK43VfC_bfgC-5aZDA</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>1839894501</pqid></control><display><type>article</type><title>Stochastic automata for language modeling</title><source>Periodicals Index Online</source><source>Access via ScienceDirect (Elsevier)</source><creator>Riccardi, Giuseppe ; Pieraccini, Roberto ; Bocchieri, Enrico</creator><creatorcontrib>Riccardi, Giuseppe ; Pieraccini, Roberto ; Bocchieri, Enrico</creatorcontrib><description>Stochastic language models are widely used in spoken language understanding to recognize and interpret the speech signal: the speech samples are decoded into word transcriptions by means of acoustic and syntactic models and then interpreted according to a semantic model. Both for speech recognition and understanding, search algorithms use stochastic models to extract the most likely uttered sentence and its correspondent interpretation. The design of the language models has to be effective in order to mostly constrain the search algorithms and has to be efficient to comply with the storage space limits. In this work we present the Variable N-gram Stochastic Automaton (VNSA) language model that provides a unified formalism for building a wide class of language models. First, this approach allows for the use of accurate language models for large vocabulary speech recognition by using the standard search algorithm in the one-pass Viterbi decoder. Second, the unified formalism is an effective approach to incorporate different sources of information for computing the probability of word sequences. Third, the VNSAs are well suited for those applications where speech and language decoding cascades are implemented through weighted rational transductions. The VNSAs have been compared to standard bigram and trigram language models and their reduced set of parameters does not affect by any means the performances in terms of perplexity. The design of a stochastic language model through the VNSA is described and applied to word and phrase class-based language models. The effectiveness of VNSAs has been tested within the Air Travel Information System (ATIS) task to build the language model for the speech recognition and the language understanding system.</description><identifier>ISSN: 0885-2308</identifier><identifier>EISSN: 1095-8363</identifier><identifier>DOI: 10.1006/csla.1996.0014</identifier><identifier>CODEN: CSPLEO</identifier><language>eng</language><publisher>Oxford: Elsevier Ltd</publisher><subject>Applied linguistics ; Computational linguistics ; Linguistics</subject><ispartof>Computer speech &amp; language, 1996-10, Vol.10 (4), p.265-293</ispartof><rights>1996</rights><rights>1997 INIST-CNRS</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c436t-787926a89586bbb13bb5fc15532a5c568156585c566467aefd54e1a62e14ce293</citedby></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://dx.doi.org/10.1006/csla.1996.0014$$EHTML$$P50$$Gelsevier$$H</linktohtml><link.rule.ids>314,780,784,3550,27869,27924,27925,45995</link.rule.ids><backlink>$$Uhttp://pascal-francis.inist.fr/vibad/index.php?action=getRecordDetail&amp;idt=2875621$$DView record in Pascal Francis$$Hfree_for_read</backlink></links><search><creatorcontrib>Riccardi, Giuseppe</creatorcontrib><creatorcontrib>Pieraccini, Roberto</creatorcontrib><creatorcontrib>Bocchieri, Enrico</creatorcontrib><title>Stochastic automata for language modeling</title><title>Computer speech &amp; language</title><description>Stochastic language models are widely used in spoken language understanding to recognize and interpret the speech signal: the speech samples are decoded into word transcriptions by means of acoustic and syntactic models and then interpreted according to a semantic model. Both for speech recognition and understanding, search algorithms use stochastic models to extract the most likely uttered sentence and its correspondent interpretation. The design of the language models has to be effective in order to mostly constrain the search algorithms and has to be efficient to comply with the storage space limits. In this work we present the Variable N-gram Stochastic Automaton (VNSA) language model that provides a unified formalism for building a wide class of language models. First, this approach allows for the use of accurate language models for large vocabulary speech recognition by using the standard search algorithm in the one-pass Viterbi decoder. Second, the unified formalism is an effective approach to incorporate different sources of information for computing the probability of word sequences. Third, the VNSAs are well suited for those applications where speech and language decoding cascades are implemented through weighted rational transductions. The VNSAs have been compared to standard bigram and trigram language models and their reduced set of parameters does not affect by any means the performances in terms of perplexity. The design of a stochastic language model through the VNSA is described and applied to word and phrase class-based language models. The effectiveness of VNSAs has been tested within the Air Travel Information System (ATIS) task to build the language model for the speech recognition and the language understanding system.</description><subject>Applied linguistics</subject><subject>Computational linguistics</subject><subject>Linguistics</subject><issn>0885-2308</issn><issn>1095-8363</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>1996</creationdate><recordtype>article</recordtype><sourceid>K30</sourceid><recordid>eNqNkUtLxDAURoMoOI5uXRcUwUXHvJssRXzBgAt1HdL0dszQNmPSCv57W2ZwIYiuchfnOzfJh9ApwQuCsbxyqbELorVcYEz4HpoRrEWumGT7aIaVEjllWB2io5TWeAwIXszQ5XMf3JtNvXeZHfrQ2t5mdYhZY7vVYFeQtaGCxnerY3RQ2ybBye6co9e725ebh3z5dP94c73MHWeyzwtVaCqt0kLJsiwJK0tROyIEo1Y4IRURUqhpklwWFupKcCBWUiDcAdVsji623k0M7wOk3rQ-OWjGC0EYklFjvmCc_QNknEr1t5EyLpiik_HsB7gOQ-zG1xqimFaaC0xGarGlXAwpRajNJvrWxk9DsJmaMFMTZmrCTE2MgfOd1iZnmzrazvn0naKqEJJOXrXFYPzdDw_RJOehc1D5CK43VfC_bfgC-5aZDA</recordid><startdate>19961001</startdate><enddate>19961001</enddate><creator>Riccardi, Giuseppe</creator><creator>Pieraccini, Roberto</creator><creator>Bocchieri, Enrico</creator><general>Elsevier Ltd</general><general>Elsevier</general><general>Academic Press</general><scope>IQODW</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>HVZBN</scope><scope>K30</scope><scope>PAAUG</scope><scope>PAWHS</scope><scope>PAWZZ</scope><scope>PAXOH</scope><scope>PBHAV</scope><scope>PBQSW</scope><scope>PBYQZ</scope><scope>PCIWU</scope><scope>PCMID</scope><scope>PCZJX</scope><scope>PDGRG</scope><scope>PDWWI</scope><scope>PETMR</scope><scope>PFVGT</scope><scope>PGXDX</scope><scope>PIHIL</scope><scope>PISVA</scope><scope>PJCTQ</scope><scope>PJTMS</scope><scope>PLCHJ</scope><scope>PMHAD</scope><scope>PNQDJ</scope><scope>POUND</scope><scope>PPLAD</scope><scope>PQAPC</scope><scope>PQCAN</scope><scope>PQCMW</scope><scope>PQEME</scope><scope>PQHKH</scope><scope>PQMID</scope><scope>PQNCT</scope><scope>PQNET</scope><scope>PQSCT</scope><scope>PQSET</scope><scope>PSVJG</scope><scope>PVMQY</scope><scope>PZGFC</scope><scope>7SC</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>8BM</scope><scope>7T9</scope></search><sort><creationdate>19961001</creationdate><title>Stochastic automata for language modeling</title><author>Riccardi, Giuseppe ; Pieraccini, Roberto ; Bocchieri, Enrico</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c436t-787926a89586bbb13bb5fc15532a5c568156585c566467aefd54e1a62e14ce293</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>1996</creationdate><topic>Applied linguistics</topic><topic>Computational linguistics</topic><topic>Linguistics</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Riccardi, Giuseppe</creatorcontrib><creatorcontrib>Pieraccini, Roberto</creatorcontrib><creatorcontrib>Bocchieri, Enrico</creatorcontrib><collection>Pascal-Francis</collection><collection>CrossRef</collection><collection>Periodicals Index Online Segment 24</collection><collection>Periodicals Index Online</collection><collection>Primary Sources Access—Foundation Edition (Plan E) - West</collection><collection>Primary Sources Access (Plan D) - International</collection><collection>Primary Sources Access &amp; Build (Plan A) - MEA</collection><collection>Primary Sources Access—Foundation Edition (Plan E) - Midwest</collection><collection>Primary Sources Access—Foundation Edition (Plan E) - Northeast</collection><collection>Primary Sources Access (Plan D) - Southeast</collection><collection>Primary Sources Access (Plan D) - North Central</collection><collection>Primary Sources Access—Foundation Edition (Plan E) - Southeast</collection><collection>Primary Sources Access (Plan D) - South Central</collection><collection>Primary Sources Access &amp; Build (Plan A) - UK / I</collection><collection>Primary Sources Access (Plan D) - Canada</collection><collection>Primary Sources Access (Plan D) - EMEALA</collection><collection>Primary Sources Access—Foundation Edition (Plan E) - North Central</collection><collection>Primary Sources Access—Foundation Edition (Plan E) - South Central</collection><collection>Primary Sources Access &amp; Build (Plan A) - International</collection><collection>Primary Sources Access—Foundation Edition (Plan E) - International</collection><collection>Primary Sources Access (Plan D) - West</collection><collection>Periodicals Index Online Segments 1-50</collection><collection>Primary Sources Access (Plan D) - APAC</collection><collection>Primary Sources Access (Plan D) - Midwest</collection><collection>Primary Sources Access (Plan D) - MEA</collection><collection>Primary Sources Access—Foundation Edition (Plan E) - Canada</collection><collection>Primary Sources Access—Foundation Edition (Plan E) - UK / I</collection><collection>Primary Sources Access—Foundation Edition (Plan E) - EMEALA</collection><collection>Primary Sources Access &amp; Build (Plan A) - APAC</collection><collection>Primary Sources Access &amp; Build (Plan A) - Canada</collection><collection>Primary Sources Access &amp; Build (Plan A) - West</collection><collection>Primary Sources Access &amp; Build (Plan A) - EMEALA</collection><collection>Primary Sources Access (Plan D) - Northeast</collection><collection>Primary Sources Access &amp; Build (Plan A) - Midwest</collection><collection>Primary Sources Access &amp; Build (Plan A) - North Central</collection><collection>Primary Sources Access &amp; Build (Plan A) - Northeast</collection><collection>Primary Sources Access &amp; Build (Plan A) - South Central</collection><collection>Primary Sources Access &amp; Build (Plan A) - Southeast</collection><collection>Primary Sources Access (Plan D) - UK / I</collection><collection>Primary Sources Access—Foundation Edition (Plan E) - APAC</collection><collection>Primary Sources Access—Foundation Edition (Plan E) - MEA</collection><collection>Computer and Information Systems Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>ComDisDome</collection><collection>Linguistics and Language Behavior Abstracts (LLBA)</collection><jtitle>Computer speech &amp; language</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Riccardi, Giuseppe</au><au>Pieraccini, Roberto</au><au>Bocchieri, Enrico</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Stochastic automata for language modeling</atitle><jtitle>Computer speech &amp; language</jtitle><date>1996-10-01</date><risdate>1996</risdate><volume>10</volume><issue>4</issue><spage>265</spage><epage>293</epage><pages>265-293</pages><issn>0885-2308</issn><eissn>1095-8363</eissn><coden>CSPLEO</coden><abstract>Stochastic language models are widely used in spoken language understanding to recognize and interpret the speech signal: the speech samples are decoded into word transcriptions by means of acoustic and syntactic models and then interpreted according to a semantic model. Both for speech recognition and understanding, search algorithms use stochastic models to extract the most likely uttered sentence and its correspondent interpretation. The design of the language models has to be effective in order to mostly constrain the search algorithms and has to be efficient to comply with the storage space limits. In this work we present the Variable N-gram Stochastic Automaton (VNSA) language model that provides a unified formalism for building a wide class of language models. First, this approach allows for the use of accurate language models for large vocabulary speech recognition by using the standard search algorithm in the one-pass Viterbi decoder. Second, the unified formalism is an effective approach to incorporate different sources of information for computing the probability of word sequences. Third, the VNSAs are well suited for those applications where speech and language decoding cascades are implemented through weighted rational transductions. The VNSAs have been compared to standard bigram and trigram language models and their reduced set of parameters does not affect by any means the performances in terms of perplexity. The design of a stochastic language model through the VNSA is described and applied to word and phrase class-based language models. The effectiveness of VNSAs has been tested within the Air Travel Information System (ATIS) task to build the language model for the speech recognition and the language understanding system.</abstract><cop>Oxford</cop><pub>Elsevier Ltd</pub><doi>10.1006/csla.1996.0014</doi><tpages>29</tpages></addata></record>
fulltext fulltext
identifier ISSN: 0885-2308
ispartof Computer speech & language, 1996-10, Vol.10 (4), p.265-293
issn 0885-2308
1095-8363
language eng
recordid cdi_proquest_miscellaneous_85657343
source Periodicals Index Online; Access via ScienceDirect (Elsevier)
subjects Applied linguistics
Computational linguistics
Linguistics
title Stochastic automata for language modeling
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-26T08%3A14%3A09IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Stochastic%20automata%20for%20language%20modeling&rft.jtitle=Computer%20speech%20&%20language&rft.au=Riccardi,%20Giuseppe&rft.date=1996-10-01&rft.volume=10&rft.issue=4&rft.spage=265&rft.epage=293&rft.pages=265-293&rft.issn=0885-2308&rft.eissn=1095-8363&rft.coden=CSPLEO&rft_id=info:doi/10.1006/csla.1996.0014&rft_dat=%3Cproquest_cross%3E23453823%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=1839894501&rft_id=info:pmid/&rft_els_id=S0885230896900145&rfr_iscdi=true