Risk-Based Semi-Supervised Discriminative Language Modeling for Broadcast Transcription

This paper describes a new method for semi-supervised discriminative language modeling, which is designed to improve the robustness of a discriminative language model (LM) obtained from manually transcribed (labeled) data. The discriminative LM is implemented as a log-linear model, which employs a s...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEICE Transactions on Information and Systems 2012/11/01, Vol.E95.D(11), pp.2674-2681
Hauptverfasser: KOBAYASHI, Akio, OKU, Takahiro, IMAI, Toru, NAKAGAWA, Seiichi
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 2681
container_issue 11
container_start_page 2674
container_title IEICE Transactions on Information and Systems
container_volume E95.D
creator KOBAYASHI, Akio
OKU, Takahiro
IMAI, Toru
NAKAGAWA, Seiichi
description This paper describes a new method for semi-supervised discriminative language modeling, which is designed to improve the robustness of a discriminative language model (LM) obtained from manually transcribed (labeled) data. The discriminative LM is implemented as a log-linear model, which employs a set of linguistic features derived from word or phoneme sequences. The proposed semi-supervised discriminative modeling is formulated as a multi-objective optimization programming problem (MOP), which consists of two objective functions defined on both labeled lattices and automatic speech recognition (ASR) lattices as unlabeled data. The objectives are coherently designed based on the expected risks that reflect information about word errors for the training data. The model is trained in a discriminative manner and acquired as a solution to the MOP problem. In transcribing Japanese broadcast programs, the proposed method reduced relatively a word error rate by 6.3% compared with that achieved by a conventional trigram LM.
doi_str_mv 10.1587/transinf.E95.D.2674
format Article
fullrecord <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_miscellaneous_1315698083</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>1315698083</sourcerecordid><originalsourceid>FETCH-LOGICAL-c541t-f4274dcb07ce4234b2b79057c9122beca31fa04c83b1957f87a88840939117883</originalsourceid><addsrcrecordid>eNpdkElP8zAQhi0EEmX5BVxyQfouKR4vsX0EyqoiJBZxtCauUwxp0s9Okfj3JCpUiNNopOed5SHkCOgYpFYnXcQmhaYaXxg5noxZocQWGYESMgdewDYZUQNFriVnu2QvpTdKQTOQI_LyENJ7fobJz7JHvwj542rp40cY-klILoZFaLALHz6bYjNf4dxnd-3M16GZZ1Ubs7PY4sxh6rKn4Yg-sOxC2xyQnQrr5A-_6z55vrx4Or_Op_dXN-en09xJAV1eCabEzJVUOS8YFyUrlaFSOQOMld4hhwqpcJqXYKSqtEKttaCGGwClNd8n_9Zzl7H9v_Kps4v-al_X2Ph2lSxwkIXRVPMe5WvUxTal6Cu77L_D-GmB2kGj_dFoe412YgeNfer4ewEmh3XVIy6kTZQV0gjOB-52zb2lrpe0ATB2wdX-72yAX0s2kHvFaH3DvwDdZ5CE</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>1315698083</pqid></control><display><type>article</type><title>Risk-Based Semi-Supervised Discriminative Language Modeling for Broadcast Transcription</title><source>J-STAGE Free</source><source>EZB-FREE-00999 freely available EZB journals</source><creator>KOBAYASHI, Akio ; OKU, Takahiro ; IMAI, Toru ; NAKAGAWA, Seiichi</creator><creatorcontrib>KOBAYASHI, Akio ; OKU, Takahiro ; IMAI, Toru ; NAKAGAWA, Seiichi</creatorcontrib><description>This paper describes a new method for semi-supervised discriminative language modeling, which is designed to improve the robustness of a discriminative language model (LM) obtained from manually transcribed (labeled) data. The discriminative LM is implemented as a log-linear model, which employs a set of linguistic features derived from word or phoneme sequences. The proposed semi-supervised discriminative modeling is formulated as a multi-objective optimization programming problem (MOP), which consists of two objective functions defined on both labeled lattices and automatic speech recognition (ASR) lattices as unlabeled data. The objectives are coherently designed based on the expected risks that reflect information about word errors for the training data. The model is trained in a discriminative manner and acquired as a solution to the MOP problem. In transcribing Japanese broadcast programs, the proposed method reduced relatively a word error rate by 6.3% compared with that achieved by a conventional trigram LM.</description><identifier>ISSN: 0916-8532</identifier><identifier>EISSN: 1745-1361</identifier><identifier>DOI: 10.1587/transinf.E95.D.2674</identifier><language>eng</language><publisher>Oxford: The Institute of Electronics, Information and Communication Engineers</publisher><subject>Applied sciences ; Artificial intelligence ; Bayes risk minimization ; Broadcasting ; Broadcasting. Videocommunications. Audiovisual ; Computer science; control theory; systems ; discriminative training ; Errors ; Exact sciences and technology ; Information, signal and communications theory ; language modeling ; Lattices ; Linguistics ; Mathematical models ; Miscellaneous ; Optimization ; Programming ; Robustness ; semi-supervised training ; Signal processing ; Speech and sound recognition and synthesis. Linguistics ; Speech processing ; Telecommunications ; Telecommunications and information theory</subject><ispartof>IEICE Transactions on Information and Systems, 2012/11/01, Vol.E95.D(11), pp.2674-2681</ispartof><rights>2012 The Institute of Electronics, Information and Communication Engineers</rights><rights>2015 INIST-CNRS</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c541t-f4274dcb07ce4234b2b79057c9122beca31fa04c83b1957f87a88840939117883</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>315,781,785,1884,4025,27928,27929,27930</link.rule.ids><backlink>$$Uhttp://pascal-francis.inist.fr/vibad/index.php?action=getRecordDetail&amp;idt=26594334$$DView record in Pascal Francis$$Hfree_for_read</backlink></links><search><creatorcontrib>KOBAYASHI, Akio</creatorcontrib><creatorcontrib>OKU, Takahiro</creatorcontrib><creatorcontrib>IMAI, Toru</creatorcontrib><creatorcontrib>NAKAGAWA, Seiichi</creatorcontrib><title>Risk-Based Semi-Supervised Discriminative Language Modeling for Broadcast Transcription</title><title>IEICE Transactions on Information and Systems</title><addtitle>IEICE Trans. Inf. &amp; Syst.</addtitle><description>This paper describes a new method for semi-supervised discriminative language modeling, which is designed to improve the robustness of a discriminative language model (LM) obtained from manually transcribed (labeled) data. The discriminative LM is implemented as a log-linear model, which employs a set of linguistic features derived from word or phoneme sequences. The proposed semi-supervised discriminative modeling is formulated as a multi-objective optimization programming problem (MOP), which consists of two objective functions defined on both labeled lattices and automatic speech recognition (ASR) lattices as unlabeled data. The objectives are coherently designed based on the expected risks that reflect information about word errors for the training data. The model is trained in a discriminative manner and acquired as a solution to the MOP problem. In transcribing Japanese broadcast programs, the proposed method reduced relatively a word error rate by 6.3% compared with that achieved by a conventional trigram LM.</description><subject>Applied sciences</subject><subject>Artificial intelligence</subject><subject>Bayes risk minimization</subject><subject>Broadcasting</subject><subject>Broadcasting. Videocommunications. Audiovisual</subject><subject>Computer science; control theory; systems</subject><subject>discriminative training</subject><subject>Errors</subject><subject>Exact sciences and technology</subject><subject>Information, signal and communications theory</subject><subject>language modeling</subject><subject>Lattices</subject><subject>Linguistics</subject><subject>Mathematical models</subject><subject>Miscellaneous</subject><subject>Optimization</subject><subject>Programming</subject><subject>Robustness</subject><subject>semi-supervised training</subject><subject>Signal processing</subject><subject>Speech and sound recognition and synthesis. Linguistics</subject><subject>Speech processing</subject><subject>Telecommunications</subject><subject>Telecommunications and information theory</subject><issn>0916-8532</issn><issn>1745-1361</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2012</creationdate><recordtype>article</recordtype><recordid>eNpdkElP8zAQhi0EEmX5BVxyQfouKR4vsX0EyqoiJBZxtCauUwxp0s9Okfj3JCpUiNNopOed5SHkCOgYpFYnXcQmhaYaXxg5noxZocQWGYESMgdewDYZUQNFriVnu2QvpTdKQTOQI_LyENJ7fobJz7JHvwj542rp40cY-klILoZFaLALHz6bYjNf4dxnd-3M16GZZ1Ubs7PY4sxh6rKn4Yg-sOxC2xyQnQrr5A-_6z55vrx4Or_Op_dXN-en09xJAV1eCabEzJVUOS8YFyUrlaFSOQOMld4hhwqpcJqXYKSqtEKttaCGGwClNd8n_9Zzl7H9v_Kps4v-al_X2Ph2lSxwkIXRVPMe5WvUxTal6Cu77L_D-GmB2kGj_dFoe412YgeNfer4ewEmh3XVIy6kTZQV0gjOB-52zb2lrpe0ATB2wdX-72yAX0s2kHvFaH3DvwDdZ5CE</recordid><startdate>2012</startdate><enddate>2012</enddate><creator>KOBAYASHI, Akio</creator><creator>OKU, Takahiro</creator><creator>IMAI, Toru</creator><creator>NAKAGAWA, Seiichi</creator><general>The Institute of Electronics, Information and Communication Engineers</general><general>Oxford University Press</general><scope>IQODW</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope></search><sort><creationdate>2012</creationdate><title>Risk-Based Semi-Supervised Discriminative Language Modeling for Broadcast Transcription</title><author>KOBAYASHI, Akio ; OKU, Takahiro ; IMAI, Toru ; NAKAGAWA, Seiichi</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c541t-f4274dcb07ce4234b2b79057c9122beca31fa04c83b1957f87a88840939117883</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2012</creationdate><topic>Applied sciences</topic><topic>Artificial intelligence</topic><topic>Bayes risk minimization</topic><topic>Broadcasting</topic><topic>Broadcasting. Videocommunications. Audiovisual</topic><topic>Computer science; control theory; systems</topic><topic>discriminative training</topic><topic>Errors</topic><topic>Exact sciences and technology</topic><topic>Information, signal and communications theory</topic><topic>language modeling</topic><topic>Lattices</topic><topic>Linguistics</topic><topic>Mathematical models</topic><topic>Miscellaneous</topic><topic>Optimization</topic><topic>Programming</topic><topic>Robustness</topic><topic>semi-supervised training</topic><topic>Signal processing</topic><topic>Speech and sound recognition and synthesis. Linguistics</topic><topic>Speech processing</topic><topic>Telecommunications</topic><topic>Telecommunications and information theory</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>KOBAYASHI, Akio</creatorcontrib><creatorcontrib>OKU, Takahiro</creatorcontrib><creatorcontrib>IMAI, Toru</creatorcontrib><creatorcontrib>NAKAGAWA, Seiichi</creatorcontrib><collection>Pascal-Francis</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>IEICE Transactions on Information and Systems</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>KOBAYASHI, Akio</au><au>OKU, Takahiro</au><au>IMAI, Toru</au><au>NAKAGAWA, Seiichi</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Risk-Based Semi-Supervised Discriminative Language Modeling for Broadcast Transcription</atitle><jtitle>IEICE Transactions on Information and Systems</jtitle><addtitle>IEICE Trans. Inf. &amp; Syst.</addtitle><date>2012</date><risdate>2012</risdate><volume>E95.D</volume><issue>11</issue><spage>2674</spage><epage>2681</epage><pages>2674-2681</pages><issn>0916-8532</issn><eissn>1745-1361</eissn><abstract>This paper describes a new method for semi-supervised discriminative language modeling, which is designed to improve the robustness of a discriminative language model (LM) obtained from manually transcribed (labeled) data. The discriminative LM is implemented as a log-linear model, which employs a set of linguistic features derived from word or phoneme sequences. The proposed semi-supervised discriminative modeling is formulated as a multi-objective optimization programming problem (MOP), which consists of two objective functions defined on both labeled lattices and automatic speech recognition (ASR) lattices as unlabeled data. The objectives are coherently designed based on the expected risks that reflect information about word errors for the training data. The model is trained in a discriminative manner and acquired as a solution to the MOP problem. In transcribing Japanese broadcast programs, the proposed method reduced relatively a word error rate by 6.3% compared with that achieved by a conventional trigram LM.</abstract><cop>Oxford</cop><pub>The Institute of Electronics, Information and Communication Engineers</pub><doi>10.1587/transinf.E95.D.2674</doi><tpages>8</tpages><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 0916-8532
ispartof IEICE Transactions on Information and Systems, 2012/11/01, Vol.E95.D(11), pp.2674-2681
issn 0916-8532
1745-1361
language eng
recordid cdi_proquest_miscellaneous_1315698083
source J-STAGE Free; EZB-FREE-00999 freely available EZB journals
subjects Applied sciences
Artificial intelligence
Bayes risk minimization
Broadcasting
Broadcasting. Videocommunications. Audiovisual
Computer science
control theory
systems
discriminative training
Errors
Exact sciences and technology
Information, signal and communications theory
language modeling
Lattices
Linguistics
Mathematical models
Miscellaneous
Optimization
Programming
Robustness
semi-supervised training
Signal processing
Speech and sound recognition and synthesis. Linguistics
Speech processing
Telecommunications
Telecommunications and information theory
title Risk-Based Semi-Supervised Discriminative Language Modeling for Broadcast Transcription
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-12T23%3A25%3A19IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Risk-Based%20Semi-Supervised%20Discriminative%20Language%20Modeling%20for%20Broadcast%20Transcription&rft.jtitle=IEICE%20Transactions%20on%20Information%20and%20Systems&rft.au=KOBAYASHI,%20Akio&rft.date=2012&rft.volume=E95.D&rft.issue=11&rft.spage=2674&rft.epage=2681&rft.pages=2674-2681&rft.issn=0916-8532&rft.eissn=1745-1361&rft_id=info:doi/10.1587/transinf.E95.D.2674&rft_dat=%3Cproquest_cross%3E1315698083%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=1315698083&rft_id=info:pmid/&rfr_iscdi=true