Utterance Classification Using Linguistic and Non-linguistic Information for Network-Based Speech-to-Speech Translation Systems

Network-based mobile services, such as speech-to-speech translation and voice search, enable the construction of large-scale log database including speech. We have developed a smartphone application called VoiceTra for speech-to-speech translation and have collected 10,000,000 utterances so far. Thi...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Sugiura, Komei, Ryong Lee, Kashioka, Hideki, Zettsu, Koji, Kidawara, Yutaka
Format: Tagungsbericht
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 216
container_issue
container_start_page 212
container_title
container_volume 2
creator Sugiura, Komei
Ryong Lee
Kashioka, Hideki
Zettsu, Koji
Kidawara, Yutaka
description Network-based mobile services, such as speech-to-speech translation and voice search, enable the construction of large-scale log database including speech. We have developed a smartphone application called VoiceTra for speech-to-speech translation and have collected 10,000,000 utterances so far. This huge corpus is unique in size and spatio-temporal information; it contains information on anonymized user locations. This spatiotemporal corpus can be used for improving the accuracy of its speech recognition and machine translation, and it will open the door for the study of the location dependency of vocabulary and new applications for location-based services. This paper first analyzes the corpus and then presents a novel method for classifying utterances using linguistic and non-linguistic information. L2-regularized Logistic Regression is used for utterance classification. Our experiments performed on the VoiceTra log corpus revealed that our proposed method outperformed baseline methods in terms of F measure.
doi_str_mv 10.1109/MDM.2013.96
format Conference Proceeding
fullrecord <record><control><sourceid>ieee_6IE</sourceid><recordid>TN_cdi_ieee_primary_6569092</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>6569092</ieee_id><sourcerecordid>6569092</sourcerecordid><originalsourceid>FETCH-LOGICAL-i175t-329fb9a05661d9fb7e6b4dd9830dab820faf8a143563c01bb2d53dac62b2ea3f3</originalsourceid><addsrcrecordid>eNpFjT1PwzAYhM2XRCidGFn8B1xe-42deITyVaktQ1uJrXJiBwxpUsVGqBN_nUhBYrl7dDrdEXLFYcI56JvF_WIigONEqyNyAZnSMtUZvh6TRGAmGaBIT8hYZzlPVYYKVC5PScKl5EyJVJ6TcQgfAMABJUeVkJ9NjK4zTenotDYh-MqXJvq2oZvgmzc67-XLh-hLahpLl23D6v9o1lRttxv6PdGli99t98nuTHCWrvbOle8stmwguu6PQj3UV4cQ3S5ckrPK1MGN_3xENo8P6-kzm788zaa3c-Z5JiNDoatCG5BKcdtj5lSRWqtzBGuKXEBlqtzwFKXCEnhRCCvRmlKJQjiDFY7I9bDrnXPbfed3pjtslVQatMBfaqtllw</addsrcrecordid><sourcetype>Publisher</sourcetype><iscdi>true</iscdi><recordtype>conference_proceeding</recordtype></control><display><type>conference_proceeding</type><title>Utterance Classification Using Linguistic and Non-linguistic Information for Network-Based Speech-to-Speech Translation Systems</title><source>IEEE Electronic Library (IEL) Conference Proceedings</source><creator>Sugiura, Komei ; Ryong Lee ; Kashioka, Hideki ; Zettsu, Koji ; Kidawara, Yutaka</creator><creatorcontrib>Sugiura, Komei ; Ryong Lee ; Kashioka, Hideki ; Zettsu, Koji ; Kidawara, Yutaka</creatorcontrib><description>Network-based mobile services, such as speech-to-speech translation and voice search, enable the construction of large-scale log database including speech. We have developed a smartphone application called VoiceTra for speech-to-speech translation and have collected 10,000,000 utterances so far. This huge corpus is unique in size and spatio-temporal information; it contains information on anonymized user locations. This spatiotemporal corpus can be used for improving the accuracy of its speech recognition and machine translation, and it will open the door for the study of the location dependency of vocabulary and new applications for location-based services. This paper first analyzes the corpus and then presents a novel method for classifying utterances using linguistic and non-linguistic information. L2-regularized Logistic Regression is used for utterance classification. Our experiments performed on the VoiceTra log corpus revealed that our proposed method outperformed baseline methods in terms of F measure.</description><identifier>ISSN: 1551-6245</identifier><identifier>ISBN: 9781467360685</identifier><identifier>ISBN: 1467360686</identifier><identifier>EISSN: 2375-0324</identifier><identifier>EISBN: 076954973X</identifier><identifier>EISBN: 9780769549736</identifier><identifier>DOI: 10.1109/MDM.2013.96</identifier><identifier>CODEN: IEEPAD</identifier><language>eng</language><publisher>IEEE</publisher><subject>Business ; GIS ; Knowledge discovery ; Mobile communication ; Pragmatics ; smartphone ; Speech ; Speech recognition ; speech-to-speech translation ; Vectors</subject><ispartof>2013 IEEE 14th International Conference on Mobile Data Management, 2013, Vol.2, p.212-216</ispartof><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/6569092$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>309,310,776,780,785,786,2052,27902,54895</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/6569092$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Sugiura, Komei</creatorcontrib><creatorcontrib>Ryong Lee</creatorcontrib><creatorcontrib>Kashioka, Hideki</creatorcontrib><creatorcontrib>Zettsu, Koji</creatorcontrib><creatorcontrib>Kidawara, Yutaka</creatorcontrib><title>Utterance Classification Using Linguistic and Non-linguistic Information for Network-Based Speech-to-Speech Translation Systems</title><title>2013 IEEE 14th International Conference on Mobile Data Management</title><addtitle>mdm</addtitle><description>Network-based mobile services, such as speech-to-speech translation and voice search, enable the construction of large-scale log database including speech. We have developed a smartphone application called VoiceTra for speech-to-speech translation and have collected 10,000,000 utterances so far. This huge corpus is unique in size and spatio-temporal information; it contains information on anonymized user locations. This spatiotemporal corpus can be used for improving the accuracy of its speech recognition and machine translation, and it will open the door for the study of the location dependency of vocabulary and new applications for location-based services. This paper first analyzes the corpus and then presents a novel method for classifying utterances using linguistic and non-linguistic information. L2-regularized Logistic Regression is used for utterance classification. Our experiments performed on the VoiceTra log corpus revealed that our proposed method outperformed baseline methods in terms of F measure.</description><subject>Business</subject><subject>GIS</subject><subject>Knowledge discovery</subject><subject>Mobile communication</subject><subject>Pragmatics</subject><subject>smartphone</subject><subject>Speech</subject><subject>Speech recognition</subject><subject>speech-to-speech translation</subject><subject>Vectors</subject><issn>1551-6245</issn><issn>2375-0324</issn><isbn>9781467360685</isbn><isbn>1467360686</isbn><isbn>076954973X</isbn><isbn>9780769549736</isbn><fulltext>true</fulltext><rsrctype>conference_proceeding</rsrctype><creationdate>2013</creationdate><recordtype>conference_proceeding</recordtype><sourceid>6IE</sourceid><sourceid>RIE</sourceid><recordid>eNpFjT1PwzAYhM2XRCidGFn8B1xe-42deITyVaktQ1uJrXJiBwxpUsVGqBN_nUhBYrl7dDrdEXLFYcI56JvF_WIigONEqyNyAZnSMtUZvh6TRGAmGaBIT8hYZzlPVYYKVC5PScKl5EyJVJ6TcQgfAMABJUeVkJ9NjK4zTenotDYh-MqXJvq2oZvgmzc67-XLh-hLahpLl23D6v9o1lRttxv6PdGli99t98nuTHCWrvbOle8stmwguu6PQj3UV4cQ3S5ckrPK1MGN_3xENo8P6-kzm788zaa3c-Z5JiNDoatCG5BKcdtj5lSRWqtzBGuKXEBlqtzwFKXCEnhRCCvRmlKJQjiDFY7I9bDrnXPbfed3pjtslVQatMBfaqtllw</recordid><startdate>201306</startdate><enddate>201306</enddate><creator>Sugiura, Komei</creator><creator>Ryong Lee</creator><creator>Kashioka, Hideki</creator><creator>Zettsu, Koji</creator><creator>Kidawara, Yutaka</creator><general>IEEE</general><scope>6IE</scope><scope>6IL</scope><scope>CBEJK</scope><scope>RIE</scope><scope>RIL</scope></search><sort><creationdate>201306</creationdate><title>Utterance Classification Using Linguistic and Non-linguistic Information for Network-Based Speech-to-Speech Translation Systems</title><author>Sugiura, Komei ; Ryong Lee ; Kashioka, Hideki ; Zettsu, Koji ; Kidawara, Yutaka</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-i175t-329fb9a05661d9fb7e6b4dd9830dab820faf8a143563c01bb2d53dac62b2ea3f3</frbrgroupid><rsrctype>conference_proceedings</rsrctype><prefilter>conference_proceedings</prefilter><language>eng</language><creationdate>2013</creationdate><topic>Business</topic><topic>GIS</topic><topic>Knowledge discovery</topic><topic>Mobile communication</topic><topic>Pragmatics</topic><topic>smartphone</topic><topic>Speech</topic><topic>Speech recognition</topic><topic>speech-to-speech translation</topic><topic>Vectors</topic><toplevel>online_resources</toplevel><creatorcontrib>Sugiura, Komei</creatorcontrib><creatorcontrib>Ryong Lee</creatorcontrib><creatorcontrib>Kashioka, Hideki</creatorcontrib><creatorcontrib>Zettsu, Koji</creatorcontrib><creatorcontrib>Kidawara, Yutaka</creatorcontrib><collection>IEEE Electronic Library (IEL) Conference Proceedings</collection><collection>IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume</collection><collection>IEEE Xplore All Conference Proceedings</collection><collection>IEEE Electronic Library (IEL)</collection><collection>IEEE Proceedings Order Plans (POP All) 1998-Present</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Sugiura, Komei</au><au>Ryong Lee</au><au>Kashioka, Hideki</au><au>Zettsu, Koji</au><au>Kidawara, Yutaka</au><format>book</format><genre>proceeding</genre><ristype>CONF</ristype><atitle>Utterance Classification Using Linguistic and Non-linguistic Information for Network-Based Speech-to-Speech Translation Systems</atitle><btitle>2013 IEEE 14th International Conference on Mobile Data Management</btitle><stitle>mdm</stitle><date>2013-06</date><risdate>2013</risdate><volume>2</volume><spage>212</spage><epage>216</epage><pages>212-216</pages><issn>1551-6245</issn><eissn>2375-0324</eissn><isbn>9781467360685</isbn><isbn>1467360686</isbn><eisbn>076954973X</eisbn><eisbn>9780769549736</eisbn><coden>IEEPAD</coden><abstract>Network-based mobile services, such as speech-to-speech translation and voice search, enable the construction of large-scale log database including speech. We have developed a smartphone application called VoiceTra for speech-to-speech translation and have collected 10,000,000 utterances so far. This huge corpus is unique in size and spatio-temporal information; it contains information on anonymized user locations. This spatiotemporal corpus can be used for improving the accuracy of its speech recognition and machine translation, and it will open the door for the study of the location dependency of vocabulary and new applications for location-based services. This paper first analyzes the corpus and then presents a novel method for classifying utterances using linguistic and non-linguistic information. L2-regularized Logistic Regression is used for utterance classification. Our experiments performed on the VoiceTra log corpus revealed that our proposed method outperformed baseline methods in terms of F measure.</abstract><pub>IEEE</pub><doi>10.1109/MDM.2013.96</doi><tpages>5</tpages></addata></record>
fulltext fulltext_linktorsrc
identifier ISSN: 1551-6245
ispartof 2013 IEEE 14th International Conference on Mobile Data Management, 2013, Vol.2, p.212-216
issn 1551-6245
2375-0324
language eng
recordid cdi_ieee_primary_6569092
source IEEE Electronic Library (IEL) Conference Proceedings
subjects Business
GIS
Knowledge discovery
Mobile communication
Pragmatics
smartphone
Speech
Speech recognition
speech-to-speech translation
Vectors
title Utterance Classification Using Linguistic and Non-linguistic Information for Network-Based Speech-to-Speech Translation Systems
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-30T19%3A39%3A09IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-ieee_6IE&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=proceeding&rft.atitle=Utterance%20Classification%20Using%20Linguistic%20and%20Non-linguistic%20Information%20for%20Network-Based%20Speech-to-Speech%20Translation%20Systems&rft.btitle=2013%20IEEE%2014th%20International%20Conference%20on%20Mobile%20Data%20Management&rft.au=Sugiura,%20Komei&rft.date=2013-06&rft.volume=2&rft.spage=212&rft.epage=216&rft.pages=212-216&rft.issn=1551-6245&rft.eissn=2375-0324&rft.isbn=9781467360685&rft.isbn_list=1467360686&rft.coden=IEEPAD&rft_id=info:doi/10.1109/MDM.2013.96&rft_dat=%3Cieee_6IE%3E6569092%3C/ieee_6IE%3E%3Curl%3E%3C/url%3E&rft.eisbn=076954973X&rft.eisbn_list=9780769549736&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rft_ieee_id=6569092&rfr_iscdi=true