Dealing with untranscribed speech

With the advent of social networks, there has been an exponential growth in multimedia data including speech. This speech data is typically conversational, casual and recorded in real environment. An important characteristic of this speech data is unavailability of corresponding transcripts (text) o...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
1. Verfasser: Prahallad, K.
Format: Tagungsbericht
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 2
container_issue
container_start_page 1
container_title
container_volume
creator Prahallad, K.
description With the advent of social networks, there has been an exponential growth in multimedia data including speech. This speech data is typically conversational, casual and recorded in real environment. An important characteristic of this speech data is unavailability of corresponding transcripts (text) or the language information. In this work, we discuss technologies dealing with speech data without any corresponding transcripts and/or language information. A traditional way is to adopt acoustic models from existing benchmark databases (of known languages) for obtaining a first-level transcription and then perform bootstrapping. We show inherent limitations of such approaches, and argue that signal processing algorithms based on speech production knowledge play an important role in dealing with such speech data. This paper discusses some of the ongoing work at our lab in this direction which includes building audio search, speech summarization, speech synthesis and voice conversion using untranscribed speech.
doi_str_mv 10.1109/SPCOM.2012.6290249
format Conference Proceeding
fullrecord <record><control><sourceid>ieee_6IE</sourceid><recordid>TN_cdi_ieee_primary_6290249</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>6290249</ieee_id><sourcerecordid>6290249</sourcerecordid><originalsourceid>FETCH-LOGICAL-i90t-d089b2d3befbeb20cc25ff6207eae58e18c2f9f77a3881dd3e706be3778a41593</originalsourceid><addsrcrecordid>eNo1j1tLw0AUhNcbWGv-gL7EH7Dx7DnJXh4lXqFSwb6X3eSsXamhJBHx3xuxzsvAfMzACHGhoFAK3PXrS718LhAUFhodYOkOROaMVaU29BvjoZihdiRJK30kzv4BmeMJKF1J0GBPRTYM7zBpqmJFM3F1y36burf8K42b_LMbe98NTZ8Ct_mwY2425-Ik-u3A2d7nYnV_t6of5WL58FTfLGRyMMoWrAvYUuAYOCA0DVYxagTDnivLyjYYXTTGk7WqbYkN6MBkjPWlqhzNxeXfbGLm9a5PH77_Xu_P0g_2l0NK</addsrcrecordid><sourcetype>Publisher</sourcetype><iscdi>true</iscdi><recordtype>conference_proceeding</recordtype></control><display><type>conference_proceeding</type><title>Dealing with untranscribed speech</title><source>IEEE Electronic Library (IEL) Conference Proceedings</source><creator>Prahallad, K.</creator><creatorcontrib>Prahallad, K.</creatorcontrib><description>With the advent of social networks, there has been an exponential growth in multimedia data including speech. This speech data is typically conversational, casual and recorded in real environment. An important characteristic of this speech data is unavailability of corresponding transcripts (text) or the language information. In this work, we discuss technologies dealing with speech data without any corresponding transcripts and/or language information. A traditional way is to adopt acoustic models from existing benchmark databases (of known languages) for obtaining a first-level transcription and then perform bootstrapping. We show inherent limitations of such approaches, and argue that signal processing algorithms based on speech production knowledge play an important role in dealing with such speech data. This paper discusses some of the ongoing work at our lab in this direction which includes building audio search, speech summarization, speech synthesis and voice conversion using untranscribed speech.</description><identifier>ISSN: 2165-0608</identifier><identifier>ISBN: 1467320137</identifier><identifier>ISBN: 9781467320139</identifier><identifier>EISSN: 2693-3616</identifier><identifier>EISBN: 9781467320122</identifier><identifier>EISBN: 1467320129</identifier><identifier>EISBN: 1467320145</identifier><identifier>EISBN: 9781467320146</identifier><identifier>DOI: 10.1109/SPCOM.2012.6290249</identifier><language>eng</language><publisher>IEEE</publisher><subject>Acoustics ; Adaptation models ; Buildings ; Production ; Signal processing algorithms ; Speech ; Speech processing</subject><ispartof>2012 International Conference on Signal Processing and Communications (SPCOM), 2012, p.1-2</ispartof><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/6290249$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>309,310,780,784,789,790,2058,27925,54920</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/6290249$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Prahallad, K.</creatorcontrib><title>Dealing with untranscribed speech</title><title>2012 International Conference on Signal Processing and Communications (SPCOM)</title><addtitle>SPCOM</addtitle><description>With the advent of social networks, there has been an exponential growth in multimedia data including speech. This speech data is typically conversational, casual and recorded in real environment. An important characteristic of this speech data is unavailability of corresponding transcripts (text) or the language information. In this work, we discuss technologies dealing with speech data without any corresponding transcripts and/or language information. A traditional way is to adopt acoustic models from existing benchmark databases (of known languages) for obtaining a first-level transcription and then perform bootstrapping. We show inherent limitations of such approaches, and argue that signal processing algorithms based on speech production knowledge play an important role in dealing with such speech data. This paper discusses some of the ongoing work at our lab in this direction which includes building audio search, speech summarization, speech synthesis and voice conversion using untranscribed speech.</description><subject>Acoustics</subject><subject>Adaptation models</subject><subject>Buildings</subject><subject>Production</subject><subject>Signal processing algorithms</subject><subject>Speech</subject><subject>Speech processing</subject><issn>2165-0608</issn><issn>2693-3616</issn><isbn>1467320137</isbn><isbn>9781467320139</isbn><isbn>9781467320122</isbn><isbn>1467320129</isbn><isbn>1467320145</isbn><isbn>9781467320146</isbn><fulltext>true</fulltext><rsrctype>conference_proceeding</rsrctype><creationdate>2012</creationdate><recordtype>conference_proceeding</recordtype><sourceid>6IE</sourceid><sourceid>RIE</sourceid><recordid>eNo1j1tLw0AUhNcbWGv-gL7EH7Dx7DnJXh4lXqFSwb6X3eSsXamhJBHx3xuxzsvAfMzACHGhoFAK3PXrS718LhAUFhodYOkOROaMVaU29BvjoZihdiRJK30kzv4BmeMJKF1J0GBPRTYM7zBpqmJFM3F1y36burf8K42b_LMbe98NTZ8Ct_mwY2425-Ik-u3A2d7nYnV_t6of5WL58FTfLGRyMMoWrAvYUuAYOCA0DVYxagTDnivLyjYYXTTGk7WqbYkN6MBkjPWlqhzNxeXfbGLm9a5PH77_Xu_P0g_2l0NK</recordid><startdate>201207</startdate><enddate>201207</enddate><creator>Prahallad, K.</creator><general>IEEE</general><scope>6IE</scope><scope>6IL</scope><scope>CBEJK</scope><scope>RIE</scope><scope>RIL</scope></search><sort><creationdate>201207</creationdate><title>Dealing with untranscribed speech</title><author>Prahallad, K.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-i90t-d089b2d3befbeb20cc25ff6207eae58e18c2f9f77a3881dd3e706be3778a41593</frbrgroupid><rsrctype>conference_proceedings</rsrctype><prefilter>conference_proceedings</prefilter><language>eng</language><creationdate>2012</creationdate><topic>Acoustics</topic><topic>Adaptation models</topic><topic>Buildings</topic><topic>Production</topic><topic>Signal processing algorithms</topic><topic>Speech</topic><topic>Speech processing</topic><toplevel>online_resources</toplevel><creatorcontrib>Prahallad, K.</creatorcontrib><collection>IEEE Electronic Library (IEL) Conference Proceedings</collection><collection>IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume</collection><collection>IEEE Xplore All Conference Proceedings</collection><collection>IEEE Electronic Library (IEL)</collection><collection>IEEE Proceedings Order Plans (POP All) 1998-Present</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Prahallad, K.</au><format>book</format><genre>proceeding</genre><ristype>CONF</ristype><atitle>Dealing with untranscribed speech</atitle><btitle>2012 International Conference on Signal Processing and Communications (SPCOM)</btitle><stitle>SPCOM</stitle><date>2012-07</date><risdate>2012</risdate><spage>1</spage><epage>2</epage><pages>1-2</pages><issn>2165-0608</issn><eissn>2693-3616</eissn><isbn>1467320137</isbn><isbn>9781467320139</isbn><eisbn>9781467320122</eisbn><eisbn>1467320129</eisbn><eisbn>1467320145</eisbn><eisbn>9781467320146</eisbn><abstract>With the advent of social networks, there has been an exponential growth in multimedia data including speech. This speech data is typically conversational, casual and recorded in real environment. An important characteristic of this speech data is unavailability of corresponding transcripts (text) or the language information. In this work, we discuss technologies dealing with speech data without any corresponding transcripts and/or language information. A traditional way is to adopt acoustic models from existing benchmark databases (of known languages) for obtaining a first-level transcription and then perform bootstrapping. We show inherent limitations of such approaches, and argue that signal processing algorithms based on speech production knowledge play an important role in dealing with such speech data. This paper discusses some of the ongoing work at our lab in this direction which includes building audio search, speech summarization, speech synthesis and voice conversion using untranscribed speech.</abstract><pub>IEEE</pub><doi>10.1109/SPCOM.2012.6290249</doi><tpages>2</tpages></addata></record>
fulltext fulltext_linktorsrc
identifier ISSN: 2165-0608
ispartof 2012 International Conference on Signal Processing and Communications (SPCOM), 2012, p.1-2
issn 2165-0608
2693-3616
language eng
recordid cdi_ieee_primary_6290249
source IEEE Electronic Library (IEL) Conference Proceedings
subjects Acoustics
Adaptation models
Buildings
Production
Signal processing algorithms
Speech
Speech processing
title Dealing with untranscribed speech
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-30T07%3A49%3A00IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-ieee_6IE&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=proceeding&rft.atitle=Dealing%20with%20untranscribed%20speech&rft.btitle=2012%20International%20Conference%20on%20Signal%20Processing%20and%20Communications%20(SPCOM)&rft.au=Prahallad,%20K.&rft.date=2012-07&rft.spage=1&rft.epage=2&rft.pages=1-2&rft.issn=2165-0608&rft.eissn=2693-3616&rft.isbn=1467320137&rft.isbn_list=9781467320139&rft_id=info:doi/10.1109/SPCOM.2012.6290249&rft_dat=%3Cieee_6IE%3E6290249%3C/ieee_6IE%3E%3Curl%3E%3C/url%3E&rft.eisbn=9781467320122&rft.eisbn_list=1467320129&rft.eisbn_list=1467320145&rft.eisbn_list=9781467320146&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rft_ieee_id=6290249&rfr_iscdi=true