Audio-to-visual conversion for multimedia communication

Although humans rely primarily on hearing to process speech, they can also extract a great deal of information with their eyes through lipreading. This skill becomes extremely important when the acoustic signal is degraded by noise. It would, therefore, be beneficial to find methods to reinforce aco...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on industrial electronics (1982) 1998-02, Vol.45 (1), p.15-22
Hauptverfasser: Rao, R.R., Tsuhan Chen, Mersereau, R.M.
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 22
container_issue 1
container_start_page 15
container_title IEEE transactions on industrial electronics (1982)
container_volume 45
creator Rao, R.R.
Tsuhan Chen
Mersereau, R.M.
description Although humans rely primarily on hearing to process speech, they can also extract a great deal of information with their eyes through lipreading. This skill becomes extremely important when the acoustic signal is degraded by noise. It would, therefore, be beneficial to find methods to reinforce acoustic speech with a synthesized visual signal for high noise environments. This paper addresses the interaction between acoustic speech and visible speech. Algorithms for converting audible speech into visible speech are examined, and applications which can utilize this conversion process are presented. Our results demonstrate that it is possible to animate a natural-looking talking head using acoustic speech as an input.
doi_str_mv 10.1109/41.661300
format Article
fullrecord <record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_crossref_primary_10_1109_41_661300</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>661300</ieee_id><sourcerecordid>28264306</sourcerecordid><originalsourceid>FETCH-LOGICAL-c308t-cf34256998ca6279dd180cb1c80c01c83b377990b5ac40e1de4ae2fce1dca063</originalsourceid><addsrcrecordid>eNqFkDtrwzAUhUVpoWnaoWunTIUOSu_Vy9YYQl8Q6JJdyLIMKraVSnag_75uHbp2OffC93GGQ8gtwhoR9KPAtVLIAc7IAqUsqNaiPCcLYEVJAYS6JFc5fwCgkCgXpNiMdYh0iPQY8mjblYv90accYr9qYlp1YzuEztfBTqTrxj44O0zwmlw0ts3-5nSXZP_8tN--0t37y9t2s6OOQzlQ13DBpNK6dFaxQtc1luAqdFPClLziRaE1VNI6AR5rL6xnjZs-Z0HxJbmfaw8pfo4-D6YL2fm2tb2PYzZMMyYZsv_FkinBfxsfZtGlmHPyjTmk0Nn0ZRDMz4RGoJknnNy72Q3e-z_vBL8Bu9lrcA</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>28264306</pqid></control><display><type>article</type><title>Audio-to-visual conversion for multimedia communication</title><source>IEEE Electronic Library (IEL)</source><creator>Rao, R.R. ; Tsuhan Chen ; Mersereau, R.M.</creator><creatorcontrib>Rao, R.R. ; Tsuhan Chen ; Mersereau, R.M.</creatorcontrib><description>Although humans rely primarily on hearing to process speech, they can also extract a great deal of information with their eyes through lipreading. This skill becomes extremely important when the acoustic signal is degraded by noise. It would, therefore, be beneficial to find methods to reinforce acoustic speech with a synthesized visual signal for high noise environments. This paper addresses the interaction between acoustic speech and visible speech. Algorithms for converting audible speech into visible speech are examined, and applications which can utilize this conversion process are presented. Our results demonstrate that it is possible to animate a natural-looking talking head using acoustic speech as an input.</description><identifier>ISSN: 0278-0046</identifier><identifier>EISSN: 1557-9948</identifier><identifier>DOI: 10.1109/41.661300</identifier><identifier>CODEN: ITIED6</identifier><language>eng</language><publisher>IEEE</publisher><subject>Acoustic noise ; Auditory system ; Data mining ; Degradation ; Eyes ; Humans ; Multimedia communication ; Speech processing ; Speech synthesis ; Working environment noise</subject><ispartof>IEEE transactions on industrial electronics (1982), 1998-02, Vol.45 (1), p.15-22</ispartof><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c308t-cf34256998ca6279dd180cb1c80c01c83b377990b5ac40e1de4ae2fce1dca063</citedby><cites>FETCH-LOGICAL-c308t-cf34256998ca6279dd180cb1c80c01c83b377990b5ac40e1de4ae2fce1dca063</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/661300$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,780,784,796,27922,27923,54756</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/661300$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Rao, R.R.</creatorcontrib><creatorcontrib>Tsuhan Chen</creatorcontrib><creatorcontrib>Mersereau, R.M.</creatorcontrib><title>Audio-to-visual conversion for multimedia communication</title><title>IEEE transactions on industrial electronics (1982)</title><addtitle>TIE</addtitle><description>Although humans rely primarily on hearing to process speech, they can also extract a great deal of information with their eyes through lipreading. This skill becomes extremely important when the acoustic signal is degraded by noise. It would, therefore, be beneficial to find methods to reinforce acoustic speech with a synthesized visual signal for high noise environments. This paper addresses the interaction between acoustic speech and visible speech. Algorithms for converting audible speech into visible speech are examined, and applications which can utilize this conversion process are presented. Our results demonstrate that it is possible to animate a natural-looking talking head using acoustic speech as an input.</description><subject>Acoustic noise</subject><subject>Auditory system</subject><subject>Data mining</subject><subject>Degradation</subject><subject>Eyes</subject><subject>Humans</subject><subject>Multimedia communication</subject><subject>Speech processing</subject><subject>Speech synthesis</subject><subject>Working environment noise</subject><issn>0278-0046</issn><issn>1557-9948</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>1998</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNqFkDtrwzAUhUVpoWnaoWunTIUOSu_Vy9YYQl8Q6JJdyLIMKraVSnag_75uHbp2OffC93GGQ8gtwhoR9KPAtVLIAc7IAqUsqNaiPCcLYEVJAYS6JFc5fwCgkCgXpNiMdYh0iPQY8mjblYv90accYr9qYlp1YzuEztfBTqTrxj44O0zwmlw0ts3-5nSXZP_8tN--0t37y9t2s6OOQzlQ13DBpNK6dFaxQtc1luAqdFPClLziRaE1VNI6AR5rL6xnjZs-Z0HxJbmfaw8pfo4-D6YL2fm2tb2PYzZMMyYZsv_FkinBfxsfZtGlmHPyjTmk0Nn0ZRDMz4RGoJknnNy72Q3e-z_vBL8Bu9lrcA</recordid><startdate>19980201</startdate><enddate>19980201</enddate><creator>Rao, R.R.</creator><creator>Tsuhan Chen</creator><creator>Mersereau, R.M.</creator><general>IEEE</general><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SP</scope><scope>8FD</scope><scope>L7M</scope><scope>7TB</scope><scope>FR3</scope></search><sort><creationdate>19980201</creationdate><title>Audio-to-visual conversion for multimedia communication</title><author>Rao, R.R. ; Tsuhan Chen ; Mersereau, R.M.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c308t-cf34256998ca6279dd180cb1c80c01c83b377990b5ac40e1de4ae2fce1dca063</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>1998</creationdate><topic>Acoustic noise</topic><topic>Auditory system</topic><topic>Data mining</topic><topic>Degradation</topic><topic>Eyes</topic><topic>Humans</topic><topic>Multimedia communication</topic><topic>Speech processing</topic><topic>Speech synthesis</topic><topic>Working environment noise</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Rao, R.R.</creatorcontrib><creatorcontrib>Tsuhan Chen</creatorcontrib><creatorcontrib>Mersereau, R.M.</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><collection>Electronics &amp; Communications Abstracts</collection><collection>Technology Research Database</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Mechanical &amp; Transportation Engineering Abstracts</collection><collection>Engineering Research Database</collection><jtitle>IEEE transactions on industrial electronics (1982)</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Rao, R.R.</au><au>Tsuhan Chen</au><au>Mersereau, R.M.</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Audio-to-visual conversion for multimedia communication</atitle><jtitle>IEEE transactions on industrial electronics (1982)</jtitle><stitle>TIE</stitle><date>1998-02-01</date><risdate>1998</risdate><volume>45</volume><issue>1</issue><spage>15</spage><epage>22</epage><pages>15-22</pages><issn>0278-0046</issn><eissn>1557-9948</eissn><coden>ITIED6</coden><abstract>Although humans rely primarily on hearing to process speech, they can also extract a great deal of information with their eyes through lipreading. This skill becomes extremely important when the acoustic signal is degraded by noise. It would, therefore, be beneficial to find methods to reinforce acoustic speech with a synthesized visual signal for high noise environments. This paper addresses the interaction between acoustic speech and visible speech. Algorithms for converting audible speech into visible speech are examined, and applications which can utilize this conversion process are presented. Our results demonstrate that it is possible to animate a natural-looking talking head using acoustic speech as an input.</abstract><pub>IEEE</pub><doi>10.1109/41.661300</doi><tpages>8</tpages></addata></record>
fulltext fulltext_linktorsrc
identifier ISSN: 0278-0046
ispartof IEEE transactions on industrial electronics (1982), 1998-02, Vol.45 (1), p.15-22
issn 0278-0046
1557-9948
language eng
recordid cdi_crossref_primary_10_1109_41_661300
source IEEE Electronic Library (IEL)
subjects Acoustic noise
Auditory system
Data mining
Degradation
Eyes
Humans
Multimedia communication
Speech processing
Speech synthesis
Working environment noise
title Audio-to-visual conversion for multimedia communication
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-14T09%3A56%3A01IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Audio-to-visual%20conversion%20for%20multimedia%20communication&rft.jtitle=IEEE%20transactions%20on%20industrial%20electronics%20(1982)&rft.au=Rao,%20R.R.&rft.date=1998-02-01&rft.volume=45&rft.issue=1&rft.spage=15&rft.epage=22&rft.pages=15-22&rft.issn=0278-0046&rft.eissn=1557-9948&rft.coden=ITIED6&rft_id=info:doi/10.1109/41.661300&rft_dat=%3Cproquest_RIE%3E28264306%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=28264306&rft_id=info:pmid/&rft_ieee_id=661300&rfr_iscdi=true