Audio-to-visual conversion for multimedia communication

Although humans rely primarily on hearing to process speech, they can also extract a great deal of information with their eyes through lipreading. This skill becomes extremely important when the acoustic signal is degraded by noise. It would, therefore, be beneficial to find methods to reinforce aco...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE transactions on industrial electronics (1982) 1998-02, Vol.45 (1), p.15-22
Hauptverfasser:	Rao, R.R., Tsuhan Chen, Mersereau, R.M.
Format:	Artikel
Sprache:	eng
Schlagworte:	Acoustic noise Auditory system Data mining Degradation Eyes Humans Multimedia communication Speech processing Speech synthesis Working environment noise
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	22
container_issue	1
container_start_page	15
container_title	IEEE transactions on industrial electronics (1982)
container_volume	45
creator	Rao, R.R. Tsuhan Chen Mersereau, R.M.
description	Although humans rely primarily on hearing to process speech, they can also extract a great deal of information with their eyes through lipreading. This skill becomes extremely important when the acoustic signal is degraded by noise. It would, therefore, be beneficial to find methods to reinforce acoustic speech with a synthesized visual signal for high noise environments. This paper addresses the interaction between acoustic speech and visible speech. Algorithms for converting audible speech into visible speech are examined, and applications which can utilize this conversion process are presented. Our results demonstrate that it is possible to animate a natural-looking talking head using acoustic speech as an input.
doi_str_mv	10.1109/41.661300
format	Article
fullrecord	<record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_crossref_primary_10_1109_41_661300</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>661300</ieee_id><sourcerecordid>28264306</sourcerecordid><originalsourceid>FETCH-LOGICAL-c308t-cf34256998ca6279dd180cb1c80c01c83b377990b5ac40e1de4ae2fce1dca063</originalsourceid><addsrcrecordid>eNqFkDtrwzAUhUVpoWnaoWunTIUOSu_Vy9YYQl8Q6JJdyLIMKraVSnag_75uHbp2OffC93GGQ8gtwhoR9KPAtVLIAc7IAqUsqNaiPCcLYEVJAYS6JFc5fwCgkCgXpNiMdYh0iPQY8mjblYv90accYr9qYlp1YzuEztfBTqTrxj44O0zwmlw0ts3-5nSXZP_8tN--0t37y9t2s6OOQzlQ13DBpNK6dFaxQtc1luAqdFPClLziRaE1VNI6AR5rL6xnjZs-Z0HxJbmfaw8pfo4-D6YL2fm2tb2PYzZMMyYZsv_FkinBfxsfZtGlmHPyjTmk0Nn0ZRDMz4RGoJknnNy72Q3e-z_vBL8Bu9lrcA</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>28264306</pqid></control><display><type>article</type><title>Audio-to-visual conversion for multimedia communication</title><source>IEEE Electronic Library (IEL)</source><creator>Rao, R.R. ; Tsuhan Chen ; Mersereau, R.M.</creator><creatorcontrib>Rao, R.R. ; Tsuhan Chen ; Mersereau, R.M.</creatorcontrib><description>Although humans rely primarily on hearing to process speech, they can also extract a great deal of information with their eyes through lipreading. This skill becomes extremely important when the acoustic signal is degraded by noise. It would, therefore, be beneficial to find methods to reinforce acoustic speech with a synthesized visual signal for high noise environments. This paper addresses the interaction between acoustic speech and visible speech. Algorithms for converting audible speech into visible speech are examined, and applications which can utilize this conversion process are presented. Our results demonstrate that it is possible to animate a natural-looking talking head using acoustic speech as an input.</description><identifier>ISSN: 0278-0046</identifier><identifier>EISSN: 1557-9948</identifier><identifier>DOI: 10.1109/41.661300</identifier><identifier>CODEN: ITIED6</identifier><language>eng</language><publisher>IEEE</publisher><subject>Acoustic noise ; Auditory system ; Data mining ; Degradation ; Eyes ; Humans ; Multimedia communication ; Speech processing ; Speech synthesis ; Working environment noise</subject><ispartof>IEEE transactions on industrial electronics (1982), 1998-02, Vol.45 (1), p.15-22</ispartof><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c308t-cf34256998ca6279dd180cb1c80c01c83b377990b5ac40e1de4ae2fce1dca063</citedby><cites>FETCH-LOGICAL-c308t-cf34256998ca6279dd180cb1c80c01c83b377990b5ac40e1de4ae2fce1dca063</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/661300$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,780,784,796,27922,27923,54756</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/661300$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Rao, R.R.</creatorcontrib><creatorcontrib>Tsuhan Chen</creatorcontrib><creatorcontrib>Mersereau, R.M.</creatorcontrib><title>Audio-to-visual conversion for multimedia communication</title><title>IEEE transactions on industrial electronics (1982)</title><addtitle>TIE</addtitle><description>Although humans rely primarily on hearing to process speech, they can also extract a great deal of information with their eyes through lipreading. This skill becomes extremely important when the acoustic signal is degraded by noise. It would, therefore, be beneficial to find methods to reinforce acoustic speech with a synthesized visual signal for high noise environments. This paper addresses the interaction between acoustic speech and visible speech. Algorithms for converting audible speech into visible speech are examined, and applications which can utilize this conversion process are presented. Our results demonstrate that it is possible to animate a natural-looking talking head using acoustic speech as an input.</description><subject>Acoustic noise</subject><subject>Auditory system</subject><subject>Data mining</subject><subject>Degradation</subject><subject>Eyes</subject><subject>Humans</subject><subject>Multimedia communication</subject><subject>Speech processing</subject><subject>Speech synthesis</subject><subject>Working environment noise</subject><issn>0278-0046</issn><issn>1557-9948</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>1998</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNqFkDtrwzAUhUVpoWnaoWunTIUOSu_Vy9YYQl8Q6JJdyLIMKraVSnag_75uHbp2OffC93GGQ8gtwhoR9KPAtVLIAc7IAqUsqNaiPCcLYEVJAYS6JFc5fwCgkCgXpNiMdYh0iPQY8mjblYv90accYr9qYlp1YzuEztfBTqTrxj44O0zwmlw0ts3-5nSXZP_8tN--0t37y9t2s6OOQzlQ13DBpNK6dFaxQtc1luAqdFPClLziRaE1VNI6AR5rL6xnjZs-Z0HxJbmfaw8pfo4-D6YL2fm2tb2PYzZMMyYZsv_FkinBfxsfZtGlmHPyjTmk0Nn0ZRDMz4RGoJknnNy72Q3e-z_vBL8Bu9lrcA</recordid><startdate>19980201</startdate><enddate>19980201</enddate><creator>Rao, R.R.</creator><creator>Tsuhan Chen</creator><creator>Mersereau, R.M.</creator><general>IEEE</general><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SP</scope><scope>8FD</scope><scope>L7M</scope><scope>7TB</scope><scope>FR3</scope></search><sort><creationdate>19980201</creationdate><title>Audio-to-visual conversion for multimedia communication</title><author>Rao, R.R. ; Tsuhan Chen ; Mersereau, R.M.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c308t-cf34256998ca6279dd180cb1c80c01c83b377990b5ac40e1de4ae2fce1dca063</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>1998</creationdate><topic>Acoustic noise</topic><topic>Auditory system</topic><topic>Data mining</topic><topic>Degradation</topic><topic>Eyes</topic><topic>Humans</topic><topic>Multimedia communication</topic><topic>Speech processing</topic><topic>Speech synthesis</topic><topic>Working environment noise</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Rao, R.R.</creatorcontrib><creatorcontrib>Tsuhan Chen</creatorcontrib><creatorcontrib>Mersereau, R.M.</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><collection>Electronics & Communications Abstracts</collection><collection>Technology Research Database</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Mechanical & Transportation Engineering Abstracts</collection><collection>Engineering Research Database</collection><jtitle>IEEE transactions on industrial electronics (1982)</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Rao, R.R.</au><au>Tsuhan Chen</au><au>Mersereau, R.M.</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Audio-to-visual conversion for multimedia communication</atitle><jtitle>IEEE transactions on industrial electronics (1982)</jtitle><stitle>TIE</stitle><date>1998-02-01</date><risdate>1998</risdate><volume>45</volume><issue>1</issue><spage>15</spage><epage>22</epage><pages>15-22</pages><issn>0278-0046</issn><eissn>1557-9948</eissn><coden>ITIED6</coden><abstract>Although humans rely primarily on hearing to process speech, they can also extract a great deal of information with their eyes through lipreading. This skill becomes extremely important when the acoustic signal is degraded by noise. It would, therefore, be beneficial to find methods to reinforce acoustic speech with a synthesized visual signal for high noise environments. This paper addresses the interaction between acoustic speech and visible speech. Algorithms for converting audible speech into visible speech are examined, and applications which can utilize this conversion process are presented. Our results demonstrate that it is possible to animate a natural-looking talking head using acoustic speech as an input.</abstract><pub>IEEE</pub><doi>10.1109/41.661300</doi><tpages>8</tpages></addata></record>
fulltext	fulltext_linktorsrc
identifier	ISSN: 0278-0046
ispartof	IEEE transactions on industrial electronics (1982), 1998-02, Vol.45 (1), p.15-22
issn	0278-0046 1557-9948
language	eng
recordid	cdi_crossref_primary_10_1109_41_661300
source	IEEE Electronic Library (IEL)
subjects	Acoustic noise Auditory system Data mining Degradation Eyes Humans Multimedia communication Speech processing Speech synthesis Working environment noise
title	Audio-to-visual conversion for multimedia communication
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-14T09%3A56%3A01IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Audio-to-visual%20conversion%20for%20multimedia%20communication&rft.jtitle=IEEE%20transactions%20on%20industrial%20electronics%20(1982)&rft.au=Rao,%20R.R.&rft.date=1998-02-01&rft.volume=45&rft.issue=1&rft.spage=15&rft.epage=22&rft.pages=15-22&rft.issn=0278-0046&rft.eissn=1557-9948&rft.coden=ITIED6&rft_id=info:doi/10.1109/41.661300&rft_dat=%3Cproquest_RIE%3E28264306%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=28264306&rft_id=info:pmid/&rft_ieee_id=661300&rfr_iscdi=true