Audio-to-visual conversion for multimedia communication
Although humans rely primarily on hearing to process speech, they can also extract a great deal of information with their eyes through lipreading. This skill becomes extremely important when the acoustic signal is degraded by noise. It would, therefore, be beneficial to find methods to reinforce aco...
Gespeichert in:
Veröffentlicht in: | IEEE transactions on industrial electronics (1982) 1998-02, Vol.45 (1), p.15-22 |
---|---|
Hauptverfasser: | , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 22 |
---|---|
container_issue | 1 |
container_start_page | 15 |
container_title | IEEE transactions on industrial electronics (1982) |
container_volume | 45 |
creator | Rao, R.R. Tsuhan Chen Mersereau, R.M. |
description | Although humans rely primarily on hearing to process speech, they can also extract a great deal of information with their eyes through lipreading. This skill becomes extremely important when the acoustic signal is degraded by noise. It would, therefore, be beneficial to find methods to reinforce acoustic speech with a synthesized visual signal for high noise environments. This paper addresses the interaction between acoustic speech and visible speech. Algorithms for converting audible speech into visible speech are examined, and applications which can utilize this conversion process are presented. Our results demonstrate that it is possible to animate a natural-looking talking head using acoustic speech as an input. |
doi_str_mv | 10.1109/41.661300 |
format | Article |
fullrecord | <record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_crossref_primary_10_1109_41_661300</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>661300</ieee_id><sourcerecordid>28264306</sourcerecordid><originalsourceid>FETCH-LOGICAL-c308t-cf34256998ca6279dd180cb1c80c01c83b377990b5ac40e1de4ae2fce1dca063</originalsourceid><addsrcrecordid>eNqFkDtrwzAUhUVpoWnaoWunTIUOSu_Vy9YYQl8Q6JJdyLIMKraVSnag_75uHbp2OffC93GGQ8gtwhoR9KPAtVLIAc7IAqUsqNaiPCcLYEVJAYS6JFc5fwCgkCgXpNiMdYh0iPQY8mjblYv90accYr9qYlp1YzuEztfBTqTrxj44O0zwmlw0ts3-5nSXZP_8tN--0t37y9t2s6OOQzlQ13DBpNK6dFaxQtc1luAqdFPClLziRaE1VNI6AR5rL6xnjZs-Z0HxJbmfaw8pfo4-D6YL2fm2tb2PYzZMMyYZsv_FkinBfxsfZtGlmHPyjTmk0Nn0ZRDMz4RGoJknnNy72Q3e-z_vBL8Bu9lrcA</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>28264306</pqid></control><display><type>article</type><title>Audio-to-visual conversion for multimedia communication</title><source>IEEE Electronic Library (IEL)</source><creator>Rao, R.R. ; Tsuhan Chen ; Mersereau, R.M.</creator><creatorcontrib>Rao, R.R. ; Tsuhan Chen ; Mersereau, R.M.</creatorcontrib><description>Although humans rely primarily on hearing to process speech, they can also extract a great deal of information with their eyes through lipreading. This skill becomes extremely important when the acoustic signal is degraded by noise. It would, therefore, be beneficial to find methods to reinforce acoustic speech with a synthesized visual signal for high noise environments. This paper addresses the interaction between acoustic speech and visible speech. Algorithms for converting audible speech into visible speech are examined, and applications which can utilize this conversion process are presented. Our results demonstrate that it is possible to animate a natural-looking talking head using acoustic speech as an input.</description><identifier>ISSN: 0278-0046</identifier><identifier>EISSN: 1557-9948</identifier><identifier>DOI: 10.1109/41.661300</identifier><identifier>CODEN: ITIED6</identifier><language>eng</language><publisher>IEEE</publisher><subject>Acoustic noise ; Auditory system ; Data mining ; Degradation ; Eyes ; Humans ; Multimedia communication ; Speech processing ; Speech synthesis ; Working environment noise</subject><ispartof>IEEE transactions on industrial electronics (1982), 1998-02, Vol.45 (1), p.15-22</ispartof><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c308t-cf34256998ca6279dd180cb1c80c01c83b377990b5ac40e1de4ae2fce1dca063</citedby><cites>FETCH-LOGICAL-c308t-cf34256998ca6279dd180cb1c80c01c83b377990b5ac40e1de4ae2fce1dca063</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/661300$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,780,784,796,27922,27923,54756</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/661300$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Rao, R.R.</creatorcontrib><creatorcontrib>Tsuhan Chen</creatorcontrib><creatorcontrib>Mersereau, R.M.</creatorcontrib><title>Audio-to-visual conversion for multimedia communication</title><title>IEEE transactions on industrial electronics (1982)</title><addtitle>TIE</addtitle><description>Although humans rely primarily on hearing to process speech, they can also extract a great deal of information with their eyes through lipreading. This skill becomes extremely important when the acoustic signal is degraded by noise. It would, therefore, be beneficial to find methods to reinforce acoustic speech with a synthesized visual signal for high noise environments. This paper addresses the interaction between acoustic speech and visible speech. Algorithms for converting audible speech into visible speech are examined, and applications which can utilize this conversion process are presented. Our results demonstrate that it is possible to animate a natural-looking talking head using acoustic speech as an input.</description><subject>Acoustic noise</subject><subject>Auditory system</subject><subject>Data mining</subject><subject>Degradation</subject><subject>Eyes</subject><subject>Humans</subject><subject>Multimedia communication</subject><subject>Speech processing</subject><subject>Speech synthesis</subject><subject>Working environment noise</subject><issn>0278-0046</issn><issn>1557-9948</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>1998</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNqFkDtrwzAUhUVpoWnaoWunTIUOSu_Vy9YYQl8Q6JJdyLIMKraVSnag_75uHbp2OffC93GGQ8gtwhoR9KPAtVLIAc7IAqUsqNaiPCcLYEVJAYS6JFc5fwCgkCgXpNiMdYh0iPQY8mjblYv90accYr9qYlp1YzuEztfBTqTrxj44O0zwmlw0ts3-5nSXZP_8tN--0t37y9t2s6OOQzlQ13DBpNK6dFaxQtc1luAqdFPClLziRaE1VNI6AR5rL6xnjZs-Z0HxJbmfaw8pfo4-D6YL2fm2tb2PYzZMMyYZsv_FkinBfxsfZtGlmHPyjTmk0Nn0ZRDMz4RGoJknnNy72Q3e-z_vBL8Bu9lrcA</recordid><startdate>19980201</startdate><enddate>19980201</enddate><creator>Rao, R.R.</creator><creator>Tsuhan Chen</creator><creator>Mersereau, R.M.</creator><general>IEEE</general><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SP</scope><scope>8FD</scope><scope>L7M</scope><scope>7TB</scope><scope>FR3</scope></search><sort><creationdate>19980201</creationdate><title>Audio-to-visual conversion for multimedia communication</title><author>Rao, R.R. ; Tsuhan Chen ; Mersereau, R.M.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c308t-cf34256998ca6279dd180cb1c80c01c83b377990b5ac40e1de4ae2fce1dca063</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>1998</creationdate><topic>Acoustic noise</topic><topic>Auditory system</topic><topic>Data mining</topic><topic>Degradation</topic><topic>Eyes</topic><topic>Humans</topic><topic>Multimedia communication</topic><topic>Speech processing</topic><topic>Speech synthesis</topic><topic>Working environment noise</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Rao, R.R.</creatorcontrib><creatorcontrib>Tsuhan Chen</creatorcontrib><creatorcontrib>Mersereau, R.M.</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><collection>Electronics & Communications Abstracts</collection><collection>Technology Research Database</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Mechanical & Transportation Engineering Abstracts</collection><collection>Engineering Research Database</collection><jtitle>IEEE transactions on industrial electronics (1982)</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Rao, R.R.</au><au>Tsuhan Chen</au><au>Mersereau, R.M.</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Audio-to-visual conversion for multimedia communication</atitle><jtitle>IEEE transactions on industrial electronics (1982)</jtitle><stitle>TIE</stitle><date>1998-02-01</date><risdate>1998</risdate><volume>45</volume><issue>1</issue><spage>15</spage><epage>22</epage><pages>15-22</pages><issn>0278-0046</issn><eissn>1557-9948</eissn><coden>ITIED6</coden><abstract>Although humans rely primarily on hearing to process speech, they can also extract a great deal of information with their eyes through lipreading. This skill becomes extremely important when the acoustic signal is degraded by noise. It would, therefore, be beneficial to find methods to reinforce acoustic speech with a synthesized visual signal for high noise environments. This paper addresses the interaction between acoustic speech and visible speech. Algorithms for converting audible speech into visible speech are examined, and applications which can utilize this conversion process are presented. Our results demonstrate that it is possible to animate a natural-looking talking head using acoustic speech as an input.</abstract><pub>IEEE</pub><doi>10.1109/41.661300</doi><tpages>8</tpages></addata></record> |
fulltext | fulltext_linktorsrc |
identifier | ISSN: 0278-0046 |
ispartof | IEEE transactions on industrial electronics (1982), 1998-02, Vol.45 (1), p.15-22 |
issn | 0278-0046 1557-9948 |
language | eng |
recordid | cdi_crossref_primary_10_1109_41_661300 |
source | IEEE Electronic Library (IEL) |
subjects | Acoustic noise Auditory system Data mining Degradation Eyes Humans Multimedia communication Speech processing Speech synthesis Working environment noise |
title | Audio-to-visual conversion for multimedia communication |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-14T09%3A56%3A01IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Audio-to-visual%20conversion%20for%20multimedia%20communication&rft.jtitle=IEEE%20transactions%20on%20industrial%20electronics%20(1982)&rft.au=Rao,%20R.R.&rft.date=1998-02-01&rft.volume=45&rft.issue=1&rft.spage=15&rft.epage=22&rft.pages=15-22&rft.issn=0278-0046&rft.eissn=1557-9948&rft.coden=ITIED6&rft_id=info:doi/10.1109/41.661300&rft_dat=%3Cproquest_RIE%3E28264306%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=28264306&rft_id=info:pmid/&rft_ieee_id=661300&rfr_iscdi=true |