Dynamic 3-D Visualization of Vocal Tract Shaping During Speech

Noninvasive imaging is widely used in speech research as a means to investigate the shaping and dynamics of the vocal tract during speech production. 3-D dynamic MRI would be a major advance, as it would provide 3-D dynamic visualization of the entire vocal tract. We present a novel method for the c...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE transactions on medical imaging 2013-05, Vol.32 (5), p.838-848
Hauptverfasser:	Yinghua Zhu, Yoon-Chul Kim, Proctor, M. I., Narayanan, S. S., Nayak, K. S.
Format:	Artikel
Sprache:	eng
Schlagworte:	Adult Articulation dynamic time warping Humans Image reconstruction Imaging, Three-Dimensional - methods Magnetic resonance imaging Magnetic Resonance Imaging - methods Male Mel frequency cepstral coefficient real-time magnetic resonance imaging (MRI) Real-time systems retrospective gating Signal Processing, Computer-Assisted Speech speech production Speech Production Measurement - methods Tongue Vocal Cords - anatomy & histology Vocal Cords - physiology vocal tract shaping
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	848
container_issue	5
container_start_page	838
container_title	IEEE transactions on medical imaging
container_volume	32
creator	Yinghua Zhu Yoon-Chul Kim Proctor, M. I. Narayanan, S. S. Nayak, K. S.
description	Noninvasive imaging is widely used in speech research as a means to investigate the shaping and dynamics of the vocal tract during speech production. 3-D dynamic MRI would be a major advance, as it would provide 3-D dynamic visualization of the entire vocal tract. We present a novel method for the creation of 3-D dynamic movies of vocal tract shaping based on the acquisition of 2-D dynamic data from parallel slices and temporal alignment of the image sequences using audio information. Multiple sagittal 2-D real-time movies with synchronized audio recordings are acquired for English vowel-consonant-vowel stimuli /ala/, /ara/, /asa/, and /a∫a/. Audio data are aligned using mel-frequency cepstral coefficients (MFCC) extracted from windowed intervals of the speech signal. Sagittal image sequences acquired from all slices are then aligned using dynamic time warping (DTW). The aligned image sequences enable dynamic 3-D visualization by creating synthesized movies of the moving airway in the coronal planes, visualizing desired tissue surfaces and tube-shaped vocal tract airway after manual segmentation of targeted articulators and smoothing. The resulting volumes allow for dynamic 3-D visualization of salient aspects of lingual articulation, including the formation of tongue grooves and sublingual cavities, with a temporal resolution of 78 ms.
doi_str_mv	10.1109/TMI.2012.2230017
format	Article
fullrecord	<record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_pubmed_primary_23204279</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>6362229</ieee_id><sourcerecordid>1349400940</sourcerecordid><originalsourceid>FETCH-LOGICAL-c361t-713c40fb5dd56ebdb36083bfeb2cc1a3287e57ef362081614a7a6c2b8b2ff173</originalsourceid><addsrcrecordid>eNo9kE1Lw0AQhhdRbK3eBUFy9JI6O5tkk4sgrR-FioeG4m3Z3ezalXzUbHKov96UVg_De5jnHYaHkGsKU0ohu8_fFlMEilNEBkD5CRnTOE5DjKOPUzIG5GkIkOCIXHj_NRBRDNk5GSFDiJBnY_Iw39Wycjpg4TxYO9_L0v3IzjV10Nhg3WhZBnkrdResNnLr6s9g3rf7WG2N0ZtLcmZl6c3VMSckf37KZ6_h8v1lMXtchpoltAs5ZToCq-KiiBOjCsUSSJmyRqHWVDJMuYm5sSxBSGlCI8llolGlCq2lnE3I3eHstm2-e-M7UTmvTVnK2jS9F5RFWQQwzIDCAdVt431rrNi2rpLtTlAQe2likCb20sRR2lC5PV7vVWWK_8KfpQG4OQDOGPO_ToZvETP2C9u1bqs</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>1349400940</pqid></control><display><type>article</type><title>Dynamic 3-D Visualization of Vocal Tract Shaping During Speech</title><source>IEEE Electronic Library (IEL)</source><creator>Yinghua Zhu ; Yoon-Chul Kim ; Proctor, M. I. ; Narayanan, S. S. ; Nayak, K. S.</creator><creatorcontrib>Yinghua Zhu ; Yoon-Chul Kim ; Proctor, M. I. ; Narayanan, S. S. ; Nayak, K. S.</creatorcontrib><description>Noninvasive imaging is widely used in speech research as a means to investigate the shaping and dynamics of the vocal tract during speech production. 3-D dynamic MRI would be a major advance, as it would provide 3-D dynamic visualization of the entire vocal tract. We present a novel method for the creation of 3-D dynamic movies of vocal tract shaping based on the acquisition of 2-D dynamic data from parallel slices and temporal alignment of the image sequences using audio information. Multiple sagittal 2-D real-time movies with synchronized audio recordings are acquired for English vowel-consonant-vowel stimuli /ala/, /ara/, /asa/, and /a∫a/. Audio data are aligned using mel-frequency cepstral coefficients (MFCC) extracted from windowed intervals of the speech signal. Sagittal image sequences acquired from all slices are then aligned using dynamic time warping (DTW). The aligned image sequences enable dynamic 3-D visualization by creating synthesized movies of the moving airway in the coronal planes, visualizing desired tissue surfaces and tube-shaped vocal tract airway after manual segmentation of targeted articulators and smoothing. The resulting volumes allow for dynamic 3-D visualization of salient aspects of lingual articulation, including the formation of tongue grooves and sublingual cavities, with a temporal resolution of 78 ms.</description><identifier>ISSN: 0278-0062</identifier><identifier>EISSN: 1558-254X</identifier><identifier>DOI: 10.1109/TMI.2012.2230017</identifier><identifier>PMID: 23204279</identifier><identifier>CODEN: ITMID4</identifier><language>eng</language><publisher>United States: IEEE</publisher><subject>Adult ; Articulation ; dynamic time warping ; Humans ; Image reconstruction ; Imaging, Three-Dimensional - methods ; Magnetic resonance imaging ; Magnetic Resonance Imaging - methods ; Male ; Mel frequency cepstral coefficient ; real-time magnetic resonance imaging (MRI) ; Real-time systems ; retrospective gating ; Signal Processing, Computer-Assisted ; Speech ; speech production ; Speech Production Measurement - methods ; Tongue ; Vocal Cords - anatomy & histology ; Vocal Cords - physiology ; vocal tract shaping</subject><ispartof>IEEE transactions on medical imaging, 2013-05, Vol.32 (5), p.838-848</ispartof><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c361t-713c40fb5dd56ebdb36083bfeb2cc1a3287e57ef362081614a7a6c2b8b2ff173</citedby><cites>FETCH-LOGICAL-c361t-713c40fb5dd56ebdb36083bfeb2cc1a3287e57ef362081614a7a6c2b8b2ff173</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/6362229$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,780,784,796,27924,27925,54758</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/6362229$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/23204279$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Yinghua Zhu</creatorcontrib><creatorcontrib>Yoon-Chul Kim</creatorcontrib><creatorcontrib>Proctor, M. I.</creatorcontrib><creatorcontrib>Narayanan, S. S.</creatorcontrib><creatorcontrib>Nayak, K. S.</creatorcontrib><title>Dynamic 3-D Visualization of Vocal Tract Shaping During Speech</title><title>IEEE transactions on medical imaging</title><addtitle>TMI</addtitle><addtitle>IEEE Trans Med Imaging</addtitle><description>Noninvasive imaging is widely used in speech research as a means to investigate the shaping and dynamics of the vocal tract during speech production. 3-D dynamic MRI would be a major advance, as it would provide 3-D dynamic visualization of the entire vocal tract. We present a novel method for the creation of 3-D dynamic movies of vocal tract shaping based on the acquisition of 2-D dynamic data from parallel slices and temporal alignment of the image sequences using audio information. Multiple sagittal 2-D real-time movies with synchronized audio recordings are acquired for English vowel-consonant-vowel stimuli /ala/, /ara/, /asa/, and /a∫a/. Audio data are aligned using mel-frequency cepstral coefficients (MFCC) extracted from windowed intervals of the speech signal. Sagittal image sequences acquired from all slices are then aligned using dynamic time warping (DTW). The aligned image sequences enable dynamic 3-D visualization by creating synthesized movies of the moving airway in the coronal planes, visualizing desired tissue surfaces and tube-shaped vocal tract airway after manual segmentation of targeted articulators and smoothing. The resulting volumes allow for dynamic 3-D visualization of salient aspects of lingual articulation, including the formation of tongue grooves and sublingual cavities, with a temporal resolution of 78 ms.</description><subject>Adult</subject><subject>Articulation</subject><subject>dynamic time warping</subject><subject>Humans</subject><subject>Image reconstruction</subject><subject>Imaging, Three-Dimensional - methods</subject><subject>Magnetic resonance imaging</subject><subject>Magnetic Resonance Imaging - methods</subject><subject>Male</subject><subject>Mel frequency cepstral coefficient</subject><subject>real-time magnetic resonance imaging (MRI)</subject><subject>Real-time systems</subject><subject>retrospective gating</subject><subject>Signal Processing, Computer-Assisted</subject><subject>Speech</subject><subject>speech production</subject><subject>Speech Production Measurement - methods</subject><subject>Tongue</subject><subject>Vocal Cords - anatomy & histology</subject><subject>Vocal Cords - physiology</subject><subject>vocal tract shaping</subject><issn>0278-0062</issn><issn>1558-254X</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2013</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><sourceid>EIF</sourceid><recordid>eNo9kE1Lw0AQhhdRbK3eBUFy9JI6O5tkk4sgrR-FioeG4m3Z3ezalXzUbHKov96UVg_De5jnHYaHkGsKU0ohu8_fFlMEilNEBkD5CRnTOE5DjKOPUzIG5GkIkOCIXHj_NRBRDNk5GSFDiJBnY_Iw39Wycjpg4TxYO9_L0v3IzjV10Nhg3WhZBnkrdResNnLr6s9g3rf7WG2N0ZtLcmZl6c3VMSckf37KZ6_h8v1lMXtchpoltAs5ZToCq-KiiBOjCsUSSJmyRqHWVDJMuYm5sSxBSGlCI8llolGlCq2lnE3I3eHstm2-e-M7UTmvTVnK2jS9F5RFWQQwzIDCAdVt431rrNi2rpLtTlAQe2likCb20sRR2lC5PV7vVWWK_8KfpQG4OQDOGPO_ToZvETP2C9u1bqs</recordid><startdate>20130501</startdate><enddate>20130501</enddate><creator>Yinghua Zhu</creator><creator>Yoon-Chul Kim</creator><creator>Proctor, M. I.</creator><creator>Narayanan, S. S.</creator><creator>Nayak, K. S.</creator><general>IEEE</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7X8</scope></search><sort><creationdate>20130501</creationdate><title>Dynamic 3-D Visualization of Vocal Tract Shaping During Speech</title><author>Yinghua Zhu ; Yoon-Chul Kim ; Proctor, M. I. ; Narayanan, S. S. ; Nayak, K. S.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c361t-713c40fb5dd56ebdb36083bfeb2cc1a3287e57ef362081614a7a6c2b8b2ff173</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2013</creationdate><topic>Adult</topic><topic>Articulation</topic><topic>dynamic time warping</topic><topic>Humans</topic><topic>Image reconstruction</topic><topic>Imaging, Three-Dimensional - methods</topic><topic>Magnetic resonance imaging</topic><topic>Magnetic Resonance Imaging - methods</topic><topic>Male</topic><topic>Mel frequency cepstral coefficient</topic><topic>real-time magnetic resonance imaging (MRI)</topic><topic>Real-time systems</topic><topic>retrospective gating</topic><topic>Signal Processing, Computer-Assisted</topic><topic>Speech</topic><topic>speech production</topic><topic>Speech Production Measurement - methods</topic><topic>Tongue</topic><topic>Vocal Cords - anatomy & histology</topic><topic>Vocal Cords - physiology</topic><topic>vocal tract shaping</topic><toplevel>online_resources</toplevel><creatorcontrib>Yinghua Zhu</creatorcontrib><creatorcontrib>Yoon-Chul Kim</creatorcontrib><creatorcontrib>Proctor, M. I.</creatorcontrib><creatorcontrib>Narayanan, S. S.</creatorcontrib><creatorcontrib>Nayak, K. S.</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005–Present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>MEDLINE - Academic</collection><jtitle>IEEE transactions on medical imaging</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Yinghua Zhu</au><au>Yoon-Chul Kim</au><au>Proctor, M. I.</au><au>Narayanan, S. S.</au><au>Nayak, K. S.</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Dynamic 3-D Visualization of Vocal Tract Shaping During Speech</atitle><jtitle>IEEE transactions on medical imaging</jtitle><stitle>TMI</stitle><addtitle>IEEE Trans Med Imaging</addtitle><date>2013-05-01</date><risdate>2013</risdate><volume>32</volume><issue>5</issue><spage>838</spage><epage>848</epage><pages>838-848</pages><issn>0278-0062</issn><eissn>1558-254X</eissn><coden>ITMID4</coden><abstract>Noninvasive imaging is widely used in speech research as a means to investigate the shaping and dynamics of the vocal tract during speech production. 3-D dynamic MRI would be a major advance, as it would provide 3-D dynamic visualization of the entire vocal tract. We present a novel method for the creation of 3-D dynamic movies of vocal tract shaping based on the acquisition of 2-D dynamic data from parallel slices and temporal alignment of the image sequences using audio information. Multiple sagittal 2-D real-time movies with synchronized audio recordings are acquired for English vowel-consonant-vowel stimuli /ala/, /ara/, /asa/, and /a∫a/. Audio data are aligned using mel-frequency cepstral coefficients (MFCC) extracted from windowed intervals of the speech signal. Sagittal image sequences acquired from all slices are then aligned using dynamic time warping (DTW). The aligned image sequences enable dynamic 3-D visualization by creating synthesized movies of the moving airway in the coronal planes, visualizing desired tissue surfaces and tube-shaped vocal tract airway after manual segmentation of targeted articulators and smoothing. The resulting volumes allow for dynamic 3-D visualization of salient aspects of lingual articulation, including the formation of tongue grooves and sublingual cavities, with a temporal resolution of 78 ms.</abstract><cop>United States</cop><pub>IEEE</pub><pmid>23204279</pmid><doi>10.1109/TMI.2012.2230017</doi><tpages>11</tpages><oa>free_for_read</oa></addata></record>
fulltext	fulltext_linktorsrc
identifier	ISSN: 0278-0062
ispartof	IEEE transactions on medical imaging, 2013-05, Vol.32 (5), p.838-848
issn	0278-0062 1558-254X
language	eng
recordid	cdi_pubmed_primary_23204279
source	IEEE Electronic Library (IEL)
subjects	Adult Articulation dynamic time warping Humans Image reconstruction Imaging, Three-Dimensional - methods Magnetic resonance imaging Magnetic Resonance Imaging - methods Male Mel frequency cepstral coefficient real-time magnetic resonance imaging (MRI) Real-time systems retrospective gating Signal Processing, Computer-Assisted Speech speech production Speech Production Measurement - methods Tongue Vocal Cords - anatomy & histology Vocal Cords - physiology vocal tract shaping
title	Dynamic 3-D Visualization of Vocal Tract Shaping During Speech
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-24T16%3A06%3A40IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Dynamic%203-D%20Visualization%20of%20Vocal%20Tract%20Shaping%20During%20Speech&rft.jtitle=IEEE%20transactions%20on%20medical%20imaging&rft.au=Yinghua%20Zhu&rft.date=2013-05-01&rft.volume=32&rft.issue=5&rft.spage=838&rft.epage=848&rft.pages=838-848&rft.issn=0278-0062&rft.eissn=1558-254X&rft.coden=ITMID4&rft_id=info:doi/10.1109/TMI.2012.2230017&rft_dat=%3Cproquest_RIE%3E1349400940%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=1349400940&rft_id=info:pmid/23204279&rft_ieee_id=6362229&rfr_iscdi=true