Dynamic 3-D Visualization of Vocal Tract Shaping During Speech

Noninvasive imaging is widely used in speech research as a means to investigate the shaping and dynamics of the vocal tract during speech production. 3-D dynamic MRI would be a major advance, as it would provide 3-D dynamic visualization of the entire vocal tract. We present a novel method for the c...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on medical imaging 2013-05, Vol.32 (5), p.838-848
Hauptverfasser: Yinghua Zhu, Yoon-Chul Kim, Proctor, M. I., Narayanan, S. S., Nayak, K. S.
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 848
container_issue 5
container_start_page 838
container_title IEEE transactions on medical imaging
container_volume 32
creator Yinghua Zhu
Yoon-Chul Kim
Proctor, M. I.
Narayanan, S. S.
Nayak, K. S.
description Noninvasive imaging is widely used in speech research as a means to investigate the shaping and dynamics of the vocal tract during speech production. 3-D dynamic MRI would be a major advance, as it would provide 3-D dynamic visualization of the entire vocal tract. We present a novel method for the creation of 3-D dynamic movies of vocal tract shaping based on the acquisition of 2-D dynamic data from parallel slices and temporal alignment of the image sequences using audio information. Multiple sagittal 2-D real-time movies with synchronized audio recordings are acquired for English vowel-consonant-vowel stimuli /ala/, /ara/, /asa/, and /a∫a/. Audio data are aligned using mel-frequency cepstral coefficients (MFCC) extracted from windowed intervals of the speech signal. Sagittal image sequences acquired from all slices are then aligned using dynamic time warping (DTW). The aligned image sequences enable dynamic 3-D visualization by creating synthesized movies of the moving airway in the coronal planes, visualizing desired tissue surfaces and tube-shaped vocal tract airway after manual segmentation of targeted articulators and smoothing. The resulting volumes allow for dynamic 3-D visualization of salient aspects of lingual articulation, including the formation of tongue grooves and sublingual cavities, with a temporal resolution of 78 ms.
doi_str_mv 10.1109/TMI.2012.2230017
format Article
fullrecord <record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_pubmed_primary_23204279</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>6362229</ieee_id><sourcerecordid>1349400940</sourcerecordid><originalsourceid>FETCH-LOGICAL-c361t-713c40fb5dd56ebdb36083bfeb2cc1a3287e57ef362081614a7a6c2b8b2ff173</originalsourceid><addsrcrecordid>eNo9kE1Lw0AQhhdRbK3eBUFy9JI6O5tkk4sgrR-FioeG4m3Z3ezalXzUbHKov96UVg_De5jnHYaHkGsKU0ohu8_fFlMEilNEBkD5CRnTOE5DjKOPUzIG5GkIkOCIXHj_NRBRDNk5GSFDiJBnY_Iw39Wycjpg4TxYO9_L0v3IzjV10Nhg3WhZBnkrdResNnLr6s9g3rf7WG2N0ZtLcmZl6c3VMSckf37KZ6_h8v1lMXtchpoltAs5ZToCq-KiiBOjCsUSSJmyRqHWVDJMuYm5sSxBSGlCI8llolGlCq2lnE3I3eHstm2-e-M7UTmvTVnK2jS9F5RFWQQwzIDCAdVt431rrNi2rpLtTlAQe2likCb20sRR2lC5PV7vVWWK_8KfpQG4OQDOGPO_ToZvETP2C9u1bqs</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>1349400940</pqid></control><display><type>article</type><title>Dynamic 3-D Visualization of Vocal Tract Shaping During Speech</title><source>IEEE Electronic Library (IEL)</source><creator>Yinghua Zhu ; Yoon-Chul Kim ; Proctor, M. I. ; Narayanan, S. S. ; Nayak, K. S.</creator><creatorcontrib>Yinghua Zhu ; Yoon-Chul Kim ; Proctor, M. I. ; Narayanan, S. S. ; Nayak, K. S.</creatorcontrib><description>Noninvasive imaging is widely used in speech research as a means to investigate the shaping and dynamics of the vocal tract during speech production. 3-D dynamic MRI would be a major advance, as it would provide 3-D dynamic visualization of the entire vocal tract. We present a novel method for the creation of 3-D dynamic movies of vocal tract shaping based on the acquisition of 2-D dynamic data from parallel slices and temporal alignment of the image sequences using audio information. Multiple sagittal 2-D real-time movies with synchronized audio recordings are acquired for English vowel-consonant-vowel stimuli /ala/, /ara/, /asa/, and /a∫a/. Audio data are aligned using mel-frequency cepstral coefficients (MFCC) extracted from windowed intervals of the speech signal. Sagittal image sequences acquired from all slices are then aligned using dynamic time warping (DTW). The aligned image sequences enable dynamic 3-D visualization by creating synthesized movies of the moving airway in the coronal planes, visualizing desired tissue surfaces and tube-shaped vocal tract airway after manual segmentation of targeted articulators and smoothing. The resulting volumes allow for dynamic 3-D visualization of salient aspects of lingual articulation, including the formation of tongue grooves and sublingual cavities, with a temporal resolution of 78 ms.</description><identifier>ISSN: 0278-0062</identifier><identifier>EISSN: 1558-254X</identifier><identifier>DOI: 10.1109/TMI.2012.2230017</identifier><identifier>PMID: 23204279</identifier><identifier>CODEN: ITMID4</identifier><language>eng</language><publisher>United States: IEEE</publisher><subject>Adult ; Articulation ; dynamic time warping ; Humans ; Image reconstruction ; Imaging, Three-Dimensional - methods ; Magnetic resonance imaging ; Magnetic Resonance Imaging - methods ; Male ; Mel frequency cepstral coefficient ; real-time magnetic resonance imaging (MRI) ; Real-time systems ; retrospective gating ; Signal Processing, Computer-Assisted ; Speech ; speech production ; Speech Production Measurement - methods ; Tongue ; Vocal Cords - anatomy &amp; histology ; Vocal Cords - physiology ; vocal tract shaping</subject><ispartof>IEEE transactions on medical imaging, 2013-05, Vol.32 (5), p.838-848</ispartof><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c361t-713c40fb5dd56ebdb36083bfeb2cc1a3287e57ef362081614a7a6c2b8b2ff173</citedby><cites>FETCH-LOGICAL-c361t-713c40fb5dd56ebdb36083bfeb2cc1a3287e57ef362081614a7a6c2b8b2ff173</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/6362229$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,780,784,796,27924,27925,54758</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/6362229$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/23204279$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Yinghua Zhu</creatorcontrib><creatorcontrib>Yoon-Chul Kim</creatorcontrib><creatorcontrib>Proctor, M. I.</creatorcontrib><creatorcontrib>Narayanan, S. S.</creatorcontrib><creatorcontrib>Nayak, K. S.</creatorcontrib><title>Dynamic 3-D Visualization of Vocal Tract Shaping During Speech</title><title>IEEE transactions on medical imaging</title><addtitle>TMI</addtitle><addtitle>IEEE Trans Med Imaging</addtitle><description>Noninvasive imaging is widely used in speech research as a means to investigate the shaping and dynamics of the vocal tract during speech production. 3-D dynamic MRI would be a major advance, as it would provide 3-D dynamic visualization of the entire vocal tract. We present a novel method for the creation of 3-D dynamic movies of vocal tract shaping based on the acquisition of 2-D dynamic data from parallel slices and temporal alignment of the image sequences using audio information. Multiple sagittal 2-D real-time movies with synchronized audio recordings are acquired for English vowel-consonant-vowel stimuli /ala/, /ara/, /asa/, and /a∫a/. Audio data are aligned using mel-frequency cepstral coefficients (MFCC) extracted from windowed intervals of the speech signal. Sagittal image sequences acquired from all slices are then aligned using dynamic time warping (DTW). The aligned image sequences enable dynamic 3-D visualization by creating synthesized movies of the moving airway in the coronal planes, visualizing desired tissue surfaces and tube-shaped vocal tract airway after manual segmentation of targeted articulators and smoothing. The resulting volumes allow for dynamic 3-D visualization of salient aspects of lingual articulation, including the formation of tongue grooves and sublingual cavities, with a temporal resolution of 78 ms.</description><subject>Adult</subject><subject>Articulation</subject><subject>dynamic time warping</subject><subject>Humans</subject><subject>Image reconstruction</subject><subject>Imaging, Three-Dimensional - methods</subject><subject>Magnetic resonance imaging</subject><subject>Magnetic Resonance Imaging - methods</subject><subject>Male</subject><subject>Mel frequency cepstral coefficient</subject><subject>real-time magnetic resonance imaging (MRI)</subject><subject>Real-time systems</subject><subject>retrospective gating</subject><subject>Signal Processing, Computer-Assisted</subject><subject>Speech</subject><subject>speech production</subject><subject>Speech Production Measurement - methods</subject><subject>Tongue</subject><subject>Vocal Cords - anatomy &amp; histology</subject><subject>Vocal Cords - physiology</subject><subject>vocal tract shaping</subject><issn>0278-0062</issn><issn>1558-254X</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2013</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><sourceid>EIF</sourceid><recordid>eNo9kE1Lw0AQhhdRbK3eBUFy9JI6O5tkk4sgrR-FioeG4m3Z3ezalXzUbHKov96UVg_De5jnHYaHkGsKU0ohu8_fFlMEilNEBkD5CRnTOE5DjKOPUzIG5GkIkOCIXHj_NRBRDNk5GSFDiJBnY_Iw39Wycjpg4TxYO9_L0v3IzjV10Nhg3WhZBnkrdResNnLr6s9g3rf7WG2N0ZtLcmZl6c3VMSckf37KZ6_h8v1lMXtchpoltAs5ZToCq-KiiBOjCsUSSJmyRqHWVDJMuYm5sSxBSGlCI8llolGlCq2lnE3I3eHstm2-e-M7UTmvTVnK2jS9F5RFWQQwzIDCAdVt431rrNi2rpLtTlAQe2likCb20sRR2lC5PV7vVWWK_8KfpQG4OQDOGPO_ToZvETP2C9u1bqs</recordid><startdate>20130501</startdate><enddate>20130501</enddate><creator>Yinghua Zhu</creator><creator>Yoon-Chul Kim</creator><creator>Proctor, M. I.</creator><creator>Narayanan, S. S.</creator><creator>Nayak, K. S.</creator><general>IEEE</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7X8</scope></search><sort><creationdate>20130501</creationdate><title>Dynamic 3-D Visualization of Vocal Tract Shaping During Speech</title><author>Yinghua Zhu ; Yoon-Chul Kim ; Proctor, M. I. ; Narayanan, S. S. ; Nayak, K. S.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c361t-713c40fb5dd56ebdb36083bfeb2cc1a3287e57ef362081614a7a6c2b8b2ff173</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2013</creationdate><topic>Adult</topic><topic>Articulation</topic><topic>dynamic time warping</topic><topic>Humans</topic><topic>Image reconstruction</topic><topic>Imaging, Three-Dimensional - methods</topic><topic>Magnetic resonance imaging</topic><topic>Magnetic Resonance Imaging - methods</topic><topic>Male</topic><topic>Mel frequency cepstral coefficient</topic><topic>real-time magnetic resonance imaging (MRI)</topic><topic>Real-time systems</topic><topic>retrospective gating</topic><topic>Signal Processing, Computer-Assisted</topic><topic>Speech</topic><topic>speech production</topic><topic>Speech Production Measurement - methods</topic><topic>Tongue</topic><topic>Vocal Cords - anatomy &amp; histology</topic><topic>Vocal Cords - physiology</topic><topic>vocal tract shaping</topic><toplevel>online_resources</toplevel><creatorcontrib>Yinghua Zhu</creatorcontrib><creatorcontrib>Yoon-Chul Kim</creatorcontrib><creatorcontrib>Proctor, M. I.</creatorcontrib><creatorcontrib>Narayanan, S. S.</creatorcontrib><creatorcontrib>Nayak, K. S.</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005–Present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>MEDLINE - Academic</collection><jtitle>IEEE transactions on medical imaging</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Yinghua Zhu</au><au>Yoon-Chul Kim</au><au>Proctor, M. I.</au><au>Narayanan, S. S.</au><au>Nayak, K. S.</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Dynamic 3-D Visualization of Vocal Tract Shaping During Speech</atitle><jtitle>IEEE transactions on medical imaging</jtitle><stitle>TMI</stitle><addtitle>IEEE Trans Med Imaging</addtitle><date>2013-05-01</date><risdate>2013</risdate><volume>32</volume><issue>5</issue><spage>838</spage><epage>848</epage><pages>838-848</pages><issn>0278-0062</issn><eissn>1558-254X</eissn><coden>ITMID4</coden><abstract>Noninvasive imaging is widely used in speech research as a means to investigate the shaping and dynamics of the vocal tract during speech production. 3-D dynamic MRI would be a major advance, as it would provide 3-D dynamic visualization of the entire vocal tract. We present a novel method for the creation of 3-D dynamic movies of vocal tract shaping based on the acquisition of 2-D dynamic data from parallel slices and temporal alignment of the image sequences using audio information. Multiple sagittal 2-D real-time movies with synchronized audio recordings are acquired for English vowel-consonant-vowel stimuli /ala/, /ara/, /asa/, and /a∫a/. Audio data are aligned using mel-frequency cepstral coefficients (MFCC) extracted from windowed intervals of the speech signal. Sagittal image sequences acquired from all slices are then aligned using dynamic time warping (DTW). The aligned image sequences enable dynamic 3-D visualization by creating synthesized movies of the moving airway in the coronal planes, visualizing desired tissue surfaces and tube-shaped vocal tract airway after manual segmentation of targeted articulators and smoothing. The resulting volumes allow for dynamic 3-D visualization of salient aspects of lingual articulation, including the formation of tongue grooves and sublingual cavities, with a temporal resolution of 78 ms.</abstract><cop>United States</cop><pub>IEEE</pub><pmid>23204279</pmid><doi>10.1109/TMI.2012.2230017</doi><tpages>11</tpages><oa>free_for_read</oa></addata></record>
fulltext fulltext_linktorsrc
identifier ISSN: 0278-0062
ispartof IEEE transactions on medical imaging, 2013-05, Vol.32 (5), p.838-848
issn 0278-0062
1558-254X
language eng
recordid cdi_pubmed_primary_23204279
source IEEE Electronic Library (IEL)
subjects Adult
Articulation
dynamic time warping
Humans
Image reconstruction
Imaging, Three-Dimensional - methods
Magnetic resonance imaging
Magnetic Resonance Imaging - methods
Male
Mel frequency cepstral coefficient
real-time magnetic resonance imaging (MRI)
Real-time systems
retrospective gating
Signal Processing, Computer-Assisted
Speech
speech production
Speech Production Measurement - methods
Tongue
Vocal Cords - anatomy & histology
Vocal Cords - physiology
vocal tract shaping
title Dynamic 3-D Visualization of Vocal Tract Shaping During Speech
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-24T16%3A06%3A40IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Dynamic%203-D%20Visualization%20of%20Vocal%20Tract%20Shaping%20During%20Speech&rft.jtitle=IEEE%20transactions%20on%20medical%20imaging&rft.au=Yinghua%20Zhu&rft.date=2013-05-01&rft.volume=32&rft.issue=5&rft.spage=838&rft.epage=848&rft.pages=838-848&rft.issn=0278-0062&rft.eissn=1558-254X&rft.coden=ITMID4&rft_id=info:doi/10.1109/TMI.2012.2230017&rft_dat=%3Cproquest_RIE%3E1349400940%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=1349400940&rft_id=info:pmid/23204279&rft_ieee_id=6362229&rfr_iscdi=true