Dynamic 3-D Visualization of Vocal Tract Shaping During Speech
Noninvasive imaging is widely used in speech research as a means to investigate the shaping and dynamics of the vocal tract during speech production. 3-D dynamic MRI would be a major advance, as it would provide 3-D dynamic visualization of the entire vocal tract. We present a novel method for the c...
Gespeichert in:
Veröffentlicht in: | IEEE transactions on medical imaging 2013-05, Vol.32 (5), p.838-848 |
---|---|
Hauptverfasser: | , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 848 |
---|---|
container_issue | 5 |
container_start_page | 838 |
container_title | IEEE transactions on medical imaging |
container_volume | 32 |
creator | Yinghua Zhu Yoon-Chul Kim Proctor, M. I. Narayanan, S. S. Nayak, K. S. |
description | Noninvasive imaging is widely used in speech research as a means to investigate the shaping and dynamics of the vocal tract during speech production. 3-D dynamic MRI would be a major advance, as it would provide 3-D dynamic visualization of the entire vocal tract. We present a novel method for the creation of 3-D dynamic movies of vocal tract shaping based on the acquisition of 2-D dynamic data from parallel slices and temporal alignment of the image sequences using audio information. Multiple sagittal 2-D real-time movies with synchronized audio recordings are acquired for English vowel-consonant-vowel stimuli /ala/, /ara/, /asa/, and /a∫a/. Audio data are aligned using mel-frequency cepstral coefficients (MFCC) extracted from windowed intervals of the speech signal. Sagittal image sequences acquired from all slices are then aligned using dynamic time warping (DTW). The aligned image sequences enable dynamic 3-D visualization by creating synthesized movies of the moving airway in the coronal planes, visualizing desired tissue surfaces and tube-shaped vocal tract airway after manual segmentation of targeted articulators and smoothing. The resulting volumes allow for dynamic 3-D visualization of salient aspects of lingual articulation, including the formation of tongue grooves and sublingual cavities, with a temporal resolution of 78 ms. |
doi_str_mv | 10.1109/TMI.2012.2230017 |
format | Article |
fullrecord | <record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_pubmed_primary_23204279</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>6362229</ieee_id><sourcerecordid>1349400940</sourcerecordid><originalsourceid>FETCH-LOGICAL-c361t-713c40fb5dd56ebdb36083bfeb2cc1a3287e57ef362081614a7a6c2b8b2ff173</originalsourceid><addsrcrecordid>eNo9kE1Lw0AQhhdRbK3eBUFy9JI6O5tkk4sgrR-FioeG4m3Z3ezalXzUbHKov96UVg_De5jnHYaHkGsKU0ohu8_fFlMEilNEBkD5CRnTOE5DjKOPUzIG5GkIkOCIXHj_NRBRDNk5GSFDiJBnY_Iw39Wycjpg4TxYO9_L0v3IzjV10Nhg3WhZBnkrdResNnLr6s9g3rf7WG2N0ZtLcmZl6c3VMSckf37KZ6_h8v1lMXtchpoltAs5ZToCq-KiiBOjCsUSSJmyRqHWVDJMuYm5sSxBSGlCI8llolGlCq2lnE3I3eHstm2-e-M7UTmvTVnK2jS9F5RFWQQwzIDCAdVt431rrNi2rpLtTlAQe2likCb20sRR2lC5PV7vVWWK_8KfpQG4OQDOGPO_ToZvETP2C9u1bqs</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>1349400940</pqid></control><display><type>article</type><title>Dynamic 3-D Visualization of Vocal Tract Shaping During Speech</title><source>IEEE Electronic Library (IEL)</source><creator>Yinghua Zhu ; Yoon-Chul Kim ; Proctor, M. I. ; Narayanan, S. S. ; Nayak, K. S.</creator><creatorcontrib>Yinghua Zhu ; Yoon-Chul Kim ; Proctor, M. I. ; Narayanan, S. S. ; Nayak, K. S.</creatorcontrib><description>Noninvasive imaging is widely used in speech research as a means to investigate the shaping and dynamics of the vocal tract during speech production. 3-D dynamic MRI would be a major advance, as it would provide 3-D dynamic visualization of the entire vocal tract. We present a novel method for the creation of 3-D dynamic movies of vocal tract shaping based on the acquisition of 2-D dynamic data from parallel slices and temporal alignment of the image sequences using audio information. Multiple sagittal 2-D real-time movies with synchronized audio recordings are acquired for English vowel-consonant-vowel stimuli /ala/, /ara/, /asa/, and /a∫a/. Audio data are aligned using mel-frequency cepstral coefficients (MFCC) extracted from windowed intervals of the speech signal. Sagittal image sequences acquired from all slices are then aligned using dynamic time warping (DTW). The aligned image sequences enable dynamic 3-D visualization by creating synthesized movies of the moving airway in the coronal planes, visualizing desired tissue surfaces and tube-shaped vocal tract airway after manual segmentation of targeted articulators and smoothing. The resulting volumes allow for dynamic 3-D visualization of salient aspects of lingual articulation, including the formation of tongue grooves and sublingual cavities, with a temporal resolution of 78 ms.</description><identifier>ISSN: 0278-0062</identifier><identifier>EISSN: 1558-254X</identifier><identifier>DOI: 10.1109/TMI.2012.2230017</identifier><identifier>PMID: 23204279</identifier><identifier>CODEN: ITMID4</identifier><language>eng</language><publisher>United States: IEEE</publisher><subject>Adult ; Articulation ; dynamic time warping ; Humans ; Image reconstruction ; Imaging, Three-Dimensional - methods ; Magnetic resonance imaging ; Magnetic Resonance Imaging - methods ; Male ; Mel frequency cepstral coefficient ; real-time magnetic resonance imaging (MRI) ; Real-time systems ; retrospective gating ; Signal Processing, Computer-Assisted ; Speech ; speech production ; Speech Production Measurement - methods ; Tongue ; Vocal Cords - anatomy & histology ; Vocal Cords - physiology ; vocal tract shaping</subject><ispartof>IEEE transactions on medical imaging, 2013-05, Vol.32 (5), p.838-848</ispartof><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c361t-713c40fb5dd56ebdb36083bfeb2cc1a3287e57ef362081614a7a6c2b8b2ff173</citedby><cites>FETCH-LOGICAL-c361t-713c40fb5dd56ebdb36083bfeb2cc1a3287e57ef362081614a7a6c2b8b2ff173</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/6362229$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,780,784,796,27924,27925,54758</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/6362229$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/23204279$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Yinghua Zhu</creatorcontrib><creatorcontrib>Yoon-Chul Kim</creatorcontrib><creatorcontrib>Proctor, M. I.</creatorcontrib><creatorcontrib>Narayanan, S. S.</creatorcontrib><creatorcontrib>Nayak, K. S.</creatorcontrib><title>Dynamic 3-D Visualization of Vocal Tract Shaping During Speech</title><title>IEEE transactions on medical imaging</title><addtitle>TMI</addtitle><addtitle>IEEE Trans Med Imaging</addtitle><description>Noninvasive imaging is widely used in speech research as a means to investigate the shaping and dynamics of the vocal tract during speech production. 3-D dynamic MRI would be a major advance, as it would provide 3-D dynamic visualization of the entire vocal tract. We present a novel method for the creation of 3-D dynamic movies of vocal tract shaping based on the acquisition of 2-D dynamic data from parallel slices and temporal alignment of the image sequences using audio information. Multiple sagittal 2-D real-time movies with synchronized audio recordings are acquired for English vowel-consonant-vowel stimuli /ala/, /ara/, /asa/, and /a∫a/. Audio data are aligned using mel-frequency cepstral coefficients (MFCC) extracted from windowed intervals of the speech signal. Sagittal image sequences acquired from all slices are then aligned using dynamic time warping (DTW). The aligned image sequences enable dynamic 3-D visualization by creating synthesized movies of the moving airway in the coronal planes, visualizing desired tissue surfaces and tube-shaped vocal tract airway after manual segmentation of targeted articulators and smoothing. The resulting volumes allow for dynamic 3-D visualization of salient aspects of lingual articulation, including the formation of tongue grooves and sublingual cavities, with a temporal resolution of 78 ms.</description><subject>Adult</subject><subject>Articulation</subject><subject>dynamic time warping</subject><subject>Humans</subject><subject>Image reconstruction</subject><subject>Imaging, Three-Dimensional - methods</subject><subject>Magnetic resonance imaging</subject><subject>Magnetic Resonance Imaging - methods</subject><subject>Male</subject><subject>Mel frequency cepstral coefficient</subject><subject>real-time magnetic resonance imaging (MRI)</subject><subject>Real-time systems</subject><subject>retrospective gating</subject><subject>Signal Processing, Computer-Assisted</subject><subject>Speech</subject><subject>speech production</subject><subject>Speech Production Measurement - methods</subject><subject>Tongue</subject><subject>Vocal Cords - anatomy & histology</subject><subject>Vocal Cords - physiology</subject><subject>vocal tract shaping</subject><issn>0278-0062</issn><issn>1558-254X</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2013</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><sourceid>EIF</sourceid><recordid>eNo9kE1Lw0AQhhdRbK3eBUFy9JI6O5tkk4sgrR-FioeG4m3Z3ezalXzUbHKov96UVg_De5jnHYaHkGsKU0ohu8_fFlMEilNEBkD5CRnTOE5DjKOPUzIG5GkIkOCIXHj_NRBRDNk5GSFDiJBnY_Iw39Wycjpg4TxYO9_L0v3IzjV10Nhg3WhZBnkrdResNnLr6s9g3rf7WG2N0ZtLcmZl6c3VMSckf37KZ6_h8v1lMXtchpoltAs5ZToCq-KiiBOjCsUSSJmyRqHWVDJMuYm5sSxBSGlCI8llolGlCq2lnE3I3eHstm2-e-M7UTmvTVnK2jS9F5RFWQQwzIDCAdVt431rrNi2rpLtTlAQe2likCb20sRR2lC5PV7vVWWK_8KfpQG4OQDOGPO_ToZvETP2C9u1bqs</recordid><startdate>20130501</startdate><enddate>20130501</enddate><creator>Yinghua Zhu</creator><creator>Yoon-Chul Kim</creator><creator>Proctor, M. I.</creator><creator>Narayanan, S. S.</creator><creator>Nayak, K. S.</creator><general>IEEE</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7X8</scope></search><sort><creationdate>20130501</creationdate><title>Dynamic 3-D Visualization of Vocal Tract Shaping During Speech</title><author>Yinghua Zhu ; Yoon-Chul Kim ; Proctor, M. I. ; Narayanan, S. S. ; Nayak, K. S.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c361t-713c40fb5dd56ebdb36083bfeb2cc1a3287e57ef362081614a7a6c2b8b2ff173</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2013</creationdate><topic>Adult</topic><topic>Articulation</topic><topic>dynamic time warping</topic><topic>Humans</topic><topic>Image reconstruction</topic><topic>Imaging, Three-Dimensional - methods</topic><topic>Magnetic resonance imaging</topic><topic>Magnetic Resonance Imaging - methods</topic><topic>Male</topic><topic>Mel frequency cepstral coefficient</topic><topic>real-time magnetic resonance imaging (MRI)</topic><topic>Real-time systems</topic><topic>retrospective gating</topic><topic>Signal Processing, Computer-Assisted</topic><topic>Speech</topic><topic>speech production</topic><topic>Speech Production Measurement - methods</topic><topic>Tongue</topic><topic>Vocal Cords - anatomy & histology</topic><topic>Vocal Cords - physiology</topic><topic>vocal tract shaping</topic><toplevel>online_resources</toplevel><creatorcontrib>Yinghua Zhu</creatorcontrib><creatorcontrib>Yoon-Chul Kim</creatorcontrib><creatorcontrib>Proctor, M. I.</creatorcontrib><creatorcontrib>Narayanan, S. S.</creatorcontrib><creatorcontrib>Nayak, K. S.</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005–Present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>MEDLINE - Academic</collection><jtitle>IEEE transactions on medical imaging</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Yinghua Zhu</au><au>Yoon-Chul Kim</au><au>Proctor, M. I.</au><au>Narayanan, S. S.</au><au>Nayak, K. S.</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Dynamic 3-D Visualization of Vocal Tract Shaping During Speech</atitle><jtitle>IEEE transactions on medical imaging</jtitle><stitle>TMI</stitle><addtitle>IEEE Trans Med Imaging</addtitle><date>2013-05-01</date><risdate>2013</risdate><volume>32</volume><issue>5</issue><spage>838</spage><epage>848</epage><pages>838-848</pages><issn>0278-0062</issn><eissn>1558-254X</eissn><coden>ITMID4</coden><abstract>Noninvasive imaging is widely used in speech research as a means to investigate the shaping and dynamics of the vocal tract during speech production. 3-D dynamic MRI would be a major advance, as it would provide 3-D dynamic visualization of the entire vocal tract. We present a novel method for the creation of 3-D dynamic movies of vocal tract shaping based on the acquisition of 2-D dynamic data from parallel slices and temporal alignment of the image sequences using audio information. Multiple sagittal 2-D real-time movies with synchronized audio recordings are acquired for English vowel-consonant-vowel stimuli /ala/, /ara/, /asa/, and /a∫a/. Audio data are aligned using mel-frequency cepstral coefficients (MFCC) extracted from windowed intervals of the speech signal. Sagittal image sequences acquired from all slices are then aligned using dynamic time warping (DTW). The aligned image sequences enable dynamic 3-D visualization by creating synthesized movies of the moving airway in the coronal planes, visualizing desired tissue surfaces and tube-shaped vocal tract airway after manual segmentation of targeted articulators and smoothing. The resulting volumes allow for dynamic 3-D visualization of salient aspects of lingual articulation, including the formation of tongue grooves and sublingual cavities, with a temporal resolution of 78 ms.</abstract><cop>United States</cop><pub>IEEE</pub><pmid>23204279</pmid><doi>10.1109/TMI.2012.2230017</doi><tpages>11</tpages><oa>free_for_read</oa></addata></record> |
fulltext | fulltext_linktorsrc |
identifier | ISSN: 0278-0062 |
ispartof | IEEE transactions on medical imaging, 2013-05, Vol.32 (5), p.838-848 |
issn | 0278-0062 1558-254X |
language | eng |
recordid | cdi_pubmed_primary_23204279 |
source | IEEE Electronic Library (IEL) |
subjects | Adult Articulation dynamic time warping Humans Image reconstruction Imaging, Three-Dimensional - methods Magnetic resonance imaging Magnetic Resonance Imaging - methods Male Mel frequency cepstral coefficient real-time magnetic resonance imaging (MRI) Real-time systems retrospective gating Signal Processing, Computer-Assisted Speech speech production Speech Production Measurement - methods Tongue Vocal Cords - anatomy & histology Vocal Cords - physiology vocal tract shaping |
title | Dynamic 3-D Visualization of Vocal Tract Shaping During Speech |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-24T16%3A06%3A40IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Dynamic%203-D%20Visualization%20of%20Vocal%20Tract%20Shaping%20During%20Speech&rft.jtitle=IEEE%20transactions%20on%20medical%20imaging&rft.au=Yinghua%20Zhu&rft.date=2013-05-01&rft.volume=32&rft.issue=5&rft.spage=838&rft.epage=848&rft.pages=838-848&rft.issn=0278-0062&rft.eissn=1558-254X&rft.coden=ITMID4&rft_id=info:doi/10.1109/TMI.2012.2230017&rft_dat=%3Cproquest_RIE%3E1349400940%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=1349400940&rft_id=info:pmid/23204279&rft_ieee_id=6362229&rfr_iscdi=true |