Improved vowel region detection from a continuous speech using post processing of vowel onset points and vowel end-points

Vowels are produced with an open configuration of the vocal tract, without any audible friction. The acoustic signal is relatively loud with varying strength of impulse-like excitation. Vowels possess significant energy content in the low-frequency bands of the speech signal. Acoustic events such as...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Multimedia tools and applications 2018-02, Vol.77 (4), p.4753-4767
Hauptverfasser: Thirumuru, Ramakrishna, Gangashetty, Suryakanth V., Vuppala, Anil Kumar
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 4767
container_issue 4
container_start_page 4753
container_title Multimedia tools and applications
container_volume 77
creator Thirumuru, Ramakrishna
Gangashetty, Suryakanth V.
Vuppala, Anil Kumar
description Vowels are produced with an open configuration of the vocal tract, without any audible friction. The acoustic signal is relatively loud with varying strength of impulse-like excitation. Vowels possess significant energy content in the low-frequency bands of the speech signal. Acoustic events such as vowel onset point (VOP) and vowel end-point (VEP) can be used as landmarks to detect vowel regions in a speech signal. In this paper, a two-stage algorithm is proposed to detect precise vowel regions. In the first level, the speech signal is processed using zero frequency filtering to emphasize energy content in low-frequency bands of speech. Zero frequency filtered signal predominantly contains low-frequency content of the speech signal as it is filtered around 0 Hz. This process is followed by the extraction of dominant spectral peaks from the magnitude spectrum around glottal closure regions of the speech signal. The vowel onset points and vowel end-points are obtained by convolving the enhanced spectral contour of zero frequency filtered signal with first order Gaussian differentiator. In the next level, a post-processing is carried out in the regions around VOP and VEP to remove spurious vowel regions based on uniformity of epoch intervals. In addition, the positions of VOPs and VEPs are also corrected using the strength of the excitation of the speech signal. The performance of the proposed vowel region detection method is compared with the existing state of art methods on TIMIT acoustic-phonetic speech corpus. It is reported that this method produced significant improvement in vowel region detection in clean and noisy environments.
doi_str_mv 10.1007/s11042-017-5044-8
format Article
fullrecord <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_2002610234</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2002610234</sourcerecordid><originalsourceid>FETCH-LOGICAL-c316t-436bb8432d4248c986608f7f5bf77b24f35596eef1532a143e58c8361d56768a3</originalsourceid><addsrcrecordid>eNp1kEtLAzEUhQdRsFZ_gLuA62jeSZdSfBQKbnQdZjI3dUqb1GSm0n9v6iiuXOXk5pzvhlNV15TcUkL0XaaUCIYJ1VgSIbA5qSZUao61ZvS0aG4I1pLQ8-oi5zUhVEkmJtVhsd2luIcW7eMnbFCCVRcDaqEH1x-VT3GLauRi6LswxCGjvANw72jIXVihXcw9KgQH-fse_Q8ohgzlJXahz6gOv3wILR6Hl9WZrzcZrn7OafX2-PA6f8bLl6fF_H6JHaeqx4KrpjGCs1YwYdzMKEWM1142XuuGCc-lnCkATyVnNRUcpHGGK9pKpZWp-bS6Gbnllx8D5N6u45BCWWkZIUxRwrgoLjq6XIo5J_B2l7ptnQ6WEnts2I4N29KwPTZsTcmwMZOLN6wg_ZH_D30B_SV_XQ</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2002610234</pqid></control><display><type>article</type><title>Improved vowel region detection from a continuous speech using post processing of vowel onset points and vowel end-points</title><source>SpringerLink Journals - AutoHoldings</source><creator>Thirumuru, Ramakrishna ; Gangashetty, Suryakanth V. ; Vuppala, Anil Kumar</creator><creatorcontrib>Thirumuru, Ramakrishna ; Gangashetty, Suryakanth V. ; Vuppala, Anil Kumar</creatorcontrib><description>Vowels are produced with an open configuration of the vocal tract, without any audible friction. The acoustic signal is relatively loud with varying strength of impulse-like excitation. Vowels possess significant energy content in the low-frequency bands of the speech signal. Acoustic events such as vowel onset point (VOP) and vowel end-point (VEP) can be used as landmarks to detect vowel regions in a speech signal. In this paper, a two-stage algorithm is proposed to detect precise vowel regions. In the first level, the speech signal is processed using zero frequency filtering to emphasize energy content in low-frequency bands of speech. Zero frequency filtered signal predominantly contains low-frequency content of the speech signal as it is filtered around 0 Hz. This process is followed by the extraction of dominant spectral peaks from the magnitude spectrum around glottal closure regions of the speech signal. The vowel onset points and vowel end-points are obtained by convolving the enhanced spectral contour of zero frequency filtered signal with first order Gaussian differentiator. In the next level, a post-processing is carried out in the regions around VOP and VEP to remove spurious vowel regions based on uniformity of epoch intervals. In addition, the positions of VOPs and VEPs are also corrected using the strength of the excitation of the speech signal. The performance of the proposed vowel region detection method is compared with the existing state of art methods on TIMIT acoustic-phonetic speech corpus. It is reported that this method produced significant improvement in vowel region detection in clean and noisy environments.</description><identifier>ISSN: 1380-7501</identifier><identifier>EISSN: 1573-7721</identifier><identifier>DOI: 10.1007/s11042-017-5044-8</identifier><language>eng</language><publisher>New York: Springer US</publisher><subject>Acoustic noise ; Acoustics ; Computer Communication Networks ; Computer Science ; Data Structures and Information Theory ; Excitation ; Filtration ; Frequencies ; Gaussian process ; Multimedia Information Systems ; Post-production processing ; Signal processing ; Special Purpose and Application-Based Systems ; Speech ; Speech disorders ; Verbal learning ; Vocal tract ; Voice recognition ; Vowels</subject><ispartof>Multimedia tools and applications, 2018-02, Vol.77 (4), p.4753-4767</ispartof><rights>Springer Science+Business Media, LLC 2017</rights><rights>Multimedia Tools and Applications is a copyright of Springer, (2017). All Rights Reserved.</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c316t-436bb8432d4248c986608f7f5bf77b24f35596eef1532a143e58c8361d56768a3</citedby><cites>FETCH-LOGICAL-c316t-436bb8432d4248c986608f7f5bf77b24f35596eef1532a143e58c8361d56768a3</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://link.springer.com/content/pdf/10.1007/s11042-017-5044-8$$EPDF$$P50$$Gspringer$$H</linktopdf><linktohtml>$$Uhttps://link.springer.com/10.1007/s11042-017-5044-8$$EHTML$$P50$$Gspringer$$H</linktohtml><link.rule.ids>314,776,780,27903,27904,41467,42536,51297</link.rule.ids></links><search><creatorcontrib>Thirumuru, Ramakrishna</creatorcontrib><creatorcontrib>Gangashetty, Suryakanth V.</creatorcontrib><creatorcontrib>Vuppala, Anil Kumar</creatorcontrib><title>Improved vowel region detection from a continuous speech using post processing of vowel onset points and vowel end-points</title><title>Multimedia tools and applications</title><addtitle>Multimed Tools Appl</addtitle><description>Vowels are produced with an open configuration of the vocal tract, without any audible friction. The acoustic signal is relatively loud with varying strength of impulse-like excitation. Vowels possess significant energy content in the low-frequency bands of the speech signal. Acoustic events such as vowel onset point (VOP) and vowel end-point (VEP) can be used as landmarks to detect vowel regions in a speech signal. In this paper, a two-stage algorithm is proposed to detect precise vowel regions. In the first level, the speech signal is processed using zero frequency filtering to emphasize energy content in low-frequency bands of speech. Zero frequency filtered signal predominantly contains low-frequency content of the speech signal as it is filtered around 0 Hz. This process is followed by the extraction of dominant spectral peaks from the magnitude spectrum around glottal closure regions of the speech signal. The vowel onset points and vowel end-points are obtained by convolving the enhanced spectral contour of zero frequency filtered signal with first order Gaussian differentiator. In the next level, a post-processing is carried out in the regions around VOP and VEP to remove spurious vowel regions based on uniformity of epoch intervals. In addition, the positions of VOPs and VEPs are also corrected using the strength of the excitation of the speech signal. The performance of the proposed vowel region detection method is compared with the existing state of art methods on TIMIT acoustic-phonetic speech corpus. It is reported that this method produced significant improvement in vowel region detection in clean and noisy environments.</description><subject>Acoustic noise</subject><subject>Acoustics</subject><subject>Computer Communication Networks</subject><subject>Computer Science</subject><subject>Data Structures and Information Theory</subject><subject>Excitation</subject><subject>Filtration</subject><subject>Frequencies</subject><subject>Gaussian process</subject><subject>Multimedia Information Systems</subject><subject>Post-production processing</subject><subject>Signal processing</subject><subject>Special Purpose and Application-Based Systems</subject><subject>Speech</subject><subject>Speech disorders</subject><subject>Verbal learning</subject><subject>Vocal tract</subject><subject>Voice recognition</subject><subject>Vowels</subject><issn>1380-7501</issn><issn>1573-7721</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2018</creationdate><recordtype>article</recordtype><sourceid>8G5</sourceid><sourceid>ABUWG</sourceid><sourceid>AFKRA</sourceid><sourceid>AZQEC</sourceid><sourceid>BENPR</sourceid><sourceid>CCPQU</sourceid><sourceid>DWQXO</sourceid><sourceid>GNUQQ</sourceid><sourceid>GUQSH</sourceid><sourceid>M2O</sourceid><recordid>eNp1kEtLAzEUhQdRsFZ_gLuA62jeSZdSfBQKbnQdZjI3dUqb1GSm0n9v6iiuXOXk5pzvhlNV15TcUkL0XaaUCIYJ1VgSIbA5qSZUao61ZvS0aG4I1pLQ8-oi5zUhVEkmJtVhsd2luIcW7eMnbFCCVRcDaqEH1x-VT3GLauRi6LswxCGjvANw72jIXVihXcw9KgQH-fse_Q8ohgzlJXahz6gOv3wILR6Hl9WZrzcZrn7OafX2-PA6f8bLl6fF_H6JHaeqx4KrpjGCs1YwYdzMKEWM1142XuuGCc-lnCkATyVnNRUcpHGGK9pKpZWp-bS6Gbnllx8D5N6u45BCWWkZIUxRwrgoLjq6XIo5J_B2l7ptnQ6WEnts2I4N29KwPTZsTcmwMZOLN6wg_ZH_D30B_SV_XQ</recordid><startdate>20180201</startdate><enddate>20180201</enddate><creator>Thirumuru, Ramakrishna</creator><creator>Gangashetty, Suryakanth V.</creator><creator>Vuppala, Anil Kumar</creator><general>Springer US</general><general>Springer Nature B.V</general><scope>AAYXX</scope><scope>CITATION</scope><scope>3V.</scope><scope>7SC</scope><scope>7WY</scope><scope>7WZ</scope><scope>7XB</scope><scope>87Z</scope><scope>8AL</scope><scope>8AO</scope><scope>8FD</scope><scope>8FE</scope><scope>8FG</scope><scope>8FK</scope><scope>8FL</scope><scope>8G5</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>ARAPS</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BEZIV</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>FRNLG</scope><scope>F~G</scope><scope>GNUQQ</scope><scope>GUQSH</scope><scope>HCIFZ</scope><scope>JQ2</scope><scope>K60</scope><scope>K6~</scope><scope>K7-</scope><scope>L.-</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>M0C</scope><scope>M0N</scope><scope>M2O</scope><scope>MBDVC</scope><scope>P5Z</scope><scope>P62</scope><scope>PQBIZ</scope><scope>PQBZA</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>Q9U</scope></search><sort><creationdate>20180201</creationdate><title>Improved vowel region detection from a continuous speech using post processing of vowel onset points and vowel end-points</title><author>Thirumuru, Ramakrishna ; Gangashetty, Suryakanth V. ; Vuppala, Anil Kumar</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c316t-436bb8432d4248c986608f7f5bf77b24f35596eef1532a143e58c8361d56768a3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2018</creationdate><topic>Acoustic noise</topic><topic>Acoustics</topic><topic>Computer Communication Networks</topic><topic>Computer Science</topic><topic>Data Structures and Information Theory</topic><topic>Excitation</topic><topic>Filtration</topic><topic>Frequencies</topic><topic>Gaussian process</topic><topic>Multimedia Information Systems</topic><topic>Post-production processing</topic><topic>Signal processing</topic><topic>Special Purpose and Application-Based Systems</topic><topic>Speech</topic><topic>Speech disorders</topic><topic>Verbal learning</topic><topic>Vocal tract</topic><topic>Voice recognition</topic><topic>Vowels</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Thirumuru, Ramakrishna</creatorcontrib><creatorcontrib>Gangashetty, Suryakanth V.</creatorcontrib><creatorcontrib>Vuppala, Anil Kumar</creatorcontrib><collection>CrossRef</collection><collection>ProQuest Central (Corporate)</collection><collection>Computer and Information Systems Abstracts</collection><collection>ABI/INFORM Collection</collection><collection>ABI/INFORM Global (PDF only)</collection><collection>ProQuest Central (purchase pre-March 2016)</collection><collection>ABI/INFORM Global (Alumni Edition)</collection><collection>Computing Database (Alumni Edition)</collection><collection>ProQuest Pharma Collection</collection><collection>Technology Research Database</collection><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>ProQuest Central (Alumni) (purchase pre-March 2016)</collection><collection>ABI/INFORM Collection (Alumni Edition)</collection><collection>Research Library (Alumni Edition)</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest Central UK/Ireland</collection><collection>Advanced Technologies &amp; Aerospace Collection</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Business Premium Collection</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>Business Premium Collection (Alumni)</collection><collection>ABI/INFORM Global (Corporate)</collection><collection>ProQuest Central Student</collection><collection>Research Library Prep</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Computer Science Collection</collection><collection>ProQuest Business Collection (Alumni Edition)</collection><collection>ProQuest Business Collection</collection><collection>Computer Science Database</collection><collection>ABI/INFORM Professional Advanced</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>ABI/INFORM Global</collection><collection>Computing Database</collection><collection>Research Library</collection><collection>Research Library (Corporate)</collection><collection>Advanced Technologies &amp; Aerospace Database</collection><collection>ProQuest Advanced Technologies &amp; Aerospace Collection</collection><collection>ProQuest One Business</collection><collection>ProQuest One Business (Alumni)</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central Basic</collection><jtitle>Multimedia tools and applications</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Thirumuru, Ramakrishna</au><au>Gangashetty, Suryakanth V.</au><au>Vuppala, Anil Kumar</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Improved vowel region detection from a continuous speech using post processing of vowel onset points and vowel end-points</atitle><jtitle>Multimedia tools and applications</jtitle><stitle>Multimed Tools Appl</stitle><date>2018-02-01</date><risdate>2018</risdate><volume>77</volume><issue>4</issue><spage>4753</spage><epage>4767</epage><pages>4753-4767</pages><issn>1380-7501</issn><eissn>1573-7721</eissn><abstract>Vowels are produced with an open configuration of the vocal tract, without any audible friction. The acoustic signal is relatively loud with varying strength of impulse-like excitation. Vowels possess significant energy content in the low-frequency bands of the speech signal. Acoustic events such as vowel onset point (VOP) and vowel end-point (VEP) can be used as landmarks to detect vowel regions in a speech signal. In this paper, a two-stage algorithm is proposed to detect precise vowel regions. In the first level, the speech signal is processed using zero frequency filtering to emphasize energy content in low-frequency bands of speech. Zero frequency filtered signal predominantly contains low-frequency content of the speech signal as it is filtered around 0 Hz. This process is followed by the extraction of dominant spectral peaks from the magnitude spectrum around glottal closure regions of the speech signal. The vowel onset points and vowel end-points are obtained by convolving the enhanced spectral contour of zero frequency filtered signal with first order Gaussian differentiator. In the next level, a post-processing is carried out in the regions around VOP and VEP to remove spurious vowel regions based on uniformity of epoch intervals. In addition, the positions of VOPs and VEPs are also corrected using the strength of the excitation of the speech signal. The performance of the proposed vowel region detection method is compared with the existing state of art methods on TIMIT acoustic-phonetic speech corpus. It is reported that this method produced significant improvement in vowel region detection in clean and noisy environments.</abstract><cop>New York</cop><pub>Springer US</pub><doi>10.1007/s11042-017-5044-8</doi><tpages>15</tpages></addata></record>
fulltext fulltext
identifier ISSN: 1380-7501
ispartof Multimedia tools and applications, 2018-02, Vol.77 (4), p.4753-4767
issn 1380-7501
1573-7721
language eng
recordid cdi_proquest_journals_2002610234
source SpringerLink Journals - AutoHoldings
subjects Acoustic noise
Acoustics
Computer Communication Networks
Computer Science
Data Structures and Information Theory
Excitation
Filtration
Frequencies
Gaussian process
Multimedia Information Systems
Post-production processing
Signal processing
Special Purpose and Application-Based Systems
Speech
Speech disorders
Verbal learning
Vocal tract
Voice recognition
Vowels
title Improved vowel region detection from a continuous speech using post processing of vowel onset points and vowel end-points
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-26T03%3A48%3A50IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Improved%20vowel%20region%20detection%20from%20a%20continuous%20speech%20using%20post%20processing%20of%20vowel%20onset%20points%20and%20vowel%20end-points&rft.jtitle=Multimedia%20tools%20and%20applications&rft.au=Thirumuru,%20Ramakrishna&rft.date=2018-02-01&rft.volume=77&rft.issue=4&rft.spage=4753&rft.epage=4767&rft.pages=4753-4767&rft.issn=1380-7501&rft.eissn=1573-7721&rft_id=info:doi/10.1007/s11042-017-5044-8&rft_dat=%3Cproquest_cross%3E2002610234%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2002610234&rft_id=info:pmid/&rfr_iscdi=true