Automatic Evaluation of Voice Quality Using Text-Based Laryngograph Measurements and Prosodic Analysis

Due to low intra- and interrater reliability, perceptual voice evaluation should be supported by objective, automatic methods. In this study, text-based, computer-aided prosodic analysis and measurements of connected speech were combined in order to model perceptual evaluation of the German Roughnes...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Computational and mathematical methods in medicine 2015-01, Vol.2015 (2015), p.1-11
Hauptverfasser: Ptok, Martin, Matoušek, Václav, Döllinger, Michael, Schwemmle, Cornelia, Haderlein, Tino, Nöth, Elmar
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 11
container_issue 2015
container_start_page 1
container_title Computational and mathematical methods in medicine
container_volume 2015
creator Ptok, Martin
Matoušek, Václav
Döllinger, Michael
Schwemmle, Cornelia
Haderlein, Tino
Nöth, Elmar
description Due to low intra- and interrater reliability, perceptual voice evaluation should be supported by objective, automatic methods. In this study, text-based, computer-aided prosodic analysis and measurements of connected speech were combined in order to model perceptual evaluation of the German Roughness-Breathiness-Hoarseness (RBH) scheme. 58 connected speech samples (43 women and 15 men; 48.7±17.8 years) containing the German version of the text “The North Wind and the Sun” were evaluated perceptually by 19 speech and voice therapy students according to the RBH scale. For the human-machine correlation, Support Vector Regression with measurements of the vocal fold cycle irregularities (CFx) and the closed phases of vocal fold vibration (CQx) of the Laryngograph and 33 features from a prosodic analysis module were used to model the listeners’ ratings. The best human-machine results for roughness were obtained from a combination of six prosodic features and CFx (r=0.71, ρ=0.57). These correlations were approximately the same as the interrater agreement among human raters (r=0.65, ρ=0.61). CQx was one of the substantial features of the hoarseness model. For hoarseness and breathiness, the human-machine agreement was substantially lower. Nevertheless, the automatic analysis method can serve as the basis for a meaningful objective support for perceptual analysis.
doi_str_mv 10.1155/2015/316325
format Article
fullrecord <record><control><sourceid>proquest_pubme</sourceid><recordid>TN_cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_4468283</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>1694965347</sourcerecordid><originalsourceid>FETCH-LOGICAL-c439t-3df1dcc91bf86bf6b8ab20635929e42c7e4995dc3a5974adb8a1a4a03c4546683</originalsourceid><addsrcrecordid>eNqNkUtr3DAURkVpaJ6r7ouWpcWJZT0sbQrTkCaBCW0gCd2Ja0meUbGliWWnmX9fhUmGdJeVLtzD0Xf5EPpIymNCOD-pSsJPKBG04u_QHqmZLERN5PvtXP7eRfsp_SlLTmpOPqDdShAqJKF7qJ1NY-xh9AafPUA35SkGHFt8F71x-HqCzo9rfJt8WOAb9zgW3yE5i-cwrMMiLgZYLfGVgzQNrndhTBiCxb-GmKLNzlmAbp18OkQ7LXTJHT2_B-j2x9nN6UUx_3l-eTqbF4ZRNRbUtsQao0jTStG0opHQVKWgXFXKscrUjinFraHAVc3A5j0BBiU1jDMhJD1A3zbe1dT0zpqcaIBOrwbf58A6gtf_b4Jf6kV80IwJWUmaBZ-fBUO8n1wade-TcV0HwcUpaSIUU4JTVmf06wY1-do0uHb7DSn1UzP6qRm9aSbTn14n27IvVWTgywZY-mDhr3-bzWXEtfAK5rUUjP4DQzeh4g</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>1694965347</pqid></control><display><type>article</type><title>Automatic Evaluation of Voice Quality Using Text-Based Laryngograph Measurements and Prosodic Analysis</title><source>Wiley-Blackwell Open Access Collection</source><source>MEDLINE</source><source>PubMed Central</source><source>Alma/SFX Local Collection</source><source>EZB Electronic Journals Library</source><source>PubMed Central Open Access</source><creator>Ptok, Martin ; Matoušek, Václav ; Döllinger, Michael ; Schwemmle, Cornelia ; Haderlein, Tino ; Nöth, Elmar</creator><contributor>Bursac, Zoran</contributor><creatorcontrib>Ptok, Martin ; Matoušek, Václav ; Döllinger, Michael ; Schwemmle, Cornelia ; Haderlein, Tino ; Nöth, Elmar ; Bursac, Zoran</creatorcontrib><description>Due to low intra- and interrater reliability, perceptual voice evaluation should be supported by objective, automatic methods. In this study, text-based, computer-aided prosodic analysis and measurements of connected speech were combined in order to model perceptual evaluation of the German Roughness-Breathiness-Hoarseness (RBH) scheme. 58 connected speech samples (43 women and 15 men; 48.7±17.8 years) containing the German version of the text “The North Wind and the Sun” were evaluated perceptually by 19 speech and voice therapy students according to the RBH scale. For the human-machine correlation, Support Vector Regression with measurements of the vocal fold cycle irregularities (CFx) and the closed phases of vocal fold vibration (CQx) of the Laryngograph and 33 features from a prosodic analysis module were used to model the listeners’ ratings. The best human-machine results for roughness were obtained from a combination of six prosodic features and CFx (r=0.71, ρ=0.57). These correlations were approximately the same as the interrater agreement among human raters (r=0.65, ρ=0.61). CQx was one of the substantial features of the hoarseness model. For hoarseness and breathiness, the human-machine agreement was substantially lower. Nevertheless, the automatic analysis method can serve as the basis for a meaningful objective support for perceptual analysis.</description><identifier>ISSN: 1748-670X</identifier><identifier>EISSN: 1748-6718</identifier><identifier>DOI: 10.1155/2015/316325</identifier><identifier>PMID: 26136813</identifier><language>eng</language><publisher>Cairo, Egypt: Hindawi Publishing Corporation</publisher><subject>Adolescent ; Adult ; Aged ; Aged, 80 and over ; Child ; Female ; Hoarseness - diagnosis ; Humans ; Male ; Middle Aged ; Regression Analysis ; Reproducibility of Results ; Signal Processing, Computer-Assisted ; Software ; Sound Spectrography - methods ; Speech ; Speech Perception ; Speech Therapy ; Voice Disorders - diagnosis ; Voice Quality ; Young Adult</subject><ispartof>Computational and mathematical methods in medicine, 2015-01, Vol.2015 (2015), p.1-11</ispartof><rights>Copyright © 2015 Tino Haderlein et al.</rights><rights>Copyright © 2015 Tino Haderlein et al. 2015</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c439t-3df1dcc91bf86bf6b8ab20635929e42c7e4995dc3a5974adb8a1a4a03c4546683</citedby><cites>FETCH-LOGICAL-c439t-3df1dcc91bf86bf6b8ab20635929e42c7e4995dc3a5974adb8a1a4a03c4546683</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC4468283/pdf/$$EPDF$$P50$$Gpubmedcentral$$Hfree_for_read</linktopdf><linktohtml>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC4468283/$$EHTML$$P50$$Gpubmedcentral$$Hfree_for_read</linktohtml><link.rule.ids>230,314,723,776,780,881,27901,27902,53766,53768</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/26136813$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><contributor>Bursac, Zoran</contributor><creatorcontrib>Ptok, Martin</creatorcontrib><creatorcontrib>Matoušek, Václav</creatorcontrib><creatorcontrib>Döllinger, Michael</creatorcontrib><creatorcontrib>Schwemmle, Cornelia</creatorcontrib><creatorcontrib>Haderlein, Tino</creatorcontrib><creatorcontrib>Nöth, Elmar</creatorcontrib><title>Automatic Evaluation of Voice Quality Using Text-Based Laryngograph Measurements and Prosodic Analysis</title><title>Computational and mathematical methods in medicine</title><addtitle>Comput Math Methods Med</addtitle><description>Due to low intra- and interrater reliability, perceptual voice evaluation should be supported by objective, automatic methods. In this study, text-based, computer-aided prosodic analysis and measurements of connected speech were combined in order to model perceptual evaluation of the German Roughness-Breathiness-Hoarseness (RBH) scheme. 58 connected speech samples (43 women and 15 men; 48.7±17.8 years) containing the German version of the text “The North Wind and the Sun” were evaluated perceptually by 19 speech and voice therapy students according to the RBH scale. For the human-machine correlation, Support Vector Regression with measurements of the vocal fold cycle irregularities (CFx) and the closed phases of vocal fold vibration (CQx) of the Laryngograph and 33 features from a prosodic analysis module were used to model the listeners’ ratings. The best human-machine results for roughness were obtained from a combination of six prosodic features and CFx (r=0.71, ρ=0.57). These correlations were approximately the same as the interrater agreement among human raters (r=0.65, ρ=0.61). CQx was one of the substantial features of the hoarseness model. For hoarseness and breathiness, the human-machine agreement was substantially lower. Nevertheless, the automatic analysis method can serve as the basis for a meaningful objective support for perceptual analysis.</description><subject>Adolescent</subject><subject>Adult</subject><subject>Aged</subject><subject>Aged, 80 and over</subject><subject>Child</subject><subject>Female</subject><subject>Hoarseness - diagnosis</subject><subject>Humans</subject><subject>Male</subject><subject>Middle Aged</subject><subject>Regression Analysis</subject><subject>Reproducibility of Results</subject><subject>Signal Processing, Computer-Assisted</subject><subject>Software</subject><subject>Sound Spectrography - methods</subject><subject>Speech</subject><subject>Speech Perception</subject><subject>Speech Therapy</subject><subject>Voice Disorders - diagnosis</subject><subject>Voice Quality</subject><subject>Young Adult</subject><issn>1748-670X</issn><issn>1748-6718</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2015</creationdate><recordtype>article</recordtype><sourceid>RHX</sourceid><sourceid>EIF</sourceid><recordid>eNqNkUtr3DAURkVpaJ6r7ouWpcWJZT0sbQrTkCaBCW0gCd2Ja0meUbGliWWnmX9fhUmGdJeVLtzD0Xf5EPpIymNCOD-pSsJPKBG04u_QHqmZLERN5PvtXP7eRfsp_SlLTmpOPqDdShAqJKF7qJ1NY-xh9AafPUA35SkGHFt8F71x-HqCzo9rfJt8WOAb9zgW3yE5i-cwrMMiLgZYLfGVgzQNrndhTBiCxb-GmKLNzlmAbp18OkQ7LXTJHT2_B-j2x9nN6UUx_3l-eTqbF4ZRNRbUtsQao0jTStG0opHQVKWgXFXKscrUjinFraHAVc3A5j0BBiU1jDMhJD1A3zbe1dT0zpqcaIBOrwbf58A6gtf_b4Jf6kV80IwJWUmaBZ-fBUO8n1wade-TcV0HwcUpaSIUU4JTVmf06wY1-do0uHb7DSn1UzP6qRm9aSbTn14n27IvVWTgywZY-mDhr3-bzWXEtfAK5rUUjP4DQzeh4g</recordid><startdate>20150101</startdate><enddate>20150101</enddate><creator>Ptok, Martin</creator><creator>Matoušek, Václav</creator><creator>Döllinger, Michael</creator><creator>Schwemmle, Cornelia</creator><creator>Haderlein, Tino</creator><creator>Nöth, Elmar</creator><general>Hindawi Publishing Corporation</general><scope>ADJCN</scope><scope>AHFXO</scope><scope>RHU</scope><scope>RHW</scope><scope>RHX</scope><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7X8</scope><scope>5PM</scope></search><sort><creationdate>20150101</creationdate><title>Automatic Evaluation of Voice Quality Using Text-Based Laryngograph Measurements and Prosodic Analysis</title><author>Ptok, Martin ; Matoušek, Václav ; Döllinger, Michael ; Schwemmle, Cornelia ; Haderlein, Tino ; Nöth, Elmar</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c439t-3df1dcc91bf86bf6b8ab20635929e42c7e4995dc3a5974adb8a1a4a03c4546683</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2015</creationdate><topic>Adolescent</topic><topic>Adult</topic><topic>Aged</topic><topic>Aged, 80 and over</topic><topic>Child</topic><topic>Female</topic><topic>Hoarseness - diagnosis</topic><topic>Humans</topic><topic>Male</topic><topic>Middle Aged</topic><topic>Regression Analysis</topic><topic>Reproducibility of Results</topic><topic>Signal Processing, Computer-Assisted</topic><topic>Software</topic><topic>Sound Spectrography - methods</topic><topic>Speech</topic><topic>Speech Perception</topic><topic>Speech Therapy</topic><topic>Voice Disorders - diagnosis</topic><topic>Voice Quality</topic><topic>Young Adult</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Ptok, Martin</creatorcontrib><creatorcontrib>Matoušek, Václav</creatorcontrib><creatorcontrib>Döllinger, Michael</creatorcontrib><creatorcontrib>Schwemmle, Cornelia</creatorcontrib><creatorcontrib>Haderlein, Tino</creatorcontrib><creatorcontrib>Nöth, Elmar</creatorcontrib><collection>الدوريات العلمية والإحصائية - e-Marefa Academic and Statistical Periodicals</collection><collection>معرفة - المحتوى العربي الأكاديمي المتكامل - e-Marefa Academic Complete</collection><collection>Hindawi Publishing Complete</collection><collection>Hindawi Publishing Subscription Journals</collection><collection>Hindawi Publishing Open Access</collection><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>MEDLINE - Academic</collection><collection>PubMed Central (Full Participant titles)</collection><jtitle>Computational and mathematical methods in medicine</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Ptok, Martin</au><au>Matoušek, Václav</au><au>Döllinger, Michael</au><au>Schwemmle, Cornelia</au><au>Haderlein, Tino</au><au>Nöth, Elmar</au><au>Bursac, Zoran</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Automatic Evaluation of Voice Quality Using Text-Based Laryngograph Measurements and Prosodic Analysis</atitle><jtitle>Computational and mathematical methods in medicine</jtitle><addtitle>Comput Math Methods Med</addtitle><date>2015-01-01</date><risdate>2015</risdate><volume>2015</volume><issue>2015</issue><spage>1</spage><epage>11</epage><pages>1-11</pages><issn>1748-670X</issn><eissn>1748-6718</eissn><abstract>Due to low intra- and interrater reliability, perceptual voice evaluation should be supported by objective, automatic methods. In this study, text-based, computer-aided prosodic analysis and measurements of connected speech were combined in order to model perceptual evaluation of the German Roughness-Breathiness-Hoarseness (RBH) scheme. 58 connected speech samples (43 women and 15 men; 48.7±17.8 years) containing the German version of the text “The North Wind and the Sun” were evaluated perceptually by 19 speech and voice therapy students according to the RBH scale. For the human-machine correlation, Support Vector Regression with measurements of the vocal fold cycle irregularities (CFx) and the closed phases of vocal fold vibration (CQx) of the Laryngograph and 33 features from a prosodic analysis module were used to model the listeners’ ratings. The best human-machine results for roughness were obtained from a combination of six prosodic features and CFx (r=0.71, ρ=0.57). These correlations were approximately the same as the interrater agreement among human raters (r=0.65, ρ=0.61). CQx was one of the substantial features of the hoarseness model. For hoarseness and breathiness, the human-machine agreement was substantially lower. Nevertheless, the automatic analysis method can serve as the basis for a meaningful objective support for perceptual analysis.</abstract><cop>Cairo, Egypt</cop><pub>Hindawi Publishing Corporation</pub><pmid>26136813</pmid><doi>10.1155/2015/316325</doi><tpages>11</tpages><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 1748-670X
ispartof Computational and mathematical methods in medicine, 2015-01, Vol.2015 (2015), p.1-11
issn 1748-670X
1748-6718
language eng
recordid cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_4468283
source Wiley-Blackwell Open Access Collection; MEDLINE; PubMed Central; Alma/SFX Local Collection; EZB Electronic Journals Library; PubMed Central Open Access
subjects Adolescent
Adult
Aged
Aged, 80 and over
Child
Female
Hoarseness - diagnosis
Humans
Male
Middle Aged
Regression Analysis
Reproducibility of Results
Signal Processing, Computer-Assisted
Software
Sound Spectrography - methods
Speech
Speech Perception
Speech Therapy
Voice Disorders - diagnosis
Voice Quality
Young Adult
title Automatic Evaluation of Voice Quality Using Text-Based Laryngograph Measurements and Prosodic Analysis
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-07T15%3A53%3A47IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_pubme&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Automatic%20Evaluation%20of%20Voice%20Quality%20Using%20Text-Based%20Laryngograph%20Measurements%20and%20Prosodic%20Analysis&rft.jtitle=Computational%20and%20mathematical%20methods%20in%20medicine&rft.au=Ptok,%20Martin&rft.date=2015-01-01&rft.volume=2015&rft.issue=2015&rft.spage=1&rft.epage=11&rft.pages=1-11&rft.issn=1748-670X&rft.eissn=1748-6718&rft_id=info:doi/10.1155/2015/316325&rft_dat=%3Cproquest_pubme%3E1694965347%3C/proquest_pubme%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=1694965347&rft_id=info:pmid/26136813&rfr_iscdi=true