I-vector Based Within Speaker Voice Quality Identification on connected speech
Voice disorders affect a large portion of the population, especially heavy voice users such as teachers or call-center workers. Most voice disorders can be treated effectively with behavioral voice therapy, which teaches patients to replace problematic, habituated voice production mechanics with opt...
Gespeichert in:
Veröffentlicht in: | arXiv.org 2021-02 |
---|---|
Hauptverfasser: | , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | |
---|---|
container_issue | |
container_start_page | |
container_title | arXiv.org |
container_volume | |
creator | Feng, Chuyao Eva van Leer Mackenzie Lee Curtis Anderson, David V |
description | Voice disorders affect a large portion of the population, especially heavy voice users such as teachers or call-center workers. Most voice disorders can be treated effectively with behavioral voice therapy, which teaches patients to replace problematic, habituated voice production mechanics with optimal voice production technique(s), yielding improved voice quality. However, treatment often fails because patients have difficulty differentiating their habitual voice from the target technique independently, when clinician feedback is unavailable between therapy sessions. Therefore, with the long term aim to extend clinician feedback to extra-clinical settings, we built two systems that automatically differentiate various voice qualities produced by the same individual. We hypothesized that 1) a system based on i-vectors could classify these qualities as if they represent different speakers and 2) such a system would outperform one based on traditional voice signal processing algorithms. Training recordings were provided by thirteen amateur actors, each producing 5 perceptually different voice qualities in connected speech: normal, breathy, fry, twang, and hyponasal. As hypothesized, the i-vector system outperformed the acoustic measure system in classification accuracy (i.e. 97.5\% compared to 77.2\%, respectively). Findings are expected because the i-vector system maps features to an integrated space which better represents each voice quality than the 22-feature space of the baseline system. Therefore, an i-vector based system has potential for clinical application in voice therapy and voice training. |
format | Article |
fullrecord | <record><control><sourceid>proquest</sourceid><recordid>TN_cdi_proquest_journals_2489934181</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2489934181</sourcerecordid><originalsourceid>FETCH-proquest_journals_24899341813</originalsourceid><addsrcrecordid>eNqNi9EKgjAYRkcQJOU7DLoWdJult0WRN0EUdSlj_uJvstk2g96-XfQAwYHv4nxnRiLGeZYUgrEFiZ3r0zRlmy3Lcx6Rc5W8QXlj6U46aOgDfYeaXkeQT7D0blABvUxyQP-hVQPaY4tKejSaBpTROuQhdCOA6lZk3srBQfzbJVkfD7f9KRmteU3gfN2byeqgaiaKsuQiKzL-3-sLp1Q-DA</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2489934181</pqid></control><display><type>article</type><title>I-vector Based Within Speaker Voice Quality Identification on connected speech</title><source>Free E- Journals</source><creator>Feng, Chuyao ; Eva van Leer ; Mackenzie Lee Curtis ; Anderson, David V</creator><creatorcontrib>Feng, Chuyao ; Eva van Leer ; Mackenzie Lee Curtis ; Anderson, David V</creatorcontrib><description>Voice disorders affect a large portion of the population, especially heavy voice users such as teachers or call-center workers. Most voice disorders can be treated effectively with behavioral voice therapy, which teaches patients to replace problematic, habituated voice production mechanics with optimal voice production technique(s), yielding improved voice quality. However, treatment often fails because patients have difficulty differentiating their habitual voice from the target technique independently, when clinician feedback is unavailable between therapy sessions. Therefore, with the long term aim to extend clinician feedback to extra-clinical settings, we built two systems that automatically differentiate various voice qualities produced by the same individual. We hypothesized that 1) a system based on i-vectors could classify these qualities as if they represent different speakers and 2) such a system would outperform one based on traditional voice signal processing algorithms. Training recordings were provided by thirteen amateur actors, each producing 5 perceptually different voice qualities in connected speech: normal, breathy, fry, twang, and hyponasal. As hypothesized, the i-vector system outperformed the acoustic measure system in classification accuracy (i.e. 97.5\% compared to 77.2\%, respectively). Findings are expected because the i-vector system maps features to an integrated space which better represents each voice quality than the 22-feature space of the baseline system. Therefore, an i-vector based system has potential for clinical application in voice therapy and voice training.</description><identifier>EISSN: 2331-8422</identifier><language>eng</language><publisher>Ithaca: Cornell University Library, arXiv.org</publisher><subject>Algorithms ; Disorders ; Feedback ; Signal processing ; Speech ; Therapy ; Training ; Voice</subject><ispartof>arXiv.org, 2021-02</ispartof><rights>2021. This work is published under http://arxiv.org/licenses/nonexclusive-distrib/1.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>776,780</link.rule.ids></links><search><creatorcontrib>Feng, Chuyao</creatorcontrib><creatorcontrib>Eva van Leer</creatorcontrib><creatorcontrib>Mackenzie Lee Curtis</creatorcontrib><creatorcontrib>Anderson, David V</creatorcontrib><title>I-vector Based Within Speaker Voice Quality Identification on connected speech</title><title>arXiv.org</title><description>Voice disorders affect a large portion of the population, especially heavy voice users such as teachers or call-center workers. Most voice disorders can be treated effectively with behavioral voice therapy, which teaches patients to replace problematic, habituated voice production mechanics with optimal voice production technique(s), yielding improved voice quality. However, treatment often fails because patients have difficulty differentiating their habitual voice from the target technique independently, when clinician feedback is unavailable between therapy sessions. Therefore, with the long term aim to extend clinician feedback to extra-clinical settings, we built two systems that automatically differentiate various voice qualities produced by the same individual. We hypothesized that 1) a system based on i-vectors could classify these qualities as if they represent different speakers and 2) such a system would outperform one based on traditional voice signal processing algorithms. Training recordings were provided by thirteen amateur actors, each producing 5 perceptually different voice qualities in connected speech: normal, breathy, fry, twang, and hyponasal. As hypothesized, the i-vector system outperformed the acoustic measure system in classification accuracy (i.e. 97.5\% compared to 77.2\%, respectively). Findings are expected because the i-vector system maps features to an integrated space which better represents each voice quality than the 22-feature space of the baseline system. Therefore, an i-vector based system has potential for clinical application in voice therapy and voice training.</description><subject>Algorithms</subject><subject>Disorders</subject><subject>Feedback</subject><subject>Signal processing</subject><subject>Speech</subject><subject>Therapy</subject><subject>Training</subject><subject>Voice</subject><issn>2331-8422</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2021</creationdate><recordtype>article</recordtype><sourceid>BENPR</sourceid><recordid>eNqNi9EKgjAYRkcQJOU7DLoWdJult0WRN0EUdSlj_uJvstk2g96-XfQAwYHv4nxnRiLGeZYUgrEFiZ3r0zRlmy3Lcx6Rc5W8QXlj6U46aOgDfYeaXkeQT7D0blABvUxyQP-hVQPaY4tKejSaBpTROuQhdCOA6lZk3srBQfzbJVkfD7f9KRmteU3gfN2byeqgaiaKsuQiKzL-3-sLp1Q-DA</recordid><startdate>20210215</startdate><enddate>20210215</enddate><creator>Feng, Chuyao</creator><creator>Eva van Leer</creator><creator>Mackenzie Lee Curtis</creator><creator>Anderson, David V</creator><general>Cornell University Library, arXiv.org</general><scope>8FE</scope><scope>8FG</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>HCIFZ</scope><scope>L6V</scope><scope>M7S</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope></search><sort><creationdate>20210215</creationdate><title>I-vector Based Within Speaker Voice Quality Identification on connected speech</title><author>Feng, Chuyao ; Eva van Leer ; Mackenzie Lee Curtis ; Anderson, David V</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-proquest_journals_24899341813</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2021</creationdate><topic>Algorithms</topic><topic>Disorders</topic><topic>Feedback</topic><topic>Signal processing</topic><topic>Speech</topic><topic>Therapy</topic><topic>Training</topic><topic>Voice</topic><toplevel>online_resources</toplevel><creatorcontrib>Feng, Chuyao</creatorcontrib><creatorcontrib>Eva van Leer</creatorcontrib><creatorcontrib>Mackenzie Lee Curtis</creatorcontrib><creatorcontrib>Anderson, David V</creatorcontrib><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Materials Science & Engineering Collection</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest Central UK/Ireland</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Engineering Collection</collection><collection>Engineering Database</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering Collection</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Feng, Chuyao</au><au>Eva van Leer</au><au>Mackenzie Lee Curtis</au><au>Anderson, David V</au><format>book</format><genre>document</genre><ristype>GEN</ristype><atitle>I-vector Based Within Speaker Voice Quality Identification on connected speech</atitle><jtitle>arXiv.org</jtitle><date>2021-02-15</date><risdate>2021</risdate><eissn>2331-8422</eissn><abstract>Voice disorders affect a large portion of the population, especially heavy voice users such as teachers or call-center workers. Most voice disorders can be treated effectively with behavioral voice therapy, which teaches patients to replace problematic, habituated voice production mechanics with optimal voice production technique(s), yielding improved voice quality. However, treatment often fails because patients have difficulty differentiating their habitual voice from the target technique independently, when clinician feedback is unavailable between therapy sessions. Therefore, with the long term aim to extend clinician feedback to extra-clinical settings, we built two systems that automatically differentiate various voice qualities produced by the same individual. We hypothesized that 1) a system based on i-vectors could classify these qualities as if they represent different speakers and 2) such a system would outperform one based on traditional voice signal processing algorithms. Training recordings were provided by thirteen amateur actors, each producing 5 perceptually different voice qualities in connected speech: normal, breathy, fry, twang, and hyponasal. As hypothesized, the i-vector system outperformed the acoustic measure system in classification accuracy (i.e. 97.5\% compared to 77.2\%, respectively). Findings are expected because the i-vector system maps features to an integrated space which better represents each voice quality than the 22-feature space of the baseline system. Therefore, an i-vector based system has potential for clinical application in voice therapy and voice training.</abstract><cop>Ithaca</cop><pub>Cornell University Library, arXiv.org</pub><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | EISSN: 2331-8422 |
ispartof | arXiv.org, 2021-02 |
issn | 2331-8422 |
language | eng |
recordid | cdi_proquest_journals_2489934181 |
source | Free E- Journals |
subjects | Algorithms Disorders Feedback Signal processing Speech Therapy Training Voice |
title | I-vector Based Within Speaker Voice Quality Identification on connected speech |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-27T11%3A08%3A59IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=document&rft.atitle=I-vector%20Based%20Within%20Speaker%20Voice%20Quality%20Identification%20on%20connected%20speech&rft.jtitle=arXiv.org&rft.au=Feng,%20Chuyao&rft.date=2021-02-15&rft.eissn=2331-8422&rft_id=info:doi/&rft_dat=%3Cproquest%3E2489934181%3C/proquest%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2489934181&rft_id=info:pmid/&rfr_iscdi=true |