Gender-ambiguous voice generation through feminine speaking style transfer in male voices
Recently, and under the umbrella of Responsible AI, efforts have been made to develop gender-ambiguous synthetic speech to represent with a single voice all individuals in the gender spectrum. However, research efforts have completely overlooked the speaking style despite differences found among bin...
Gespeichert in:
Hauptverfasser: | , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | |
---|---|
container_issue | |
container_start_page | |
container_title | |
container_volume | |
creator | Koutsogiannaki, Maria Dowall, Shafel Mc Agiomyrgiannakis, Ioannis |
description | Recently, and under the umbrella of Responsible AI, efforts have been made to
develop gender-ambiguous synthetic speech to represent with a single voice all
individuals in the gender spectrum. However, research efforts have completely
overlooked the speaking style despite differences found among binary and
non-binary populations. In this work, we synthesise gender-ambiguous speech by
combining the timbre of a male speaker with the manner of speech of a female
speaker using voice morphing and pitch shifting towards the male-female
boundary. Subjective evaluations indicate that the ambiguity of the morphed
samples that convey the female speech style is higher than those that undergo
plain pitch transformations suggesting that the speaking style can be a
contributing factor in creating gender-ambiguous speech. To our knowledge, this
is the first study that explicitly uses the transfer of the speaking style to
create gender-ambiguous voices. |
doi_str_mv | 10.48550/arxiv.2403.07661 |
format | Article |
fullrecord | <record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_2403_07661</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2403_07661</sourcerecordid><originalsourceid>FETCH-LOGICAL-a671-55ac7a24039cd7e79689612f4df8d96d26a55e089ca36a6653d74eebff4bfec23</originalsourceid><addsrcrecordid>eNotj71OwzAURr0woMIDMOEXSHDiv3isKihIlVi6MEU39nVqtXEqO6no20MD0yd9w9E5hDxVrBSNlOwF0ne4lLVgvGRaqeqefG0xOkwFDF3o53HO9DIGi7THiAmmMEY6HdI49wfqcQgxRKT5jHAMsad5up6QTgli9phoiHSA32Mh5Ady5-GU8fF_V2T_9rrfvBe7z-3HZr0rQOmqkBKshpuRsU6jNqoxqqq9cL5xRrlagZTIGmOBK1BKcqcFYue96Dzamq_I8x92aWvPKQyQru2N2C6N_Acr9U5e</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Gender-ambiguous voice generation through feminine speaking style transfer in male voices</title><source>arXiv.org</source><creator>Koutsogiannaki, Maria ; Dowall, Shafel Mc ; Agiomyrgiannakis, Ioannis</creator><creatorcontrib>Koutsogiannaki, Maria ; Dowall, Shafel Mc ; Agiomyrgiannakis, Ioannis</creatorcontrib><description>Recently, and under the umbrella of Responsible AI, efforts have been made to
develop gender-ambiguous synthetic speech to represent with a single voice all
individuals in the gender spectrum. However, research efforts have completely
overlooked the speaking style despite differences found among binary and
non-binary populations. In this work, we synthesise gender-ambiguous speech by
combining the timbre of a male speaker with the manner of speech of a female
speaker using voice morphing and pitch shifting towards the male-female
boundary. Subjective evaluations indicate that the ambiguity of the morphed
samples that convey the female speech style is higher than those that undergo
plain pitch transformations suggesting that the speaking style can be a
contributing factor in creating gender-ambiguous speech. To our knowledge, this
is the first study that explicitly uses the transfer of the speaking style to
create gender-ambiguous voices.</description><identifier>DOI: 10.48550/arxiv.2403.07661</identifier><language>eng</language><subject>Computer Science - Sound</subject><creationdate>2024-03</creationdate><rights>http://creativecommons.org/licenses/by/4.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,780,885</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/2403.07661$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.2403.07661$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Koutsogiannaki, Maria</creatorcontrib><creatorcontrib>Dowall, Shafel Mc</creatorcontrib><creatorcontrib>Agiomyrgiannakis, Ioannis</creatorcontrib><title>Gender-ambiguous voice generation through feminine speaking style transfer in male voices</title><description>Recently, and under the umbrella of Responsible AI, efforts have been made to
develop gender-ambiguous synthetic speech to represent with a single voice all
individuals in the gender spectrum. However, research efforts have completely
overlooked the speaking style despite differences found among binary and
non-binary populations. In this work, we synthesise gender-ambiguous speech by
combining the timbre of a male speaker with the manner of speech of a female
speaker using voice morphing and pitch shifting towards the male-female
boundary. Subjective evaluations indicate that the ambiguity of the morphed
samples that convey the female speech style is higher than those that undergo
plain pitch transformations suggesting that the speaking style can be a
contributing factor in creating gender-ambiguous speech. To our knowledge, this
is the first study that explicitly uses the transfer of the speaking style to
create gender-ambiguous voices.</description><subject>Computer Science - Sound</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNotj71OwzAURr0woMIDMOEXSHDiv3isKihIlVi6MEU39nVqtXEqO6no20MD0yd9w9E5hDxVrBSNlOwF0ne4lLVgvGRaqeqefG0xOkwFDF3o53HO9DIGi7THiAmmMEY6HdI49wfqcQgxRKT5jHAMsad5up6QTgli9phoiHSA32Mh5Ady5-GU8fF_V2T_9rrfvBe7z-3HZr0rQOmqkBKshpuRsU6jNqoxqqq9cL5xRrlagZTIGmOBK1BKcqcFYue96Dzamq_I8x92aWvPKQyQru2N2C6N_Acr9U5e</recordid><startdate>20240312</startdate><enddate>20240312</enddate><creator>Koutsogiannaki, Maria</creator><creator>Dowall, Shafel Mc</creator><creator>Agiomyrgiannakis, Ioannis</creator><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20240312</creationdate><title>Gender-ambiguous voice generation through feminine speaking style transfer in male voices</title><author>Koutsogiannaki, Maria ; Dowall, Shafel Mc ; Agiomyrgiannakis, Ioannis</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a671-55ac7a24039cd7e79689612f4df8d96d26a55e089ca36a6653d74eebff4bfec23</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Computer Science - Sound</topic><toplevel>online_resources</toplevel><creatorcontrib>Koutsogiannaki, Maria</creatorcontrib><creatorcontrib>Dowall, Shafel Mc</creatorcontrib><creatorcontrib>Agiomyrgiannakis, Ioannis</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Koutsogiannaki, Maria</au><au>Dowall, Shafel Mc</au><au>Agiomyrgiannakis, Ioannis</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Gender-ambiguous voice generation through feminine speaking style transfer in male voices</atitle><date>2024-03-12</date><risdate>2024</risdate><abstract>Recently, and under the umbrella of Responsible AI, efforts have been made to
develop gender-ambiguous synthetic speech to represent with a single voice all
individuals in the gender spectrum. However, research efforts have completely
overlooked the speaking style despite differences found among binary and
non-binary populations. In this work, we synthesise gender-ambiguous speech by
combining the timbre of a male speaker with the manner of speech of a female
speaker using voice morphing and pitch shifting towards the male-female
boundary. Subjective evaluations indicate that the ambiguity of the morphed
samples that convey the female speech style is higher than those that undergo
plain pitch transformations suggesting that the speaking style can be a
contributing factor in creating gender-ambiguous speech. To our knowledge, this
is the first study that explicitly uses the transfer of the speaking style to
create gender-ambiguous voices.</abstract><doi>10.48550/arxiv.2403.07661</doi><oa>free_for_read</oa></addata></record> |
fulltext | fulltext_linktorsrc |
identifier | DOI: 10.48550/arxiv.2403.07661 |
ispartof | |
issn | |
language | eng |
recordid | cdi_arxiv_primary_2403_07661 |
source | arXiv.org |
subjects | Computer Science - Sound |
title | Gender-ambiguous voice generation through feminine speaking style transfer in male voices |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-07T02%3A45%3A08IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Gender-ambiguous%20voice%20generation%20through%20feminine%20speaking%20style%20transfer%20in%20male%20voices&rft.au=Koutsogiannaki,%20Maria&rft.date=2024-03-12&rft_id=info:doi/10.48550/arxiv.2403.07661&rft_dat=%3Carxiv_GOX%3E2403_07661%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true |