AfriSpeech-200: Pan-African Accented Speech Dataset for Clinical and General Domain ASR

Africa has a very low doctor-to-patient ratio. At very busy clinics, doctors could see 30+ patients per day -- a heavy patient burden compared with developed countries -- but productivity tools such as clinical automatic speech recognition (ASR) are lacking for these overworked clinicians. However,...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Olatunji, Tobi, Afonja, Tejumade, Yadavalli, Aditya, Emezue, Chris Chinenye, Singh, Sahib, Dossou, Bonaventure F. P, Osuchukwu, Joanne, Osei, Salomey, Tonja, Atnafu Lambebo, Etori, Naome, Mbataku, Clinton
Format:	Artikel
Sprache:	eng
Schlagworte:	Computer Science - Computation and Language
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page
container_issue
container_start_page
container_title
container_volume
creator	Olatunji, Tobi Afonja, Tejumade Yadavalli, Aditya Emezue, Chris Chinenye Singh, Sahib Dossou, Bonaventure F. P Osuchukwu, Joanne Osei, Salomey Tonja, Atnafu Lambebo Etori, Naome Mbataku, Clinton
description	Africa has a very low doctor-to-patient ratio. At very busy clinics, doctors could see 30+ patients per day -- a heavy patient burden compared with developed countries -- but productivity tools such as clinical automatic speech recognition (ASR) are lacking for these overworked clinicians. However, clinical ASR is mature, even ubiquitous, in developed nations, and clinician-reported performance of commercial clinical ASR systems is generally satisfactory. Furthermore, the recent performance of general domain ASR is approaching human accuracy. However, several gaps exist. Several publications have highlighted racial bias with speech-to-text algorithms and performance on minority accents lags significantly. To our knowledge, there is no publicly available research or benchmark on accented African clinical ASR, and speech data is non-existent for the majority of African accents. We release AfriSpeech, 200hrs of Pan-African English speech, 67,577 clips from 2,463 unique speakers across 120 indigenous accents from 13 countries for clinical and general domain ASR, a benchmark test set, with publicly available pre-trained models with SOTA performance on the AfriSpeech benchmark.
doi_str_mv	10.48550/arxiv.2310.00274
format	Article
fullrecord	<record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_2310_00274</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2310_00274</sourcerecordid><originalsourceid>FETCH-LOGICAL-a674-e74b45ea9f4d4cabe27ab0b621e4e4635df230cf4add133b4a258076611b19e63</originalsourceid><addsrcrecordid>eNotj09LAzEUxHPxINUP4Ml8gdT8eZttvS1bbQsFxRY8Li_JCwa2aUkX0W_vtvU0wzAz8GPsQckpzKpKPmH5Sd9TbcZASl3DLftsYknbI5H_ElrKZ_6OWZwzj5k33lMeKPBrgS9wwBMNPB4Kb_uUx1LPMQe-pExl9IvDHtO4237csZuI_Ynu_3XCdq8vu3YlNm_LddtsBNoaBNXgoCKcRwjg0ZGu0UlntSIgsKYKURvpI2AIyhgHqKuZrK1Vyqk5WTNhj9fbC1l3LGmP5bc7E3YXQvMHEwFJ6w</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>AfriSpeech-200: Pan-African Accented Speech Dataset for Clinical and General Domain ASR</title><source>arXiv.org</source><creator>Olatunji, Tobi ; Afonja, Tejumade ; Yadavalli, Aditya ; Emezue, Chris Chinenye ; Singh, Sahib ; Dossou, Bonaventure F. P ; Osuchukwu, Joanne ; Osei, Salomey ; Tonja, Atnafu Lambebo ; Etori, Naome ; Mbataku, Clinton</creator><creatorcontrib>Olatunji, Tobi ; Afonja, Tejumade ; Yadavalli, Aditya ; Emezue, Chris Chinenye ; Singh, Sahib ; Dossou, Bonaventure F. P ; Osuchukwu, Joanne ; Osei, Salomey ; Tonja, Atnafu Lambebo ; Etori, Naome ; Mbataku, Clinton</creatorcontrib><description>Africa has a very low doctor-to-patient ratio. At very busy clinics, doctors could see 30+ patients per day -- a heavy patient burden compared with developed countries -- but productivity tools such as clinical automatic speech recognition (ASR) are lacking for these overworked clinicians. However, clinical ASR is mature, even ubiquitous, in developed nations, and clinician-reported performance of commercial clinical ASR systems is generally satisfactory. Furthermore, the recent performance of general domain ASR is approaching human accuracy. However, several gaps exist. Several publications have highlighted racial bias with speech-to-text algorithms and performance on minority accents lags significantly. To our knowledge, there is no publicly available research or benchmark on accented African clinical ASR, and speech data is non-existent for the majority of African accents. We release AfriSpeech, 200hrs of Pan-African English speech, 67,577 clips from 2,463 unique speakers across 120 indigenous accents from 13 countries for clinical and general domain ASR, a benchmark test set, with publicly available pre-trained models with SOTA performance on the AfriSpeech benchmark.</description><identifier>DOI: 10.48550/arxiv.2310.00274</identifier><language>eng</language><subject>Computer Science - Computation and Language</subject><creationdate>2023-09</creationdate><rights>http://creativecommons.org/licenses/by/4.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,780,885</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/2310.00274$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.2310.00274$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Olatunji, Tobi</creatorcontrib><creatorcontrib>Afonja, Tejumade</creatorcontrib><creatorcontrib>Yadavalli, Aditya</creatorcontrib><creatorcontrib>Emezue, Chris Chinenye</creatorcontrib><creatorcontrib>Singh, Sahib</creatorcontrib><creatorcontrib>Dossou, Bonaventure F. P</creatorcontrib><creatorcontrib>Osuchukwu, Joanne</creatorcontrib><creatorcontrib>Osei, Salomey</creatorcontrib><creatorcontrib>Tonja, Atnafu Lambebo</creatorcontrib><creatorcontrib>Etori, Naome</creatorcontrib><creatorcontrib>Mbataku, Clinton</creatorcontrib><title>AfriSpeech-200: Pan-African Accented Speech Dataset for Clinical and General Domain ASR</title><description>Africa has a very low doctor-to-patient ratio. At very busy clinics, doctors could see 30+ patients per day -- a heavy patient burden compared with developed countries -- but productivity tools such as clinical automatic speech recognition (ASR) are lacking for these overworked clinicians. However, clinical ASR is mature, even ubiquitous, in developed nations, and clinician-reported performance of commercial clinical ASR systems is generally satisfactory. Furthermore, the recent performance of general domain ASR is approaching human accuracy. However, several gaps exist. Several publications have highlighted racial bias with speech-to-text algorithms and performance on minority accents lags significantly. To our knowledge, there is no publicly available research or benchmark on accented African clinical ASR, and speech data is non-existent for the majority of African accents. We release AfriSpeech, 200hrs of Pan-African English speech, 67,577 clips from 2,463 unique speakers across 120 indigenous accents from 13 countries for clinical and general domain ASR, a benchmark test set, with publicly available pre-trained models with SOTA performance on the AfriSpeech benchmark.</description><subject>Computer Science - Computation and Language</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNotj09LAzEUxHPxINUP4Ml8gdT8eZttvS1bbQsFxRY8Li_JCwa2aUkX0W_vtvU0wzAz8GPsQckpzKpKPmH5Sd9TbcZASl3DLftsYknbI5H_ElrKZ_6OWZwzj5k33lMeKPBrgS9wwBMNPB4Kb_uUx1LPMQe-pExl9IvDHtO4237csZuI_Ynu_3XCdq8vu3YlNm_LddtsBNoaBNXgoCKcRwjg0ZGu0UlntSIgsKYKURvpI2AIyhgHqKuZrK1Vyqk5WTNhj9fbC1l3LGmP5bc7E3YXQvMHEwFJ6w</recordid><startdate>20230930</startdate><enddate>20230930</enddate><creator>Olatunji, Tobi</creator><creator>Afonja, Tejumade</creator><creator>Yadavalli, Aditya</creator><creator>Emezue, Chris Chinenye</creator><creator>Singh, Sahib</creator><creator>Dossou, Bonaventure F. P</creator><creator>Osuchukwu, Joanne</creator><creator>Osei, Salomey</creator><creator>Tonja, Atnafu Lambebo</creator><creator>Etori, Naome</creator><creator>Mbataku, Clinton</creator><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20230930</creationdate><title>AfriSpeech-200: Pan-African Accented Speech Dataset for Clinical and General Domain ASR</title><author>Olatunji, Tobi ; Afonja, Tejumade ; Yadavalli, Aditya ; Emezue, Chris Chinenye ; Singh, Sahib ; Dossou, Bonaventure F. P ; Osuchukwu, Joanne ; Osei, Salomey ; Tonja, Atnafu Lambebo ; Etori, Naome ; Mbataku, Clinton</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a674-e74b45ea9f4d4cabe27ab0b621e4e4635df230cf4add133b4a258076611b19e63</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Computer Science - Computation and Language</topic><toplevel>online_resources</toplevel><creatorcontrib>Olatunji, Tobi</creatorcontrib><creatorcontrib>Afonja, Tejumade</creatorcontrib><creatorcontrib>Yadavalli, Aditya</creatorcontrib><creatorcontrib>Emezue, Chris Chinenye</creatorcontrib><creatorcontrib>Singh, Sahib</creatorcontrib><creatorcontrib>Dossou, Bonaventure F. P</creatorcontrib><creatorcontrib>Osuchukwu, Joanne</creatorcontrib><creatorcontrib>Osei, Salomey</creatorcontrib><creatorcontrib>Tonja, Atnafu Lambebo</creatorcontrib><creatorcontrib>Etori, Naome</creatorcontrib><creatorcontrib>Mbataku, Clinton</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Olatunji, Tobi</au><au>Afonja, Tejumade</au><au>Yadavalli, Aditya</au><au>Emezue, Chris Chinenye</au><au>Singh, Sahib</au><au>Dossou, Bonaventure F. P</au><au>Osuchukwu, Joanne</au><au>Osei, Salomey</au><au>Tonja, Atnafu Lambebo</au><au>Etori, Naome</au><au>Mbataku, Clinton</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>AfriSpeech-200: Pan-African Accented Speech Dataset for Clinical and General Domain ASR</atitle><date>2023-09-30</date><risdate>2023</risdate><abstract>Africa has a very low doctor-to-patient ratio. At very busy clinics, doctors could see 30+ patients per day -- a heavy patient burden compared with developed countries -- but productivity tools such as clinical automatic speech recognition (ASR) are lacking for these overworked clinicians. However, clinical ASR is mature, even ubiquitous, in developed nations, and clinician-reported performance of commercial clinical ASR systems is generally satisfactory. Furthermore, the recent performance of general domain ASR is approaching human accuracy. However, several gaps exist. Several publications have highlighted racial bias with speech-to-text algorithms and performance on minority accents lags significantly. To our knowledge, there is no publicly available research or benchmark on accented African clinical ASR, and speech data is non-existent for the majority of African accents. We release AfriSpeech, 200hrs of Pan-African English speech, 67,577 clips from 2,463 unique speakers across 120 indigenous accents from 13 countries for clinical and general domain ASR, a benchmark test set, with publicly available pre-trained models with SOTA performance on the AfriSpeech benchmark.</abstract><doi>10.48550/arxiv.2310.00274</doi><oa>free_for_read</oa></addata></record>
fulltext	fulltext_linktorsrc
identifier	DOI: 10.48550/arxiv.2310.00274
ispartof
issn
language	eng
recordid	cdi_arxiv_primary_2310_00274
source	arXiv.org
subjects	Computer Science - Computation and Language
title	AfriSpeech-200: Pan-African Accented Speech Dataset for Clinical and General Domain ASR
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-25T09%3A35%3A54IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=AfriSpeech-200:%20Pan-African%20Accented%20Speech%20Dataset%20for%20Clinical%20and%20General%20Domain%20ASR&rft.au=Olatunji,%20Tobi&rft.date=2023-09-30&rft_id=info:doi/10.48550/arxiv.2310.00274&rft_dat=%3Carxiv_GOX%3E2310_00274%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true