AfriSpeech-200: Pan-African Accented Speech Dataset for Clinical and General Domain ASR
Africa has a very low doctor-to-patient ratio. At very busy clinics, doctors could see 30+ patients per day -- a heavy patient burden compared with developed countries -- but productivity tools such as clinical automatic speech recognition (ASR) are lacking for these overworked clinicians. However,...
Gespeichert in:
Hauptverfasser: | , , , , , , , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | |
---|---|
container_issue | |
container_start_page | |
container_title | |
container_volume | |
creator | Olatunji, Tobi Afonja, Tejumade Yadavalli, Aditya Emezue, Chris Chinenye Singh, Sahib Dossou, Bonaventure F. P Osuchukwu, Joanne Osei, Salomey Tonja, Atnafu Lambebo Etori, Naome Mbataku, Clinton |
description | Africa has a very low doctor-to-patient ratio. At very busy clinics, doctors
could see 30+ patients per day -- a heavy patient burden compared with
developed countries -- but productivity tools such as clinical automatic speech
recognition (ASR) are lacking for these overworked clinicians. However,
clinical ASR is mature, even ubiquitous, in developed nations, and
clinician-reported performance of commercial clinical ASR systems is generally
satisfactory. Furthermore, the recent performance of general domain ASR is
approaching human accuracy. However, several gaps exist. Several publications
have highlighted racial bias with speech-to-text algorithms and performance on
minority accents lags significantly. To our knowledge, there is no publicly
available research or benchmark on accented African clinical ASR, and speech
data is non-existent for the majority of African accents. We release
AfriSpeech, 200hrs of Pan-African English speech, 67,577 clips from 2,463
unique speakers across 120 indigenous accents from 13 countries for clinical
and general domain ASR, a benchmark test set, with publicly available
pre-trained models with SOTA performance on the AfriSpeech benchmark. |
doi_str_mv | 10.48550/arxiv.2310.00274 |
format | Article |
fullrecord | <record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_2310_00274</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2310_00274</sourcerecordid><originalsourceid>FETCH-LOGICAL-a674-e74b45ea9f4d4cabe27ab0b621e4e4635df230cf4add133b4a258076611b19e63</originalsourceid><addsrcrecordid>eNotj09LAzEUxHPxINUP4Ml8gdT8eZttvS1bbQsFxRY8Li_JCwa2aUkX0W_vtvU0wzAz8GPsQckpzKpKPmH5Sd9TbcZASl3DLftsYknbI5H_ElrKZ_6OWZwzj5k33lMeKPBrgS9wwBMNPB4Kb_uUx1LPMQe-pExl9IvDHtO4237csZuI_Ynu_3XCdq8vu3YlNm_LddtsBNoaBNXgoCKcRwjg0ZGu0UlntSIgsKYKURvpI2AIyhgHqKuZrK1Vyqk5WTNhj9fbC1l3LGmP5bc7E3YXQvMHEwFJ6w</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>AfriSpeech-200: Pan-African Accented Speech Dataset for Clinical and General Domain ASR</title><source>arXiv.org</source><creator>Olatunji, Tobi ; Afonja, Tejumade ; Yadavalli, Aditya ; Emezue, Chris Chinenye ; Singh, Sahib ; Dossou, Bonaventure F. P ; Osuchukwu, Joanne ; Osei, Salomey ; Tonja, Atnafu Lambebo ; Etori, Naome ; Mbataku, Clinton</creator><creatorcontrib>Olatunji, Tobi ; Afonja, Tejumade ; Yadavalli, Aditya ; Emezue, Chris Chinenye ; Singh, Sahib ; Dossou, Bonaventure F. P ; Osuchukwu, Joanne ; Osei, Salomey ; Tonja, Atnafu Lambebo ; Etori, Naome ; Mbataku, Clinton</creatorcontrib><description>Africa has a very low doctor-to-patient ratio. At very busy clinics, doctors
could see 30+ patients per day -- a heavy patient burden compared with
developed countries -- but productivity tools such as clinical automatic speech
recognition (ASR) are lacking for these overworked clinicians. However,
clinical ASR is mature, even ubiquitous, in developed nations, and
clinician-reported performance of commercial clinical ASR systems is generally
satisfactory. Furthermore, the recent performance of general domain ASR is
approaching human accuracy. However, several gaps exist. Several publications
have highlighted racial bias with speech-to-text algorithms and performance on
minority accents lags significantly. To our knowledge, there is no publicly
available research or benchmark on accented African clinical ASR, and speech
data is non-existent for the majority of African accents. We release
AfriSpeech, 200hrs of Pan-African English speech, 67,577 clips from 2,463
unique speakers across 120 indigenous accents from 13 countries for clinical
and general domain ASR, a benchmark test set, with publicly available
pre-trained models with SOTA performance on the AfriSpeech benchmark.</description><identifier>DOI: 10.48550/arxiv.2310.00274</identifier><language>eng</language><subject>Computer Science - Computation and Language</subject><creationdate>2023-09</creationdate><rights>http://creativecommons.org/licenses/by/4.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,780,885</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/2310.00274$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.2310.00274$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Olatunji, Tobi</creatorcontrib><creatorcontrib>Afonja, Tejumade</creatorcontrib><creatorcontrib>Yadavalli, Aditya</creatorcontrib><creatorcontrib>Emezue, Chris Chinenye</creatorcontrib><creatorcontrib>Singh, Sahib</creatorcontrib><creatorcontrib>Dossou, Bonaventure F. P</creatorcontrib><creatorcontrib>Osuchukwu, Joanne</creatorcontrib><creatorcontrib>Osei, Salomey</creatorcontrib><creatorcontrib>Tonja, Atnafu Lambebo</creatorcontrib><creatorcontrib>Etori, Naome</creatorcontrib><creatorcontrib>Mbataku, Clinton</creatorcontrib><title>AfriSpeech-200: Pan-African Accented Speech Dataset for Clinical and General Domain ASR</title><description>Africa has a very low doctor-to-patient ratio. At very busy clinics, doctors
could see 30+ patients per day -- a heavy patient burden compared with
developed countries -- but productivity tools such as clinical automatic speech
recognition (ASR) are lacking for these overworked clinicians. However,
clinical ASR is mature, even ubiquitous, in developed nations, and
clinician-reported performance of commercial clinical ASR systems is generally
satisfactory. Furthermore, the recent performance of general domain ASR is
approaching human accuracy. However, several gaps exist. Several publications
have highlighted racial bias with speech-to-text algorithms and performance on
minority accents lags significantly. To our knowledge, there is no publicly
available research or benchmark on accented African clinical ASR, and speech
data is non-existent for the majority of African accents. We release
AfriSpeech, 200hrs of Pan-African English speech, 67,577 clips from 2,463
unique speakers across 120 indigenous accents from 13 countries for clinical
and general domain ASR, a benchmark test set, with publicly available
pre-trained models with SOTA performance on the AfriSpeech benchmark.</description><subject>Computer Science - Computation and Language</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNotj09LAzEUxHPxINUP4Ml8gdT8eZttvS1bbQsFxRY8Li_JCwa2aUkX0W_vtvU0wzAz8GPsQckpzKpKPmH5Sd9TbcZASl3DLftsYknbI5H_ElrKZ_6OWZwzj5k33lMeKPBrgS9wwBMNPB4Kb_uUx1LPMQe-pExl9IvDHtO4237csZuI_Ynu_3XCdq8vu3YlNm_LddtsBNoaBNXgoCKcRwjg0ZGu0UlntSIgsKYKURvpI2AIyhgHqKuZrK1Vyqk5WTNhj9fbC1l3LGmP5bc7E3YXQvMHEwFJ6w</recordid><startdate>20230930</startdate><enddate>20230930</enddate><creator>Olatunji, Tobi</creator><creator>Afonja, Tejumade</creator><creator>Yadavalli, Aditya</creator><creator>Emezue, Chris Chinenye</creator><creator>Singh, Sahib</creator><creator>Dossou, Bonaventure F. P</creator><creator>Osuchukwu, Joanne</creator><creator>Osei, Salomey</creator><creator>Tonja, Atnafu Lambebo</creator><creator>Etori, Naome</creator><creator>Mbataku, Clinton</creator><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20230930</creationdate><title>AfriSpeech-200: Pan-African Accented Speech Dataset for Clinical and General Domain ASR</title><author>Olatunji, Tobi ; Afonja, Tejumade ; Yadavalli, Aditya ; Emezue, Chris Chinenye ; Singh, Sahib ; Dossou, Bonaventure F. P ; Osuchukwu, Joanne ; Osei, Salomey ; Tonja, Atnafu Lambebo ; Etori, Naome ; Mbataku, Clinton</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a674-e74b45ea9f4d4cabe27ab0b621e4e4635df230cf4add133b4a258076611b19e63</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Computer Science - Computation and Language</topic><toplevel>online_resources</toplevel><creatorcontrib>Olatunji, Tobi</creatorcontrib><creatorcontrib>Afonja, Tejumade</creatorcontrib><creatorcontrib>Yadavalli, Aditya</creatorcontrib><creatorcontrib>Emezue, Chris Chinenye</creatorcontrib><creatorcontrib>Singh, Sahib</creatorcontrib><creatorcontrib>Dossou, Bonaventure F. P</creatorcontrib><creatorcontrib>Osuchukwu, Joanne</creatorcontrib><creatorcontrib>Osei, Salomey</creatorcontrib><creatorcontrib>Tonja, Atnafu Lambebo</creatorcontrib><creatorcontrib>Etori, Naome</creatorcontrib><creatorcontrib>Mbataku, Clinton</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Olatunji, Tobi</au><au>Afonja, Tejumade</au><au>Yadavalli, Aditya</au><au>Emezue, Chris Chinenye</au><au>Singh, Sahib</au><au>Dossou, Bonaventure F. P</au><au>Osuchukwu, Joanne</au><au>Osei, Salomey</au><au>Tonja, Atnafu Lambebo</au><au>Etori, Naome</au><au>Mbataku, Clinton</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>AfriSpeech-200: Pan-African Accented Speech Dataset for Clinical and General Domain ASR</atitle><date>2023-09-30</date><risdate>2023</risdate><abstract>Africa has a very low doctor-to-patient ratio. At very busy clinics, doctors
could see 30+ patients per day -- a heavy patient burden compared with
developed countries -- but productivity tools such as clinical automatic speech
recognition (ASR) are lacking for these overworked clinicians. However,
clinical ASR is mature, even ubiquitous, in developed nations, and
clinician-reported performance of commercial clinical ASR systems is generally
satisfactory. Furthermore, the recent performance of general domain ASR is
approaching human accuracy. However, several gaps exist. Several publications
have highlighted racial bias with speech-to-text algorithms and performance on
minority accents lags significantly. To our knowledge, there is no publicly
available research or benchmark on accented African clinical ASR, and speech
data is non-existent for the majority of African accents. We release
AfriSpeech, 200hrs of Pan-African English speech, 67,577 clips from 2,463
unique speakers across 120 indigenous accents from 13 countries for clinical
and general domain ASR, a benchmark test set, with publicly available
pre-trained models with SOTA performance on the AfriSpeech benchmark.</abstract><doi>10.48550/arxiv.2310.00274</doi><oa>free_for_read</oa></addata></record> |
fulltext | fulltext_linktorsrc |
identifier | DOI: 10.48550/arxiv.2310.00274 |
ispartof | |
issn | |
language | eng |
recordid | cdi_arxiv_primary_2310_00274 |
source | arXiv.org |
subjects | Computer Science - Computation and Language |
title | AfriSpeech-200: Pan-African Accented Speech Dataset for Clinical and General Domain ASR |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-25T09%3A35%3A54IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=AfriSpeech-200:%20Pan-African%20Accented%20Speech%20Dataset%20for%20Clinical%20and%20General%20Domain%20ASR&rft.au=Olatunji,%20Tobi&rft.date=2023-09-30&rft_id=info:doi/10.48550/arxiv.2310.00274&rft_dat=%3Carxiv_GOX%3E2310_00274%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true |