A large-scale and PCR-referenced vocal audio dataset for COVID-19
The UK COVID-19 Vocal Audio Dataset is designed for the training and evaluation of machine learning models that classify SARS-CoV-2 infection status or associated respiratory symptoms using vocal audio. The UK Health Security Agency recruited voluntary participants through the national Test and Trac...
Gespeichert in:
Hauptverfasser: | , , , , , , , , , , , , , , , , , , , , , , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | The UK COVID-19 Vocal Audio Dataset is designed for the training and
evaluation of machine learning models that classify SARS-CoV-2 infection status
or associated respiratory symptoms using vocal audio. The UK Health Security
Agency recruited voluntary participants through the national Test and Trace
programme and the REACT-1 survey in England from March 2021 to March 2022,
during dominant transmission of the Alpha and Delta SARS-CoV-2 variants and
some Omicron variant sublineages. Audio recordings of volitional coughs,
exhalations, and speech were collected in the 'Speak up to help beat
coronavirus' digital survey alongside demographic, self-reported symptom and
respiratory condition data, and linked to SARS-CoV-2 test results. The UK
COVID-19 Vocal Audio Dataset represents the largest collection of SARS-CoV-2
PCR-referenced audio recordings to date. PCR results were linked to 70,794 of
72,999 participants and 24,155 of 25,776 positive cases. Respiratory symptoms
were reported by 45.62% of participants. This dataset has additional potential
uses for bioacoustics research, with 11.30% participants reporting asthma, and
27.20% with linked influenza PCR test results. |
---|---|
DOI: | 10.48550/arxiv.2212.07738 |