Coughs: ESC-50 and FSDKaggle2018

This dataset consists of timestamps for coughs contained in files extracted from the ESC-50 and FSDKaggle2018 datasets. Citation This dataset was generated and used in our paper: Mahmoud Abdelkhalek, Jinyi Qiu, Michelle Hernandez, Alper Bozkurt, Edgar Lobaton, “Investigating the Relationship between...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Abdelkhalek, Mahmoud, Jinyi Qiu, Hernandez, Michelle, Bozkurt, Alper, Lobaton, Edgar
Format:	Dataset
Sprache:	eng
Schlagworte:	audio dataset Kaggle
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	This dataset consists of timestamps for coughs contained in files extracted from the ESC-50 and FSDKaggle2018 datasets. Citation This dataset was generated and used in our paper: Mahmoud Abdelkhalek, Jinyi Qiu, Michelle Hernandez, Alper Bozkurt, Edgar Lobaton, “Investigating the Relationship between Cough Detection and Sampling Frequency for Wearable Devices,” in the 43rd Annual International Conference of the IEEE Engineering in Medicine and Biology Society, 2021. Please cite this paper if you use the timestamps.csv file in your work. Generation The cough timestamps given in the timestamps.csv file were generated using the cough templates given in figures 3 and 4 in the paper: A. H. Morice, G. A. Fontana, M. G. Belvisi, S. S. Birring, K. F. Chung, P. V. Dicpinigaitis, J. A. Kastelik, L. P. McGarvey, J. A. Smith, M. Tatar, J. Widdicombe, "ERS guidelines on the assessment of cough", European Respiratory Journal 2007 29: 1256-1276; DOI: 10.1183/09031936.00101006 More precisely, 40 files labelled as "coughing" in the ESC-50 dataset and 273 files labelled as "Cough" in the FSDKaggle2018 dataset were manually searched using Audacity for segments of audio that closely matched the aforementioned templates, both visually and auditorily. Some files did not contain any coughs at all, while other files contained several coughs. Therefore, only the files that contained at least one cough are included in the coughs directory. In total, the timestamps of 768 cough segments with lengths ranging from 0.2 seconds to 0.9 seconds were extracted. Description The audio files are presented in wav format in the coughs directory. Files named in the general format of "--*-24.wav" were extracted from the ESC-50 dataset, while all other files were extracted from the FSDKaggle2018 dataset. The timestamps.csv file contains the timestamps for the coughs and it consists of four columns: file_name,cough_number,start_time,end_time Files in the file_name column can be found in the coughs directory. cough_number refers to the index of the cough in the corresponding file. For example, if the file X.wav contains 5 coughs, then X.wav will be repeated 5 times under the file_name column, and for each row, the cough_number will range from 1 to 5. start_time refers to the starting time of a cough segment measured in seconds, while end_time refers to the end time of a cough segment measured in seconds. Licensing The ESC-50 dataset as a whole is licensed under the Creative Commons Attribution-NonC
DOI:	10.5281/zenodo.5136591