DCASE 2024 Challenge Task 2 Development Dataset
Description This dataset is the "development dataset" for the DCASE 2024 Challenge Task 2. The data consists of the normal/anomalous operating sounds of seven types of real/toy machines. Each recording is a single-channel 10-second audio that includes both a machine's operating sound...
Gespeichert in:
Hauptverfasser: | , , , , , , , , , , , |
---|---|
Format: | Dataset |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | |
---|---|
container_issue | |
container_start_page | |
container_title | |
container_volume | |
creator | Nishida, Tomoya Imoto, Keisuke Harada, Noboru Niizumi, Daisuke Albertini, Davide Sannino, Roberto Pradolini, Simone Augusti, Filippo Dohi, Kota Purohit, Harsh Endo, Takashi Kawaguchi, Yohei |
description | Description
This dataset is the "development dataset" for the DCASE 2024 Challenge Task 2.
The data consists of the normal/anomalous operating sounds of seven types of real/toy machines. Each recording is a single-channel 10-second audio that includes both a machine's operating sound and environmental noise. The following seven types of real/toy machines are used in this task:
ToyCar
ToyTrain
Fan
Gearbox
Bearing
Slide rail
Valve
Overview of the task
Anomalous sound detection (ASD) is the task of identifying whether the sound emitted from a target machine is normal or anomalous. Automatic detection of mechanical failure is an essential technology in the fourth industrial revolution, which involves artificial-intelligence-based factory automation. Prompt detection of machine anomalies by observing sounds is useful for monitoring the condition of machines.
This task is the follow-up from DCASE 2020 Task 2 to DCASE 2023 Task 2. The task this year is to develop an ASD system that meets the following five requirements.
1. **Train a model using only normal sound** (unsupervised learning scenario) Because anomalies rarely occur and are highly diverse in real-world factories, it can be difficult to collect exhaustive patterns of anomalous sounds. Therefore, the system must detect unknown types of anomalous sounds that are not provided in the training data. This is the same requirement as in the previous tasks.
2. **Detect anomalies regardless of domain shifts** (domain generalization task) In real-world cases, the operational states of a machine or the environmental noise can change to cause domain shifts. Domain-generalization techniques can be useful for handling domain shifts that occur frequently or are hard-to-notice. In this task, the system is required to use domain-generalization techniques for handling these domain shifts. This requirement is the same as in DCASE 2022 Task 2 and DCASE 2023 Task 2.
3. **Train a model for a completely new machine type** For a completely new machine type, hyperparameters of the trained model cannot be tuned. Therefore, the system should have the ability to train models without additional hyperparameter tuning. This requirement is the same as in DCASE 2023 Task 2.
4. **Train a model using a limited number of machines from its machine type** While sounds from multiple machines of the same machine type can be used to enhance the detection performance, it is often the |
doi_str_mv | 10.5281/zenodo.10850879 |
format | Dataset |
fullrecord | <record><control><sourceid>datacite_PQ8</sourceid><recordid>TN_cdi_datacite_primary_10_5281_zenodo_10850879</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>10_5281_zenodo_10850879</sourcerecordid><originalsourceid>FETCH-datacite_primary_10_5281_zenodo_108508793</originalsourceid><addsrcrecordid>eNpjYBA3NNAzNbIw1K9KzctPydczNLAwNbAwt-Rk0Hdxdgx2VTAyMDJRcM5IzMlJzUtPVQhJLM5WMFJwSS1LzckvyE3NK1FwSSxJLE4t4WFgTUvMKU7lhdLcDPpuriHOHropQPnkzJLU-IKizNzEosp4Q4N4kJXxECvjYVYak64DAJmWOQM</addsrcrecordid><sourcetype>Publisher</sourcetype><iscdi>true</iscdi><recordtype>dataset</recordtype></control><display><type>dataset</type><title>DCASE 2024 Challenge Task 2 Development Dataset</title><source>DataCite</source><creator>Nishida, Tomoya ; Imoto, Keisuke ; Harada, Noboru ; Niizumi, Daisuke ; Albertini, Davide ; Sannino, Roberto ; Pradolini, Simone ; Augusti, Filippo ; Dohi, Kota ; Purohit, Harsh ; Endo, Takashi ; Kawaguchi, Yohei</creator><creatorcontrib>Nishida, Tomoya ; Imoto, Keisuke ; Harada, Noboru ; Niizumi, Daisuke ; Albertini, Davide ; Sannino, Roberto ; Pradolini, Simone ; Augusti, Filippo ; Dohi, Kota ; Purohit, Harsh ; Endo, Takashi ; Kawaguchi, Yohei</creatorcontrib><description><Data files will be made accessible on April 1st, 2024.>
Description
This dataset is the "development dataset" for the DCASE 2024 Challenge Task 2.
The data consists of the normal/anomalous operating sounds of seven types of real/toy machines. Each recording is a single-channel 10-second audio that includes both a machine's operating sound and environmental noise. The following seven types of real/toy machines are used in this task:
ToyCar
ToyTrain
Fan
Gearbox
Bearing
Slide rail
Valve
Overview of the task
Anomalous sound detection (ASD) is the task of identifying whether the sound emitted from a target machine is normal or anomalous. Automatic detection of mechanical failure is an essential technology in the fourth industrial revolution, which involves artificial-intelligence-based factory automation. Prompt detection of machine anomalies by observing sounds is useful for monitoring the condition of machines.
This task is the follow-up from DCASE 2020 Task 2 to DCASE 2023 Task 2. The task this year is to develop an ASD system that meets the following five requirements.
1. **Train a model using only normal sound** (unsupervised learning scenario) Because anomalies rarely occur and are highly diverse in real-world factories, it can be difficult to collect exhaustive patterns of anomalous sounds. Therefore, the system must detect unknown types of anomalous sounds that are not provided in the training data. This is the same requirement as in the previous tasks.
2. **Detect anomalies regardless of domain shifts** (domain generalization task) In real-world cases, the operational states of a machine or the environmental noise can change to cause domain shifts. Domain-generalization techniques can be useful for handling domain shifts that occur frequently or are hard-to-notice. In this task, the system is required to use domain-generalization techniques for handling these domain shifts. This requirement is the same as in DCASE 2022 Task 2 and DCASE 2023 Task 2.
3. **Train a model for a completely new machine type** For a completely new machine type, hyperparameters of the trained model cannot be tuned. Therefore, the system should have the ability to train models without additional hyperparameter tuning. This requirement is the same as in DCASE 2023 Task 2.
4. **Train a model using a limited number of machines from its machine type** While sounds from multiple machines of the same machine type can be used to enhance the detection performance, it is often the case that only a limited number of machines are available for a machine type. In such a case, the system should be able to train models using a few machines from a machine type. This requirement is the same as in DCASE 2023 Task 2.
5 . **Train a model both with or without attribute information**While additional attribute information can help enhance the detection performance, we cannot always obtain such information. Therefore, the system must work well both when attribute information is available and when it is not.
The last requirement is newly introduced in DCASE 2024 Task2.
Definition
We first define key terms in this task: "machine type," "section," "source domain," "target domain," and "attributes.".
"Machine type" indicates the type of machine, which in the development dataset is one of seven: fan, gearbox, bearing, slide rail, valve, ToyCar, and ToyTrain.
A section is defined as a subset of the dataset for calculating performance metrics.
The source domain is the domain under which most of the training data and some of the test data were recorded, and the target domain is a different set of domains under which some of the training data and some of the test data were recorded. There are differences between the source and target domains in terms of operating speed, machine load, viscosity, heating temperature, type of environmental noise, signal-to-noise ratio, etc.
Attributes are parameters that define states of machines or types of noise. For several machine types, the attributes are hidden.
Dataset
This dataset consists of seven machine types. For each machine type, one section is provided, and the section is a complete set of training and test data. For each section, this dataset provides (i) 990 clips of normal sounds in the source domain for training, (ii) ten clips of normal sounds in the target domain for training, and (iii) 100 clips each of normal and anomalous sounds for the test. The source/target domain of each sample is provided. Additionally, the attributes of each sample in the training and test data are provided in the file names and attribute csv files.
File names and attribute csv files
File names and attribute csv files provide reference labels for each clip. The given reference labels for each training/test clip include machine type, section index, normal/anomaly information, and attributes regarding the condition other than normal/anomaly. The machine type is given by the directory name. The section index is given by their respective file names. For the datasets other than the evaluation dataset, the normal/anomaly information and the attributes are given by their respective file names. Note that for machine types that has its attribute information hidden, the attribute information in each file names are only labeled as "noAttributes". Attribute csv files are for easy access to attributes that cause domain shifts. In these files, the file names, name of parameters that cause domain shifts (domain shift parameter, dp), and the value or type of these parameters (domain shift value, dv) are listed. Each row takes the following format:
[filename (string)], [d1p (string)], [d1v (int | float | string)], [d2p], [d2v]...
For machine types that have their attribute information hidden, all columns except the filename column are left blank for each row.
Recording procedure
Normal/anomalous operating sounds of machines and its related equipment are recorded. Anomalous sounds were collected by deliberately damaging target machines. For simplifying the task, we use only the first channel of multi-channel recordings; all recordings are regarded as single-channel recordings of a fixed microphone. We mixed a target machine sound with environmental noise, and only noisy recordings are provided as training/test data. The environmental noise samples were recorded in several real factory environments. We will publish papers on the dataset to explain the details of the recording procedure by the submission deadline.
Directory structure
- /dev_data
- /raw - /fan - /train (only normal clips) - /section_00_source_train_normal_0001_.wav - ... - /section_00_source_train_normal_0990_.wav - /section_00_target_train_normal_0001_.wav - ... - /section_00_target_train_normal_0010_.wav - /test - /section_00_source_test_normal_0001_.wav - ... - /section_00_source_test_normal_0050_.wav - /section_00_source_test_anomaly_0001_.wav - ... - /section_00_source_test_anomaly_0050_.wav - /section_00_target_test_normal_0001_.wav - ... - /section_00_target_test_normal_0050_.wav - /section_00_target_test_anomaly_0001_.wav - ... - /section_00_target_test_anomaly_0050_.wav - attributes_00.csv (attribute csv for section 00) - /gearbox (The other machine types have the same directory structure as fan.) - /bearing - /slider (`slider` means "slide rail") - /ToyCar - /ToyTrain - /valve
Baseline system
The baseline system is available on the Github repository <https://github.com/nttcslab/dcase2023_task2_baseline_ae>.The baseline systems provide a simple entry-level approach that gives a reasonable performance in the dataset of Task 2. They are good starting points, especially for entry-level researchers who want to get familiar with the anomalous-sound-detection task.
Condition of use
This dataset was created jointly by Hitachi, Ltd. and NTT Corporation and is available under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0) license.
Citation
Contact
If there is any problem, please contact us:
Tomoya Nishida, tomoya.nishida.ax@hitachi.com
Keisuke Imoto, keisuke.imoto@ieee.org
Noboru Harada, noboru@ieee.org
Daisuke Niizumi, daisuke.niizumi.dt@hco.ntt.co.jp
Yohei Kawaguchi, yohei.kawaguchi.xk@hitachi.com</description><identifier>DOI: 10.5281/zenodo.10850879</identifier><language>eng</language><publisher>Zenodo</publisher><subject>acoustic condition monitoring ; acoustic event detection ; acoustic scene classification ; acoustic signal processing ; anomalous sound detection ; anomaly detection ; audio ; computational auditory scene analysis ; DCASE ; domain generalization ; domain shift ; machine fault diagnosis ; machine learning ; sound ; unsupervised learning</subject><creationdate>2024</creationdate><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><orcidid>0000-0002-2329-5441 ; 0000-0002-7941-9984 ; 0000-0002-1759-4533 ; 0000-0002-5063-0508 ; 0000-0001-6478-6792 ; 0000-0002-0703-8293</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>777,1888</link.rule.ids><linktorsrc>$$Uhttps://commons.datacite.org/doi.org/10.5281/zenodo.10850879$$EView_record_in_DataCite.org$$FView_record_in_$$GDataCite.org$$Hfree_for_read</linktorsrc></links><search><creatorcontrib>Nishida, Tomoya</creatorcontrib><creatorcontrib>Imoto, Keisuke</creatorcontrib><creatorcontrib>Harada, Noboru</creatorcontrib><creatorcontrib>Niizumi, Daisuke</creatorcontrib><creatorcontrib>Albertini, Davide</creatorcontrib><creatorcontrib>Sannino, Roberto</creatorcontrib><creatorcontrib>Pradolini, Simone</creatorcontrib><creatorcontrib>Augusti, Filippo</creatorcontrib><creatorcontrib>Dohi, Kota</creatorcontrib><creatorcontrib>Purohit, Harsh</creatorcontrib><creatorcontrib>Endo, Takashi</creatorcontrib><creatorcontrib>Kawaguchi, Yohei</creatorcontrib><title>DCASE 2024 Challenge Task 2 Development Dataset</title><description><Data files will be made accessible on April 1st, 2024.>
Description
This dataset is the "development dataset" for the DCASE 2024 Challenge Task 2.
The data consists of the normal/anomalous operating sounds of seven types of real/toy machines. Each recording is a single-channel 10-second audio that includes both a machine's operating sound and environmental noise. The following seven types of real/toy machines are used in this task:
ToyCar
ToyTrain
Fan
Gearbox
Bearing
Slide rail
Valve
Overview of the task
Anomalous sound detection (ASD) is the task of identifying whether the sound emitted from a target machine is normal or anomalous. Automatic detection of mechanical failure is an essential technology in the fourth industrial revolution, which involves artificial-intelligence-based factory automation. Prompt detection of machine anomalies by observing sounds is useful for monitoring the condition of machines.
This task is the follow-up from DCASE 2020 Task 2 to DCASE 2023 Task 2. The task this year is to develop an ASD system that meets the following five requirements.
1. **Train a model using only normal sound** (unsupervised learning scenario) Because anomalies rarely occur and are highly diverse in real-world factories, it can be difficult to collect exhaustive patterns of anomalous sounds. Therefore, the system must detect unknown types of anomalous sounds that are not provided in the training data. This is the same requirement as in the previous tasks.
2. **Detect anomalies regardless of domain shifts** (domain generalization task) In real-world cases, the operational states of a machine or the environmental noise can change to cause domain shifts. Domain-generalization techniques can be useful for handling domain shifts that occur frequently or are hard-to-notice. In this task, the system is required to use domain-generalization techniques for handling these domain shifts. This requirement is the same as in DCASE 2022 Task 2 and DCASE 2023 Task 2.
3. **Train a model for a completely new machine type** For a completely new machine type, hyperparameters of the trained model cannot be tuned. Therefore, the system should have the ability to train models without additional hyperparameter tuning. This requirement is the same as in DCASE 2023 Task 2.
4. **Train a model using a limited number of machines from its machine type** While sounds from multiple machines of the same machine type can be used to enhance the detection performance, it is often the case that only a limited number of machines are available for a machine type. In such a case, the system should be able to train models using a few machines from a machine type. This requirement is the same as in DCASE 2023 Task 2.
5 . **Train a model both with or without attribute information**While additional attribute information can help enhance the detection performance, we cannot always obtain such information. Therefore, the system must work well both when attribute information is available and when it is not.
The last requirement is newly introduced in DCASE 2024 Task2.
Definition
We first define key terms in this task: "machine type," "section," "source domain," "target domain," and "attributes.".
"Machine type" indicates the type of machine, which in the development dataset is one of seven: fan, gearbox, bearing, slide rail, valve, ToyCar, and ToyTrain.
A section is defined as a subset of the dataset for calculating performance metrics.
The source domain is the domain under which most of the training data and some of the test data were recorded, and the target domain is a different set of domains under which some of the training data and some of the test data were recorded. There are differences between the source and target domains in terms of operating speed, machine load, viscosity, heating temperature, type of environmental noise, signal-to-noise ratio, etc.
Attributes are parameters that define states of machines or types of noise. For several machine types, the attributes are hidden.
Dataset
This dataset consists of seven machine types. For each machine type, one section is provided, and the section is a complete set of training and test data. For each section, this dataset provides (i) 990 clips of normal sounds in the source domain for training, (ii) ten clips of normal sounds in the target domain for training, and (iii) 100 clips each of normal and anomalous sounds for the test. The source/target domain of each sample is provided. Additionally, the attributes of each sample in the training and test data are provided in the file names and attribute csv files.
File names and attribute csv files
File names and attribute csv files provide reference labels for each clip. The given reference labels for each training/test clip include machine type, section index, normal/anomaly information, and attributes regarding the condition other than normal/anomaly. The machine type is given by the directory name. The section index is given by their respective file names. For the datasets other than the evaluation dataset, the normal/anomaly information and the attributes are given by their respective file names. Note that for machine types that has its attribute information hidden, the attribute information in each file names are only labeled as "noAttributes". Attribute csv files are for easy access to attributes that cause domain shifts. In these files, the file names, name of parameters that cause domain shifts (domain shift parameter, dp), and the value or type of these parameters (domain shift value, dv) are listed. Each row takes the following format:
[filename (string)], [d1p (string)], [d1v (int | float | string)], [d2p], [d2v]...
For machine types that have their attribute information hidden, all columns except the filename column are left blank for each row.
Recording procedure
Normal/anomalous operating sounds of machines and its related equipment are recorded. Anomalous sounds were collected by deliberately damaging target machines. For simplifying the task, we use only the first channel of multi-channel recordings; all recordings are regarded as single-channel recordings of a fixed microphone. We mixed a target machine sound with environmental noise, and only noisy recordings are provided as training/test data. The environmental noise samples were recorded in several real factory environments. We will publish papers on the dataset to explain the details of the recording procedure by the submission deadline.
Directory structure
- /dev_data
- /raw - /fan - /train (only normal clips) - /section_00_source_train_normal_0001_.wav - ... - /section_00_source_train_normal_0990_.wav - /section_00_target_train_normal_0001_.wav - ... - /section_00_target_train_normal_0010_.wav - /test - /section_00_source_test_normal_0001_.wav - ... - /section_00_source_test_normal_0050_.wav - /section_00_source_test_anomaly_0001_.wav - ... - /section_00_source_test_anomaly_0050_.wav - /section_00_target_test_normal_0001_.wav - ... - /section_00_target_test_normal_0050_.wav - /section_00_target_test_anomaly_0001_.wav - ... - /section_00_target_test_anomaly_0050_.wav - attributes_00.csv (attribute csv for section 00) - /gearbox (The other machine types have the same directory structure as fan.) - /bearing - /slider (`slider` means "slide rail") - /ToyCar - /ToyTrain - /valve
Baseline system
The baseline system is available on the Github repository <https://github.com/nttcslab/dcase2023_task2_baseline_ae>.The baseline systems provide a simple entry-level approach that gives a reasonable performance in the dataset of Task 2. They are good starting points, especially for entry-level researchers who want to get familiar with the anomalous-sound-detection task.
Condition of use
This dataset was created jointly by Hitachi, Ltd. and NTT Corporation and is available under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0) license.
Citation
Contact
If there is any problem, please contact us:
Tomoya Nishida, tomoya.nishida.ax@hitachi.com
Keisuke Imoto, keisuke.imoto@ieee.org
Noboru Harada, noboru@ieee.org
Daisuke Niizumi, daisuke.niizumi.dt@hco.ntt.co.jp
Yohei Kawaguchi, yohei.kawaguchi.xk@hitachi.com</description><subject>acoustic condition monitoring</subject><subject>acoustic event detection</subject><subject>acoustic scene classification</subject><subject>acoustic signal processing</subject><subject>anomalous sound detection</subject><subject>anomaly detection</subject><subject>audio</subject><subject>computational auditory scene analysis</subject><subject>DCASE</subject><subject>domain generalization</subject><subject>domain shift</subject><subject>machine fault diagnosis</subject><subject>machine learning</subject><subject>sound</subject><subject>unsupervised learning</subject><fulltext>true</fulltext><rsrctype>dataset</rsrctype><creationdate>2024</creationdate><recordtype>dataset</recordtype><sourceid>PQ8</sourceid><recordid>eNpjYBA3NNAzNbIw1K9KzctPydczNLAwNbAwt-Rk0Hdxdgx2VTAyMDJRcM5IzMlJzUtPVQhJLM5WMFJwSS1LzckvyE3NK1FwSSxJLE4t4WFgTUvMKU7lhdLcDPpuriHOHropQPnkzJLU-IKizNzEosp4Q4N4kJXxECvjYVYak64DAJmWOQM</recordid><startdate>20240401</startdate><enddate>20240401</enddate><creator>Nishida, Tomoya</creator><creator>Imoto, Keisuke</creator><creator>Harada, Noboru</creator><creator>Niizumi, Daisuke</creator><creator>Albertini, Davide</creator><creator>Sannino, Roberto</creator><creator>Pradolini, Simone</creator><creator>Augusti, Filippo</creator><creator>Dohi, Kota</creator><creator>Purohit, Harsh</creator><creator>Endo, Takashi</creator><creator>Kawaguchi, Yohei</creator><general>Zenodo</general><scope>DYCCY</scope><scope>PQ8</scope><orcidid>https://orcid.org/0000-0002-2329-5441</orcidid><orcidid>https://orcid.org/0000-0002-7941-9984</orcidid><orcidid>https://orcid.org/0000-0002-1759-4533</orcidid><orcidid>https://orcid.org/0000-0002-5063-0508</orcidid><orcidid>https://orcid.org/0000-0001-6478-6792</orcidid><orcidid>https://orcid.org/0000-0002-0703-8293</orcidid></search><sort><creationdate>20240401</creationdate><title>DCASE 2024 Challenge Task 2 Development Dataset</title><author>Nishida, Tomoya ; Imoto, Keisuke ; Harada, Noboru ; Niizumi, Daisuke ; Albertini, Davide ; Sannino, Roberto ; Pradolini, Simone ; Augusti, Filippo ; Dohi, Kota ; Purohit, Harsh ; Endo, Takashi ; Kawaguchi, Yohei</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-datacite_primary_10_5281_zenodo_108508793</frbrgroupid><rsrctype>datasets</rsrctype><prefilter>datasets</prefilter><language>eng</language><creationdate>2024</creationdate><topic>acoustic condition monitoring</topic><topic>acoustic event detection</topic><topic>acoustic scene classification</topic><topic>acoustic signal processing</topic><topic>anomalous sound detection</topic><topic>anomaly detection</topic><topic>audio</topic><topic>computational auditory scene analysis</topic><topic>DCASE</topic><topic>domain generalization</topic><topic>domain shift</topic><topic>machine fault diagnosis</topic><topic>machine learning</topic><topic>sound</topic><topic>unsupervised learning</topic><toplevel>online_resources</toplevel><creatorcontrib>Nishida, Tomoya</creatorcontrib><creatorcontrib>Imoto, Keisuke</creatorcontrib><creatorcontrib>Harada, Noboru</creatorcontrib><creatorcontrib>Niizumi, Daisuke</creatorcontrib><creatorcontrib>Albertini, Davide</creatorcontrib><creatorcontrib>Sannino, Roberto</creatorcontrib><creatorcontrib>Pradolini, Simone</creatorcontrib><creatorcontrib>Augusti, Filippo</creatorcontrib><creatorcontrib>Dohi, Kota</creatorcontrib><creatorcontrib>Purohit, Harsh</creatorcontrib><creatorcontrib>Endo, Takashi</creatorcontrib><creatorcontrib>Kawaguchi, Yohei</creatorcontrib><collection>DataCite (Open Access)</collection><collection>DataCite</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Nishida, Tomoya</au><au>Imoto, Keisuke</au><au>Harada, Noboru</au><au>Niizumi, Daisuke</au><au>Albertini, Davide</au><au>Sannino, Roberto</au><au>Pradolini, Simone</au><au>Augusti, Filippo</au><au>Dohi, Kota</au><au>Purohit, Harsh</au><au>Endo, Takashi</au><au>Kawaguchi, Yohei</au><format>book</format><genre>unknown</genre><ristype>DATA</ristype><title>DCASE 2024 Challenge Task 2 Development Dataset</title><date>2024-04-01</date><risdate>2024</risdate><abstract><Data files will be made accessible on April 1st, 2024.>
Description
This dataset is the "development dataset" for the DCASE 2024 Challenge Task 2.
The data consists of the normal/anomalous operating sounds of seven types of real/toy machines. Each recording is a single-channel 10-second audio that includes both a machine's operating sound and environmental noise. The following seven types of real/toy machines are used in this task:
ToyCar
ToyTrain
Fan
Gearbox
Bearing
Slide rail
Valve
Overview of the task
Anomalous sound detection (ASD) is the task of identifying whether the sound emitted from a target machine is normal or anomalous. Automatic detection of mechanical failure is an essential technology in the fourth industrial revolution, which involves artificial-intelligence-based factory automation. Prompt detection of machine anomalies by observing sounds is useful for monitoring the condition of machines.
This task is the follow-up from DCASE 2020 Task 2 to DCASE 2023 Task 2. The task this year is to develop an ASD system that meets the following five requirements.
1. **Train a model using only normal sound** (unsupervised learning scenario) Because anomalies rarely occur and are highly diverse in real-world factories, it can be difficult to collect exhaustive patterns of anomalous sounds. Therefore, the system must detect unknown types of anomalous sounds that are not provided in the training data. This is the same requirement as in the previous tasks.
2. **Detect anomalies regardless of domain shifts** (domain generalization task) In real-world cases, the operational states of a machine or the environmental noise can change to cause domain shifts. Domain-generalization techniques can be useful for handling domain shifts that occur frequently or are hard-to-notice. In this task, the system is required to use domain-generalization techniques for handling these domain shifts. This requirement is the same as in DCASE 2022 Task 2 and DCASE 2023 Task 2.
3. **Train a model for a completely new machine type** For a completely new machine type, hyperparameters of the trained model cannot be tuned. Therefore, the system should have the ability to train models without additional hyperparameter tuning. This requirement is the same as in DCASE 2023 Task 2.
4. **Train a model using a limited number of machines from its machine type** While sounds from multiple machines of the same machine type can be used to enhance the detection performance, it is often the case that only a limited number of machines are available for a machine type. In such a case, the system should be able to train models using a few machines from a machine type. This requirement is the same as in DCASE 2023 Task 2.
5 . **Train a model both with or without attribute information**While additional attribute information can help enhance the detection performance, we cannot always obtain such information. Therefore, the system must work well both when attribute information is available and when it is not.
The last requirement is newly introduced in DCASE 2024 Task2.
Definition
We first define key terms in this task: "machine type," "section," "source domain," "target domain," and "attributes.".
"Machine type" indicates the type of machine, which in the development dataset is one of seven: fan, gearbox, bearing, slide rail, valve, ToyCar, and ToyTrain.
A section is defined as a subset of the dataset for calculating performance metrics.
The source domain is the domain under which most of the training data and some of the test data were recorded, and the target domain is a different set of domains under which some of the training data and some of the test data were recorded. There are differences between the source and target domains in terms of operating speed, machine load, viscosity, heating temperature, type of environmental noise, signal-to-noise ratio, etc.
Attributes are parameters that define states of machines or types of noise. For several machine types, the attributes are hidden.
Dataset
This dataset consists of seven machine types. For each machine type, one section is provided, and the section is a complete set of training and test data. For each section, this dataset provides (i) 990 clips of normal sounds in the source domain for training, (ii) ten clips of normal sounds in the target domain for training, and (iii) 100 clips each of normal and anomalous sounds for the test. The source/target domain of each sample is provided. Additionally, the attributes of each sample in the training and test data are provided in the file names and attribute csv files.
File names and attribute csv files
File names and attribute csv files provide reference labels for each clip. The given reference labels for each training/test clip include machine type, section index, normal/anomaly information, and attributes regarding the condition other than normal/anomaly. The machine type is given by the directory name. The section index is given by their respective file names. For the datasets other than the evaluation dataset, the normal/anomaly information and the attributes are given by their respective file names. Note that for machine types that has its attribute information hidden, the attribute information in each file names are only labeled as "noAttributes". Attribute csv files are for easy access to attributes that cause domain shifts. In these files, the file names, name of parameters that cause domain shifts (domain shift parameter, dp), and the value or type of these parameters (domain shift value, dv) are listed. Each row takes the following format:
[filename (string)], [d1p (string)], [d1v (int | float | string)], [d2p], [d2v]...
For machine types that have their attribute information hidden, all columns except the filename column are left blank for each row.
Recording procedure
Normal/anomalous operating sounds of machines and its related equipment are recorded. Anomalous sounds were collected by deliberately damaging target machines. For simplifying the task, we use only the first channel of multi-channel recordings; all recordings are regarded as single-channel recordings of a fixed microphone. We mixed a target machine sound with environmental noise, and only noisy recordings are provided as training/test data. The environmental noise samples were recorded in several real factory environments. We will publish papers on the dataset to explain the details of the recording procedure by the submission deadline.
Directory structure
- /dev_data
- /raw - /fan - /train (only normal clips) - /section_00_source_train_normal_0001_.wav - ... - /section_00_source_train_normal_0990_.wav - /section_00_target_train_normal_0001_.wav - ... - /section_00_target_train_normal_0010_.wav - /test - /section_00_source_test_normal_0001_.wav - ... - /section_00_source_test_normal_0050_.wav - /section_00_source_test_anomaly_0001_.wav - ... - /section_00_source_test_anomaly_0050_.wav - /section_00_target_test_normal_0001_.wav - ... - /section_00_target_test_normal_0050_.wav - /section_00_target_test_anomaly_0001_.wav - ... - /section_00_target_test_anomaly_0050_.wav - attributes_00.csv (attribute csv for section 00) - /gearbox (The other machine types have the same directory structure as fan.) - /bearing - /slider (`slider` means "slide rail") - /ToyCar - /ToyTrain - /valve
Baseline system
The baseline system is available on the Github repository <https://github.com/nttcslab/dcase2023_task2_baseline_ae>.The baseline systems provide a simple entry-level approach that gives a reasonable performance in the dataset of Task 2. They are good starting points, especially for entry-level researchers who want to get familiar with the anomalous-sound-detection task.
Condition of use
This dataset was created jointly by Hitachi, Ltd. and NTT Corporation and is available under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0) license.
Citation
Contact
If there is any problem, please contact us:
Tomoya Nishida, tomoya.nishida.ax@hitachi.com
Keisuke Imoto, keisuke.imoto@ieee.org
Noboru Harada, noboru@ieee.org
Daisuke Niizumi, daisuke.niizumi.dt@hco.ntt.co.jp
Yohei Kawaguchi, yohei.kawaguchi.xk@hitachi.com</abstract><pub>Zenodo</pub><doi>10.5281/zenodo.10850879</doi><orcidid>https://orcid.org/0000-0002-2329-5441</orcidid><orcidid>https://orcid.org/0000-0002-7941-9984</orcidid><orcidid>https://orcid.org/0000-0002-1759-4533</orcidid><orcidid>https://orcid.org/0000-0002-5063-0508</orcidid><orcidid>https://orcid.org/0000-0001-6478-6792</orcidid><orcidid>https://orcid.org/0000-0002-0703-8293</orcidid><oa>free_for_read</oa></addata></record> |
fulltext | fulltext_linktorsrc |
identifier | DOI: 10.5281/zenodo.10850879 |
ispartof | |
issn | |
language | eng |
recordid | cdi_datacite_primary_10_5281_zenodo_10850879 |
source | DataCite |
subjects | acoustic condition monitoring acoustic event detection acoustic scene classification acoustic signal processing anomalous sound detection anomaly detection audio computational auditory scene analysis DCASE domain generalization domain shift machine fault diagnosis machine learning sound unsupervised learning |
title | DCASE 2024 Challenge Task 2 Development Dataset |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-21T03%3A47%3A40IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-datacite_PQ8&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=unknown&rft.au=Nishida,%20Tomoya&rft.date=2024-04-01&rft_id=info:doi/10.5281/zenodo.10850879&rft_dat=%3Cdatacite_PQ8%3E10_5281_zenodo_10850879%3C/datacite_PQ8%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true |