Robust and Interpretable Temporal Convolution Network for Event Detection in Lung Sound Recordings

This paper proposes a novel framework for lung sound event detection, segmenting continuous lung sound recordings into discrete events and performing recognition on each event. Exploiting the lightweight nature of Temporal Convolution Networks (TCNs) and their superior results compared to their recu...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:arXiv.org 2021-06
Hauptverfasser: Tharindu Fernando, Sridharan, Sridha, Denman, Simon, Ghaemmaghami, Houman, Fookes, Clinton
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page
container_issue
container_start_page
container_title arXiv.org
container_volume
creator Tharindu Fernando
Sridharan, Sridha
Denman, Simon
Ghaemmaghami, Houman
Fookes, Clinton
description This paper proposes a novel framework for lung sound event detection, segmenting continuous lung sound recordings into discrete events and performing recognition on each event. Exploiting the lightweight nature of Temporal Convolution Networks (TCNs) and their superior results compared to their recurrent counterparts, we propose a lightweight, yet robust, and completely interpretable framework for lung sound event detection. We propose the use of a multi-branch TCN architecture and exploit a novel fusion strategy to combine the resultant features from these branches. This not only allows the network to retain the most salient information across different temporal granularities and disregards irrelevant information, but also allows our network to process recordings of arbitrary length. Results: The proposed method is evaluated on multiple public and in-house benchmarks of irregular and noisy recordings of the respiratory auscultation process for the identification of numerous auscultation events including inhalation, exhalation, crackles, wheeze, stridor, and rhonchi. We exceed the state-of-the-art results in all evaluations. Furthermore, we empirically analyse the effect of the proposed multi-branch TCN architecture and the feature fusion strategy and provide quantitative and qualitative evaluations to illustrate their efficiency. Moreover, we provide an end-to-end model interpretation pipeline that interprets the operations of all the components of the proposed framework. Our analysis of different feature fusion strategies shows that the proposed feature concatenation method leads to better suppression of non-informative features, which drastically reduces the classifier overhead resulting in a robust lightweight network.The lightweight nature of our model allows it to be deployed in end-user devices such as smartphones, and it has the ability to generate predictions in real-time.
format Article
fullrecord <record><control><sourceid>proquest</sourceid><recordid>TN_cdi_proquest_journals_2547183646</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2547183646</sourcerecordid><originalsourceid>FETCH-proquest_journals_25471836463</originalsourceid><addsrcrecordid>eNqNjEkKwkAQAAdBUNQ_NHgWkplsdxcUxIN6DxPTCYmxO84Sv6-ID_BUhypqJKZSqXCVRVJOxMLaNggCmaQyjtVUFGcuvHWgqYQDOTS9QaeLDuGKj56N7mDNNHDnXcMEJ3QvNneo2MB2QHKwQYe3r2sIjp5quLD_zM54Y1M2VNu5GFe6s7j4cSaWu-11vV_1hp8erctb9oY-KpdxlIaZSqJE_Ve9AcGpRgI</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2547183646</pqid></control><display><type>article</type><title>Robust and Interpretable Temporal Convolution Network for Event Detection in Lung Sound Recordings</title><source>Free eJournals</source><creator>Tharindu Fernando ; Sridharan, Sridha ; Denman, Simon ; Ghaemmaghami, Houman ; Fookes, Clinton</creator><creatorcontrib>Tharindu Fernando ; Sridharan, Sridha ; Denman, Simon ; Ghaemmaghami, Houman ; Fookes, Clinton</creatorcontrib><description>This paper proposes a novel framework for lung sound event detection, segmenting continuous lung sound recordings into discrete events and performing recognition on each event. Exploiting the lightweight nature of Temporal Convolution Networks (TCNs) and their superior results compared to their recurrent counterparts, we propose a lightweight, yet robust, and completely interpretable framework for lung sound event detection. We propose the use of a multi-branch TCN architecture and exploit a novel fusion strategy to combine the resultant features from these branches. This not only allows the network to retain the most salient information across different temporal granularities and disregards irrelevant information, but also allows our network to process recordings of arbitrary length. Results: The proposed method is evaluated on multiple public and in-house benchmarks of irregular and noisy recordings of the respiratory auscultation process for the identification of numerous auscultation events including inhalation, exhalation, crackles, wheeze, stridor, and rhonchi. We exceed the state-of-the-art results in all evaluations. Furthermore, we empirically analyse the effect of the proposed multi-branch TCN architecture and the feature fusion strategy and provide quantitative and qualitative evaluations to illustrate their efficiency. Moreover, we provide an end-to-end model interpretation pipeline that interprets the operations of all the components of the proposed framework. Our analysis of different feature fusion strategies shows that the proposed feature concatenation method leads to better suppression of non-informative features, which drastically reduces the classifier overhead resulting in a robust lightweight network.The lightweight nature of our model allows it to be deployed in end-user devices such as smartphones, and it has the ability to generate predictions in real-time.</description><identifier>EISSN: 2331-8422</identifier><language>eng</language><publisher>Ithaca: Cornell University Library, arXiv.org</publisher><subject>Auscultation ; Convolution ; Evaluation ; Exhalation ; Lightweight ; Lungs ; Respiration ; Robustness ; Sound recordings</subject><ispartof>arXiv.org, 2021-06</ispartof><rights>2021. This work is published under http://arxiv.org/licenses/nonexclusive-distrib/1.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>776,780</link.rule.ids></links><search><creatorcontrib>Tharindu Fernando</creatorcontrib><creatorcontrib>Sridharan, Sridha</creatorcontrib><creatorcontrib>Denman, Simon</creatorcontrib><creatorcontrib>Ghaemmaghami, Houman</creatorcontrib><creatorcontrib>Fookes, Clinton</creatorcontrib><title>Robust and Interpretable Temporal Convolution Network for Event Detection in Lung Sound Recordings</title><title>arXiv.org</title><description>This paper proposes a novel framework for lung sound event detection, segmenting continuous lung sound recordings into discrete events and performing recognition on each event. Exploiting the lightweight nature of Temporal Convolution Networks (TCNs) and their superior results compared to their recurrent counterparts, we propose a lightweight, yet robust, and completely interpretable framework for lung sound event detection. We propose the use of a multi-branch TCN architecture and exploit a novel fusion strategy to combine the resultant features from these branches. This not only allows the network to retain the most salient information across different temporal granularities and disregards irrelevant information, but also allows our network to process recordings of arbitrary length. Results: The proposed method is evaluated on multiple public and in-house benchmarks of irregular and noisy recordings of the respiratory auscultation process for the identification of numerous auscultation events including inhalation, exhalation, crackles, wheeze, stridor, and rhonchi. We exceed the state-of-the-art results in all evaluations. Furthermore, we empirically analyse the effect of the proposed multi-branch TCN architecture and the feature fusion strategy and provide quantitative and qualitative evaluations to illustrate their efficiency. Moreover, we provide an end-to-end model interpretation pipeline that interprets the operations of all the components of the proposed framework. Our analysis of different feature fusion strategies shows that the proposed feature concatenation method leads to better suppression of non-informative features, which drastically reduces the classifier overhead resulting in a robust lightweight network.The lightweight nature of our model allows it to be deployed in end-user devices such as smartphones, and it has the ability to generate predictions in real-time.</description><subject>Auscultation</subject><subject>Convolution</subject><subject>Evaluation</subject><subject>Exhalation</subject><subject>Lightweight</subject><subject>Lungs</subject><subject>Respiration</subject><subject>Robustness</subject><subject>Sound recordings</subject><issn>2331-8422</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2021</creationdate><recordtype>article</recordtype><sourceid>BENPR</sourceid><recordid>eNqNjEkKwkAQAAdBUNQ_NHgWkplsdxcUxIN6DxPTCYmxO84Sv6-ID_BUhypqJKZSqXCVRVJOxMLaNggCmaQyjtVUFGcuvHWgqYQDOTS9QaeLDuGKj56N7mDNNHDnXcMEJ3QvNneo2MB2QHKwQYe3r2sIjp5quLD_zM54Y1M2VNu5GFe6s7j4cSaWu-11vV_1hp8erctb9oY-KpdxlIaZSqJE_Ve9AcGpRgI</recordid><startdate>20210630</startdate><enddate>20210630</enddate><creator>Tharindu Fernando</creator><creator>Sridharan, Sridha</creator><creator>Denman, Simon</creator><creator>Ghaemmaghami, Houman</creator><creator>Fookes, Clinton</creator><general>Cornell University Library, arXiv.org</general><scope>8FE</scope><scope>8FG</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>HCIFZ</scope><scope>L6V</scope><scope>M7S</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope></search><sort><creationdate>20210630</creationdate><title>Robust and Interpretable Temporal Convolution Network for Event Detection in Lung Sound Recordings</title><author>Tharindu Fernando ; Sridharan, Sridha ; Denman, Simon ; Ghaemmaghami, Houman ; Fookes, Clinton</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-proquest_journals_25471836463</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2021</creationdate><topic>Auscultation</topic><topic>Convolution</topic><topic>Evaluation</topic><topic>Exhalation</topic><topic>Lightweight</topic><topic>Lungs</topic><topic>Respiration</topic><topic>Robustness</topic><topic>Sound recordings</topic><toplevel>online_resources</toplevel><creatorcontrib>Tharindu Fernando</creatorcontrib><creatorcontrib>Sridharan, Sridha</creatorcontrib><creatorcontrib>Denman, Simon</creatorcontrib><creatorcontrib>Ghaemmaghami, Houman</creatorcontrib><creatorcontrib>Fookes, Clinton</creatorcontrib><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Materials Science &amp; Engineering Collection</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest Central UK/Ireland</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Engineering Collection</collection><collection>Engineering Database</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering Collection</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Tharindu Fernando</au><au>Sridharan, Sridha</au><au>Denman, Simon</au><au>Ghaemmaghami, Houman</au><au>Fookes, Clinton</au><format>book</format><genre>document</genre><ristype>GEN</ristype><atitle>Robust and Interpretable Temporal Convolution Network for Event Detection in Lung Sound Recordings</atitle><jtitle>arXiv.org</jtitle><date>2021-06-30</date><risdate>2021</risdate><eissn>2331-8422</eissn><abstract>This paper proposes a novel framework for lung sound event detection, segmenting continuous lung sound recordings into discrete events and performing recognition on each event. Exploiting the lightweight nature of Temporal Convolution Networks (TCNs) and their superior results compared to their recurrent counterparts, we propose a lightweight, yet robust, and completely interpretable framework for lung sound event detection. We propose the use of a multi-branch TCN architecture and exploit a novel fusion strategy to combine the resultant features from these branches. This not only allows the network to retain the most salient information across different temporal granularities and disregards irrelevant information, but also allows our network to process recordings of arbitrary length. Results: The proposed method is evaluated on multiple public and in-house benchmarks of irregular and noisy recordings of the respiratory auscultation process for the identification of numerous auscultation events including inhalation, exhalation, crackles, wheeze, stridor, and rhonchi. We exceed the state-of-the-art results in all evaluations. Furthermore, we empirically analyse the effect of the proposed multi-branch TCN architecture and the feature fusion strategy and provide quantitative and qualitative evaluations to illustrate their efficiency. Moreover, we provide an end-to-end model interpretation pipeline that interprets the operations of all the components of the proposed framework. Our analysis of different feature fusion strategies shows that the proposed feature concatenation method leads to better suppression of non-informative features, which drastically reduces the classifier overhead resulting in a robust lightweight network.The lightweight nature of our model allows it to be deployed in end-user devices such as smartphones, and it has the ability to generate predictions in real-time.</abstract><cop>Ithaca</cop><pub>Cornell University Library, arXiv.org</pub><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier EISSN: 2331-8422
ispartof arXiv.org, 2021-06
issn 2331-8422
language eng
recordid cdi_proquest_journals_2547183646
source Free eJournals
subjects Auscultation
Convolution
Evaluation
Exhalation
Lightweight
Lungs
Respiration
Robustness
Sound recordings
title Robust and Interpretable Temporal Convolution Network for Event Detection in Lung Sound Recordings
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-04T08%3A01%3A05IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=document&rft.atitle=Robust%20and%20Interpretable%20Temporal%20Convolution%20Network%20for%20Event%20Detection%20in%20Lung%20Sound%20Recordings&rft.jtitle=arXiv.org&rft.au=Tharindu%20Fernando&rft.date=2021-06-30&rft.eissn=2331-8422&rft_id=info:doi/&rft_dat=%3Cproquest%3E2547183646%3C/proquest%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2547183646&rft_id=info:pmid/&rfr_iscdi=true