HeAR -- Health Acoustic Representations

Health acoustic sounds such as coughs and breaths are known to contain useful health signals with significant potential for monitoring health and disease, yet are underexplored in the medical machine learning community. The existing deep learning systems for health acoustics are often narrowly train...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Baur, Sebastien, Nabulsi, Zaid, Weng, Wei-Hung, Garrison, Jake, Blankemeier, Louis, Fishman, Sam, Chen, Christina, Kakarmath, Sujay, Maimbolwa, Minyoi, Sanjase, Nsala, Shuma, Brian, Matias, Yossi, Corrado, Greg S, Patel, Shwetak, Shetty, Shravya, Prabhakara, Shruthi, Muyoyeta, Monde, Ardila, Diego
Format:	Artikel
Sprache:	eng
Schlagworte:	Computer Science - Artificial Intelligence Computer Science - Learning
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page
container_issue
container_start_page
container_title
container_volume
creator	Baur, Sebastien Nabulsi, Zaid Weng, Wei-Hung Garrison, Jake Blankemeier, Louis Fishman, Sam Chen, Christina Kakarmath, Sujay Maimbolwa, Minyoi Sanjase, Nsala Shuma, Brian Matias, Yossi Corrado, Greg S Patel, Shwetak Shetty, Shravya Prabhakara, Shruthi Muyoyeta, Monde Ardila, Diego
description	Health acoustic sounds such as coughs and breaths are known to contain useful health signals with significant potential for monitoring health and disease, yet are underexplored in the medical machine learning community. The existing deep learning systems for health acoustics are often narrowly trained and evaluated on a single task, which is limited by data and may hinder generalization to other tasks. To mitigate these gaps, we develop HeAR, a scalable self-supervised learning-based deep learning system using masked autoencoders trained on a large dataset of 313 million two-second long audio clips. Through linear probes, we establish HeAR as a state-of-the-art health audio embedding model on a benchmark of 33 health acoustic tasks across 6 datasets. By introducing this work, we hope to enable and accelerate further health acoustics research.
doi_str_mv	10.48550/arxiv.2403.02522
format	Article
fullrecord	<record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_2403_02522</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2403_02522</sourcerecordid><originalsourceid>FETCH-LOGICAL-a672-d29146866df0eba86d448c271522c1183378d885a8c0a9fd1c7abe31044b7d823</originalsourceid><addsrcrecordid>eNotzr0OgjAYheEuDka9ACfZnIr9o_0YiVExITEh7uSjLZFExQAavXsVnc72noeQOWehgihiK2yf9SMUismQiUiIMVmmPskDSoPU47k_BYlt7l1f2yD3t9Z3_tpjXzfXbkpGFZ47P_vvhBy3m-M6pdlht18nGUVtBHUi5kqD1q5ivkTQTimwwvDPl-UcpDTgACIEyzCuHLcGSy85U6o0DoSckMUvO0iLW1tfsH0VX3ExiOUbwsY4_A</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>HeAR -- Health Acoustic Representations</title><source>arXiv.org</source><creator>Baur, Sebastien ; Nabulsi, Zaid ; Weng, Wei-Hung ; Garrison, Jake ; Blankemeier, Louis ; Fishman, Sam ; Chen, Christina ; Kakarmath, Sujay ; Maimbolwa, Minyoi ; Sanjase, Nsala ; Shuma, Brian ; Matias, Yossi ; Corrado, Greg S ; Patel, Shwetak ; Shetty, Shravya ; Prabhakara, Shruthi ; Muyoyeta, Monde ; Ardila, Diego</creator><creatorcontrib>Baur, Sebastien ; Nabulsi, Zaid ; Weng, Wei-Hung ; Garrison, Jake ; Blankemeier, Louis ; Fishman, Sam ; Chen, Christina ; Kakarmath, Sujay ; Maimbolwa, Minyoi ; Sanjase, Nsala ; Shuma, Brian ; Matias, Yossi ; Corrado, Greg S ; Patel, Shwetak ; Shetty, Shravya ; Prabhakara, Shruthi ; Muyoyeta, Monde ; Ardila, Diego</creatorcontrib><description>Health acoustic sounds such as coughs and breaths are known to contain useful health signals with significant potential for monitoring health and disease, yet are underexplored in the medical machine learning community. The existing deep learning systems for health acoustics are often narrowly trained and evaluated on a single task, which is limited by data and may hinder generalization to other tasks. To mitigate these gaps, we develop HeAR, a scalable self-supervised learning-based deep learning system using masked autoencoders trained on a large dataset of 313 million two-second long audio clips. Through linear probes, we establish HeAR as a state-of-the-art health audio embedding model on a benchmark of 33 health acoustic tasks across 6 datasets. By introducing this work, we hope to enable and accelerate further health acoustics research.</description><identifier>DOI: 10.48550/arxiv.2403.02522</identifier><language>eng</language><subject>Computer Science - Artificial Intelligence ; Computer Science - Learning</subject><creationdate>2024-03</creationdate><rights>http://arxiv.org/licenses/nonexclusive-distrib/1.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,780,885</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/2403.02522$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.2403.02522$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Baur, Sebastien</creatorcontrib><creatorcontrib>Nabulsi, Zaid</creatorcontrib><creatorcontrib>Weng, Wei-Hung</creatorcontrib><creatorcontrib>Garrison, Jake</creatorcontrib><creatorcontrib>Blankemeier, Louis</creatorcontrib><creatorcontrib>Fishman, Sam</creatorcontrib><creatorcontrib>Chen, Christina</creatorcontrib><creatorcontrib>Kakarmath, Sujay</creatorcontrib><creatorcontrib>Maimbolwa, Minyoi</creatorcontrib><creatorcontrib>Sanjase, Nsala</creatorcontrib><creatorcontrib>Shuma, Brian</creatorcontrib><creatorcontrib>Matias, Yossi</creatorcontrib><creatorcontrib>Corrado, Greg S</creatorcontrib><creatorcontrib>Patel, Shwetak</creatorcontrib><creatorcontrib>Shetty, Shravya</creatorcontrib><creatorcontrib>Prabhakara, Shruthi</creatorcontrib><creatorcontrib>Muyoyeta, Monde</creatorcontrib><creatorcontrib>Ardila, Diego</creatorcontrib><title>HeAR -- Health Acoustic Representations</title><description>Health acoustic sounds such as coughs and breaths are known to contain useful health signals with significant potential for monitoring health and disease, yet are underexplored in the medical machine learning community. The existing deep learning systems for health acoustics are often narrowly trained and evaluated on a single task, which is limited by data and may hinder generalization to other tasks. To mitigate these gaps, we develop HeAR, a scalable self-supervised learning-based deep learning system using masked autoencoders trained on a large dataset of 313 million two-second long audio clips. Through linear probes, we establish HeAR as a state-of-the-art health audio embedding model on a benchmark of 33 health acoustic tasks across 6 datasets. By introducing this work, we hope to enable and accelerate further health acoustics research.</description><subject>Computer Science - Artificial Intelligence</subject><subject>Computer Science - Learning</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNotzr0OgjAYheEuDka9ACfZnIr9o_0YiVExITEh7uSjLZFExQAavXsVnc72noeQOWehgihiK2yf9SMUismQiUiIMVmmPskDSoPU47k_BYlt7l1f2yD3t9Z3_tpjXzfXbkpGFZ47P_vvhBy3m-M6pdlht18nGUVtBHUi5kqD1q5ivkTQTimwwvDPl-UcpDTgACIEyzCuHLcGSy85U6o0DoSckMUvO0iLW1tfsH0VX3ExiOUbwsY4_A</recordid><startdate>20240304</startdate><enddate>20240304</enddate><creator>Baur, Sebastien</creator><creator>Nabulsi, Zaid</creator><creator>Weng, Wei-Hung</creator><creator>Garrison, Jake</creator><creator>Blankemeier, Louis</creator><creator>Fishman, Sam</creator><creator>Chen, Christina</creator><creator>Kakarmath, Sujay</creator><creator>Maimbolwa, Minyoi</creator><creator>Sanjase, Nsala</creator><creator>Shuma, Brian</creator><creator>Matias, Yossi</creator><creator>Corrado, Greg S</creator><creator>Patel, Shwetak</creator><creator>Shetty, Shravya</creator><creator>Prabhakara, Shruthi</creator><creator>Muyoyeta, Monde</creator><creator>Ardila, Diego</creator><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20240304</creationdate><title>HeAR -- Health Acoustic Representations</title><author>Baur, Sebastien ; Nabulsi, Zaid ; Weng, Wei-Hung ; Garrison, Jake ; Blankemeier, Louis ; Fishman, Sam ; Chen, Christina ; Kakarmath, Sujay ; Maimbolwa, Minyoi ; Sanjase, Nsala ; Shuma, Brian ; Matias, Yossi ; Corrado, Greg S ; Patel, Shwetak ; Shetty, Shravya ; Prabhakara, Shruthi ; Muyoyeta, Monde ; Ardila, Diego</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a672-d29146866df0eba86d448c271522c1183378d885a8c0a9fd1c7abe31044b7d823</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Computer Science - Artificial Intelligence</topic><topic>Computer Science - Learning</topic><toplevel>online_resources</toplevel><creatorcontrib>Baur, Sebastien</creatorcontrib><creatorcontrib>Nabulsi, Zaid</creatorcontrib><creatorcontrib>Weng, Wei-Hung</creatorcontrib><creatorcontrib>Garrison, Jake</creatorcontrib><creatorcontrib>Blankemeier, Louis</creatorcontrib><creatorcontrib>Fishman, Sam</creatorcontrib><creatorcontrib>Chen, Christina</creatorcontrib><creatorcontrib>Kakarmath, Sujay</creatorcontrib><creatorcontrib>Maimbolwa, Minyoi</creatorcontrib><creatorcontrib>Sanjase, Nsala</creatorcontrib><creatorcontrib>Shuma, Brian</creatorcontrib><creatorcontrib>Matias, Yossi</creatorcontrib><creatorcontrib>Corrado, Greg S</creatorcontrib><creatorcontrib>Patel, Shwetak</creatorcontrib><creatorcontrib>Shetty, Shravya</creatorcontrib><creatorcontrib>Prabhakara, Shruthi</creatorcontrib><creatorcontrib>Muyoyeta, Monde</creatorcontrib><creatorcontrib>Ardila, Diego</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Baur, Sebastien</au><au>Nabulsi, Zaid</au><au>Weng, Wei-Hung</au><au>Garrison, Jake</au><au>Blankemeier, Louis</au><au>Fishman, Sam</au><au>Chen, Christina</au><au>Kakarmath, Sujay</au><au>Maimbolwa, Minyoi</au><au>Sanjase, Nsala</au><au>Shuma, Brian</au><au>Matias, Yossi</au><au>Corrado, Greg S</au><au>Patel, Shwetak</au><au>Shetty, Shravya</au><au>Prabhakara, Shruthi</au><au>Muyoyeta, Monde</au><au>Ardila, Diego</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>HeAR -- Health Acoustic Representations</atitle><date>2024-03-04</date><risdate>2024</risdate><abstract>Health acoustic sounds such as coughs and breaths are known to contain useful health signals with significant potential for monitoring health and disease, yet are underexplored in the medical machine learning community. The existing deep learning systems for health acoustics are often narrowly trained and evaluated on a single task, which is limited by data and may hinder generalization to other tasks. To mitigate these gaps, we develop HeAR, a scalable self-supervised learning-based deep learning system using masked autoencoders trained on a large dataset of 313 million two-second long audio clips. Through linear probes, we establish HeAR as a state-of-the-art health audio embedding model on a benchmark of 33 health acoustic tasks across 6 datasets. By introducing this work, we hope to enable and accelerate further health acoustics research.</abstract><doi>10.48550/arxiv.2403.02522</doi><oa>free_for_read</oa></addata></record>
fulltext	fulltext_linktorsrc
identifier	DOI: 10.48550/arxiv.2403.02522
ispartof
issn
language	eng
recordid	cdi_arxiv_primary_2403_02522
source	arXiv.org
subjects	Computer Science - Artificial Intelligence Computer Science - Learning
title	HeAR -- Health Acoustic Representations
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-24T03%3A37%3A36IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=HeAR%20--%20Health%20Acoustic%20Representations&rft.au=Baur,%20Sebastien&rft.date=2024-03-04&rft_id=info:doi/10.48550/arxiv.2403.02522&rft_dat=%3Carxiv_GOX%3E2403_02522%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true