Contrastive learning of heart and lung sounds for label-efficient diagnosis

Data labeling is often the limiting step in machine learning because it requires time from trained experts. To address the limitation on labeled data, contrastive learning, among other unsupervised learning methods, leverages unlabeled data to learn representations of data. Here, we propose a contra...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Patterns (New York, N.Y.) N.Y.), 2022-01, Vol.3 (1), p.100400-100400, Article 100400
Hauptverfasser:	Soni, Pratham N., Shi, Siyu, Sriram, Pranav R., Ng, Andrew Y., Rajpurkar, Pranav
Format:	Artikel
Sprache:	eng
Schlagworte:	contrastive learning heart sounds lung sounds medicine self-supervised learning unlabeled data
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Data labeling is often the limiting step in machine learning because it requires time from trained experts. To address the limitation on labeled data, contrastive learning, among other unsupervised learning methods, leverages unlabeled data to learn representations of data. Here, we propose a contrastive learning framework that utilizes metadata for selecting positive and negative pairs when training on unlabeled data. We demonstrate its application in the healthcare domain on heart and lung sound recordings. The increasing availability of heart and lung sound recordings due to adoption of digital stethoscopes lends itself as an opportunity to demonstrate the application of our contrastive learning method. Compared to contrastive learning with augmentations, the contrastive learning model leveraging metadata for pair selection utilizes clinical information associated with lung and heart sound recordings. This approach uses shared context of the recordings on the patient level using clinical information including age, sex, weight, location of sounds, etc. We show improvement in downstream tasks for diagnosing heart and lung sounds when leveraging patient-specific representations in selecting positive and negative pairs. This study paves the path for medical applications of contrastive learning that leverage clinical information. We have made our code available here: https://github.com/stanfordmlgroup/selfsupervised-lungandheartsounds. ▪ •Contrastive learning uses unlabeled data to learn representations•A new contrastive learning framework is proposed for metadata pair selection•We show its application in medical heart and lung sound data and metadata•The contrastive learning strategy only needs 10% of labeled training data Annotating data at scale is time consuming, especially in specialized domains, such as healthcare, agriculture, and autonomous driving. The scarcity of labeled data can limit the effectiveness of supervised learning. In contrast, there is usually access to more unlabeled data. Unlabeled data can be used through unsupervised learning. One type of unsupervised learning is self-supervised learning, where representations of data are learned from unlabeled data through pretext tasks and are later used for supervised learning tasks. We propose a new contrastive learning framework that leverages metadata in selecting pairs during contrastive learning. We demonstrate the application of the framework in diagnosing heart and lung diseases through h
ISSN:	2666-3899 2666-3899
DOI:	10.1016/j.patter.2021.100400