MLVICX: Multi-Level Variance-Covariance Exploration for Chest X-Ray Self-Supervised Representation Learning
Self-supervised learning (SSL) reduces the need for manual annotation in deep learning models for medical image analysis. By learning the representations from unablelled data, self-supervised models perform well on tasks that require little to no fine-tuning. However, for medical images, like chest...
Gespeichert in:
Veröffentlicht in: | IEEE journal of biomedical and health informatics 2024-12, Vol.28 (12), p.7480-7490 |
---|---|
Hauptverfasser: | , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Self-supervised learning (SSL) reduces the need for manual annotation in deep learning models for medical image analysis. By learning the representations from unablelled data, self-supervised models perform well on tasks that require little to no fine-tuning. However, for medical images, like chest X-rays, characterised by complex anatomical structures and diverse clinical conditions, a need arises for representation learning techniques that encode fine-grained details while preserving the broader contextual information. In this context, we introduce MLVICX (Multi-Level Variance-Covariance Exploration for Chest X-ray Self-Supervised Representation Learning), an approach to capture rich representations in the form of embeddings from chest X-ray images. Central to our approach is a novel multi-level variance and covariance exploration strategy that effectively enables the model to detect diagnostically meaningful patterns while reducing redundancy. MLVICX promotes the retention of critical medical insights by adapting global and local contextual details and enhancing the variance and covariance of the learned embeddings. We demonstrate the performance of MLVICX in advancing self-supervised chest X-ray representation learning through comprehensive experiments. The performance enhancements we observe across various downstream tasks highlight the significance of the proposed approach in enhancing the utility of chest X-ray embeddings for precision medical diagnosis and comprehensive image analysis. For pertaining, we used the NIH-Chest X-ray dataset. Downstream tasks utilized NIH-Chest X-ray, Vinbig-CXR, RSNA pneumonia, and SIIM-ACR Pneumothorax datasets. Overall, we observe up to 3% performance gain over SOTA SSL approaches in various downstream tasks. Additionally, to demonstrate generalizability of our method, we conducted additional experiments on fundus images and observed superior performance on multiple datasets. Codes are available at GitHub. |
---|---|
ISSN: | 2168-2194 2168-2208 2168-2208 |
DOI: | 10.1109/JBHI.2024.3455337 |