Learning binary and sparse permutation-invariant representations for fast and memory efficient whole slide image search
Considering their gigapixel sizes, the representation of whole slide images (WSIs) for classification and retrieval systems is a non-trivial task. Patch processing and multi-Instance Learning (MIL) are common approaches to analyze WSIs. However, in end-to-end training, these methods require high GPU...
Gespeichert in:
Veröffentlicht in: | Computers in biology and medicine 2023-08, Vol.162, p.107026-107026, Article 107026 |
---|---|
Hauptverfasser: | , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Considering their gigapixel sizes, the representation of whole slide images (WSIs) for classification and retrieval systems is a non-trivial task. Patch processing and multi-Instance Learning (MIL) are common approaches to analyze WSIs. However, in end-to-end training, these methods require high GPU memory consumption due to the simultaneous processing of multiple sets of patches. Furthermore, compact WSI representations through binary and/or sparse representations are urgently needed for real-time image retrieval within large medical archives. To address these challenges, we propose a novel framework for learning compact WSI representations utilizing deep conditional generative modeling and the Fisher Vector Theory. The training of our method is instance-based, achieving better memory and computational efficiency during the training. To achieve efficient large-scale WSI search, we introduce new loss functions, namely gradient sparsity and gradient quantization losses, for learning sparse and binary permutation-invariant WSI representations called Conditioned Sparse Fisher Vector (C-Deep-SFV), and Conditioned Binary Fisher Vector (C-Deep-BFV). The learned WSI representations are validated on the largest public WSI archive, The Cancer Genomic Atlas (TCGA) and also Liver-Kidney-Stomach (LKS) dataset. For WSI search, the proposed method outperforms Yottixel and Gaussian Mixture Model (GMM)-based Fisher Vector both in terms of retrieval accuracy and speed. For WSI classification, we achieve competitive performance against state-of-art on lung cancer data from TCGA and the public benchmark LKS dataset.
•Multi-Instance Learning (MIL) is a common practice to process WSIs as a set of patches.•End-to-end training of MIL methods require high GPU memory consumption due to the simultaneous processing of multiple sets of patches.•Furthermore, compact WSI representations, e.g., binary and/or sparse representations are necessary for real-time image retrieval within large medical archives.•We propose a novel framework for learning compact WSI representations utilizing a deep conditional generative modeling and the Fisher Vector Theory.•We introduce new loss functions for learning sparse and binary permutation-invariant WSI representations, called: Conditioned Sparse Fisher Vector (C-Deep-SFV), and; Conditioned Binary Fisher Vector (C-Deep-BFV).•For validation, we use: The Cancer Genomic Atlas (TCGA); Liver-Kidney-Stomach (LKS) dataset.•The proposed method outperforms Yott |
---|---|
ISSN: | 0010-4825 1879-0534 |
DOI: | 10.1016/j.compbiomed.2023.107026 |