Learning binary and sparse permutation-invariant representations for fast and memory efficient whole slide image search

Considering their gigapixel sizes, the representation of whole slide images (WSIs) for classification and retrieval systems is a non-trivial task. Patch processing and multi-Instance Learning (MIL) are common approaches to analyze WSIs. However, in end-to-end training, these methods require high GPU...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Computers in biology and medicine 2023-08, Vol.162, p.107026-107026, Article 107026
Hauptverfasser: Hemati, Sobhan, Kalra, Shivam, Babaie, Morteza, Tizhoosh, H.R.
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Considering their gigapixel sizes, the representation of whole slide images (WSIs) for classification and retrieval systems is a non-trivial task. Patch processing and multi-Instance Learning (MIL) are common approaches to analyze WSIs. However, in end-to-end training, these methods require high GPU memory consumption due to the simultaneous processing of multiple sets of patches. Furthermore, compact WSI representations through binary and/or sparse representations are urgently needed for real-time image retrieval within large medical archives. To address these challenges, we propose a novel framework for learning compact WSI representations utilizing deep conditional generative modeling and the Fisher Vector Theory. The training of our method is instance-based, achieving better memory and computational efficiency during the training. To achieve efficient large-scale WSI search, we introduce new loss functions, namely gradient sparsity and gradient quantization losses, for learning sparse and binary permutation-invariant WSI representations called Conditioned Sparse Fisher Vector (C-Deep-SFV), and Conditioned Binary Fisher Vector (C-Deep-BFV). The learned WSI representations are validated on the largest public WSI archive, The Cancer Genomic Atlas (TCGA) and also Liver-Kidney-Stomach (LKS) dataset. For WSI search, the proposed method outperforms Yottixel and Gaussian Mixture Model (GMM)-based Fisher Vector both in terms of retrieval accuracy and speed. For WSI classification, we achieve competitive performance against state-of-art on lung cancer data from TCGA and the public benchmark LKS dataset. •Multi-Instance Learning (MIL) is a common practice to process WSIs as a set of patches.•End-to-end training of MIL methods require high GPU memory consumption due to the simultaneous processing of multiple sets of patches.•Furthermore, compact WSI representations, e.g., binary and/or sparse representations are necessary for real-time image retrieval within large medical archives.•We propose a novel framework for learning compact WSI representations utilizing a deep conditional generative modeling and the Fisher Vector Theory.•We introduce new loss functions for learning sparse and binary permutation-invariant WSI representations, called: Conditioned Sparse Fisher Vector (C-Deep-SFV), and; Conditioned Binary Fisher Vector (C-Deep-BFV).•For validation, we use: The Cancer Genomic Atlas (TCGA); Liver-Kidney-Stomach (LKS) dataset.•The proposed method outperforms Yott
ISSN:0010-4825
1879-0534
DOI:10.1016/j.compbiomed.2023.107026