Deep Multimodal Collaborative Learning for Polyp Re-Identification
Colonoscopic Polyp Re-Identification aims to match the same polyp from a large gallery with images from different views taken using different cameras, which plays an important role in the prevention and treatment of colorectal cancer in computer-aided diagnosis. However, traditional methods for obje...
Gespeichert in:
Hauptverfasser: | , , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Colonoscopic Polyp Re-Identification aims to match the same polyp from a
large gallery with images from different views taken using different cameras,
which plays an important role in the prevention and treatment of colorectal
cancer in computer-aided diagnosis. However, traditional methods for object
ReID directly adopting CNN models trained on the ImageNet dataset usually
produce unsatisfactory retrieval performance on colonoscopic datasets due to
the large domain gap. Worsely, these solutions typically learn unimodal modal
representations on the basis of visual samples, which fails to explore
complementary information from other different modalities. To address this
challenge, we propose a novel Deep Multimodal Collaborative Learning framework
named DMCL for polyp re-identification, which can effectively encourage
modality collaboration and reinforce generalization capability in medical
scenarios. On the basis of it, a dynamic multimodal feature fusion strategy is
introduced to leverage the optimized multimodal representations for multimodal
fusion via end-to-end training. Experiments on the standard benchmarks show the
benefits of the multimodal setting over state-of-the-art unimodal ReID models,
especially when combined with the specialized multimodal fusion strategy, from
which we have proved that learning representation with multiple-modality can be
competitive to methods based on unimodal representation learning. We also hope
that our method will shed light on some related researches to move forward,
especially for multimodal collaborative learning. The code is publicly
available at https://github.com/JeremyXSC/DMCL. |
---|---|
DOI: | 10.48550/arxiv.2408.05914 |