Multi-scale Feature Enhancement in Multi-task Learning for Medical Image Analysis
Traditional deep learning methods in medical imaging often focus solely on segmentation or classification, limiting their ability to leverage shared information. Multi-task learning (MTL) addresses this by combining both tasks through shared representations but often struggles to balance local spati...
Gespeichert in:
Hauptverfasser: | , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Traditional deep learning methods in medical imaging often focus solely on
segmentation or classification, limiting their ability to leverage shared
information. Multi-task learning (MTL) addresses this by combining both tasks
through shared representations but often struggles to balance local spatial
features for segmentation and global semantic features for classification,
leading to suboptimal performance. In this paper, we propose a simple yet
effective UNet-based MTL model, where features extracted by the encoder are
used to predict classification labels, while the decoder produces the
segmentation mask. The model introduces an advanced encoder incorporating a
novel ResFormer block that integrates local context from convolutional feature
extraction with long-range dependencies modeled by the Transformer. This design
captures broader contextual relationships and fine-grained details, improving
classification and segmentation accuracy. To enhance classification
performance, multi-scale features from different encoder levels are combined to
leverage the hierarchical representation of the input image. For segmentation,
the features passed to the decoder via skip connections are refined using a
novel dilated feature enhancement (DFE) module, which captures information at
different scales through three parallel convolution branches with varying
dilation rates. This allows the decoder to detect lesions of varying sizes with
greater accuracy. Experimental results across multiple medical datasets confirm
the superior performance of our model in both segmentation and classification
tasks, compared to state-of-the-art single-task and multi-task learning
methods. |
---|---|
DOI: | 10.48550/arxiv.2412.00351 |