Deep learning–based harmonization of CT reconstruction kernels towards improved clinical task performance
Objectives To develop a deep learning–based harmonization framework, assessing whether it can improve performance of radiomics models given different kernels in different clinical tasks and additionally generalize to mitigate the effects of new/unobserved kernels on radiomics features. Methods Patie...
Gespeichert in:
Veröffentlicht in: | European radiology 2023-04, Vol.33 (4), p.2426-2438 |
---|---|
Hauptverfasser: | , , , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Objectives
To develop a deep learning–based harmonization framework, assessing whether it can improve performance of radiomics models given different kernels in different clinical tasks and additionally generalize to mitigate the effects of new/unobserved kernels on radiomics features.
Methods
Patient data with 2 reconstruction kernels and phantom data with 22 reconstruction kernels were included. Eighty-five patients were studied for lymph node metastasis (LNM) prediction, and 164 patients for differential diagnosis between lung cancer (LC) and pulmonary tuberculosis (TB). Two convolutional neural network (CNN) models were developed to convert images (i) from B70f to B30f (CNNa) and (ii) from B30f to B70f (CNNb). Model performance between the two kernels was evaluated using AUC and compared with other well-known harmonization methods. Patient-normalized feature difference (PNFD) was used to identify the incompatible kernels (i.e., kernel with median PNFD > 1) with baseline (B30f/B70f), and measure the ability of the CNN models to convert the non-comparable kernels.
Results
For LC versus pulmonary TB diagnosis, AUCs of CNNa vs. others were 0.85 vs. 0.54–0.74 (
p
= 0.0001–0.0003), and for CNNb vs. others: 0.87 vs. 0.54–0.86 (
p
= 0.0001–0.55). For LNM prediction, AUCs of CNNa vs. others were 0.68 vs. 0.56–0.61 (
p
= 0.10–0.39), and for CNNb vs. others: 0.78 vs. 0.70–0.73 (
p
= 0.07–0.40). After CNN harmonization, 17 of 20 (85%) of investigated unknown kernels produced comparable radiomics feature values relative to baseline (median PNFD from 1.10–2.31 to 0.23–1.13).
Conclusion
The CNN harmonization effectively improved performance of radiomics models between reconstruction kernels in different clinical tasks, and reduced feature differences between unknown kernels vs. baseline.
Key Points
• The soft (B30f) and sharp (B70f) kernels strongly affect radiomics reproducibility and generalizability.
•
The convolutional neural network (CNN) harmonization methods performed better than location-scale (ComBat and centering-scaling) and matrix factorization harmonization methods (based on singular value decomposition (SVD) and independent component analysis (ICA)) in both clinical tasks.
•
The CNN harmonization methods improve feature reproducibility not only between specific kernels (B30f and B70f) from the same scanner, but also between unobserved kernels from different scanners of different vendors. |
---|---|
ISSN: | 1432-1084 0938-7994 1432-1084 |
DOI: | 10.1007/s00330-022-09229-w |