Optimizing Breast Cancer Mammogram Classification Through a Dual Approach: A Deep Learning Framework Combining ResNet50, SMOTE, and Fully Connected Layers for Balanced and Imbalanced Data

Breast cancer is a global health concern where early and accurate diagnosis is crucial. Mammogram scans provide detailed imaging but require expert interpretation, which is time-consuming. While deep learning shows promise in medical image analysis, the prevalence of imbalanced datasets in medical d...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE access 2025, Vol.13, p.4815-4826
Hauptverfasser: Alshamrani, Abdullah Fahad A., Saleh Zuhair Alshomrani, Faisal
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Breast cancer is a global health concern where early and accurate diagnosis is crucial. Mammogram scans provide detailed imaging but require expert interpretation, which is time-consuming. While deep learning shows promise in medical image analysis, the prevalence of imbalanced datasets in medical diagnosis hinders the development of accurate and reliable classification models. We propose a novel deep-learning framework for breast cancer classification from Mammogram scans. The framework addresses imbalanced data through a unique two-module pipeline incorporating the Synthetic Minority Over-sampling Technique (SMOTE). One module employs SMOTE on the entire dataset to balance class distribution. At the same time, the second separates a portion (20%) of the original imbalanced data for evaluation and applies SMOTE to the remaining 80%. The framework incorporates a blockwise Convolutional Neural Network (CNN), utilizing VGG16 preprocessing for input standardization and ResNet50 for feature extraction. A fully connected classification model, consisting of multiple dense layers with batch normalization and dropout for regularization, was developed to assess the extracted features. The model architecture was iteratively refined to combat overfitting, with the final version incorporating three dense layers (128, 256, and 128 neurons) with dropout rates of 0.5. Our model achieved 99% accuracy on a balanced dataset and 90% on an imbalanced portion. The framework includes an interpretable visualization technique for randomly selected predictions across all classes. Our approach significantly improves diagnostic accuracy in breast cancer classification from Mammogram scans, effectively addressing the challenge of imbalanced data in medical image analysis. This work contributes to medical image analysis and computer-aided diagnosis. The proposed techniques for handling imbalanced data and providing interpretable results can be extended to improve diagnostic accuracy across various medical conditions.
ISSN:2169-3536
2169-3536
DOI:10.1109/ACCESS.2024.3524633