Evaluating Generalization, Bias, and Fairness in Deep Learning for Metal Surface Defect Detection: A Comparative Study
In recent years, deep learning models have led to improved accuracy in industrial defect detection, often using variants of YOLO (You Only Look Once), due to its high performance at a low cost. However, the generalizability, fairness and bias of their outcomes have not been examined, which may lead...
Gespeichert in:
Veröffentlicht in: | Processes 2024-03, Vol.12 (3), p.456 |
---|---|
Hauptverfasser: | , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | In recent years, deep learning models have led to improved accuracy in industrial defect detection, often using variants of YOLO (You Only Look Once), due to its high performance at a low cost. However, the generalizability, fairness and bias of their outcomes have not been examined, which may lead to overconfident predictions. Additionally, the complexity added by co-occurring defects, single and multi-class defects, and the effect on training, is not taken into consideration. This study addresses these critical gaps by introducing new methodologies for analyzing dataset complexity and evaluating model fairness. It introduces the novel approach of co-occurrence impact analysis, examining how the co-occurrence of defects in sample images affects performance, and introducing new dimensions to dataset preparation and training. Its aim is to increase model robustness in the face of real-world scenarios where multiple defects often appear together. Our study also innovates in the evaluation of model fairness by adapting the disparate impact ratio (DIR) to consider the true positive rate (TPR) across different groups and modifying the predictive parity difference (PPD) metric to focus on biases present in industrial quality control. Experiments demonstrate by cross-validation that the model trained on combined datasets significantly outperforms others in accuracy without overfitting and results in increased fairness, as validated by our novel fairness metrics. Explainability also provides valuable insights on the effects of different training regimes, notably absent in prior works. This work not only advances the field of deep learning for defect detection but also provides a strategic framework for future advancements, emphasizing the need for balanced datasets and considerations of ethics, fairness, bias and generalizability in the deployment of artificial intelligence in industry. |
---|---|
ISSN: | 2227-9717 2227-9717 |
DOI: | 10.3390/pr12030456 |