Enhancing accident diagnosis in nuclear power plants through knowledge Distillation: Bridging the gap between simulation and Real-World scenarios
•Addressed the domain sensitivity issue in accident diagnosis for nuclear power plants using knowledge distillation techniques.•Proposed training a lightweight student model using distilled knowledge from multiple teacher models, each capturing different aspects of the problem, such as noise and tim...
Gespeichert in:
Veröffentlicht in: | Nuclear engineering and design 2024-09, Vol.426, p.113395, Article 113395 |
---|---|
Hauptverfasser: | , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | •Addressed the domain sensitivity issue in accident diagnosis for nuclear power plants using knowledge distillation techniques.•Proposed training a lightweight student model using distilled knowledge from multiple teacher models, each capturing different aspects of the problem, such as noise and time trend information.•Demonstrated through extensive experiments that combining knowledge from diverse teacher models significantly improves the student model’s performance and robustness on both simulation and real-world proxy data.•Contributed to the development of effective, robust, and computationally efficient accident diagnosis systems, bridging the gap between simulation and real-world scenarios through knowledge distillation.
Accident diagnosis is critical for ensuring the safety and reliability of nuclear power plants (NPPs). However, the scarcity of real-world data poses a significant challenge in the development of accurate, and robust, artificial intelligence (AI)-based diagnostic systems. This paper presents a novel approach for addressing this domain issue by leveraging knowledge-distillation techniques. We propose training a lightweight-student model that uses distilled knowledge from multiple teacher models, each of which captures different aspects of the problem, such as noise and time trends. Teacher models and student models were trained on CNS simulation data, and both models were evaluated on real-world proxy data(PCTRAN Simulation data).
Through a series of experiments, we demonstrated the effectiveness of our approach in enhancing the performance and robustness of student models. The student model showed improvements in untrained target domain, when knowledge obtained from noisy, time-trend-information containing, and naive teacher models was used instead of a single teacher model. The incorporation of diverse knowledge from different teacher models aids the student model in handling noisy data, capturing temporal dependencies, and generalizing unseen scenarios better.
Moreover, our analysis of the relative training time highlighted the computational benefits of using a lighter student model. The student model required significantly less training time than the teacher model, making it more efficient for deployment and online updation in real-world scenarios. This computational efficiency is particularly valuable in the context of NPPs where timely decision making and resource constraints are critical factors.
Our findings will contribute to |
---|---|
ISSN: | 0029-5493 1872-759X |
DOI: | 10.1016/j.nucengdes.2024.113395 |