Negative Feedback Training: A Novel Concept to Improve Robustness of NVCIM DNN Accelerators
Compute-in-memory (CIM) accelerators built upon non-volatile memory (NVM) devices excel in energy efficiency and latency when performing Deep Neural Network (DNN) inference, thanks to their in-situ data processing capability. However, the stochastic nature and intrinsic variations of NVM devices oft...
Gespeichert in:
Hauptverfasser: | , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Compute-in-memory (CIM) accelerators built upon non-volatile memory (NVM)
devices excel in energy efficiency and latency when performing Deep Neural
Network (DNN) inference, thanks to their in-situ data processing capability.
However, the stochastic nature and intrinsic variations of NVM devices often
result in performance degradation in DNN inference. Introducing these non-ideal
device behaviors during DNN training enhances robustness, but drawbacks include
limited accuracy improvement, reduced prediction confidence, and convergence
issues. This arises from a mismatch between the deterministic training and
non-deterministic device variations, as such training, though considering
variations, relies solely on the model's final output. In this work, we draw
inspiration from the control theory and propose a novel training concept:
Negative Feedback Training (NFT) leveraging the multi-scale noisy information
captured from network. We develop two specific NFT instances, Oriented
Variational Forward (OVF) and Intermediate Representation Snapshot (IRS).
Extensive experiments show that our methods outperform existing
state-of-the-art methods with up to a 46.71% improvement in inference accuracy
while reducing epistemic uncertainty, boosting output confidence, and improving
convergence probability. Their effectiveness highlights the generality and
practicality of our NFT concept in enhancing DNN robustness against device
variations. |
---|---|
DOI: | 10.48550/arxiv.2305.14561 |