Consistency-Guided Temperature Scaling Using Style and Content Information for Out-of-Domain Calibration
Research interests in the robustness of deep neural networks against domain shifts have been rapidly increasing in recent years. Most existing works, however, focus on improving the accuracy of the model, not the calibration performance which is another important requirement for trustworthy AI syste...
Gespeichert in:
Hauptverfasser: | , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Research interests in the robustness of deep neural networks against domain
shifts have been rapidly increasing in recent years. Most existing works,
however, focus on improving the accuracy of the model, not the calibration
performance which is another important requirement for trustworthy AI systems.
Temperature scaling (TS), an accuracy-preserving post-hoc calibration method,
has been proven to be effective in in-domain settings, but not in out-of-domain
(OOD) due to the difficulty in obtaining a validation set for the unseen domain
beforehand. In this paper, we propose consistency-guided temperature scaling
(CTS), a new temperature scaling strategy that can significantly enhance the
OOD calibration performance by providing mutual supervision among data samples
in the source domains. Motivated by our observation that over-confidence
stemming from inconsistent sample predictions is the main obstacle to OOD
calibration, we propose to guide the scaling process by taking consistencies
into account in terms of two different aspects -- style and content -- which
are the key components that can well-represent data samples in multi-domain
settings. Experimental results demonstrate that our proposed strategy
outperforms existing works, achieving superior OOD calibration performance on
various datasets. This can be accomplished by employing only the source domains
without compromising accuracy, making our scheme directly applicable to various
trustworthy AI systems. |
---|---|
DOI: | 10.48550/arxiv.2402.15019 |