Style Normalization and Restitution for Domain Generalization and Adaptation

For many computer vision applications, the learned models usually have high performance on the training datasets but suffer from significant performance degradation when deployed in new environments, where there are usually style differences between the training images and the testing images. For hi...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on multimedia 2022, Vol.24, p.3636-3651
Hauptverfasser: Jin, Xin, Lan, Cuiling, Zeng, Wenjun, Chen, Zhibo
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:For many computer vision applications, the learned models usually have high performance on the training datasets but suffer from significant performance degradation when deployed in new environments, where there are usually style differences between the training images and the testing images. For high-level vision tasks, an effective domain generalizable model is expected to be able to learn feature representations that are both generalizable and discriminative. In this paper, we design a novel Style Normalization and Restitution module (SNR) to simultaneously ensure high generalization and discrimination capability of the networks. In SNR, particularly, we filter out the style variations ( e.g ., illumination, color contrast) by performing Instance Normalization (IN) to obtain style normalized features, where the discrepancy among different samples/domains is reduced. However, such a process is task-ignorant and inevitably removes some task-relevant discriminative information, which may hurt the performance. To remedy this, we propose to distill task-relevant discriminative features from the residual ( i . e ., the difference between the original feature and the style normalized feature) and add them back to the network to ensure high discrimination. Moreover, for better disentanglement, we enforce a dual restitution loss constraint to encourage the better separation of task-relevant and task-irrelevant features. We validate the effectiveness of our SNR on different vision tasks, including classification, semantic segmentation, and object detection. Experiments demonstrate that our SNR is capable of improving the performance of networks for domain generalization (DG) and unsupervised domain adaptation (UDA).
ISSN:1520-9210
1941-0077
DOI:10.1109/TMM.2021.3104379