CLIP-guided black-box domain adaptation of image classification

Recently, the significant success of the large pre-trained models have attracted great attentions. How to sufficiently use these models is a big issue. Black-box domain adaptation is a way which tries to train a target model by a cloud API offered by a large pre-trained model without model details a...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Signal, image and video processing image and video processing, 2024-07, Vol.18 (5), p.4637-4646
Hauptverfasser:	Tian, Liang, Ye, Mao, Zhou, Lihua, He, Qichen
Format:	Artikel
Sprache:	eng
Schlagworte:	Adaptation Black boxes Computer Imaging Computer Science Datasets Image classification Image Processing and Computer Vision Labels Multimedia Information Systems Original Paper Pattern Recognition and Graphics Signal,Image and Speech Processing Vision
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Recently, the significant success of the large pre-trained models have attracted great attentions. How to sufficiently use these models is a big issue. Black-box domain adaptation is a way which tries to train a target model by a cloud API offered by a large pre-trained model without model details and source data. The existing black-box domain adaptation methods for image classification always use the prediction results from the cloud API, but the information is very limited. On the other hand, the recent proposed visual-language model (CLIP), trained from a large number of extensive datasets, aligns the visual feature and text feature in a common space, which provides useful auxiliary information. In this work, we propose a new black-box domain adaptation method guided by CLIP (BBC). The key idea is to generate more accurate pseudo-labels. Two strategies are adapted. The first is called generation of joint pseudo-labels, which combines the predictions from cloud API and CLIP model. Another one is the structure-preserved pseudo-labeling strategy which further generates much better pseudo-labels by the previous stored predictions of the k -closest neighbors. Experiments on three benchmark datasets show that our method achieves the state-of-the-art results with large margin.
ISSN:	1863-1703 1863-1711
DOI:	10.1007/s11760-024-03101-8