CLIP-guided black-box domain adaptation of image classification

Recently, the significant success of the large pre-trained models have attracted great attentions. How to sufficiently use these models is a big issue. Black-box domain adaptation is a way which tries to train a target model by a cloud API offered by a large pre-trained model without model details a...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Signal, image and video processing image and video processing, 2024-07, Vol.18 (5), p.4637-4646
Hauptverfasser: Tian, Liang, Ye, Mao, Zhou, Lihua, He, Qichen
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Recently, the significant success of the large pre-trained models have attracted great attentions. How to sufficiently use these models is a big issue. Black-box domain adaptation is a way which tries to train a target model by a cloud API offered by a large pre-trained model without model details and source data. The existing black-box domain adaptation methods for image classification always use the prediction results from the cloud API, but the information is very limited. On the other hand, the recent proposed visual-language model (CLIP), trained from a large number of extensive datasets, aligns the visual feature and text feature in a common space, which provides useful auxiliary information. In this work, we propose a new black-box domain adaptation method guided by CLIP (BBC). The key idea is to generate more accurate pseudo-labels. Two strategies are adapted. The first is called generation of joint pseudo-labels, which combines the predictions from cloud API and CLIP model. Another one is the structure-preserved pseudo-labeling strategy which further generates much better pseudo-labels by the previous stored predictions of the k -closest neighbors. Experiments on three benchmark datasets show that our method achieves the state-of-the-art results with large margin.
ISSN:1863-1703
1863-1711
DOI:10.1007/s11760-024-03101-8