Practical Membership Inference Attacks Against Large-Scale Multi-Modal Models: A Pilot Study
Membership inference attacks (MIAs) aim to infer whether a data point has been used to train a machine learning model. These attacks can be employed to identify potential privacy vulnerabilities and detect unauthorized use of personal data. While MIAs have been traditionally studied for simple class...
Gespeichert in:
Hauptverfasser: | , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Membership inference attacks (MIAs) aim to infer whether a data point has
been used to train a machine learning model. These attacks can be employed to
identify potential privacy vulnerabilities and detect unauthorized use of
personal data. While MIAs have been traditionally studied for simple
classification models, recent advancements in multi-modal pre-training, such as
CLIP, have demonstrated remarkable zero-shot performance across a range of
computer vision tasks. However, the sheer scale of data and models presents
significant computational challenges for performing the attacks.
This paper takes a first step towards developing practical MIAs against
large-scale multi-modal models. We introduce a simple baseline strategy by
thresholding the cosine similarity between text and image features of a target
point and propose further enhancing the baseline by aggregating cosine
similarity across transformations of the target. We also present a new weakly
supervised attack method that leverages ground-truth non-members (e.g.,
obtained by using the publication date of a target model and the timestamps of
the open data) to further enhance the attack. Our evaluation shows that CLIP
models are susceptible to our attack strategies, with our simple baseline
achieving over $75\%$ membership identification accuracy. Furthermore, our
enhanced attacks outperform the baseline across multiple models and datasets,
with the weakly supervised attack demonstrating an average-case performance
improvement of $17\%$ and being at least $7$X more effective at low
false-positive rates. These findings highlight the importance of protecting the
privacy of multi-modal foundational models, which were previously assumed to be
less susceptible to MIAs due to less overfitting. Our code is available at
https://github.com/ruoxi-jia-group/CLIP-MIA. |
---|---|
DOI: | 10.48550/arxiv.2310.00108 |