SoK: Reducing the Vulnerability of Fine-tuned Language Models to Membership Inference Attacks
Natural language processing models have experienced a significant upsurge in recent years, with numerous applications being built upon them. Many of these applications require fine-tuning generic base models on customized, proprietary datasets. This fine-tuning data is especially likely to contain p...
Gespeichert in:
Hauptverfasser: | , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Natural language processing models have experienced a significant upsurge in
recent years, with numerous applications being built upon them. Many of these
applications require fine-tuning generic base models on customized, proprietary
datasets. This fine-tuning data is especially likely to contain personal or
sensitive information about individuals, resulting in increased privacy risk.
Membership inference attacks are the most commonly employed attack to assess
the privacy leakage of a machine learning model. However, limited research is
available on the factors that affect the vulnerability of language models to
this kind of attack, or on the applicability of different defense strategies in
the language domain. We provide the first systematic review of the
vulnerability of fine-tuned large language models to membership inference
attacks, the various factors that come into play, and the effectiveness of
different defense strategies. We find that some training methods provide
significantly reduced privacy risk, with the combination of differential
privacy and low-rank adaptors achieving the best privacy protection against
these attacks. |
---|---|
DOI: | 10.48550/arxiv.2403.08481 |