Red-Teaming Segment Anything Model
Foundation models have emerged as pivotal tools, tackling many complex tasks through pre-training on vast datasets and subsequent fine-tuning for specific applications. The Segment Anything Model is one of the first and most well-known foundation models for computer vision segmentation tasks. This w...
Gespeichert in:
Hauptverfasser: | , , , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Foundation models have emerged as pivotal tools, tackling many complex tasks
through pre-training on vast datasets and subsequent fine-tuning for specific
applications. The Segment Anything Model is one of the first and most
well-known foundation models for computer vision segmentation tasks. This work
presents a multi-faceted red-teaming analysis that tests the Segment Anything
Model against challenging tasks: (1) We analyze the impact of style transfer on
segmentation masks, demonstrating that applying adverse weather conditions and
raindrops to dashboard images of city roads significantly distorts generated
masks. (2) We focus on assessing whether the model can be used for attacks on
privacy, such as recognizing celebrities' faces, and show that the model
possesses some undesired knowledge in this task. (3) Finally, we check how
robust the model is to adversarial attacks on segmentation masks under text
prompts. We not only show the effectiveness of popular white-box attacks and
resistance to black-box attacks but also introduce a novel approach - Focused
Iterative Gradient Attack (FIGA) that combines white-box approaches to
construct an efficient attack resulting in a smaller number of modified pixels.
All of our testing methods and analyses indicate a need for enhanced safety
measures in foundation models for image segmentation. |
---|---|
DOI: | 10.48550/arxiv.2404.02067 |