No Annotations for Object Detection in Art through Stable Diffusion
Object detection in art is a valuable tool for the digital humanities, as it allows for faster identification of objects in artistic and historical images compared to humans. However, annotating such images poses significant challenges due to the need for specialized domain expertise. We present NAD...
Gespeichert in:
Hauptverfasser: | , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Object detection in art is a valuable tool for the digital humanities, as it
allows for faster identification of objects in artistic and historical images
compared to humans. However, annotating such images poses significant
challenges due to the need for specialized domain expertise. We present NADA
(no annotations for detection in art), a pipeline that leverages diffusion
models' art-related knowledge for object detection in paintings without the
need for full bounding box supervision. Our method, which supports both
weakly-supervised and zero-shot scenarios and does not require any fine-tuning
of its pretrained components, consists of a class proposer based on large
vision-language models and a class-conditioned detector based on Stable
Diffusion. NADA is evaluated on two artwork datasets, ArtDL 2.0 and IconArt,
outperforming prior work in weakly-supervised detection, while being the first
work for zero-shot object detection in art. Code is available at
https://github.com/patrick-john-ramos/nada |
---|---|
DOI: | 10.48550/arxiv.2412.06286 |