Evaluating Text-to-Image Diffusion Models for Texturing Synthetic Data
Building generic robotic manipulation systems often requires large amounts of real-world data, which can be dificult to collect. Synthetic data generation offers a promising alternative, but limiting the sim-to-real gap requires significant engineering efforts. To reduce this engineering effort, we...
Gespeichert in:
Hauptverfasser: | , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Building generic robotic manipulation systems often requires large amounts of
real-world data, which can be dificult to collect. Synthetic data generation
offers a promising alternative, but limiting the sim-to-real gap requires
significant engineering efforts. To reduce this engineering effort, we
investigate the use of pretrained text-to-image diffusion models for texturing
synthetic images and compare this approach with using random textures, a common
domain randomization technique in synthetic data generation. We focus on
generating object-centric representations, such as keypoints and segmentation
masks, which are important for robotic manipulation and require precise
annotations. We evaluate the efficacy of the texturing methods by training
models on the synthetic data and measuring their performance on real-world
datasets for three object categories: shoes, T-shirts, and mugs. Surprisingly,
we find that texturing using a diffusion model performs on par with random
textures, despite generating seemingly more realistic images. Our results
suggest that, for now, using diffusion models for texturing does not benefit
synthetic data generation for robotics. The code, data and trained models are
available at \url{https://github.com/tlpss/diffusing-synthetic-data.git}. |
---|---|
DOI: | 10.48550/arxiv.2411.10164 |