Towards Automatic Parsing of Structured Visual Content through the Use of Synthetic Data
Structured Visual Content (SVC) such as graphs, flow charts, or the like are used by authors to illustrate various concepts. While such depictions allow the average reader to better understand the contents, images containing SVCs are typically not machine-readable. This, in turn, not only hinders au...
Gespeichert in:
Hauptverfasser: | , , , , , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Structured Visual Content (SVC) such as graphs, flow charts, or the like are
used by authors to illustrate various concepts. While such depictions allow the
average reader to better understand the contents, images containing SVCs are
typically not machine-readable. This, in turn, not only hinders automated
knowledge aggregation, but also the perception of displayed in-formation for
visually impaired people. In this work, we propose a synthetic dataset,
containing SVCs in the form of images as well as ground truths. We show the
usage of this dataset by an application that automatically extracts a graph
representation from an SVC image. This is done by training a model via common
supervised learning methods. As there currently exist no large-scale public
datasets for the detailed analysis of SVC, we propose the Synthetic SVC (SSVC)
dataset comprising 12,000 images with respective bounding box annotations and
detailed graph representations. Our dataset enables the development of strong
models for the interpretation of SVCs while skipping the time-consuming dense
data annotation. We evaluate our model on both synthetic and manually annotated
data and show the transferability of synthetic to real via various metrics,
given the presented application. Here, we evaluate that this proof of concept
is possible to some extend and lay down a solid baseline for this task. We
discuss the limitations of our approach for further improvements. Our utilized
metrics can be used as a tool for future comparisons in this domain. To enable
further research on this task, the dataset is publicly available at
https://bit.ly/3jN1pJJ |
---|---|
DOI: | 10.48550/arxiv.2204.14136 |