Automated Data Visualization from Natural Language via Large Language Models: An Exploratory Study
The Natural Language to Visualization (NL2Vis) task aims to transform natural-language descriptions into visual representations for a grounded table, enabling users to gain insights from vast amounts of data. Recently, many deep learning-based approaches have been developed for NL2Vis. Despite the c...
Gespeichert in:
Hauptverfasser: | , , , , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | The Natural Language to Visualization (NL2Vis) task aims to transform
natural-language descriptions into visual representations for a grounded table,
enabling users to gain insights from vast amounts of data. Recently, many deep
learning-based approaches have been developed for NL2Vis. Despite the
considerable efforts made by these approaches, challenges persist in
visualizing data sourced from unseen databases or spanning multiple tables.
Taking inspiration from the remarkable generation capabilities of Large
Language Models (LLMs), this paper conducts an empirical study to evaluate
their potential in generating visualizations, and explore the effectiveness of
in-context learning prompts for enhancing this task. In particular, we first
explore the ways of transforming structured tabular data into sequential text
prompts, as to feed them into LLMs and analyze which table content contributes
most to the NL2Vis. Our findings suggest that transforming structured tabular
data into programs is effective, and it is essential to consider the table
schema when formulating prompts. Furthermore, we evaluate two types of LLMs:
finetuned models (e.g., T5-Small) and inference-only models (e.g., GPT-3.5),
against state-of-the-art methods, using the NL2Vis benchmarks (i.e., nvBench).
The experimental results reveal that LLMs outperform baselines, with
inference-only models consistently exhibiting performance improvements, at
times even surpassing fine-tuned models when provided with certain few-shot
demonstrations through in-context learning. Finally, we analyze when the LLMs
fail in NL2Vis, and propose to iteratively update the results using strategies
such as chain-of-thought, role-playing, and code-interpreter. The experimental
results confirm the efficacy of iterative updates and hold great potential for
future study. |
---|---|
DOI: | 10.48550/arxiv.2404.17136 |