Lux: Always-on Visualization Recommendations for Exploratory Dataframe Workflows
Exploratory data science largely happens in computational notebooks with dataframe APIs, such as pandas, that support flexible means to transform, clean, and analyze data. Yet, visually exploring data in dataframes remains tedious, requiring substantial programming effort for visualization and menta...
Gespeichert in:
Hauptverfasser: | , , , , , , , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Exploratory data science largely happens in computational notebooks with
dataframe APIs, such as pandas, that support flexible means to transform,
clean, and analyze data. Yet, visually exploring data in dataframes remains
tedious, requiring substantial programming effort for visualization and mental
effort to determine what analysis to perform next. We propose Lux, an always-on
framework for accelerating visual insight discovery in dataframe workflows.
When users print a dataframe in their notebooks, Lux recommends visualizations
to provide a quick overview of the patterns and trends and suggests promising
analysis directions. Lux features a high level language for generating
visualizations on demand to encourage rapid visual experimentation with data.
We demonstrate that through the use of a careful design and three system
optimizations, Lux adds no more than two seconds of overhead on top of pandas
for over 98% of datasets in the UCI repository. We evaluate Lux in terms of
usability via a controlled first-use study and interviews with early adopters,
finding that Lux helps fulfill the needs of data scientists for visualization
support within their dataframe workflows. Lux has already been embraced by data
science practitioners, with over 3.1k stars on Github. |
---|---|
DOI: | 10.48550/arxiv.2105.00121 |