AutoKaggle: A Multi-Agent Framework for Autonomous Data Science Competitions
Data science tasks involving tabular data present complex challenges that require sophisticated problem-solving approaches. We propose AutoKaggle, a powerful and user-centric framework that assists data scientists in completing daily data pipelines through a collaborative multi-agent system. AutoKag...
Gespeichert in:
Hauptverfasser: | , , , , , , , , , , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Data science tasks involving tabular data present complex challenges that
require sophisticated problem-solving approaches. We propose AutoKaggle, a
powerful and user-centric framework that assists data scientists in completing
daily data pipelines through a collaborative multi-agent system. AutoKaggle
implements an iterative development process that combines code execution,
debugging, and comprehensive unit testing to ensure code correctness and logic
consistency. The framework offers highly customizable workflows, allowing users
to intervene at each phase, thus integrating automated intelligence with human
expertise. Our universal data science toolkit, comprising validated functions
for data cleaning, feature engineering, and modeling, forms the foundation of
this solution, enhancing productivity by streamlining common tasks. We selected
8 Kaggle competitions to simulate data processing workflows in real-world
application scenarios. Evaluation results demonstrate that AutoKaggle achieves
a validation submission rate of 0.85 and a comprehensive score of 0.82 in
typical data science pipelines, fully proving its effectiveness and
practicality in handling complex data science tasks. |
---|---|
DOI: | 10.48550/arxiv.2410.20424 |