Infant Agent: A Tool-Integrated, Logic-Driven Agent with Cost-Effective API Usage

Despite the impressive capabilities of large language models (LLMs), they currently exhibit two primary limitations, \textbf{\uppercase\expandafter{\romannumeral 1}}: They struggle to \textbf{autonomously solve the real world engineering problem}. \textbf{\uppercase\expandafter{\romannumeral 2}}: Th...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	arXiv.org 2024-11
Hauptverfasser:	Lei, Bin, Li, Yuchen, Zeng, Yiming, Ren, Tao, Luo, Yi, Shi, Tianyu, Gao, Zitian, Hu, Zeyu, Kang, Weitai, Chen, Qiuwu
Format:	Artikel
Sprache:	eng
Schlagworte:	Accuracy Infants Large language models Operators (mathematics) Reasoning Task complexity
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Despite the impressive capabilities of large language models (LLMs), they currently exhibit two primary limitations, \textbf{\uppercase\expandafter{\romannumeral 1}}: They struggle to \textbf{autonomously solve the real world engineering problem}. \textbf{\uppercase\expandafter{\romannumeral 2}}: They remain \textbf{challenged in reasoning through complex logic problems}. To address these challenges, we developed the \textsc{Infant Agent}, integrating task-aware functions, operators, a hierarchical management system, and a memory retrieval mechanism. Together, these components enable large language models to sustain extended reasoning processes and handle complex, multi-step tasks efficiently, all while significantly reducing API costs. Using the \textsc{Infant Agent}, GPT-4o's accuracy on the SWE-bench-lite dataset rises from \(\mathbf{0.33\%}\) to \(\mathbf{30\%}\), and in the AIME-2024 mathematics competition, it increases GPT-4o's accuracy from \(\mathbf{13.3\%}\) to \(\mathbf{37\%}\).
ISSN:	2331-8422