FlexFL: Flexible and Effective Fault Localization with Open-Source Large Language Models
Due to the impressive code comprehension ability of Large Language Models (LLMs), a few studies have proposed to leverage LLMs to locate bugs, i.e., LLM-based FL, and demonstrated promising performance. However, first, these methods are limited in flexibility. They rely on bug-triggering test cases...
Gespeichert in:
Hauptverfasser: | , , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Due to the impressive code comprehension ability of Large Language Models
(LLMs), a few studies have proposed to leverage LLMs to locate bugs, i.e.,
LLM-based FL, and demonstrated promising performance. However, first, these
methods are limited in flexibility. They rely on bug-triggering test cases to
perform FL and cannot make use of other available bug-related information,
e.g., bug reports. Second, they are built upon proprietary LLMs, which are,
although powerful, confronted with risks in data privacy. To address these
limitations, we propose a novel LLM-based FL framework named FlexFL, which can
flexibly leverage different types of bug-related information and effectively
work with open-source LLMs. FlexFL is composed of two stages. In the first
stage, FlexFL reduces the search space of buggy code using state-of-the-art FL
techniques of different families and provides a candidate list of bug-related
methods. In the second stage, FlexFL leverages LLMs to delve deeper to
double-check the code snippets of methods suggested by the first stage and
refine fault localization results. In each stage, FlexFL constructs agents
based on open-source LLMs, which share the same pipeline that does not
postulate any type of bug-related information and can interact with function
calls without the out-of-the-box capability. Extensive experimental results on
Defects4J demonstrate that FlexFL outperforms the baselines and can work with
different open-source LLMs. Specifically, FlexFL with a lightweight open-source
LLM Llama3-8B can locate 42 and 63 more bugs than two state-of-the-art
LLM-based FL approaches AutoFL and AgentFL that both use GPT-3.5. |
---|---|
DOI: | 10.48550/arxiv.2411.10714 |