Exploring Critical Testing Scenarios for Decision-Making Policies: An LLM Approach
Recent advances in decision-making policies have led to significant progress in fields such as autonomous driving and robotics. However, testing these policies remains crucial with the existence of critical scenarios that may threaten their reliability. Despite ongoing research, challenges such as l...
Gespeichert in:
Hauptverfasser: | , , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Recent advances in decision-making policies have led to significant progress
in fields such as autonomous driving and robotics. However, testing these
policies remains crucial with the existence of critical scenarios that may
threaten their reliability. Despite ongoing research, challenges such as low
testing efficiency and limited diversity persist due to the complexity of the
decision-making policies and their environments. To address these challenges,
this paper proposes an adaptable Large Language Model (LLM)-driven online
testing framework to explore critical and diverse testing scenarios for
decision-making policies. Specifically, we design a "generate-test-feedback"
pipeline with templated prompt engineering to harness the world knowledge and
reasoning abilities of LLMs. Additionally, a multi-scale scenario generation
strategy is proposed to address the limitations of LLMs in making fine-grained
adjustments, further enhancing testing efficiency. Finally, the proposed
LLM-driven method is evaluated on five widely recognized benchmarks, and the
experimental results demonstrate that our method significantly outperforms
baseline methods in uncovering both critical and diverse scenarios. These
findings suggest that LLM-driven methods hold significant promise for advancing
the testing of decision-making policies. |
---|---|
DOI: | 10.48550/arxiv.2412.06684 |