DialTest‐EA: An Enhanced Fuzzing Approach With Energy Adjustment for Dialogue Systems via Metamorphic Testing

ABSTRACT Deep neural networks (DNNs) possess potent feature learning capability, enabling them to comprehend natural language, which strongly support developing dialogue systems. However, dialogue systems usually perform incorrect behaviours in some corner cases, which may cause misunderstanding or...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Software testing, verification & reliability verification & reliability, 2025-01, Vol.35 (1), p.n/a
Hauptverfasser: Chen, Haibo, Chen, Jinfu, Wu, Yucheng, Cai, Saihua, Ahmad, Bilal, Huang, Rubing, Wang, Shengran, Zhang, Chi
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:ABSTRACT Deep neural networks (DNNs) possess potent feature learning capability, enabling them to comprehend natural language, which strongly support developing dialogue systems. However, dialogue systems usually perform incorrect behaviours in some corner cases, which may cause misunderstanding or economic loss. To test and debug dialogue systems, a popular fuzzing framework by metamorphic testing with Gini impurity guidance is proposed, namely, DialTest. However, DialTest treats all seeds (the initial test inputs to generate the mutated test inputs) equally during the fuzzing process and does not differentiate seeds, resulting in a certain limitation to its incorrect behaviour detection capability. In this paper, we propose to enhance the DialTest by applying a lightweight energy adjustment strategy called DialTest with Energy Adjustment (DialTest‐EA). DialTest‐EA employs the ant colony optimization algorithm (ACO) to adjust the mutation energy of each seed adaptively, ensuring that potential seeds have more opportunities to generate subsequent test inputs. To evaluate the effectiveness of the proposed DialTest‐EA, we conduct a series of comparisons with the original DialTest and random mutation strategy. The experimental results show that the proposed DialTest‐EA outperforms the compared methods both in the intent detection and slot filling tasks. Compared with the original DialTest, the intent detection accuracy of generated test cases by the proposed method is reduced by more than 14%, and the slot filling accuracy is reduced by more than 8%. We introduce an adaptive energy adjustment strategy by implementing the ACO algorithm and propose an enhanced Gini‐guided fuzzing method called DialTest‐EA for dialogue systems. DialTest‐EA differentiates seeds according to the testing feedback and allocates testing resources adaptively through ACO optimization.
ISSN:0960-0833
1099-1689
DOI:10.1002/stvr.1897