Automatic Requirement Dependency Extraction Based on Integrated Active Learning Strategies

Since requirement dependency extraction is a cognitively challenging and error-prone task, this paper proposes an automatic requirement dependency extraction method based on integrated active learning strategies. In this paper, the coefficient of variation method was used to determine the correspond...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:International journal of automation and computing 2024-10, Vol.21 (5), p.993-1010
Hauptverfasser: Guan, Hui, Cai, Guorong, Xu, Hang
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Since requirement dependency extraction is a cognitively challenging and error-prone task, this paper proposes an automatic requirement dependency extraction method based on integrated active learning strategies. In this paper, the coefficient of variation method was used to determine the corresponding weight of the impact factors from three different angles: uncertainty probability, text similarity difference degree and active learning variant prediction divergence degree. By combining the three factors with the proposed calculation formula to measure the information value of dependency pairs, the top K dependency pairs with the highest comprehensive evaluation value are selected as the optimal samples. As the optimal samples are continuously added into the initial training set, the performance of the active learning model using different dependency features for requirement dependency extraction is rapidly improved. Therefore, compared with other active learning strategies, a higher evaluation measure of requirement dependency extraction can be achieved by using the same number of samples. Finally, the proposed method using the PV-DM dependency feature improves the weight-F1 by 2.71%, the weight-recall by 2.45%, and the weight-precision by 2.64% in comparison with other strategies, saving approximately 46% of the labelled data compared with the machine learning approach.
ISSN:2731-538X
2153-182X
1476-8186
2731-5398
2153-1838
1751-8520
DOI:10.1007/s11633-023-1420-1