An empirical comparison of four Java-based regression test selection techniques

Regression testing is a critical but expensive activity that ensures previously tested functionality is not broken by changes made to the code. Regression test selection (RTS) techniques aim to select and run only those test cases impacted by code changes. The techniques possess different characteri...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:The Journal of systems and software 2022-04, Vol.186, p.111174, Article 111174
Hauptverfasser: Shin, Min Kyung, Ghosh, Sudipto, Vijayasarathy, Leo R.
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Regression testing is a critical but expensive activity that ensures previously tested functionality is not broken by changes made to the code. Regression test selection (RTS) techniques aim to select and run only those test cases impacted by code changes. The techniques possess different characteristics related to their selection accuracy, test suite size reduction, time to select and run the test cases, and the fault detection ability of the selected test cases. This paper presents an empirical comparison of four Java-based RTS techniques (Ekstazi, HyRTS, OpenClover and STARTS) using multiple revisions from five open source projects. The results show that STARTS selects more test cases than Ekstazi and HyRTS. OpenClover selects the most test cases. Safety and precision violations measure to what extent a technique misses test cases that should be selected and selects only the test cases that are impacted. Using HyRTS as the baseline, OpenClover had significantly worse safety violations compared to STARTS and Ekstazi, and significantly worse precision violations compared to Ekstazi. While STARTS and Ekstazi did not differ on safety violations, Ekstazi had significantly fewer precision violations than STARTS. The average fault detection ability of the RTS techniques was 8.75% lower than the original test suite. •OpenClover and STARTS selected the most number of test cases among the RTS tools.•The average end-to-end time reduction of the RTS tools was 40.49%.•OpenClover had the worst safety and prevision violations with HyRTS as the baseline.•Selected tests had 8.75% lower fault detection ability than the original test suite.
ISSN:0164-1212
1873-1228
DOI:10.1016/j.jss.2021.111174