Quantifying, Characterizing, and Mitigating Flakily Covered Program Elements
Code coverage measures the degree to which source code elements (e.g., statements, branches) are invoked during testing. Despite growing evidence that coverage is a problematic measurement, it is often used to make decisions about where testing effort should be invested. For example, using coverage...
Gespeichert in:
Veröffentlicht in: | IEEE transactions on software engineering 2022-03, Vol.48 (3), p.1018-1029 |
---|---|
Hauptverfasser: | , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Code coverage measures the degree to which source code elements (e.g., statements, branches) are invoked during testing. Despite growing evidence that coverage is a problematic measurement, it is often used to make decisions about where testing effort should be invested. For example, using coverage as a guide, tests should be written to invoke the non-covered program elements. At their core, coverage measurements assume that invocation of a program element during any test is equally valuable. Yet in reality, some tests are more robust than others. As a concrete instance of this, we posit in this paper that program elements that are only covered by flaky tests, i.e., tests with non-deterministic behaviour, are also worthy of investment of additional testing effort. In this paper, we set out to quantify, characterize, and mitigate "flakily covered" program elements (i.e., those elements that are only covered by flaky tests). To that end, we perform an empirical study of three large software systems from the OpenStack community. In terms of quantification, we find that systems are disproportionately impacted by flakily covered statements with 5 and 10 percent of the covered statements in Nova and Neutron being flakily covered, respectively, while |
---|---|
ISSN: | 0098-5589 1939-3520 |
DOI: | 10.1109/TSE.2020.3010045 |