DatAFLow: Toward a Data-Flow-Guided Fuzzer

Coverage-guided greybox fuzzers rely on control-flow coverage feedback to explore a target program and uncover bugs. Compared to control-flow coverage, data-flow coverage offers a more fine-grained approximation of program behavior. Data-flow coverage captures behaviors not visible as control flow a...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	ACM transactions on software engineering and methodology 2023-09, Vol.32 (5), p.1-31, Article 132
Hauptverfasser:	Herrera, Adrian, Payer, Mathias, Hosking, Antony L.
Format:	Artikel
Sprache:	eng
Schlagworte:	Empirical software validation Software and its engineering Software maintenance tools Software testing and debugging
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Coverage-guided greybox fuzzers rely on control-flow coverage feedback to explore a target program and uncover bugs. Compared to control-flow coverage, data-flow coverage offers a more fine-grained approximation of program behavior. Data-flow coverage captures behaviors not visible as control flow and should intuitively discover more (or different) bugs. Despite this advantage, fuzzers guided by data-flow coverage have received relatively little attention, appearing mainly in combination with heavyweight program analyses (e.g., taint analysis, symbolic execution). Unfortunately, these more accurate analyses incur a high run-time penalty, impeding fuzzer throughput. Lightweight data-flow alternatives to control-flow fuzzing remain unexplored.We present datAFLow, a greybox fuzzer guided by lightweight data-flow profiling. We also establish a framework for reasoning about data-flow coverage, allowing the computational cost of exploration to be balanced with precision. Using this framework, we extensively evaluate datAFLow across different precisions, comparing it against state-of-the-art fuzzers guided by control flow, taint analysis, and data flow. Our results suggest that the ubiquity of control-flow-guided fuzzers is well-founded. The high run-time costs of data-flow-guided fuzzing (~10 × higher than control-flow-guided fuzzing) significantly reduces fuzzer iteration rates, adversely affecting bug discovery and coverage expansion. Despite this, datAFLow uncovered bugs that state-of-the-art control-flow-guided fuzzers (notably, AFL++) failed to find. This was because data-flow coverage revealed states in the target not visible under control-flow coverage. Thus, we encourage the community to continue exploring lightweight data-flow profiling; specifically, to lower run-time costs and to combine this profiling with control-flow coverage to maximize bug-finding potential.
ISSN:	1049-331X 1557-7392
DOI:	10.1145/3587156