RUGRAT: Evaluating program analysis and testing tools and compilers with large generated random benchmark applications

Summary Benchmarks are heavily used in different areas of computer science to evaluate algorithms and tools. In program analysis and testing, open‐source and commercial programs are routinely used as benchmarks to evaluate different aspects of algorithms and tools. Unfortunately, many of these progr...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Software, practice & experience practice & experience, 2016-03, Vol.46 (3), p.405-431
Hauptverfasser:	Hussain, Ishtiaque, Csallner, Christoph, Grechanik, Mark, Xie, Qing, Park, Sangmin, Taneja, Kunal, Mainul Hossain, B. M.
Format:	Artikel
Sprache:	eng
Schlagworte:	benchmark application generator benchmark applications stochastic parse tree
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Summary Benchmarks are heavily used in different areas of computer science to evaluate algorithms and tools. In program analysis and testing, open‐source and commercial programs are routinely used as benchmarks to evaluate different aspects of algorithms and tools. Unfortunately, many of these programs are written by programmers who introduce different biases, not to mention that it is very difficult to find programs that can serve as benchmarks with high reproducibility of results. We propose a novel approach for generating random benchmarks for evaluating program analysis and testing tools and compilers. Our approach uses stochastic parse trees, where language grammar production rules are assigned probabilities that specify the frequencies with which instantiations of these rules will appear in the generated programs. We implemented our tool for Java and applied it to generate a set of large benchmark programs of up to 5M lines of code each with which we evaluated different program analysis and testing tools and compilers. The generated benchmarks let us independently rediscover several issues in the evaluated tools. Copyright © 2014 John Wiley & Sons, Ltd.
ISSN:	0038-0644 1097-024X
DOI:	10.1002/spe.2290