Exploring Microtask Crowdsourcing as a Means of Fault Localization

Microtask crowdsourcing is the practice of breaking down an overarching task to be performed into numerous, small, and quick microtasks that are distributed to an unknown, large set of workers. Microtask crowdsourcing has shown potential in other disciplines, but with only a handful of approaches ex...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Adriano, Christian Medeiros, van der Hoek, Andre
Format:	Artikel
Sprache:	eng
Schlagworte:	Computer Science - Software Engineering
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Microtask crowdsourcing is the practice of breaking down an overarching task to be performed into numerous, small, and quick microtasks that are distributed to an unknown, large set of workers. Microtask crowdsourcing has shown potential in other disciplines, but with only a handful of approaches explored to date in software engineering, its potential in our field remains unclear. In this paper, we explore how microtask crowdsourcing might serve as a means of fault localization. We particularly take a first step in assessing whether a crowd of workers can correctly locate known faults in a few lines of code (code fragments) taken from different open source projects. Through Mechanical Turk, we collected the answers of hundreds of workers to a pre-determined set of template questions applied to the code fragments, with a replication factor of twenty answers per question. Our findings show that a crowd can correctly distinguish questions that cover lines of code that contain a fault from those that do not. We also show that various filters can be applied to identify the most effective subcrowds. Our findings also presented serious limitations in terms of the proportion of lines of code selected for inspection and the cost to collect answers. We describe the design of our experiment, discuss the results, and provide an extensive analysis of different filters and their effects in terms of speed, cost, and effectiveness. We conclude with a discussion of limitations and possible future experiments toward more full-fledged fault localization on a large scale involving more complex faults.
DOI:	10.48550/arxiv.1612.03015