Floating Forests: Quantitative Validation of Citizen Science Data Generated From Consensus Classifications
Large-scale research endeavors can be hindered by logistical constraints limiting the amount of available data. For example, global ecological questions require a global dataset, and traditional sampling protocols are often too inefficient for a small research team to collect an adequate amount of d...
Gespeichert in:
Hauptverfasser: | , , , , , , , , , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Large-scale research endeavors can be hindered by logistical constraints
limiting the amount of available data. For example, global ecological questions
require a global dataset, and traditional sampling protocols are often too
inefficient for a small research team to collect an adequate amount of data.
Citizen science offers an alternative by crowdsourcing data collection. Despite
growing popularity, the community has been slow to embrace it largely due to
concerns about quality of data collected by citizen scientists. Using the
citizen science project Floating Forests (http://floatingforests.org), we show
that consensus classifications made by citizen scientists produce data that is
of comparable quality to expert generated classifications. Floating Forests is
a web-based project in which citizen scientists view satellite photographs of
coastlines and trace the borders of kelp patches. Since launch in 2014, over
7,000 citizen scientists have classified over 750,000 images of kelp forests
largely in California and Tasmania. Images are classified by 15 users. We
generated consensus classifications by overlaying all citizen classifications
and assessed accuracy by comparing to expert classifications. Matthews
correlation coefficient (MCC) was calculated for each threshold (1-15), and the
threshold with the highest MCC was considered optimal. We showed that optimal
user threshold was 4.2 with an MCC of 0.400 (0.023 SE) for Landsats 5 and 7,
and a MCC of 0.639 (0.246 SE) for Landsat 8. These results suggest that citizen
science data derived from consensus classifications are of comparable accuracy
to expert classifications. Citizen science projects should implement methods
such as consensus classification in conjunction with a quantitative comparison
to expert generated classifications to avoid concerns about data quality. |
---|---|
DOI: | 10.48550/arxiv.1801.08522 |