An evaluation of stringent filtering to improve species distribution models from citizen science data

Aim Citizen science data are increasingly used for modelling species distributions because they offer broad spatiotemporal coverage of local observations. However, such data are often collected without experimental design or set survey methods, raising the risk that bias and noise will compromise mo...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Diversity & distributions 2019-12, Vol.25 (12), p.1857-1869
Hauptverfasser:	Steen, Valerie A., Elphick, Chris S., Tingley, Morgan W.
Format:	Artikel
Sprache:	eng
Schlagworte:	BIODIVERSITY RESEARCH Birds citizen science Construction Culling Data collection data filtering Datasets Design of experiments eBird Experimental design Filtration Geographical distribution model evaluation Noise prediction observer expertise occurrence data Outliers (statistics) Polls & surveys Predictions Science Spatial data Species species distribution models Statistical methods Studies survey effort
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Aim Citizen science data are increasingly used for modelling species distributions because they offer broad spatiotemporal coverage of local observations. However, such data are often collected without experimental design or set survey methods, raising the risk that bias and noise will compromise modelled predictions. We tested the ability of species distribution models (SDMs) built from these low‐structure citizen science data to match the quality of SDMs from systematically collected data and tested whether stringent data filtering improved predictions. Location Northeastern USA. Methods We evaluated models built from a rapidly growing dataset of avian occurrences reported by birders—eBird—against models built from four independent, systematically collected datasets. We developed SDMs for 96 species using both data sources and compared their predictive abilities. We also tested whether culling eBird data by applying stringent data filters on survey effort or observer expertise improved predictions. Results We found that SDMs built from low‐structure citizen science data matched or exceeded performance of SDMs from systematically collected datasets for 12%–31% of species (x¯ = 22%), depending on the dataset. At least one culling option produced equivalent or better performance for 40%–70% of species (x¯ = 49%). Data culling by restricting survey effort improved predictions more than restricting by observer expertise. The optimal effort restriction differed by dataset, and for three of the datasets was further informed by species traits. Main conclusions Species distribution models developed using low‐structure citizen science data sometimes performed as well as those from systematic data. Culling generally improved models, but results were heterogeneous, prohibiting clear recommendations for how to cull. Our results indicate that the growing availability of citizen science data holds potential for creating high‐quality spatial predictions, but that time should be invested in determining how best to cull datasets and that one‐size‐fits‐all solutions beyond basic outlier filtering may be hard to find.
ISSN:	1366-9516 1472-4642
DOI:	10.1111/ddi.12985