Solving sampling bias problems in presence–absence or presence‐only species data using zero‐inflated models

Aim Large databases of species records such as those generated through citizen science projects, archives or museum collections are being used with increasing frequency in species distribution modelling (SDM) for conservation and land management. Despite the broad spatial and temporal coverage of th...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Journal of biogeography 2022-01, Vol.49 (1), p.215-232
Hauptverfasser: Nolan, Victoria, Gilbert, Francis, Reader, Tom
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Aim Large databases of species records such as those generated through citizen science projects, archives or museum collections are being used with increasing frequency in species distribution modelling (SDM) for conservation and land management. Despite the broad spatial and temporal coverage of the data, its application is often limited by the issue of sampling bias and consequently, zero inflation; there are more zeros (which are potentially ‘false absences’) in the data than expected. Here, we demonstrate how pooling species presence data into a ‘pseudo‐abundance’ count can allow identification and removal of sampling bias through the use of zero‐inflated (ZI) models, and thus solve a common SDM problem. Location All locations Taxon All taxa Methods We present the results of a series of simulations based on hypothetical ecological scenarios of data collection using random and non‐random sampling strategies. Our simulations assume that the locations of occurrence records are known at a high spatial resolution, but that the absence of occurrence records may reflect under‐sampling. To simulate pooling of presence–absence or presence‐only data, we count occurrence records at intermediate and coarse spatial resolutions, and use ZI models to predict the counts (species abundance per grid cell) from environmental layers. Results ZI models can successfully identify predictors of bias in species data and produce abundance prediction maps that are free from that bias. This phenomenon holds across multiple spatial scales, thereby presenting an advantage over presence‐only SDM methods such as binomial GLMs or MaxEnt, where information about species density is lost, and model performance declines at coarser scales. Main Conclusions Our results highlight the value of converting presence–absence or presence‐only species data to ‘pseudo‐abundance’ and using ZI models to address the problem of sampling bias. This method has huge potential for ecological researchers when using large species datasets for research and conservation.
ISSN:0305-0270
1365-2699
DOI:10.1111/jbi.14268