Conservation planning implications of modeling seagrass habitats with sparse absence data: a balanced random forest approach

This paper presents a species distribution model (SDM) to quantify relationships between environmental variables and habitat suitability using unbalanced presence-absence data common in ecology. The proposed model applies a stratified sample balancing scheme for the random forest classifier where ev...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Journal of coastal conservation 2022-06, Vol.26 (3), Article 22
Hauptverfasser: Aydin, Orhun, Osorio-Murillo, Carlos, Butler, Kevin A., Wright, Dawn
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:This paper presents a species distribution model (SDM) to quantify relationships between environmental variables and habitat suitability using unbalanced presence-absence data common in ecology. The proposed model applies a stratified sample balancing scheme for the random forest classifier where every classification tree receives a balanced sample of presence and absence. The model is applied to the Australian coast's seagrass habitats, where seagrass populations have been on the decline. Australian Centre for Ecological Analysis and Synthesis (ACEAS) seagrass presence-absence data is used to train the model. Seagrasses are observed at 97.6% of the survey locations, and seagrass absence is recorded at only 2.4% of the survey locations. The proposed model's accuracy is validated with an independent dataset on seagrass presence from the Coastal and Marine Resources Information System (CAMRIS). The environmental variables used in the analysis are obtained from the Ecological Marine Units (EMU) dataset. The variables on human-driven stressors to seagrass habitats due to ship traffic are obtained from World Port Index. The proposed model predicts seagrass absence at a recall rate of 80%, whereas the random forest recall rate is 24%. The model's variable importance profile aligns with the main drivers behind seagrass habitats reported in the literature. A case study is conducted for quantifying the impacts of two proposed ports in the Gulf of Carpenteria on the local seagrass habitats. Results show that balancing improves the explanatory and predictive capabilities of an SDM to define conditions resulting in a species' absence, aiding conservation planning with realistic species distributions.
ISSN:1400-0350
1874-7841
DOI:10.1007/s11852-022-00868-1