Using machine learning to identify nontraditional spatial dependence in occupancy data
Occupancy data are spatially referenced contaminated binary responses used to understand spatial variability in the presence or absence of a species. Spatial models for occupancy data are used to estimate and map the true presence of a species, which may depend on biotic and abiotic factors as well as spatial autocorrelation. Traditionally researchers have accounted for spatial autocorrelation in occupancy data by using a correlated normally distributed site-level random effect, which might be incapable of identifying nontraditional spatial dependence such as discontinuities and abrupt transitions. Machine learning approaches have the potential to identify and model nontraditional spatial dependence, but these approaches do not account for contamination in the binary response. By combining the flexibility of Bayesian hierarchal modeling and machine learning approaches, we present a general framework to model occupancy data that accounts for both traditional and nontraditional spatial dependence and contamination in the binary responses. We demonstrate our approach using synthetic data containing traditional and nontraditional spatial dependence and using data on Thomson's gazelle in Tanzania.
READ FULL TEXT