Hostname: page-component-5db58dd55d-jhf8m Total loading time: 0 Render date: 2026-05-26T04:15:57.589Z Has data issue: false hasContentIssue false

Learning complex spatial dynamics of wildlife diseases with machine learning-guided partial differential equations

Published online by Cambridge University Press:  08 May 2025

Juan Francisco Mandujano Reyes*
Affiliation:
Department of Statistics, University of Wisconsin–Madison, Madison, WI, USA
Gina Oh
Affiliation:
Department of Statistics, University of Wisconsin–Madison, Madison, WI, USA
Ian McGahan
Affiliation:
Department of Statistics, University of Wisconsin–Madison, Madison, WI, USA
Ting Fung Ma
Affiliation:
Department of Statistics, University of South Carolina, Columbia, SC, USA
Robin Russell
Affiliation:
Ecological Services Program, U.S. Fish and Wildlife Service, Fort Collins, CO, USA
Daniel P. Walsh
Affiliation:
U.S. Geological Survey, Montana Cooperative Wildlife Research Unit, Wildlife Biology Program, University of Montana, Missoula, MT, USA
Jun Zhu
Affiliation:
Department of Statistics, University of Wisconsin–Madison, Madison, WI, USA
*
Corresponding author: Juan Francisco Mandujano Reyes; Email: mandujanorey@wisc.edu

Abstract

Emerging wildlife pathogens often display geographic variability due to landscape heterogeneity. Modeling approaches capable of learning complex, non-linear spatial dynamics of diseases are needed to rigorously assess and mitigate the effects of pathogens on wildlife health and biodiversity. We propose a novel machine learning (ML)-guided approach that leverages prior physical knowledge of ecological systems, using partial differential equations. We present our approach, taking advantage of the universal function approximation property of neural networks for flexible representation of the underlying dynamics of the geographic spread and growth of wildlife diseases. We demonstrate the benefits of our approach by comparing its forecasting power with commonly used methods and highlighting the obtained insights on disease dynamics. Additionally, we show the theoretical guarantees for the approximation error of our model. We illustrate the implementation of our ML-guided approach using data from white-nose syndrome (WNS) outbreaks in bat populations across the US. WNS is an infectious fungal disease responsible for significant declines in bat populations. Our results on WNS are useful for disease surveillance and bat conservation efforts. Our methods can be broadly used to assess the effects of environmental and anthropogenic drivers impacting wildlife health and biodiversity.

Information

Type
Methods Paper
Creative Commons
Creative Common License - CCCreative Common License - BY
The contribution by USGS authors, Daniel Walsh, is part of their official duties as U.S. government employees and constitutes a work of the United States government, which is in the public domain under Section 105 of the Copyright Act of 1976. Published by Cambridge University Press
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (http://creativecommons.org/licenses/by/4.0), which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited.
Open Practices
Open data
Copyright
© US Geological Survey (USGS) and the Author(s), 2025
Figure 0

Figure 1. White-nose syndrome (WNS) is an infectious fungal disease in bats caused by Pseudogymnoascus destructans. Presented here are the geographic locations where WNS samples were collected by the USGS WNS surveillance team between 2006 and 2016. There were 1,557 positive tests, represented by the colored dots, with lighter colors corresponding to more recently taken samples and darker colors corresponding to earlier samples. There were 18,515 negative samples, represented with small black dots.

Figure 1

Figure 2. Diagram of neural networks (NNs) informing the partial differential equation (PDE) to model pathogen spread and growth. The covariates, tree canopy cover (canopy), linear hydrography (waterways), topographic ruggedness index (TRI), number of coal mines (mines), and karst geomorphology (karst) are used as input. The NNs (red and blue boxes) take these covariates and predict the unknown spatially varying log-diffusion $ \left(\log \mu \right) $ and growth $ \left(\lambda \right) $ coefficients which characterize the PDE. The solution of the PDE, given the predicted coefficients, is post-processed and contrasted with the observed data using a loss function. The loss guides the training process to refine the NN’s predictions.

Figure 2

Figure 3. Machine learning-guided partial differential equation approximations of the probability of presence of Pseudogymnoascus destructans (Pd) (upper) vs. observed data (lower) for April 2010 (left) and April 2020 (right). The observed data color represents time (earlier times have a darker color). We observe a heterogeneous diffusive behavior, which is important for differentiating the impact of Pd reaching different zones.

Figure 3

Table 1. Validation metrics on Pseudogymnoascus destructans (Pd) detection for our machine learning-guided partial differential equation (PDE) method (machine leaning [ML]-guided) and the PDE with spatial additive linear effects model (linear) for the train and test datasets

Figure 4

Figure 4. Left: loss function value for test dataset in our machine learning-guided partial differential equation (PDE) method (machine learning [ML]-guided) vs. the PDE with spatial additive linear effects model (linear). Right: training computation time (in seconds) for ML-guided vs. linear model.

Figure 5

Figure 5. Maps with approximated log-diffusion coefficient $ \log \mu $ and growth coefficient $ \lambda $ from neural networks (upper) and linear function (lower). Color scales are different to allow easier visualization of details in each figure. Approximated diffusion and growth values are only interpretable within the boundaries of the continental USA. Predicted values are not interpretable for large water bodies (e.g., the ocean and the Great Lakes).

Figure 6

Figure 6. Functional relationship between each variable versus the log-diffusion coefficient $ \log \mu $ (upper) and growth coefficient $ \lambda $ (lower) from neural networks (NNs) (right) and a linear model (left). Each covariate is varied while the remaining covariates are fixed at 0.5. Note the different values on the y-axis between linear and NN models.

Figure 7

Figure 7. Bee swarm plots of SHapley Additive exPlanations (SHAP) values from $ \mathrm{5,000} $ randomly sampled locations for log-diffusion (upper) and growth (lower) neural network models. SHAP values explain the covariate contributions to the predictions of each observation. Covariates are ranked from top to bottom by their mean absolute SHAP value (shown in parentheses beside the name). For each covariate, each location has a point distributed along the horizontal axis by its SHAP value. SHAP values with high density are represented by stacking the points vertically. Color represents covariate raw values.

Figure 8

Figure A1. Covariates used as explanatory variables for the coefficients in the ecological diffusion equation modeling the probability of the presence of Pseudogymnoascus destructans (Pd). We use the percentage of tree canopy cover (canopy), linear hydrography including streams/rivers, braided streams, canals, ditches, artificial paths, and aqueducts (waterways), topographic ruggedness index, which is related to the magnitude of elevation (TRI), number of coal mines per $ 10 $ km $ \times 10 $ km grid cell (mines), and percentage of karst geomorphology (karst). Covariates are transformed to lie in the interval $ \left[0,1\right] $ and were selected following recommendations from bat biologists on the Strategic Pd Surveillance Advisory Team.

Figure 9

Figure A2. Approximated log-diffusion coefficient $ \log \mu $ and growth coefficient $ \lambda $ from neural networks for the Southeastern United States. Black arrows: (1) color pattern for high diffusion and low growth in the karst landscapes of the Appalachian Mountains; and (2) color pattern characterizing relatively high diffusion and low growth in northern Florida, across Georgia, and South Carolina. The region circled (arrow 2) along the Florida/Georgia border exhibits a slightly higher density of red pixels for $ \mu $ than in Florida, indicating a higher average diffusion rate in that region. Additionally, it exhibits a lower $ \lambda $, indicating a lower average growth rate than in Florida. We interpret this to mean that the region acts as a natural barrier to establishment.