# A Generalized Linear Mixed Model Approach to Assess Emerald Ash Borer Diffusion

^{1}

^{2}

^{3}

^{4}

^{*}

## Abstract

**:**

## 1. Introduction

## 2. Materials and Methods

#### 2.1. Data

#### 2.1.1. Species Data

#### 2.1.2. Risk Predictors

#### 2.1.3. Spatial Autocorrelation among the Risk Predictors

_{1}; y

_{2}; …; y

_{n}have spatial correlations with mean µ. Moran’s I statistic is given by Equation (1),

_{ij}denotes the spatial weight, which can be obtained based on the Euclidean distance between the ith and jth observations. Moran’s I values were calculated for the 22 counties in Ontario with known presence points and the results are shown in Table 3. Spatial autocorrelation is statistically significant in half of the 22 counties (p < 0.05), and also relatively high in Algoma, Hamilton, Lambton, and Toronto. The overall Moran’s I for the EAB data was estimated to be approximately 0.109, with a highly significant P value close to 0. Thus, the overall spatial autocorrelation was statistically significant among the sampled points, and the correlation might be higher than average within some counties.

#### 2.2. Methodology

^{2}), and Cp statistic, can be used to compare different candidate models. Similar model selection results can be expected in most cases, while the BIC is a more restricted measure to deal with the overfit model for the large sample.

## 3. Results

## 4. Discussion and Conclusions

## Author Contributions

## Acknowledgments

## Conflicts of Interest

## References

- de Groot, P.; Biggs, W.D.; Lyons, D.B.; Scarr, T.; Czerwinski, E.; Evans, H.J.; Ingram, W.; Marchant, K. A Visual Guide to Detecting Emerald Ash Borer Damage; Ontario Ministry of Natural Resources: Peterborough, ON, Canada, 2006; p. 16.
- Parsons, G.L. Emerald Ash Borer Agrilus planipennis Fairmaire (Coleoptera: Buprestidae): A Guide to Identification and Comparison to Similar Species; Department of Entomology, Michigan State University: East Lansing, MI, USA, 2008. [Google Scholar]
- Marchant, K.R. City of Missisauga Emerald Ash Borer Management Plan. 2012, p. 174. Available online: http://www7.mississauga.ca/documents/parks/forestry/2014/Management_Plan_Final_22Jan12.pdf (accessed on 26 June 2020).
- BenDor, T.K.; Metcalf, S.S.; Fontenot, L.E.; Sangunett, B.; Hannon, B. Modeling the spread of the emerald ash borer. Ecol. Model.
**2006**, 197, 221–236. [Google Scholar] [CrossRef] - Hallett, R.; Pontius, J.; Martin, M.; Plourde, L. The practical utility of hyperspectral remote sensing for early detection of emerald ash borer. In Proceedings of the Emerald Ash Borer Research and Development Meeting, Pittsburgh, PA, USA, 23–24 October 2007; US Department of Agriculture, Forest Service, Forest Health Technology Enterprise Team: Morgantown, WV, USA, 2008. [Google Scholar]
- McCullagh, P. Generalized Linear Models; Routledge: New York, NY, USA, 1989. [Google Scholar] [CrossRef]
- Guisan, A.; Edwards, T.C.; Hastie, T. Generalized linear and generalized additive models in studies of species distributions: Setting the scene. Ecol. Model.
**2002**, 157, 89–100. [Google Scholar] [CrossRef][Green Version] - Zhang, L.J.; Gove, J.H.; Heath, L.S. Spatial residual analysis of six modeling techniques. Ecol. Model.
**2005**, 186, 154–177. [Google Scholar] [CrossRef] - Hoque, F.; Hu, B.; Wang, J.; Hall, B.G. Use of geospatial methods to characterize dispersion of the Emerald Ash Borer in Southern Ontario, Canada. Ecol. Inform.
**2020**, 55, 101037. [Google Scholar] [CrossRef] - Wolfinger, R.; O’connell, M. Generalized linear mixed models a pseudo-likelihood approach. J. Stat. Comput. Simul.
**1993**, 48, 233–243. [Google Scholar] [CrossRef] - Pinheiro, J.C.; Bates, D.M. Mixed-Effects Models in S and S-Plus; Springer: New York, NY, USA, 2000. [Google Scholar]
- McCulloch, C.; Neuhaus, J. Generalized linear mixed models. In Encyclopedia of Biostatistics; John Wiley and Sons Ltd.: Hoboken, NJ, USA, 2005. [Google Scholar]
- Latimer, A.M.; Banerjee, S.; Sang, H.; Mosher, E.S.; Silander, J.A. Hierarchical models facilitate spatial analysis of large data sets: A case study on invasive plant species in the northeastern United States. Ecol. Lett.
**2009**, 12, 144–154. [Google Scholar] [CrossRef] - Zuur, A.F.; Ieno, E.N.; Walker, N.J.; Saveliev, A.A.; Smith, G.M. Mixed Effects Models and Extensions in Ecology with R; Springer: New York, NY, USA, 2009. [Google Scholar]
- Bolker, B.M.; Brooks, M.E.; Clark, C.J.; Geange, S.W.; Poulsen, J.R.; Stevens, M.H.H.; White, J.S.S. Generalized linear mixed models: A practical guide for ecology and evolution. Trends Ecol. Evol.
**2009**, 24, 127–135. [Google Scholar] [CrossRef] - Niku, J.; Warton, D.I.; Hui, F.K.C.; Taskinen, S. Generalized Linear Latent Variable Models for Multivariate Count and Biomass Data in Ecology. J. Agric. Biol. Environ. Stat.
**2017**, 22, 498–522. [Google Scholar] [CrossRef] - Broennimann, O.; Treier, U.A.; Muller-Scharer, H.; Thuiller, W.; Peterson, A.T.; Guisan, A. Evidence of climatic niche shift during biological invasion. Ecol. Lett.
**2007**, 10, 701–709. [Google Scholar] [CrossRef][Green Version] - Kelly, R.; Leach, K.; Cameron, A.; Maggs, C.A.; Reid, N. Combining global climate and regional landscape models to improve prediction of invasion risk. Divers. Distrib.
**2014**, 20, 884–894. [Google Scholar] [CrossRef][Green Version] - Gallardo, B.; Zieritz, A.; Aldridge, D.C. The importance of the human footprint in shaping the global distribution of terrestrial, freshwater and marine invaders. PLoS ONE
**2015**, 10. [Google Scholar] [CrossRef] [PubMed][Green Version] - Prasad, A.M.; Iverson, L.R.; Peters, M.P.; Bossenbroek, J.M.; Matthews, S.N.; Sydnor, T.D.; Schwartz, M.W. Modeling the invasive emerald ash borer risk of spread using a spatially explicit cellular model. Landsc. Ecol.
**2010**, 25, 353369. [Google Scholar] [CrossRef] - Fink, D.; Hochachka, W.M.; Zuckerberg, B.; Winkler, D.W.; Shaby, B.; Munson, M.A.; Hooker Riedewald, G.M.; Sheldon, D.; Kelling, S. Spatiotemporal exploratory models for broad-scale survey data. Ecol. Appl.
**2010**, 20, 2131–2147. [Google Scholar] [CrossRef] [PubMed][Green Version] - Elith, J.; Leathwick, J.R. Species distribution models: Ecological explanation and prediction across space and time. Annu. Rev. Ecol. Evol. Syst.
**2009**, 40, 677–697. [Google Scholar] [CrossRef] - Appleton, E.; Kimoto, T.; Holmes, J.; Turgeon, J.J. Surveillance Guidelines for Emerald Ash Borer; Canadian Food Inspection Agency: Ottawa, ON, USA, 2017.
- Fick, S.E.; Hijmans, R.J. WorldClim 2: New 1-km spatial resolution climate surfaces for global land areas. Int. J. Climatol.
**2017**, 37, 4302–4315. [Google Scholar] [CrossRef] - Ontario Wind Resource Information, 2005; Electronic Resource: Vector; Ontario Ministry of Natural Resources: Peterborough, ON, Canada, 2005.
- Settur, B.; Rajan, K.S.; Ramachandra, T.V. Land surface temperature responses to land use land cover dynamics. Geoinform. Geostat. Overv.
**2013**. [Google Scholar] [CrossRef] - Hulley, G.; Hook, S. MOD21A2 MODIS/Terra Land Surface Temperature/3-Band Emissivity 8-Day L3 Global 1km SIN Grid V006 [Data set]. NASA EOSDIS Land Process. DAAC
**2017**. [Google Scholar] [CrossRef] - Ontario Ministry of Natural Resources. Provincial Digital Elevation Model Technical Specifications v3.0; Ontario Ministry of Natural Resources: Peterborough, ON, Canada, 2013; pp. 1–23.
- McCullough, D.G.; Poland, T.M.; Cappaert, D.L. Attraction of the emerald ash borer to ash trees stressed by girdling, herbicide treatment, or wounding. Can. J. For. Res.
**2009**, 39, 1331–1345. [Google Scholar] [CrossRef] - Royo, A.A.; Knight, K.S. White ash (Fraxinus americana) decline and mortality: The role of site nutrition and stress history. For. Ecol. Manag.
**2012**, 286, 8–15. [Google Scholar] [CrossRef] - Dormann, C.F.; McPherson, J.M.; Arajo, M.B.; Bivand, R.; Bolliger, J.; Carl, G.; Wilson, R. Methods to account for spatial autocorrelation in the analysis of species distributional data: A review. Ecography
**2007**, 30, 609–628. [Google Scholar] [CrossRef][Green Version] - Tasneem, F. Use of Geospatial Methods to Characterize Dispersion of the Emerald Ash Borer in Southern Ontario, Canada. Master’s Thesis, Graduate Program in Earth and Space Science, York University, Toronto, ON, Canada, 2019. [Google Scholar]
- Warton, D.I.; Blanchet, F.; O’Hara, R.B.; Ovaskainen, O.; Taskinen, S.; Walker, S.C.; Hui, F.K. So Many Variables: Joint Modeling in Community Ecology. Trends Ecol. Evol.
**2015**, 30, 766–779. [Google Scholar] [CrossRef] [PubMed]

**Figure 1.**The Emerald Ash Borer (EAB) distribution in Southern Ontario, Canada from 2006 to 2012. Green points are the EAB absence points and red points are the EAB presence points.

**Figure 2.**3D plot of the data from three distinct counties and cities: KA: Kawartha Lakes (green dots); LE: Lennox and Addington (orange dots); WT: Waterloo (blue dots) for three predicting variables, wind speed (x-axis), nearest EAB (y-axis) and distance to forest facilities (z-axis).

**Figure 3.**Study area partitioning. Sample locations are grouped into 36 regions to capture the random effects in the second model. Please note that there were no data in Regions R2, R3, R4, R5, R6, R30, R31, R35, and R36 and these regions were excluded from the model.

**Figure 4.**Cross-validation results for the validation data based on the model selection process shown in Table 5. Nine steps from the model with one predictor (average wind speed in June) to the proposed GLMMs with seven predictors. Solid lines are for true negative and dashed lines for true positive. Cyan and red lines represent the random effects based on county and region, respectively.

**Figure 5.**Sampled points in Southern Ontario in 2013. Green points are the EAB absence points and red points are the EAB presence points. Please note that due to the close proximity of many points, it seems that there are fewer presence points visible than there actually are.

**Figure 6.**Projection of the risk map in southern Ontario. Predicted probabilities are based on the GLMM with regional random effect (Model 2 in Table 6).

**Figure 7.**The prediction rate under different spatial units, where the green, red, and blue represent overall, true negative and true positive, respectively.

**Table 1.**Samples collected from field surveys from 2006 to 2012 with 250 presence points and 11,229 absence points.

Year | Presence(%) | Absence(%) | Total |
---|---|---|---|

2006 | 58(0.88%) | 6531(99.12%) | 6589 |

2007 | 69(5.53%) | 1177(94.46%) | 1246 |

2008 | 90(9.03%) | 906(90.97%) | 996 |

2009 | 16(2.11%) | 744(97.89%) | 760 |

2010 | 11(1.35%) | 800(98.65%) | 811 |

2011 | 1(0.25%) | 392(99.75%) | 393 |

2012 | 5(0.73%) | 679(99.27%) | 684 |

Predictor Title | Unit | Average | Data Range | Data Format |
---|---|---|---|---|

Climatic factors | ||||

Precipitation | mm | 84.3 | (61.0, 109.0) | Raster (TIF) |

Solar radiation | KJ/m^{2}day | 21,313 | (20613, 21822) | |

June wind speed | m/s | 3.933 | (2.500, 5.430) | NAD83 |

Land surface temperature | Kevin | 338.6 | (321.2, 345.9) | Raster (TIF) |

Geographic factors | ||||

Elevation | m | 222.40 | (41.64, 524.71) | Raster |

Slope | ^{o} deg | 1.335 | (0, 21.025) | |

Aspect | ^{o} deg | 188.013 | (0.104, 359.946) | |

Biotic factors | ||||

Normalized difference vegetation index | N/A | 0.634 | (−0.619, 0.965) | Raster (TIF) |

Nearest EAB positive location | 56,175 | (0, 596,180) | ||

from previous years | m | |||

Anthropogenic factors, distance to | ||||

Population centers | m | 25,226 | (0, 204,697) | Vector (points) |

Sea ports | m | 38,441 | (241, 212,511) | Coordinates |

Forest processing facilities | m | 23,069 | (60, 85,454) | Vector (points) |

Highways | m | 14,071 | (0, 44,905) | |

Campgrounds | m | 27,196 | (30, 104,972) |

**Table 3.**The estimated spatial autocorrelation of samples (presence and absence points) in 22 counties based on Moran’s I statistics.

County Name | Samples | Moran’s I | Std Dev | p-Value |
---|---|---|---|---|

ALGOMA | 125 | 0.463 | 0.032 | 0.000 |

BRANT | 87 | −0.012 | 0.003 | 0.980 |

BRUCE | 245 | 0.011 | 0.015 | 0.326 |

CHATHAM-KENT | 1838 | 0.064 | 0.002 | 0.000 |

DURHAM | 36 | −0.029 | 0.010 | 0.932 |

FRONTENAC | 108 | −0.009 | 0.003 | 0.960 |

HALTON | 64 | 0.186 | 0.037 | 0.000 |

HAMILTON | 52 | 0.582 | 0.068 | 0.000 |

HURON | 549 | 0.012 | 0.010 | 0.150 |

LAMBTON | 231 | 0.384 | 0.016 | 0.000 |

MIDDLESEX | 1841 | 0.097 | 0.002 | 0.000 |

NIAGARA | 89 | −0.011 | 0.045 | 0.991 |

NORFOLK | 356 | 0.286 | 0.015 | 0.000 |

OTTAWA | 190 | 0.185 | 0.014 | 0.000 |

OXFORD | 265 | −0.004 | 0.012 | 0.966 |

PEEL | 58 | 0.149 | 0.027 | 0.000 |

PERTH | 124 | −0.005 | 0.002 | 0.152 |

PRESCOTT AND RUSSELL | 66 | −0.016 | 0.005 | 0.916 |

TORONTO | 48 | 0.243 | 0.041 | 0.000 |

WATERLOO | 142 | −0.012 | 0.013 | 0.738 |

WELLINGTON | 109 | 0.217 | 0.031 | 0.000 |

YORK | 50 | −0.014 | 0.006 | 0.314 |

**Table 4.**The significance level and the goodness of fit by each predictor variable based on the univariate GLMM (Generalized Linear Mixed Model) with the regional random effects.

Predictor Variables | Effect | p-Value | Deviance |
---|---|---|---|

Time | 0.057 | 0.0065 | 2214 |

Climatic factors | |||

Precipitation | −0.591 | 1.26e-12 | 2169 |

Solar radiation | −28.504 | 2.62e-23 | 2115 |

June wind speed | −3.848 | 1.60e-81 | 1375 |

Land surface temperature | 11.950 | 2.06e-19 | 2130 |

Geographic factors | |||

Elevation | −0.520 | 1.31e-12 | 2168 |

Slope | −0.129 | 2.78e-06 | 2193 |

Aspects | 0.039 | 0.1446 | 2219 |

Biotic factors | |||

Normalized difference vegetation index | −0.091 | 2.33e-11 | 2178 |

Nearest EAB positive location from previous years | −0.010 | 0.0174 | 2215 |

Anthropogenic factors | |||

Population centers | −0.174 | 3.44e-15 | 2149 |

Sea ports | 0.040 | 0.0078 | 2214 |

Forest processing facilities | 0.284 | 6.38e-40 | 2022 |

Highways | −0.171 | 6.12e-08 | 2190 |

Campgrounds | −0.227 | 1.23e-19 | 2128 |

**Table 5.**The stepwise model selection process. In each step, one predictor with the smallest p-value was added in the model. If the estimated BIC (Bayesian Information Criterion) value decreased, that predictor variable was kept in the model, and other variables were tested in the next steps. Meanwhile, if the estimated BIC value increased as an additional variable was included, the predictor variable was dropped.

Selected Models | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|

Predictor Variables | I | II | III | IV | V | VI | VII | … | ||

Time | ✓ | |||||||||

Precipitation | ✓ | |||||||||

Solar radiation | ✓ | ✓ | ✓ | ✓ | ✓ | |||||

June wind speed | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | |||

Land surface temperature | ✓ | ✓ | ✓ | |||||||

Elevation | ||||||||||

Slope | ||||||||||

Aspects | ||||||||||

NDVI | ||||||||||

Nearest EAB positive location from previous years | ||||||||||

Population centers | ✓ | ✓ | ✓ | |||||||

Sea ports | ✓ | |||||||||

Forest processing facilities | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ||||

Highways | ||||||||||

Campgrounds | ✓ | |||||||||

Model with County | BIC | 1975 | 1158 | 1158 | 1156 | 1163 | 1135 | 1144 | 1065 | |

AIC | 1960 | 1136 | 1129 | 1119 | 1119 | 1083 | 1085 | 1000 | ||

Adj R^{2} | − | 0.424 | 0.429 | 0.436 | 0.437 | 0.457 | 0.457 | 0.503 | ||

Cp | 1959 | 1133 | 1124 | 1113 | 1113 | 1077 | 1078 | 990 | ||

Model with Region | BIC | 2259 | 1443 | 1404 | 1384 | 1391 | 1365 | 1369 | 1300 | |

AIC | 2244 | 1420 | 1374 | 1348 | 1347 | 1313 | 1311 | … | 1234 | |

Adj R^{2} | − | 0.370 | 0.392 | 0.405 | 0.407 | 0.423 | 0.426 | 0.456 | ||

Cp | 2247 | 1418 | 1371 | 1342 | 1342 | 1306 | 1302 | 1233 |

**Table 6.**Estimation results for fixed effects and random effects by the generalized linear mixed model with the logistic link function. Model 1 stands for the GLMM with county random effect; model 2 stands for the GLMM with regional random effect.

Predictor Variables | Model 1 | (Std Error) | Model 2 | (Std Error) |
---|---|---|---|---|

Fixed Effects | ||||

Time | 0.7989 | (0.14) | 0.3330 | (0.09) |

June wind speed | −10.3668 | (0.62) | −8.1574 | (0.45) |

Land Surface Temperature | −4.4333 | (6.24) | −27.5184 | (4.55) |

Solar Radiation | −36.9254 | (13.2) | −51.7792 | (10.7) |

Distance to Forest Processing Facilities | −0.2170 | (0.13) | 0.3281 | (0.07) |

Distance to Ports | 0.8553 | (0.12) | 0.5067 | (0.06) |

Population Centers | −0.8976 | (0.12) | −0.3537 | (0.07) |

Random Effects | ||||

Type | County | Region | ||

Variance | 42.19 | 42.42 | ||

Standard deviation | 6.495 | 6.513 |

**Table 7.**Classification accuracy for the validation and testing datasets by the GLM (Generalized Linear Model) and GLMM, where TNR: true negative rate; TPR: true positive rate.

Testing Data from 2013 | ||||
---|---|---|---|---|

Model | Random effect | TNR | TPR | Overall |

GLM | N/A | 99.54% | 54.55% | 77.04% |

GLMM | County | 98.97% | 63.64% | 81.30% |

Region | 98.63% | 95.45% | 97.04% |

© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Zhong, Y.; Hu, B.; Hall, G.B.; Hoque, F.; Xu, W.; Gao, X. A Generalized Linear Mixed Model Approach to Assess Emerald Ash Borer Diffusion. *ISPRS Int. J. Geo-Inf.* **2020**, *9*, 414.
https://doi.org/10.3390/ijgi9070414

**AMA Style**

Zhong Y, Hu B, Hall GB, Hoque F, Xu W, Gao X. A Generalized Linear Mixed Model Approach to Assess Emerald Ash Borer Diffusion. *ISPRS International Journal of Geo-Information*. 2020; 9(7):414.
https://doi.org/10.3390/ijgi9070414

**Chicago/Turabian Style**

Zhong, Yuan, Baoxin Hu, G. Brent Hall, Farah Hoque, Wei Xu, and Xin Gao. 2020. "A Generalized Linear Mixed Model Approach to Assess Emerald Ash Borer Diffusion" *ISPRS International Journal of Geo-Information* 9, no. 7: 414.
https://doi.org/10.3390/ijgi9070414