Next Article in Journal
Genome-Wide Investigation and Expression Analysis of AP2 Gene Subfamily Reveals Its Evolution and Regulatory Role Under Salt Stress in Populus
Next Article in Special Issue
Optimization and Construction of Forestland Ecological Security Pattern: A Case Study of the Huai River Source–Dabie Mountains in China
Previous Article in Journal
Effects of Spraying Exogenous Hormones IAA and 6-BA on Sprouts of Pinus yunnanensis Seedlings After Stumping
Previous Article in Special Issue
Evaluation of Correction Algorithms for Sentinel-2 Images Implemented in Google Earth Engine for Use in Land Cover Classification in Northern Spain
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Modeling Spongy Moth Forest Mortality in Rhode Island Temperate Deciduous Forest

by
Liubov Dumarevskaya
and
Jason R. Parent
*
Department of Natural Resources Science, University of Rhode Island, Kingston, RI 02881, USA
*
Author to whom correspondence should be addressed.
Forests 2025, 16(1), 93; https://doi.org/10.3390/f16010093
Submission received: 28 November 2024 / Revised: 19 December 2024 / Accepted: 6 January 2025 / Published: 8 January 2025

Abstract

:
Invasive pests cause major ecological and economic damages to forests around the world including reduced carbon sequestration and biodiversity and loss of forest revenue. In this study, we used Random Forest to model forest mortality resulting from a 2015–2017 Spongy moth outbreak in the temperate deciduous forests of Rhode Island (northeastern U.S.). Mortality was modeled with a 100 m spatial resolution based on Landsat-derived defoliation maps and geospatial data representing soil characteristics, drought condition, and forest characteristics as well as proximity to coast, development, and water. Random Forest was used to model forest mortality with two classes (low/high) and three classes (low/med/high). The best models had overall accuracies of 82% and 65% for the two-class and three-class models, respectively. The most important predictors of forest mortality were defoliation, distance to coast, and canopy cover. Model performance improved only slightly with the inclusion of more than three variables. The models classified 35% of forests as having canopy mortality >5 trees/ha and 21% of Rhode Island forests having mortality >11 trees/ha. The study shows the benefit of Random Forest models that use both defoliation maps and geospatial environmental data for classifying forest mortality caused by Spongy moth.

1. Introduction

Invasive pests have far-reaching ecological and economic impacts, affecting global forest communities and carbon storage [1]. Globally, Ref. [2] found that forests affected by pest invasion sequester 69% less carbon, on average, compared to unaffected forests. In the U.S. alone, forest pest activity results in annual biomass losses of 5.5 TgC [3], which is roughly equivalent to 11,800 hectares of temperate forest land (based on [4]). One important forest pest of the U.S. is Spongy moth (Lymantria dispar dispar), which can cause severe tree defoliation that leads to mortality. From 1994 to 2010, Spongy moth caused an estimated 898 TgC of biomass loss in the U.S. [3], which is equivalent to roughly 1.9 M hectares of temperate forest [4]. In a single U.S. city (i.e., Baltimore, MD), the damages caused by Spongy moth were estimated to range from USD 5.5 to 63.7 M per year of an outbreak, depending on environmental and management scenarios [5].
Remote sensing is a key tool for monitoring pest-related damages across broad areas [6] and providing guidance for mitigation efforts [7,8,9]. For some pests, defoliation is a sign of tree mortality and many studies have used satellite, aerial, or drone imagery to map mortality directly [10,11,12,13,14,15]. However, for leaf-eating insects such as the Spongy moth, defoliation is a stressor that deciduous trees can survive [16,17,18]. The chances of mortality increase when the defoliation is repeated [19] or coincides with other environmental stressors such as drought or other pests [17]. The inclusion of data on both defoliation and environmental stressors has been shown to significantly improve models of forest mortality caused by certain pests or diseases [20,21].
Many environmental factors can aggravate stress caused by pest-related defoliation and increase tree mortality. Ref. [22] found that soil characteristics and topographic relief together accounted for 20% of the mortality variation in the tropical forests. They also found that tree mortality was higher on steep slopes and on sandy soils in valleys while mortality was lower on plateaus with clay soils. Ref. [23] found correlations between multiple years of drought and tree mortality. Canopy cover has been found to be an important factor in mortality prediction, although the effect was not consistent across different forest types and environments [20,24,25,26]. Other factors such as proximity to roads and urban areas can affect overall tree health [27,28,29,30,31]. Ref. [17] showed that unhealthy trees have a substantially lower chance of survival from Spongy moth outbreaks.
Some previous studies have used defoliation, mapped from moderate-resolution satellite imagery (e.g., Landsat, Sentinel-2), as an indicator of pest-related mortality in coniferous forests [10,11,13,32,33]. Other studies have used high-resolution drone imagery to map tree mortality [12,13]. Ref. [34] mapped defoliation caused by Spongy moth but did not investigate the impacts of defoliation on tree mortality. For Spongy moth and other leaf-eating pests that affect deciduous trees, mortality models should consider environmental stressors in addition to defoliation. Although some studies have included both environmental factors and satellite-based defoliation maps in tree mortality models [20,21], we are not aware of any studies that have applied these methods for Spongy moth or in temperate deciduous forests.
The objective of this paper is to model tree mortality resulting from a 2015–2017 Spongy moth outbreak in the temperate deciduous forest in Rhode Island, located in the northeastern USA. Spongy moth preferentially feeds on the foliage of oaks (Quercus spp.) and other deciduous trees, which make up the dominant forest cover in Rhode Island. During outbreaks, Spongy moth can cause complete and widespread defoliation. We use a Random Forest approach to model mortality based on defoliation and environmental factors that could stress trees and make them more susceptible to mortality. We map defoliation from Landsat imagery and include geospatial data representing topography, climate, soil, and vegetation characteristics in modeling mortality. This research explores the effectiveness of defoliation and environmental factors, represented by geospatial data, in predicting tree mortality from outbreaks of a leaf-eating forest pest (i.e., Spongy moth). To our knowledge, this is the first study that includes environmental predictors, along with satellite-based defoliation mapping, in mortality models for temperate deciduous forests.

2. Methods

2.1. Study Area

Our study area was the state of Rhode Island, which has a total area of 3144 km2 (Figure 1). Rhode Island is largely covered by temperate deciduous forests with species that are common throughout the eastern USA. Rhode Island forests are dominated by oak (Quercus spp.) and hickory (Carya spp.) species [35]. Other common species include eastern white pine (Pinus strobus), sugar maple (Acer saccharum), red maple (Acer rubrum), American beech (Fagus grandifolia), hemlock (Tsuga canadensis), birch (Betula spp.), elm (Ulmus americana), white ash (Fraxinus americana), and linden tree (Tilia americana) [35]. Coniferous forests of Rhode Island are dominated by eastern white pine and occur mostly in the southern part of the state [35]. The topography of Rhode Island mostly consists of gently rolling uplands with elevations up to 60 m and hillier upland terrain in the northwest with elevations of 60 to 200 m. The state includes humid continental, humid subtropical, and oceanic climate types [36]. Mean annual temperatures in Rhode Island range from 8 to 12 °C and annual precipitation ranges from 100 to 150 cm with lower precipitation amounts in coastal areas. Summers are characterized by periods (i.e., >2 weeks) of little to no precipitation.
Rhode Island is the second most densely populated state in the U.S. with an average of 393 persons/km2 [37]. The state contains 1784 km2 of forest land, which generates an estimated USD 408 M in forest products and USD 720 M annually in recreational value [38]. In addition, the forests provide erosion and flood control and other ecosystem services [38].
The study area experienced a severe Spongy moth outbreak in 2015–2017. This outbreak defoliated approximately 4386 km2 of forest in New England, which made it the largest outbreak in 30 years [34]. This outbreak was caused by unusually dry springs in 2014, 2015, and 2016. Those dry conditions suppressed a growth of fungus Entomophaga maimaiga, which typically controls Spongy moth populations in the region [34].

2.2. Data and Software

To map defoliation, we used atmospherically corrected (level 2) Landsat 5 and 8 satellite imagery acquired in mid-summer (i.e., late June to early August) from 2009 to 2017 [39]. Landsat imagery has a 30 m spatial resolution for the visible and near-infrared wavelengths. Images corresponded as closely as possible to early July after Spongy moth activity reached its peak and before significant refoliation could occur. We acquired cloud-free imagery for 2009, 2011, 2013, and 2016. No cloud-free images were available for the remaining years so we used multiple images to create cloud-free composites. The composite images were created by using a cloud mask layer to identify cloud pixels in the primary images and replacing those pixels with cloud-free image pixels from the secondary images. Secondary images had acquisition dates within 4 weeks of the primary images.
We used high-resolution summertime aerial imagery and Sentinel-2 imagery to evaluate the accuracy of the defoliation classification. The aerial imagery corresponded to July 2016, had a spatial resolution of 7.6 cm, and included visible and near-infrared wavelengths. Atmospherically corrected Sentinel-2 imagery [39] was used for 2015 and 2017 when summertime aerial imagery was not available. Sentinel-2 imagery has a spatial resolution of 10 m for the visible and near-infrared wavelengths. To create training and validation for the tree mortality models, we used aerial imagery acquired during the summer (July–August) of 2019; the imagery consisted of visible (RGB) bands with a 7.6 cm spatial resolution [40].
Various datasets were used to create environmental metrics. Most of these datasets were downloaded from the Rhode Island Geographic Information System and included Soils; Glacial Deposits; Ecological Communities of Rhode Island; 2011 Rhode Island Statewide LiDAR; Freshwater Lakes, Ponds, and Reservoirs; and Coastal Waters [40]. We also used the U.S. Drought Monitor data [41] and a 1 m resolution land cover from the National Oceanographic and Atmospheric Administration Coastal Change Analysis Program (CCAP) [42]. We used weekly Drought Monitor data for the summertime (1 June–1 September) during the Spongy moth outbreak (2015–2017). The LiDAR data were acquired during leaf-off conditions in March 2011 and had a point density of 2+ pts/m2. The ground-classified points in the LiDAR data were used to create a 1 m resolution bare-earth digital elevation model (DEM). For all the GIS-based data processing and analysis, we used ArcGIS Pro with Python 3.11.3. The Python modules Scikit-Learn, Seaborn, and SHAP were used to run the Random Forest analysis, calculate related metrics, and create figures for tree mortality modeling.

2.3. Predictors of Tree Mortality

We created a 100 m × 100 m (1 ha) polygon grid covering the study area to serve as the analysis units for modeling. Tiles were included in the analysis if they had at least 75% forest cover (i.e., deciduous, coniferous, mixed, wetland) based on the CCAP land cover data. Most of the environmental metrics represent the fraction of a given tile that corresponds to the particular feature of interest. Altogether, we developed 21 environmental metrics, which characterized soil properties, topography, climate, forest characteristics, and proximity to resources and stressors (Table 1).

2.3.1. Defoliation

We used the Normalized Difference Vegetation Index (NDVI) to map defoliation with 30 m spatial resolution based on Landsat imagery [43]. To establish a baseline NDVI for the pre-outbreak forest, we calculated NDVI for the 2009, 2011, and 2013 summer images. We estimated annual pre-outbreak variation in NDVI based on a sample of 200 points. The sample included 100 points in locations with obvious defoliation and 100 points in locations with no apparent defoliation, based on the 2016 summertime aerial imagery. Points were semi-randomly distributed to encompass the extent of forest land in Rhode Island. The NDVI values for the pixels corresponding to the sample point locations were extracted from the pre-outbreak imagery. We calculated the average pre-outbreak NDVI values for each pixel as well as the standard deviations for each year. For each outbreak year (2015–2017), the NDVI rasters were subtracted from the pre-outbreak average NDVI raster. We considered NDVI differences of >0.09 to correspond to significant defoliation. This threshold was double the largest pre-outbreak standard deviation, which helped ensure that the differences exceeded the typical yearly variations.
We calculated a defoliation index for the 100 m tiles, which weighted multiple years of defoliation more heavily than a single year of defoliation because repeated defoliations have been shown to greatly increase tree mortality [19]. Ref. [19] found that healthy oak trees that are defoliated 2 years in a row have three times the mortality rate of oak trees that are defoliated only once (i.e., 22% vs. 7%). They expect that mortality would drastically increase if defoliation occurred 3 years in a row. Thus, the defoliation index was calculated as
Defoliation Index = fD1 + 2(fD2) + 3(fD3)
where fD1, fD2, and fD3 are the fractions of area within each tile that experienced defoliation for 1, 2, or 3 years, respectively (Table 1).

2.3.2. Environmental Metrics

Canopy cover can have positive or negative impacts on tree mortality depending on the ecosystem. We used the LiDAR point cloud to calculate canopy cover based on the first-return cover index (FRCI) [44]. This metric was calculated, based on the LiDAR points within each tile, as the ratio of first returns that intercept the canopy (FirstCanopy) to the total number of first returns (FirstTotal):
FRCI = ∑FirstCanopy/∑FirstTotal
First returns indicate the location where the LiDAR pulse first intercepted an object. We considered first returns with heights >3 m to correspond to the canopy. FRCI values range from 0 (no cover) to 1 (complete cover) (Table 1).
The amount of coniferous tree cover can be an important factor because Spongy moth prefers to feed on deciduous trees [17,45]. Also, ruderal and plantation forests have different tolerances for stressors than later successional forests or forests that are not heavily managed. Thus, we calculated the fractions of each tile covered by evergreen, ruderal, and plantation forests, based on the Ecological Communities dataset (Table 1).
Drought conditions can have a major impact on tree survival when it coincides with other stressors such as defoliation. We calculated a drought index during the peak growing season (i.e., June through August) for the 2015–2017 outbreak period (Table 2). The drought data consisted of polygons with attributes ranging from 0 to 6 to indicate conditions ranging from no drought (0) to severe drought (6) [41]. The polygons were intersected with the 100 m grid to join the weekly drought indices to our study tiles. The drought index was the average of the weekly drought ratings over all the summer seasons of the outbreak period for each tile.
Coastal environments have differing climatic conditions (e.g., temperature, humidity) compared to inland environments, which may be relevant for predicting forest mortality. To represent coast proximity, we calculated distances from the coastline as defined by the Rhode Island Coastal Waters dataset. The distances within each tile were averaged to create the metric (Table 2).
Urban proximity can stress trees through exposure to air pollution, road salt, heat island effects, and invasive pests [27,28,29,30]. We calculated urban distance as distance from developed land cover in the CCAP land cover dataset. The distances within each tile were averaged to create the metric (Table 2).
Proximity to waterbodies can influence soil moisture levels and depth to the water table. We used the Freshwater Lakes, Ponds, and Reservoirs dataset to calculate distance from waterbodies. The distances within each tile were averaged to create the lake proximity metric (Table 2).
Soil and glacial deposits can impact tree mortality [22] by limiting root growth or water availability [46]. Soil attributes included tile fractional cover of (1) hydric soils with permanent or temporary oversaturation, (2) overdrained soils, (3) eroded soils, (4) a densic horizon close to the surface, (5) stony soils, and (6) restrictive soils (Table 3). Metrics relating to glacial deposit attributes included tile fractional cover of (1) till, (2) outwash, and (3) bedrock (Table 3).
Topography can influence climate, which may affect tree survival during stressful events [47,48,49]. Hilltops and steep south-facing slopes may be drier than valleys or north-facing slopes. To characterize relevant slope conditions, we calculated slope and aspect based on a 1 m DEM. Slope and aspect corresponded to the steepest gradient within a 3 × 3 window centered on each pixel [50]. We extracted slopes >20° as well as slopes that were both steep (>20°) and south-facing with aspects between 225° and 315°. Metrics were calculated as the fraction of the tiles covered by steep slopes and steep south-facing slopes (Table 3). To identify hilltops and valley bottoms, we used a Topographic Position Index (TPI) [51] based on the DEM. We calculated TPI by subtracting the average elevation within 20 m of a given pixel location from the actual elevation at that location. Positive TPI values indicate hilltops whereas negative values indicate valley bottoms. We extracted TPI > 0.2 and TPI < −0.2 to represent hilltops and valley bottoms, respectively. The fractional cover of tiles covered by hilltops and valley bottoms were used as the metrics (Table 3).
Table 1. List of forest-related predictors.
Table 1. List of forest-related predictors.
NameDescriptionJustificationData
Defoliation index% defoliated forest cover weighted by number of years of defoliationDefoliation depletes energy reserves [17] and multiple years of defoliation greatly increase mortality [19].Landsat
Evergreen cover% cover by coniferous treesConiferous trees are not a preferred food source but do not tolerate defoliation [17,45].Ecological Communities
Canopy cover% of 1st returns intercepted by canopyCanopy cover can affect competition and microclimate and has been shown to affect mortality [20,24,25,26].2011 Lidar
Ruderal forests% cover by early successional forest speciesEarly successional forest has species adapted to disturbances. Heavily managed forest may have different tolerances for stressors.Ecological Communities
Plantations% cover plantation forest
Table 2. List of proximity- and drought-related predictors.
Table 2. List of proximity- and drought-related predictors.
NameDescriptionJustificationData
Drought indexAverage summer drought rating during outbreakDrought is a major stressor [52,53,54] and prolonged drought has been shown to correlate with tree mortality [23].U.S. Drought Monitor
Coast proximityAverage distance to the coastProximity to coast affects climate conditions (e.g., temperature, humidity), which may affect tree stress.RI Coastal Waters
Urban proximityAverage distance to developmentDeveloped areas may be exposed to more pollutants and invasive species [27,28,29,30].NOAA CCAP Land Cover
Lake proximityAverage distance to the lakesGround water near waterbodies may be shallow and variable.Lakes, Ponds, and Reservoirs
Table 3. List of soil- and topography-related predictors.
Table 3. List of soil- and topography-related predictors.
NameDescriptionJustificationData
Hydric soils% cover by hydric soilsHydric soils may have waterlogged soils whereas overdrained soils may have low available moisture; restrictive soils, shallow bedrock, tills, and stony soils can limit root depth or lateral distribution, thus reducing access to water and nutrients; eroded and outwash soils may have low nutrient availability [46]. Soils and geology have been shown to affect tree mortality [22].NRCS Soils
Restrictive soils% cover by restrictive soils
Stony soils% cover by stony soils
Shallow bedrock% cover by soils with shallow bedrock
Overdrained soils% cover by overdrained soils
Eroded soils% cover by eroded soils
Outwash% cover by glacial outwashGlacial Deposits
Till% cover by glacial till
Valley bottoms% cover by valleys (TPI < −0.2)Topographic position and slope can affect sun intensity, soil moisture, nutrient availability, and stability [22,47,48,49].2011 Lidar
Hilltops% cover by hilltops (TPI > 0.2)
Steep slopes% cover by slopes >20 °
Steep south-facing slopes% steep slopes facing 225°–315°

2.4. Training/Validation Data

To train and validate our tree mortality models, we used two different “ground-truth” datasets that were based on summertime 2019 aerial imagery. The first set included 1426 tiles that were randomly selected from the 100 m polygon grid. These tiles were visually evaluated using the aerial imagery and rated using a six-category system to characterize mortality of overstory trees in each tile (Table 4). The second training/validation dataset included 787 tiles corresponding to Rhode Island Department of Environmental Management (RIDEM) property that had particularly high rates of tree mortality. For these tiles, the locations of dead overstory trees were digitized as point features and subsequently aggregated by tile to create ratings that were consistent with the first training dataset (Table 4). Altogether, the training/validation dataset included 2320 sample tiles.

2.5. Modeling Tree Mortality Using Random Forest

We used Random Forest to model tree mortality based on the 21 environmental and defoliation metrics (Table 1, Table 2 and Table 3). We created two different models—one predicted two classes of mortality (low/high) and the other predicted three classes of mortality (low/med/high). The two-class model predicted low- and high-mortality classes with <5 and ≥5 dead overstory trees per hectare, respectively. The three-class model predicted low-, medium-, and high-mortality classes with ≤2, 3–11, and >11 dead overstory trees per hectare. We selected these thresholds to ensure that there were relatively similar numbers of tiles representing each class in the training/validation data. Visual assessments showed that these thresholds were adequate for representing the differing intensities of tree mortality in the study area. We were not able to identify ecologically meaningful thresholds from the literature since studies tend to not distinguish dead overstory trees from the total number of dead trees per hectare. We used a split of 80% and 20% for training and validation data, respectively. We used a 5-fold cross-validation to evaluate the consistency of model performance based on random subsets of the training/validation dataset.
We tuned our model by adjusting several hyperparameters through trial and error to find optimal parameters. Hyperparameters are the settings that are not learned by the model during training but set beforehand, and they can significantly impact model performance. We adjusted the following hyperparameters: the number of trees, minimum number of samples required to be at a leaf node, and criterion for splitting a node. We found that the models performed best with 100 trees, a minimum of 3 features per node, and using the Gini criterion for splitting the nodes. We tested variants of the model using all 21 predictors as well as using only the top 3 and 7 most important predictors. Variable importance was determined by the variable importance scores.

2.6. Accuracy Assessment

We assessed the accuracy of our defoliation mapping for 2015, 2016, and 2017 by qualitatively assessing Landsat, aerial, and Sentinel imagery, respectively. These image datasets represented the best available data for each year. For both the defoliation and mortality models, we quantitatively assessed accuracy using the F1 score, precision, and recall. These metrics are based on true positives (TP), false positives (FP), and false negatives (FN) and were calculated as follows:
Precision = TP/(TP + FP)
Recall = TP/(TP + FN)
F1 score = 2 × (Precision × Recall)/(Precision + Recall)
Quantitative assessment of the defoliation mapping was based on the 200 semi-random sample locations. Landsat, aerial, and Sentinel-2 imagery was used for visually classifying the sample points for 2015, 2016, and 2017, respectively. Accuracy assessment for the mortality modeling was based on the validation subsets (i.e., 20%) of the 2213 training/validation tiles. The mortality models were further evaluated using Shapley Additive Explanation (SHAP) plots and feature importance scores. SHAP plots indicate how the model uses a particular predictor and feature importance scores indicate the relative contribution of each predictor in the model classification. SHAP plots were based on a model that used the full dataset (i.e., without cross-validation) and the 7 most important variables.

3. Results

3.1. Defoliation

During the 3-year Spongy moth outbreak (2015–2017), approximately 23% of Rhode Island’s forest (312 km2) experienced significant defoliation for one or more years, based on the 30 m defoliation model. Repeated defoliation for a given area was uncommon. Of the forest land that experienced defoliation during the outbreak, 89.2% of forests (278 km2) were defoliated only once. Only 10.6% (33 km2) and 0.2% (<1 km2) of affected forest land experienced two or three defoliations, respectively (Figure 2). For the approximately 136,000 one-hectare study tiles with >75% forest cover, 33% (44,807 tiles) and 18% (24,050 tiles) of the tiles had defoliation over more than 25% and 75% of the tile area, respectively. Defoliation was distributed predominantly in the western part of Rhode Island, with repeated defoliation occurring sporadically, mostly in the southwestern and northwestern parts of the state. Detectable defoliation rarely occurred in coastal forests (Figure 2).
The defoliation model had good agreement with our visual interpretation of the aerial and satellite imagery for our 200 sample points (Table 5). For accuracy assessment, we considered (1) complete defoliation only and (2) complete and partial defoliation. We had high confidence in our ability to visually detect complete defoliation in both satellite and aerial imagery, but lower confidence in our ability to detect partial defoliation in the relatively coarse Landsat and Sentinel-2 data. For complete defoliation, the model performed consistently throughout the outbreak period with F1 scores ranging from 0.84 to 0.86. When partial defoliation was also considered, the model accuracy was more variable with F1 scores ranging from 0.76 to 0.87; the best accuracy was associated with the 2016 data, for which we were able to use aerial imagery to create more reliable validation data. The lower accuracy when partial defoliation was included for 2015 and 2017 may be due, in part, to an incorrect interpretation of the validation imagery.

3.2. Tree Mortality

Mortality was modeled for 1358 km2 of forest land, which corresponded to tiles with >75% forest cover. Based on the models, tree mortality was highest in the western part of the state (Figure 3). There was low tree mortality in the coastal areas and the eastern part of the state. For the two-class model, 65% (883 km2) and 35% (475 km2) of forest land was classified as low and high mortality, respectively. For the three-class model, 56% (760 km2), 23% (312 km2), and 21% (286 km2) of forest land was classified as low, medium, and high mortality, respectively.
Our best two-class and three-class models of tree mortality had overall accuracies of 82% and 65%, respectively (Table 6 and Table 7). When only defoliation was included as a predictive variable, accuracies were 72% and 56% for the two-class and three-class models, respectively. When the top three predictive variables were included, accuracies improved to 79% and 63% for the two-class and three-class models, respectively. For both two- and three-class models, precision tended to improve slightly with the inclusion of more predictive variables. However, the 21-variable models were not consistently better than the 7-variable model, and both models were only slightly better than the 3-variable model. For the two-class model, recall tended to improve with the number of variables. However, for the three-class models, there was no consistent pattern between the recall and number of variables.
The two-class model was slightly better at predicting the high class at the cost of underpredicting the low class (Table 8). Commission errors were around 20% for both the low and high classes. Omission errors were 28% and 13% for the low and high class, respectively. The three-class model was also slightly better at predicting the high class and worst at predicting the medium class (Table 9). When misclassified, the medium class was primarily confused with either the high or low classes. Commission errors were 34%, 48%, and 26% for the low, medium, and high classes, respectively. Omission errors were 32%, 52%, and 22% for the low, medium, and high classes, respectively.
The importance of the predictive variables was consistent among all the tree mortality models (Table 10, Table 11 and Table 12). The defoliation index was the most important predictor followed by coast proximity and canopy cover. For the models that included either all 21 variables or only the top 7 variables, the defoliation index was substantially more important than coast proximity, and coast proximity was substantially more important than canopy cover. In models with seven or more variables, canopy cover was only slightly more important than evergreen cover. The evergreen cover, drought index, urban proximity, and lake proximity all had similar but relatively minor importance. Variables relating to soil and topography tended to be the least important predictors.
The SHAP analysis shows how individual variables influence the model results for the high-mortality class of the three-class model (Figure 4 and Figure 5). Similar variable influences were found for the two-class model. Tiles that were classified as “high mortality” tended to have a higher defoliation index, further coast proximity, lower canopy cover, a higher drought index, further urban proximity, and less evergreen cover (Figure 4). Tiles were associated with the highest-mortality class if they had defoliation index values >0.4, were >15 km from the coast, had <60% canopy cover, had drought indices >0.8, were >100 m from urban areas, and had <20% evergreen tree cover (Figure 5).

4. Discussion

This study used satellite-based defoliation mapping combined with geospatial environmental predictors to model tree mortality resulting from a Spongy moth outbreak in a mixed temperate deciduous forest. The performances of our models were on par with the 70%–80% accuracy achieved by studies that modeled forest mortality directly from defoliation in coniferous forests [11,12,13] or that modeled mortality using defoliation and environmental factors in a semi-arid woodland [20]. Our models identified predictors that were associated with higher rates of mortality (e.g., defoliation index, coast proximity, canopy cover). The inclusion of environmental predictors along with the defoliation index improved model performance by 9%–10%, which is similar to findings in [20] that GIS-based data explained an additional 17% of mortality beyond defoliation alone. To our knowledge, our study is the first to model forest mortality from leaf-eating insects in temperate deciduous forest using both defoliation and environmental predictors.
The SHAP analysis was generally consistent with our expectations of how various predictors should affect tree mortality. Higher tree mortality was associated with a higher defoliation index, further distance from a coast, lower canopy cover, less evergreen cover, and closer proximity to urban cover. Since defoliation is the primary stressor for trees during a Spongy moth outbreak, the defoliation index was expected to be directly related to increased mortality. Evergreen cover was expected to be inversely related to mortality since coniferous species are not the preferred food source of Spongy moths. The higher mortality associated with distances >15 km from the coast may be due to higher temperatures and lower humidity found in inland areas [36]. Coastal proximity was much more important than the drought index, which suggests that other protective attributes (such as differing forest compositions) may be associated with coastal areas. The much higher resolution of the coastal proximity metric may also help explain its greater importance than the drought index. The higher mortality associated with lower canopy cover may be due to the increased solar heating of the ground and evaporation of soil moisture in more open forests. In our study area, forests with lower canopy cover could also signify previous disturbances (e.g., storm damage) that contributed to overall poor forest health. Our finding regarding canopy cover is consistent with the authors of [20], who studied semi-arid piñon–juniper woodlands. The lower mortality within 100 m of urban areas was an unexpected but minor effect, which may be due to the greater care and management (e.g., prompt removal of dead trees) provided to trees in more urban environments. It is likely that many of the dead trees near roads and buildings were removed during the 2–4-year period between the outbreak and the time the aerial imagery was collected in 2019.
We unexpectedly found that soil-based predictors had little importance in our models for predicting tree mortality. Adverse soil conditions constrain maximum tree heights, slow growth rates, and stress trees through limited water or nutrient availability. Ref. [20] found that surface organic matter had moderate importance for predicting tree mortality but found that 29 other soil variables had very little value. The lack of importance for soil data in our study may reflect a limitation of GIS soil datasets. The minimum mapping unit of our dataset was around 1 ha, which could omit much of the soil variation that would be relevant to mortality of individual trees. In addition, unfavorable individual soil characteristics were relatively uncommon in our study area and the majority of the tiles in our training/validation dataset had zeroes or very low values for soil-based predictors. The lack of variation may have made it less likely for Random Forest to find useful partitions of these predictors that were associated with varying levels of tree mortality. Combining the soil characteristics into a single metric may yield a more useful predictor for Random Forest.
We found that topographic characteristics also had little importance in our models of tree mortality. Steeper south-facing slopes tend to receive more solar heating than other slope orientations, which results in warmer and drier conditions that are likely to stress trees. However, the topography in our study area was relatively moderate with little area covered by steep south-facing slopes. The relatively infrequent occurrence of steep slopes in the study area may have made the predictor unlikely to be used effectively by Random Forest. However, slope and orientation may be more important factors in areas with rugged topography.
Our models were only slightly improved when we included more than the three top predictors. Models with the seven top predictors performed very similarly to models with the full set of predictors. The ability to use fewer predictors without sacrificing model performance is advantageous because it simplifies model development and improves efficiency.
The fraction of a tree crown that is defoliated is likely to be an important factor in tree mortality; however, this information cannot be derived from relatively coarse Landsat or Sentinel-2 satellite imagery. Ref. [34] mapped defoliation, from a Spongy moth outbreak, with differing levels of severity; however, the severity levels were not validated and cannot be expected to correspond to differing defoliation levels of individual trees. We did not attempt to map differing severity levels because we found it difficult to assess partial defoliation visually based on the 10–30 m resolutions of the satellite imagery that we used for accuracy assessments. We also found significant annual variation in our baseline (i.e., pre-outbreak) NDVI values; thus, we chose a conservative threshold for defoliation to minimize commission error.
This study was applied to a Spongy moth pest outbreak over a limited geographic area. Thus, the relevant predictors of mortality may be somewhat different for other forest pests or disease outbreaks in different areas. The coastal proximity metric would likely be irrelevant for more inland study areas and the effect of evergreen cover will depend on the feeding preferences of the pest. The literature has shown an inconsistent effect of canopy cover that varies based on the study area. Although predictors may vary by study area, we were able to confirm the benefit of incorporating GIS-based predictors along with satellite-based mapping of defoliation in forest mortality models for leaf-eating insects in temperate deciduous forests. The Random Forest tool was effective for using certain types of predictors in ways that are consistent with expectations. However, it seemed ineffective for using predictors that are relatively uncommon but still likely to be relevant (e.g., soils). Limitations in the spatial resolution of GIS data may also preclude the inclusion of important predictors in mortality models. Future work should explore whether uncommon, but likely relevant, features can be used more effectively in mortality models.

5. Conclusions

Our study used satellite-based defoliation mapping and geospatial environmental data with Random Forest to model tree mortality from a Spongy moth outbreak in Rhode Island’s temperate deciduous forest. The best models achieved accuracies of 82% and 65% when predicting two classes (low/high) or three classes (low/med/high) of mortality, respectively. The inclusion of geospatial data improved model predictions by 7%–10% compared to models based only on defoliation. The most important predictors in the models were the defoliation index, coastal proximity, and canopy cover. Models improved only slightly with the inclusion of more than three top predictors. Soil characteristics had very little contribution to the models, which may be due to the coarse resolution (i.e., minimum mapping unit of 1 ha) and the tendency for individual adverse soil characteristics to be relatively uncommon. Topographic factors also had minimal influence on models, which may be due to the relatively moderate topography of the study area. Although relevant predictors of tree mortality may vary somewhat for different regions and pest species, this study showed the benefit of Random Forest modeling that combines satellite-based monitoring with geospatial environmental data.

Author Contributions

Conceptualization, J.R.P.; Formal analysis, L.D.; Methodology, L.D. and J.R.P.; Supervision, J.R.P.; Writing—original draft, L.D.; Writing—review and editing, J.R.P. All authors have read and agreed to the published version of the manuscript.

Funding

USDA National Institute of Food and Agriculture, McIntire Stennis project 1021643; USGS AmericaView project AV23-RI-01.

Data Availability Statement

The original data presented in the study are openly available online: https://www.rigis.org/ (accessed on 5 January 2025).

Acknowledgments

The authors thank Rockwell Richards and Molly Ahern for their contributions in creating the training/validation datasets used in this study. This work was supported by the USDA National Institute of Food and Agriculture, McIntire Stennis project 1021643, and the United States Geological Survey through the AmericaView program (AV23-RI-01). We thank the reviewers for their valuable feedback on the original manuscript.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Seidl, R.; Schelhaas, M.J.; Rammer, W.; Verkerk, P.J. Increasing forest disturbances in Europe and their impact on carbon storage. Nat. Clim. Change 2014, 4, 806–810. [Google Scholar] [CrossRef] [PubMed]
  2. Quirion, B.R.; Domke, G.M.; Walters, B.F.; Lovett, G.M.; Fargione, J.E.; Greenwood, L.; Serbesoff-King, K.; Randall, J.M.; Fei, S. Insect and disease disturbances correlate with reduced carbon sequestration in forests of the contiguous United States. Front. For. Glob. Change 2021, 4, 716582. [Google Scholar] [CrossRef]
  3. Fei, S.; Morin, R.S.; Oswalt, C.M.; Liebhold, A.M. Biomass losses resulting from insect and disease invasions in US forests. Proc. Natl. Acad. Sci. USA 2019, 116, 17371–17376. [Google Scholar] [CrossRef]
  4. Murray, B.C.; Pendleton, L.; Jenkins, W.A.; Sifleet, S. Green Payments for Blue Carbon: Economic Incentives for Protecting Threatened Coastal Habitats; Nicholas Institute for Environmental Policy Solutions, Duke University: Durham, NC, USA, 2011. [Google Scholar]
  5. Bigsby, K.M.; Ambrose, M.J.; Tobin, P.C.; Sills, E.O. The cost of gypsy moth sex in the city. Urban For. Urban Green. 2014, 13, 459–468. [Google Scholar] [CrossRef]
  6. Verbesselt, J.; Robinson, A.; Stone, C.; Culvenor, D. Forecasting tree mortality using change metrics derived from MODIS satellite data. For. Ecol. Manag. 2009, 258, 1166–1173. [Google Scholar] [CrossRef]
  7. Hudgins, E.J.; Liebhold, A.M.; Leung, B. Predicting the spread of all invasive forest pests in the United States. Ecol. Lett. 2017, 20, 426–435. [Google Scholar] [CrossRef] [PubMed]
  8. Ivantsova, E.D.; Pyzhev, A.I.; Zander, E.V. Economic consequences of insect pests outbreaks in boreal forests: A literature review. J. Sib. Fed. Univ. 2019, 12, 627–642. [Google Scholar] [CrossRef]
  9. Kinahan, I.G.; Grandstaff, G.; Russell, A.; Rigsby, C.M.; Casagrande, R.A.; Preisser, E.L. A four-year, seven-state reforestation trial with eastern hemlocks (Tsuga canadensis) resistant to hemlock woolly adelgid (Adelges tsugae). Forests 2020, 11, 312. [Google Scholar] [CrossRef]
  10. Meddens, A.J.H.; Hicke, J.A. Spatial and temporal patterns of Landsat-based detection of tree mortality caused by a mountain pine beetle outbreak in Colorado, USA. For. Ecol. Manag. 2014, 322, 78–88. [Google Scholar] [CrossRef]
  11. Goodwin, N.R.; Coops, N.C.; Wulder, M.A.; Gillanders, S.; Schroeder, T.A.; Nelson, T. Estimation of insect infestation dynamics using a temporal sequence of Landsat data. Remote Sens. Environ. 2008, 112, 3680–3689. [Google Scholar] [CrossRef]
  12. Long, J.A.; Lawrence, R.L. Mapping Percent Tree Mortality due to Mountain Pine Beetle Damage. For. Sci. 2016, 62, 392–402. [Google Scholar] [CrossRef]
  13. Bergmüller, K.O.; Vanderwel, M.C. Predicting Tree Mortality Using Spectral Indices Derived from Multispectral UAV Imagery. Remote Sens. 2022, 14, 2195. [Google Scholar] [CrossRef]
  14. Meng, R.; Gao, R.; Zhao, F.; Huang, C.; Sun, R.; Lv, Z.; Huang, Z. Landsat-based monitoring of southern pine beetle infestation severity and severity change in a temperate mixed forest. Remote Sens. Environ. 2022, 269, 112847. [Google Scholar] [CrossRef]
  15. Zhan, Z.; Yu, L.; Li, Z.; Ren, L.; Gao, B.; Wang, L.; Luo, Y. Combining GF-2 and Sentinel-2 Images to Detect Tree Mortality Caused by Red Turpentine Beetle during the Early Outbreak Stage in North China. Forests 2020, 11, 172. [Google Scholar] [CrossRef]
  16. Gottschalk, K.W.; Colbert, J.J.; Feicht, D.L. Tree mortality risk of oak due to gypsy moth. Eur. J. For. Pathol. 2007, 28, 121–132. [Google Scholar] [CrossRef]
  17. Davidson, C.B.; Gottschalk, K.W.; Johnson, J.E. Tree mortality following defoliation by the European gypsy moth (Lymantria dispar L.) in the United States: A review. For. Sci. 1999, 45, 74–84. [Google Scholar] [CrossRef]
  18. Baker, W.L. Effect of Gypsy Moth Defoliation on Certain Forest Trees. J. For. 1941, 39, 1017–1022. [Google Scholar]
  19. Campbell, R.W. Gypsy Moth Influence on Forest; Agriculture Information Bulletin No. 423; U.S. Department of Agriculture Forest Service Pacific Northwest Research Station: Portland, OR, USA, 1979. [Google Scholar]
  20. Campbell, M.J.; Dennison, P.E.; Tune, J.W.; Kannenberg, S.A.; Kerr, K.L.; Codding, B.F.; Anderegg, W.R.L. A multi-sensor, multi-scale approach to mapping tree mortality in woodland ecosystems. Remote Sens. Environ. 2020, 245, 111853. [Google Scholar] [CrossRef]
  21. Navarro-Cerrillo, R.M.; Varo-Martínez, M.Á.; Acosta, C.; Rodriguez, G.P.; Sánchez-Cuesta, R.; Ruiz Gómez, F.J. Integration of WorldView-2 and airborne laser scanning data to classify defoliation levels in Quercus ilex L. dehesas affected by root rot mortality: Management implications. For. Ecol. Manag. 2019, 451, 117564. [Google Scholar] [CrossRef]
  22. Toledo, J.J.; Magnusson, W.E.; Castilho, C.V.; Nascimento, H.E.M. Tree mode of death in Central Amazonia: Effects of soil and topography on tree mortality associated with storm disturbances. For. Ecol. Manag. 2012, 263, 253–261. [Google Scholar] [CrossRef]
  23. Guarín, A.; Taylor, A.H. Drought triggered tree mortality in mixed conifer forests in Yosemite National Park, California, USA. For. Ecol. Manag. 2005, 218, 229–244. [Google Scholar] [CrossRef]
  24. Gunst, K.; Weisberg, P.G.; Yang, J.; Fan, Y. Do denser forests have greater risk of tree mortality: A remote sensing analysis of density-dependent forest mortality. For. Ecol. Manag. 2016, 359, 19–32. [Google Scholar] [CrossRef]
  25. Dorman, M.; Perevolotsky, A.; Sarris, D.; Svoray, T. The effect of rainfall and competition intensity on forest response to drought: Lessons learned from a dry extreme. Oecologia 2015, 177, 1025–1038. [Google Scholar] [CrossRef]
  26. Das, A.; Battles, J.; Stephenson, N.L.; van Mantgem, P.J. The contribution of competition to tree mortality in old-growth coniferous forests. For. Ecol. Manag. 2011, 261, 1203–1213. [Google Scholar] [CrossRef]
  27. Kayama, M.; Quoreshi, A.M.; Kitaoka, S.; Kitahashi, Y.; Sakamoto, Y.; Maruyama, Y.; Kitao, M.; Koike, T. Effects of deicing salt on the vitality and health of two spruce species, Picea abies Karst., and Picea glehnii Masters planted along roadsides in northern Japan. Environ. Pollut. 2003, 124, 127–137. [Google Scholar] [CrossRef]
  28. Horsley, S.B.; Long, R.P.; Bailey, S.W.; Hallett, R.A.; Wargo, P.M. Health of eastern North American sugar maple forests and factors affecting decline. North. J. Appl. For. 2002, 19, 34–44. [Google Scholar] [CrossRef]
  29. Trumbore, S.; Brando, P.; Hartmann, H. Forest health and global change. Science 2015, 349, 814–818. [Google Scholar] [CrossRef]
  30. Percy, K.E.; Ferretti, M. Air pollution and forest health: Toward new monitoring concepts. Environ. Pollut. 2004, 130, 113–126. [Google Scholar] [CrossRef]
  31. Stravinskienė, V.; Bartkevičius, E.; Abraitienė, J.; Dautartė, A. Assessment of Pinus sylvestris L. tree health in urban forests at highway sides in Lithuania. Glob. Ecol. Conserv. 2018, 16, e00517. [Google Scholar] [CrossRef]
  32. Meddens, A.J.H.; Hicke, J.A.; Vierling, L.A.; Hudak, A.T. Evaluating methods to detect bark beetle-caused tree mortality using single-date and multi-date Landsat imagery. Remote Sens. Environ. 2013, 132, 49–58. [Google Scholar] [CrossRef]
  33. Shearman, T.M.; Varner, J.M.; Hood, S.M.; Cansler, C.A.; Hiers, J.K. Modelling post-fire tree mortality: Can random forest improve discrimination of imbalanced data? Ecol. Model. 2019, 414, 108855. [Google Scholar] [CrossRef]
  34. Pasquarella, V.J.; Elkinton, J.S.; Bradley, B.A. Extensive gypsy moth defoliation in Southern New England characterized using Landsat satellite observations. Biol. Invasions 2018, 20, 3047–3053. [Google Scholar] [CrossRef]
  35. Enser, R.; Gregg, D.; Sparks, C.; August, P.; Jordan, P.; Coit, J.; Raithel, C.; Tefft, B.; Payton, B.; Brown, C.; et al. Rhode Island Ecological Communities Classification; Technical Report; Rhode Island Natural History Survey: Kingston, RI, USA, 2011; 33p. [Google Scholar]
  36. National Climatic Data Center (NCDC). Climates of the States, Volume 1; U.S. Government Publishing Office (Climatography of the United States): Washington, DC, USA, 1971; Available online: https://books.google.com/books?id=pfHFwgEACAAJ (accessed on 5 January 2025).
  37. United States Census Bureau (USCB). Historical Population Density Data (1910–2020). 2020. Available online: https://www.census.gov/data/tables/time-series/dec/density-data-text.html (accessed on 18 December 2024).
  38. Riely, C.; Sayles, K.; Burr, J. The Value of Rhode Island Forests. 2019. Rhode Island Tree Council. Available online: https://dem.ri.gov/sites/g/files/xkgbur861/files/programs/bnatres/forest/pdf/forest-value.pdf (accessed on 18 December 2024).
  39. United States Geological Survey (USGS). Earth Explorer. Available online: https://earthexplorer.usgs.gov/ (accessed on 5 January 2025).
  40. Rhode Island Geographic Information Systems (RIGIS). Rhode Island Maps and Data Geospatial Hub. 2024. Available online: www.rigis.org (accessed on 5 January 2025).
  41. National Drought Mitigation Center (NDMC). U.S. Drought Monitor (USDM). Available online: https://www.drought.gov/data-maps-tools/us-drought-monitor (accessed on 18 December 2024).
  42. National Oceanographic and Atmospheric Administration (NOAA). Coastal Change Analysis Program (C-CAP). High-Resolution Land Cover Dataset for Rhode Island. 2020. Available online: https://chs.coast.noaa.gov/htdata/raster1/landcover/bulkdownload/hires/ri/ (accessed on 18 December 2024).
  43. Rullan-Silva, C.; Olthoff, A.; Delgado de la Mata, J.; Pajares-Alonso, J. Remote Monitoring of Forest Insect Defoliation—A Review. For. Syst. 2013, 22, 377–391. [Google Scholar] [CrossRef]
  44. Ma, Q.; Su, Y.; Guo, Q. Comparison of Canopy Cover Estimations from Airborne LiDAR, Aerial Imagery, and Satellite Imagery. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2017, 10, 4225–4236. [Google Scholar] [CrossRef]
  45. Barbosa, P.; Capinera, J.L. Population quality, dispersal and numerical change in the gypsy moth, Lymantria dispar (L.). Oecologia 1978, 36, 203–209. [Google Scholar] [CrossRef]
  46. Baughman, M.J.; Russell, M. Woodland Stewardship: A Practical Guide for Midwestern Landowners, 3rd ed.; University of Minnesota Extension: St. Paul, MN, USA, 2019; Available online: https://open.lib.umn.edu/woodlandstewardship/chapter/how-trees-and-woodlands-grow/ (accessed on 14 December 2024).
  47. Zhang, X.; Jiao, J.J.; Guo, W. How Does Topography Control Topography-Driven Groundwater Flow? Geophys. Res. Lett. 2022, 49, e2022GL101005. [Google Scholar] [CrossRef]
  48. Opedal, Ø.H.; Armbruster, W.S.; Graae, B.J. Linking small-scale topography with microclimate, plant species diversity and intra-specific trait variation in an alpine landscape. Plant Ecol. Divers. 2015, 8, 305–315. [Google Scholar] [CrossRef]
  49. Dunn, C.P.; Stearns, F. A comparison of vegetation and soils in floodplain and basin forested wetlands of southeastern Wisconsin. Am. Midl. Nat. 1987, 118, 375–384. [Google Scholar] [CrossRef]
  50. Environmental Systems Research Institute. Slope (Spatial Analyst). 2024. Available online: https://pro.arcgis.com/en/pro-app/latest/tool-reference/spatial-analyst/slope.htm (accessed on 18 December 2024).
  51. De Reu, J.; Bourgeois, J.; Bats, M.; Zwertvaegher, A.; Gelorini, V.; De Smedt, P.; Chu, W.; Antrop, M.; De Maeyer, P.; Finke, P.; et al. Application of the topographic position index to heterogeneous landscapes. Catena 2013, 103, 31–39. [Google Scholar] [CrossRef]
  52. Bottero, A.; D’Amato, A.W.; Palik, B.J.; Bradford, J.B.; Fraver, S.; Battaglia, M.A.; Asherin, L.A. Density-dependent vulnerability of forest ecosystems to drought. J. Appl. Ecol. 2017, 54, 1605–1614. [Google Scholar] [CrossRef]
  53. Ramsfield, T.D.; Bentz, B.J.; Faccoli, M.; Jactel, H.; Brockerhoff, E.G. Forest health in a changing world: Effects of globalization and climate change on forest insect and pathogen impacts. Forestry 2016, 89, 245–252. [Google Scholar] [CrossRef]
  54. Choat, B.; Brodribb, T.J.; Brodersen, C.R.; Duursma, R.A.; López, R.; Medlyn, B.E. Triggers of tree mortality under drought. Nature 2018, 558, 531–539. [Google Scholar] [CrossRef] [PubMed]
Figure 1. The study area consisted of the state of Rhode Island (red dot) in the northeastern USA.
Figure 1. The study area consisted of the state of Rhode Island (red dot) in the northeastern USA.
Forests 16 00093 g001
Figure 2. Defoliation in Rhode Island after Spongy moth invasion in 2015–2017.
Figure 2. Defoliation in Rhode Island after Spongy moth invasion in 2015–2017.
Forests 16 00093 g002
Figure 3. Maps of mortality prediction based on 2-class model (left) and 3-class model (right).
Figure 3. Maps of mortality prediction based on 2-class model (left) and 3-class model (right).
Forests 16 00093 g003
Figure 4. A SHAP beeswarm plot of the 6 most important predictors for the high-mortality class of the 3-class model. Positive SHAP values (x-axis) indicate the feature values (high—red, low—blue) that are more strongly associated with the class. The green line indicates where the SHAP values are zero.
Figure 4. A SHAP beeswarm plot of the 6 most important predictors for the high-mortality class of the 3-class model. Positive SHAP values (x-axis) indicate the feature values (high—red, low—blue) that are more strongly associated with the class. The green line indicates where the SHAP values are zero.
Forests 16 00093 g004
Figure 5. SHAP partial dependence scatter plots of the 6 most important predictors for the high-mortality class of the 3-class model. Green lines indicate where the SHAP values are zero.
Figure 5. SHAP partial dependence scatter plots of the 6 most important predictors for the high-mortality class of the 3-class model. Green lines indicate where the SHAP values are zero.
Forests 16 00093 g005
Table 4. Classes of tree mortality in training/validation dataset.
Table 4. Classes of tree mortality in training/validation dataset.
ClassDead Trees/haNumber of Tiles (Dataset 1)Number of Tiles (Dataset 2)Total Number of Tiles
0025820278
11–229950349
23–5239101340
36–10222176398
411–20152272424
5>208523531
Table 5. Accuracy assessment of defoliation model based on 200 sample locations.
Table 5. Accuracy assessment of defoliation model based on 200 sample locations.
YearValidation ImageryDefoliation TypeF1 ScoreSample Size
2015LandsatPartial + complete0.79113
Complete0.84110
2016AerialPartial + complete0.87167
Complete0.86136
2017SentinelPartial + complete0.76188
Complete0.86154
Table 6. Accuracy assessment for 2-class models (low and high mortality) with varying numbers of variables. Metrics include overall accuracy (OA), average 5-fold cross-validation accuracy (XV), kappa (kp), F1 score (F1), precision (P), and recall (R).
Table 6. Accuracy assessment for 2-class models (low and high mortality) with varying numbers of variables. Metrics include overall accuracy (OA), average 5-fold cross-validation accuracy (XV), kappa (kp), F1 score (F1), precision (P), and recall (R).
Number of VariablesOAXVkpLowHigh
F1PRF1PR
210.810.780.610.770.820.730.840.810.87
70.820.770.630.770.780.760.860.850.86
30.790.770.560.730.770.700.830.800.85
10.720.740.420.670.660.670.750.760.75
Table 7. Accuracy assessment for 3-class models (low, medium, and high mortality) with varying numbers of variables. Metrics include overall accuracy (OA), average 5-fold cross-validation accuracy (XV), kappa (kp), F1 score (F1), precision (P), and recall (R).
Table 7. Accuracy assessment for 3-class models (low, medium, and high mortality) with varying numbers of variables. Metrics include overall accuracy (OA), average 5-fold cross-validation accuracy (XV), kappa (kp), F1 score (F1), precision (P), and recall (R).
Number of VariablesOAXVkpLowMediumHigh
F1PRF1PRF1PR
210.640.600.460.670.630.720.470.540.410.750.710.80
70.650.610.470.650.670.640.500.450.510.770.770.70
30.630.600.430.650.660.640.480.450.510.740.770.48
10.560.550.330.600.510.730.300.230.400.680.670.70
Table 8. Confusion matrix for 2-class model with 7 variables.
Table 8. Confusion matrix for 2-class model with 7 variables.
PredictedOmission Error
LowHigh
ActualLow2098328%
High4932313%
Commission error19%20%
Table 9. Confusion matrix for 3-class model (7 variables).
Table 9. Confusion matrix for 3-class model (7 variables).
PredictedOmission Error
LowMed.High
ActualLow127441632%
Med.561015552%
High104820722%
Commission error34%48%26%
Table 10. Relative importance of predictors used for the 21-variable models. Higher scores indicate greater importance and the scores sum to 1.
Table 10. Relative importance of predictors used for the 21-variable models. Higher scores indicate greater importance and the scores sum to 1.
Predictors2-Class Model3-Class Model
Defoliation index0.250.21
Coast proximity0.160.13
Canopy cover0.080.09
Evergreen cover0.060.06
Drought index0.060.06
Urban proximity0.060.06
Lake proximity0.050.05
Valley bottoms0.040.04
Hilltops0.040.04
Steep slopes0.030.04
Hydric soils0.020.03
Stony soils0.020.03
Restrictive soils0.020.02
Shallow bedrock0.020.02
Steep south-facing slopes0.010.02
Plantations0.010.01
Till<0.01<0.01
Outwash<0.01<0.01
Overdrained soils<0.01<0.01
Eroded soils<0.01<0.01
Ruderal forest<0.01<0.01
Table 11. Relative importance of predictors used for the 7-variable models. Higher scores indicate greater importance and the scores sum to 1.
Table 11. Relative importance of predictors used for the 7-variable models. Higher scores indicate greater importance and the scores sum to 1.
Predictors2-Class Model3-Class Model
Defoliation index0.300.27
Coast proximity 0.230.20
Canopy cover0.110.13
Drought index0.10.09
Evergreen cover0.090.09
Urban proximity0.090.11
Lake proximity0.080.1
Table 12. Relative importance of predictors used for the 3-variable models. Higher scores indicate greater importance and the scores sum to 1.
Table 12. Relative importance of predictors used for the 3-variable models. Higher scores indicate greater importance and the scores sum to 1.
Predictors2-Class Model3-Class Model
Defoliation index0.380.37
Coast proximity0.360.36
Canopy cover0.270.28
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Dumarevskaya, L.; Parent, J.R. Modeling Spongy Moth Forest Mortality in Rhode Island Temperate Deciduous Forest. Forests 2025, 16, 93. https://doi.org/10.3390/f16010093

AMA Style

Dumarevskaya L, Parent JR. Modeling Spongy Moth Forest Mortality in Rhode Island Temperate Deciduous Forest. Forests. 2025; 16(1):93. https://doi.org/10.3390/f16010093

Chicago/Turabian Style

Dumarevskaya, Liubov, and Jason R. Parent. 2025. "Modeling Spongy Moth Forest Mortality in Rhode Island Temperate Deciduous Forest" Forests 16, no. 1: 93. https://doi.org/10.3390/f16010093

APA Style

Dumarevskaya, L., & Parent, J. R. (2025). Modeling Spongy Moth Forest Mortality in Rhode Island Temperate Deciduous Forest. Forests, 16(1), 93. https://doi.org/10.3390/f16010093

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop