The Difﬁculty of Predicting Eastern Spruce Dwarf Mistletoe in Lowland Black Spruce: Model Benchmarking in Northern Minnesota, USA

: Insects, fungi, and diseases play an important role in forest stand development and subsequently, forest management decisions and treatments. As these disturbance agents commonly occur within and across landscapes, modeling has often been used to inform forest planning and management decisions. However, models are rarely benchmarked, leaving questions about their utility. Here, we assessed the predictive performance of a Bayesian hierarchical model through on– the-ground sampling to explore what features of stand structure or composition may be important factors related to eastern spruce dwarf mistletoe ( Arceuthobium pusillum Peck) presence in lowland black spruce ( Picea mariana (Mill.) B. S. P.). Twenty-ﬁve state-owned stands included in the predictive model were sampled during the 2019 and 2020 growing seasons. Within each stand, data related to the presence of eastern spruce dwarf mistletoe, stand structure, and species composition were collected. The model accurately predicted eastern spruce dwarf mistletoe occurrence for 13 of the 25 stands. The amount of living and dead black spruce basal area differed signiﬁcantly based on model prediction and observed infestation, but trees per hectare, total living basal area, diameter at breast height, stand age, and species richness were not signiﬁcantly different. Our results highlight the beneﬁts of model benchmarking to improve model interpretation as well as to inform our understanding of forest health problems across diverse stand conditions.


Introduction
Insects, fungi, and diseases are important components of forest ecosystems and are frequently at the center of discussions regarding forest health. During the maximum sustained yield era of forestry, these disturbance agents were largely considered threats that needed to be "controlled" to reduce the losses to timber production and quality [1]. However, as goals and objectives have expanded beyond commercial timber production, forest health has shifted to exploring how insects, diseases, and other disturbances agents influence stand structure, composition, and development [2]. Here, we define forest health as the maintenance of resilience, defined as the recurrence and persistence of processes that lead to sustainable ecological conditions as well as timber and non-timber forest products [3].
As biotic disturbance agents interact in novel ways with changing climate conditions, non-native forest health threats continue to be introduced, and data on the subsequent impact on stand development becomes better understood [4][5][6][7][8][9][10], more information about the distribution, prevalence, and effects of these disturbance agents is being sought. This information, in turn, is used to inform forest management decisions that increasingly require adaptability and flexibility [11,12].
Modeling is often used to supplement data collected through forest inventories or more intensive sampling for a specific disease or insect [13]. However, in some forest communities, locations may be difficult to access or prohibitively expensive to sample [14]. Additionally, forest cover types can cover large areas, increasing the utility of modeling. The ability to model systems using previously collected forest inventory data allows researchers and natural resource managers to address many different forest health questions. Modeling has been used to predict the occurrence of forest disturbance agents [15], risk factors [8,16], rates and patterns of spread [17], and their effects on growth [1,2,18], among other objectives.
While models can provide insight into the distribution, potential impact, and other aspects of biotic disturbance agents related to forest management, they are still mathematical representations of systems, not the systems themselves [19]. As such, each model should be carefully evaluated [20]. Evaluation of a model can include, but is not limited to, examining the data used to create the model (what the original purpose was, etc.), asking if model complexity reflects the system or our understanding of the system [21], and, critically, benchmarking to assess model performance [20,22].
A system where modeling of forest health threats has occurred is in lowland black spruce (Picea mariana (Mill.) B. S. P.) forests of Minnesota that are infected with eastern spruce dwarf mistletoe (Arceuthobium pusillum Peck; hereafter ESDM) [15,17]. Lowland black spruce is an important component of the southern boreal forest, providing many ecological and economic benefits [23][24][25][26]. ESDM, the only dwarf mistletoe on Picea spp. in Minnesota, is a native, parasitic plant that has long been present on the landscape and is an important but not well understood component of disturbance regimes in black spruce forests [27]. In the region, black spruce is the main host of ESDM; though, at extremely high levels of infestation, it has been known to also parasitize tamarack (Larix larcina (Du Roi) K. Koch) [28].
Historically, black spruce forests were affected by large, stand-replacing disturbances (e.g., fire) at infrequent intervals with smaller, more frequent disturbances occurring at shorter intervals [29,30]. These fires are hypothesized to have maintained ESDM populations at a lower level than is currently observed, as ESDM requires a living host to survive and reproduce [27]. With fire largely absent from the landscape, clear-cut regeneration harvest is now the most common stand-replacing disturbance, resulting in mostly evenaged stands that often are affected by windthrow or ESDM infestations between rotations. Management of ESDM infestations occurs at the time of regeneration harvest, with best practices recommending cutting everything at least 1.5-m tall [31].
Long-range dispersal of ESDM by birds and other species such as small mammals is hypothesized to result in new infection locations [32]. When a black spruce tree is infected with ESDM, there is an initial reduction in growth accompanied by the development of witches' brooms-bushy, compact masses of branches and twigs that grow at the site of infection [27]. After 15 to 20 years, 75% of black spruce will succumb to mortality [33]. Infestations spread radially within stands and create a unique spatial distribution of ESDM at both the stand and landscape levels.
This spatial distribution of ESDM has been the focus of much of the modeling done in this system. Baker et al. [17] developed a model (DMLOSS) to predict the within-stand spread and resulting black spruce volume loss; their model estimated significant volume loss over the course of a single rotation using previously published rates of spread and mortality [33]. Moving from the stand level to the landscape, Hanks and colleagues [15], developed a model to predict the occurrence of ESDM utilizing a Bayesian hierarchical framework. However, this model has not been benchmarked to assess the model's predictive performance. Our objective is to assess the predictive performance of the model through on-the-ground sampling to explore what features of stand structure or composition may be important factors related to ESDM infection modeling at the landscape scale.

Study Area
Northern Minnesota, USA, falls within the Laurentian Mixed Deciduous forest and is a transition zone between northern hardwood forests to the south and east and boreal forests to the north [34]. The growing season in the region is short, ranging from 98 to 111 days [35]. The average annual temperature is between 1 • C and 4 • C; winter temperatures regularly reach −31 • C while summer temperatures above 30 • C have been recorded. Precipitation ranges between 650 and 800 mm per year, with the majority occurring during the summer months as rain [36].
The region contains mixed hardwood and conifer forests, as well as conifer swamps and bogs. Conifer bogs and swamps include peatland forests, which have deep, acidic, nutrient-poor, and poorly drained soils and are dominated by very few tree species, including black spruce and tamarack [30,35].
Black spruce peatlands are an important component of the region's forests, comprising 648,000 hectares of the 7.04 million hectares of forest in the state [37], occurring in both managed and unmanaged stands. When managed, black spruce is often found in pure stands with an even-aged distribution. However, black spruce can also occur in mixed stands with tamarack and balsam fir (Abies balsamea L. Mill.) with components of birch (Betula spp.), quaking aspen (Populus tremuloides Michx.), and northern white cedar (Thuja occidentalis L.) which are more likely to be multi-aged [35].

ESDM Model
Given that ESDM is one of the major disturbance agents of black spruce forests, there was interest in modeling the probability of its occurrence. Hanks et al. [15] developed a Bayesian hierarchical model to predict the probability of occurrence of ESDM for 25,235 state-owned stands across northeastern Minnesota, USA, using multiple data sets.  (Table 1). This represents the only published predictive model of ESDM occurrence in the study region. However, the predictive accuracy of the model has not been assessed. Aspen cover type Jack pine cover type Spatial autocovariate (ac y ) Notes: The variable y is the presence (y i = 1) or absence (y 1 = 0) of mistletoe in a stand, as identified in the intensive survey; Φ is the probability of the DNR survey finding mistletoe if it is present in the intensive survey; and Ψ is the probability of the DNR survey reporting mistletoe present if it is not present in the intensive survey. "DNR"is the Minnesota Department of Natural Resources.

Model Benchmarking
To assess the accuracy of the ESDM model [15], we applied the predicted probability of ESDM occurrence in black spruce stands to the MN DNR FIM database. Using ArcMap 10.5.1 (ESRI) twenty-five stands for on-the-ground sampling were selected ( Figure 1). Selection criteria included stand size (≥4.04 hectares in area), age (between 60 and 100 years old at the time of the survey), dominant species (black spruce; as denoted by the DNR), surveyed since at least 1995, and within 200 m of a navigable road. This sampling frame was further reduced after selecting an equal number of stands predicted to either have ESDM present or absent. A greater than 80% predicted probability of ESDM occurrence was used to label stands where ESDM was predicted to occur. Stands with less than 20% predicted probability of occurrence were labeled as ESDM free. These percentages were chosen as conservative thresholds indicating presence or absence.
was used to label stands where ESDM was predicted to occur. Stands with less than 20% predicted probability of occurrence were labeled as ESDM free. These percentages were chosen as conservative thresholds indicating presence or absence.
Twenty-five black spruce stands were sampled during the summers of 2019 and 2020; 13 predicted to be ESDM free, and 12 predicted to have ESDM present. Within each stand, sampling occurred across three units, each comprised of six plots, consistent with previous work in this system [28]. The location of the first unit was selected by placing a GPS point within at least 30 m of the stand boundary. The second unit was 100 m away from the sixth plot of the first unit. The third unit was 100 m away from the final plot of the second unit, and all azimuths were chosen to fit within the shape of the stand (given many have highly irregular shapes). Each unit contained six 1/100th hectare overstory plots spaced 30 m × 60 m in a 2 × 3 design; in some irregularly shaped stands, the rectangular configuration could not be met, but the number of plots and spacing remained consistent.
Within each overstory plot, all trees with a diameter at breast height (DBH) of at least 3 cm were measured. For each tree, species, DBH, status (alive or dead), live crown ratio, and presence or absence of ESDM was recorded. On trees with ESDM present, the proportion of crown occupied by witches' brooms was recorded (<50%, ~50%, or >50%) in a simplified version of the dwarf mistletoe rating system [38]. A simplified version was utilized because, unlike other mistletoes, any level of ESDM will result in the eventual mortality of the black spruce. Height was measured on every fourth tree, provided the tree remained upright (leaning less than 45° from vertical) and the crown was unbroken. Height of snags was also recorded as long as these criteria were met.  Twenty-five black spruce stands were sampled during the summers of 2019 and 2020; 13 predicted to be ESDM free, and 12 predicted to have ESDM present. Within each stand, sampling occurred across three units, each comprised of six plots, consistent with previous work in this system [28]. The location of the first unit was selected by placing a GPS point within at least 30 m of the stand boundary. The second unit was 100 m away from the sixth plot of the first unit. The third unit was 100 m away from the final plot of the second unit, and all azimuths were chosen to fit within the shape of the stand (given many have highly irregular shapes). Each unit contained six 1/100th hectare overstory plots spaced 30 m × 60 m in a 2 × 3 design; in some irregularly shaped stands, the rectangular configuration could not be met, but the number of plots and spacing remained consistent.
Within each overstory plot, all trees with a diameter at breast height (DBH) of at least 3 cm were measured. For each tree, species, DBH, status (alive or dead), live crown ratio, and presence or absence of ESDM was recorded. On trees with ESDM present, the proportion of crown occupied by witches' brooms was recorded (<50%,~50%, or >50%) in a simplified version of the dwarf mistletoe rating system [38]. A simplified version was utilized because, unlike other mistletoes, any level of ESDM will result in the eventual mortality of the black spruce. Height was measured on every fourth tree, provided the tree remained upright (leaning less than 45 • from vertical) and the crown was unbroken. Height of snags was also recorded as long as these criteria were met.

Analysis
The accuracy of the model was assessed by creating a confusion matrix and calculating overall accuracy, type I error, and type II error. Additionally, stand structure characteristics were compared across predictive groups (true negative, false negative, false positive, true positive) using an analysis of variance and Tukey's honest significant difference test when significance was found. All analyses were performed in R [39].

Results
Of the 25 stands sampled, the model accurately predicted the presence or absence of ESDM for 13 stands-7 stands without infestation and 6 stands with infestation. We found an overall error rate of 48% and type I and II error rates of 46% (Table 2). A comparison of stands across predicted and observed occurrence of ESDM shows groups diverging in few characteristics, while many remain similar (Table 3, Figure 2). There was no significant difference in trees per hectare across all predictive groups (p = 0.637, F-value = 0.593), nor was there a significant difference in total live basal area (p = 0.389, F-value = 1.056), diameter at breast height (p = 0.764, F-value = 0.386), stand age (p = 0.603, F-value = 0.631), or species richness (p = 0.31, F-value = 1.271) ( Table 3). Table 3. Stand structure characteristics of the 25 sampled stands with or without eastern spruce dwarf mistletoe (ESDM), calculated from field surveys, and classified as true negative, false negative, false positive, and true positive. Unless otherwise specified, all values reflect those of living trees only. Means followed by the same letter are not significantly different by Tukey's HSD test at a 0.05 level of significance for predictive group.

Discussion
The results from this analysis showed the challenges of predicting the occurrence of ESDM, as indicated by the overall accuracy of 52% (Table 2). Model performance did not differ between absence and presence of ESDM (true negative rate = 54%, true positive rate = 50%, Table 2), indicating it is not biased towards either over or under prediction.
Some of the limitations of the model developed by Hanks and colleagues [15] for ESDM are common across many models. Data used to create the Hanks model represents Predictive groups diverged when looking at black spruce individually, with significant differences in both living black spruce basal area (p = 1.97 × 10 −6 , F-value = 10.09) and dead black spruce basal area (p = 2.42 × 10 −15 , F-value = 26.4) ( Table 3). The amount of living black spruce basal area was not significantly different between true negative and false negative groups, between false negative and false positive groups, nor between false positive and true negative groups (Table 3). However, the true negative group had significantly more living black spruce basal area than both the false positive and true positive groups; similarly, the true positive group had significantly less living black spruce basal area than both the true negative and false negative groups ( Table 3). The true positive group also had significantly more dead black spruce basal area than all other predictive groups (Table 3).

Discussion
The results from this analysis showed the challenges of predicting the occurrence of ESDM, as indicated by the overall accuracy of 52% (Table 2). Model performance did not differ between absence and presence of ESDM (true negative rate = 54%, true positive rate = 50%, Table 2), indicating it is not biased towards either over or under prediction.
Some of the limitations of the model developed by Hanks and colleagues [15] for ESDM are common across many models. Data used to create the Hanks model represents a single point in time, creating static predictions of ESDM occurrence. In this case, it is for the inventory year for each stand in the FIM dataset, which ranges from the early 1980s to the current year (all stands used in field verification were sampled no earlier than 1995; inventory year used in original modeling not available). Given the lag between stand inventory, model development, and field verification (up to 25 years in our analysis), some discrepancies would be expected. However, if this were the main reason for the inaccurate predictions, we would expect a lower false-positive rate and a higher false-negative rate (i.e., with the passing of time, more stands would have become infested).
With this temporal lag, it is also important to consider climate change and its potential effects on this disturbance regime as the elapsed time represents a period of shifting climate in the region. While the role of climate in the distribution and prevalence of ESDM has not been studied, the effects of climate change on black spruce are more well known. Overall, climate change is predicted to significantly negatively affect black spruce at the landscape scale [40], which would likely increase stress in black spruce and potentially the amount and rate of mortality due to ESDM infestation. However, the effects of climate change at the stand (and tree) level, may rely more on microsite conditions and water table dynamics [41] which could further increase the spatial complexity of ESDM across the state.
The patterns observed in stand structure and composition among all four predictive groups provide insight into ESDM presence. Most of the included stand structure characteristics did not differ significantly across predictive groups; the only significant differences were found in living and dead black spruce basal areas (Table 3). Live black spruce basal area shows a gradient of significant differences, with the true-negative group having more living black spruce basal area than both the false and true positive groups, the true positive group having significantly less black spruce basal area than the false and true negative groups and the false negative and positive groups having intermediate amounts (Figure 2). This gradient shows that, as stands become infested with ESDM, the amount of living black spruce declines, but that low levels of black spruce basal area do not always indicate infestation. Stands in the true positive group also had significantly more dead black spruce basal area than any other predictive group, perhaps indicating more advanced levels of infestation (and thus mortality due to ESDM infestation) (Figure 2).
The difference in patterns observed between the predicted/observed groups indicates that the Hanks model may be characterizing stand structure trends that could be, but are not necessarily, related to ESDM infestation or are only identifying stands at a specific stage of infestation. The model appears to be well suited to identifying stands with more established or more severe infestations, picking up trends of lower living black spruce basal area and higher dead black spruce basal area ( Figure 2). Our results show, however, that there may simply be greater variation in black spruce stand conditions than originally hypothesized, in both the presence and absence of ESDM. Though ESDM in black spruce has been studied for decades, it has largely been from a plant pathology perspective and focused on spread and mortality rates, rather than effects on stand structure and composition [33,42]. It is only in recent years that the effects of ESDM on stand structure have started to be explored [28].
This new research has shown that stands at different levels of ESDM infestation exhibit differing stand structures. Stands with no ESDM present are predominately composed of living black spruce, while stands with low (<50% of stand infested) levels of ESDM still have low levels of black spruce mortality as well as additional species becoming established, and stands with high (>50% of stand infested) levels of ESDM had both increased black spruce mortality as well as a greater number of species present [28]. These differences indicate that predicting the occurrence of ESDM without considering the severity or extent of the infestation may not be possible, as different levels of infestation appear to have differing effects on stand structure.
Though mortality of the dominant species was included as a variable in this model, all infested stands were included as one group (ESDM presence was coded as a binary variable), which reflects opinions that any ESDM in a stand will result in an undesirable loss of timber volume [17]. A better approach may be to include multiple levels of severity to better reflect how ESDM infestation progresses or to focus on a specific stage of infestation (establishment, for example) that may have a more unique signature for a model to identify.
Our results show some of the difficulty of modeling the presence of disturbance agents that affect forest health-particularly those with strong spatial considerations. In modeling ESDM, mountain pine beetle, spruce budworm, or similar forest health threats, considering not just the presence or absence but the level of infestation could allow for more useful predictions [43,44]. Our results also indicate the value of looking beyond the main effect of the disturbance agent-mortality, in the case of ESDM-and considering broader ecosystem effects, such as species composition shifts or changes to stand development [28,45]. These considerations are important not only for predicting the occurrence of important disturbance agents but also for accurate growth and yield models that are necessary for forest management decision-making [2].
The Hanks et al. 2011 model, as well as our analysis, highlights the benefits of model benchmarking to improve model interpretation as well as the importance of understanding the data used in modeling and basing models on the ecological understanding of a system (here, how stand structure characteristics vary based on the severity of infestation). Our