Accumulated Heating and Chilling Are Important Drivers of Forest Phenology and Productivity in the Algonquin-to-Adirondacks Conservation Corridor of Eastern North America

Research Highlights: Forest phenology and productivity were responsive to seasonal heating and chilling accumulation, but responses differed across the temperature range. Background and Objectives: Temperate forests have responded to recent climate change worldwide, but the pattern and magnitude of response have varied, necessitating additional studies at higher spatial and temporal resolutions. We investigated climatic drivers of inter-annual variation in forest phenology and productivity across the Algonquin-to-Adirondacks (A2A) conservation corridor of eastern North America. Methods: We used remotely sensed indices from the AVHRR sensor series and a suite of gridded climate data from the Daymet database spanning from 1989–2014. We used random forest regression to characterize forest–climate relationships between forest growth indices and climatological variables. Results: A large portion of the annual variation in phenology and productivity was explained by climate (pR > 80%), with variation largely driven by accumulated heating and chilling degree days. Only very minor relationships with precipitation-related variables were evident. Conclusions: Our results indicate that anthropogenic climate change in the A2A has not yet reached the point of triggering widespread changes in forest phenology and productivity, but the sensitivity of forest growth to inter-annual variation in seasonal temperature accumulation suggests that more temperate forest area will be affected by climate change as warming continues.


Introduction
The phenology (i.e., timing of growth events, such as bud burst) and productivity (i.e., photosynthetic production over time) of temperate forests worldwide have responded to recent climate change [1,2]. Climate in eastern North America has generally become warmer and wetter over the past century [3][4][5]. Annual mean temperature has increased by approximately 1 • C since 1901, with most change occurring during winter [6]. The return rate of extreme temperature anomalies, especially unusually high winter temperatures, has increased in recent decades [7], corresponding to the overall increase in temperatures, which has largely been attributed to anthropogenic forcing [8]. Precipitation totals and intensity have also increased, especially in fall and spring, and models predict that this trend will continue [6,9]. A better understanding of the responses of forests to these recent changes, as well as the specific climatic variables which influence their growth, is necessary for an improved understanding of the impacts of continued climate change.
Temperate forest phenology is among the most sensitive indicators of environmental change, and is responsive to temperature cues [10]. In spring, phenological progression from bud burst to leaf development is cued by warm temperatures, accumulated over time [10]. In fall, progression from photosynthetically active foliage to leaf senescence and/or abscission are also largely cued by temperature but also precipitation and photoperiod, which serves as a limiting factor for the end-of-season date [11]. Together, the dates of the start and end of the growing season determine the overall length of the growing season, drivers of forest growth (ex., [29]). By modelling drivers of inter-annual variation, rather than drivers of trends, climatic anomalies can be treated as natural experiments and thus provide greater insight into the role of such extremes, to which forest phenology and productivity are likely to respond [32]. The challenge with this approach, however, is that it is computationally intense, and the potentially large number of interacting variables are not well suited to traditional linear modeling. Climatological variables are often highly inter-correlated, so it is difficult to identify and differentiate responses to specific aspects of climate. Machine learning approaches, such as random forest (RF), have become increasingly popular for modelling the influence of many interrelated independent variables. Rodriguez-Galiano et al. [29] modelled inter-annual variation in European forest phenology using RF regression, and they found that phenology-climate models had higher R 2 (68%-90% vs. 39%-68%, respectively) and lower root mean squared error (0.92-0.43 vs. 1.04-1.40, respectively) than conventional linear models.
Our aim in this study was to examine the influence of climate on temperate forest growth at a regional scale in North America. We examined spring and fall phenology, growing season length, and foliar productivity, and compared these annual metrics to a suite of climatic variables hypothesized to influence them. We focused on the Algonquin-to-Adirondacks conservation corridor (A2A), which contains large diverse areas of temperate forest. Our research questions were: to what extent does climate influence the interannual variation in phenology and productivity of forests in the A2A region and what climate variables are the most important in driving variation? In order to address these questions, we required a method that would support analysis of a large number of inter-correlated climate variables, identify their relative importance, and characterize their individual influence on forest growth. As such, we applied RF regression modeling, which has been shown to be effective for this purpose [29].

Study Area
A2A is a continentally significant conservation corridor, which spans the Canada-United States border at the eastern extent of the Great Lakes in eastern North America (see Figure 1) [33]. It has become the focus of international conservation efforts because of its high concentration of large interconnected natural areas, many of which are protected forests. These forests are highly diverse, even in the context of the already high diversity of eastern North America. The land area of A2A is >135,000 km 2 , of which 61% is forests (calculated using [34]). Most forests are deciduous (32% of total area), dominated by maples (Acer rubrum and Acer saccharum), red oak (Quercus rubra), and American Beech (Fagus grandifolia) [35,36]. Coniferous forests represent a smaller proportion (9% of total area), mainly in high elevation sites in Adirondacks and riparian areas, and are dominated by pines (Pinus strobus and Pinus banksiana), spruces (Picea rubens, Picea glauca, and Picea mariana), balsam fir (Aibes balsamea), eastern hemlock (Thuja canadensis), and white cedar (Thuja occidentalis). Communities containing a mixture of deciduous and coniferous species represent more area than coniferous communities alone (21% of the total area). Human settlement and agricultural lands are mainly in lowlands near the St Lawrence River.

Forest Phenology and Productivity Indices
Forest growth data products based on remotely sensed photosynthetic activity were obtained from the United States Geological Survey (USGS) Earth Resources Observation and Science Center. These data were developed using the Advanced High Resolution Radiometer sensor series (AVHRR), which presented a relatively long temporal window for trend analysis (1989 to 2014) and had a spatial resolution (1 km 2 ) relevant to analysis of forest landscapes. Indices were derived from annually compiled weekly composites of the normalized difference vegetation index, commonly referred to as NDVI [37].
Of the variables available from the USGS, we selected four 'forest growth indices', which were representative of the productivity and phenology of forests in A2A. We selected one productivity index: time integrated NDVI (TIN); and three phenology indices: length of growing season (LOS), start of growing season (SOS), and end of growing season (EOS). Details on each index are found in Table 1. Remotely sensed phenology indices at the spatial resolution of AVHRR do not capture distinct phenophases of individual plants represented within pixels, but rather a relative phenological change (ex., SOS may correspond to 25% leaf expansion on 50% of trees in a given pixel) of all vegetation within a given area (i.e., pixel) [38]. Likewise, TIN is a proxy metric of vegetation foliage productivity integrated across a given area (i.e., pixel) within a given annual growing season. In combination, these indices represent overall annual forest foliage production (which is represented over time as productivity, i.e., TIN), the portion of each year in which forests are growing (i.e., LOS), and the date of the start (i.e., SOS) and end (i.e., EOS) of the growing season, which both affect the LOS.
Only those pixels representing land dominated by forest were retained for analysis. Forested areas were identified by aggregating all forest classes in the 30 m resolution North American Landcover Classification [34] into a single 'forest' class. AVHRR pixels that were comprised of less than 50% forest were removed from further analysis.
Ji and Brown [39] found that the USGS AVHRR phenology data products are biased by the effects of satellite orbital drift, and they recommend removing data for 1992, 1993, 1994, 1999, and 2000. As such, orbital drift effects were removed by setting values in the affected layers as null, which reduced the number of available years from 26 to 21 but did not alter the timespan of the dataset.
We tested for trends in forest growth indices using least squares linear regression as part of exploratory data analysis. After accounting for temporal autocorrelation, we found no statistically significant (α = 0.05) trends in these forest growth indices over the study period (see Appendix A Table A2). Given the lack of forest growth trends we assumed that interannual variability in forest growth indices was not directional during the study period.
Z-scores were calculated for each forest growth index to represent inter-annual variability in forest growth (sensu [29]). The Z-score for a given year is calculated by scaling (subtracting the 1989-2014 mean, excluding the given year, from a given observation each year) and centering (dividing scaled values by the 1989-2014 standard deviation by pixel) the original observations. Z-scores representing little inter-annual variation (values within 1 standard deviation of the mean: −1 to 1) were removed because we were primarily interested in modelling drivers of variation in forest growth and Z-scores representing lesser variation were not as informative as those that we retained in that regard. Data manipulation was performed using the raster package [40] for R [41].

Climatological Variables
Climatological data were obtained from the Daymet database (available from daymet. ornl.gov, accessed on 4 July 2018), which offers gap-free, geospatially explicit, gridded climatological datasets covering North America [42]. Daymet datasets have a 1 km 2 spatial resolution, a daily temporal resolution, and data beginning in 1980, with more recent years being added on an ongoing basis. These contiguous daily data were produced by interpolating daily observations from meteorological stations [43].
Minimum and maximum air temperature (hereafter 'minimum temperature' and 'maximum temperature', respectively), total precipitation (hereafter 'precipitation'), snowwater equivalents (hereafter 'snowpack'), day length, and net incident shortwave radiation (hereafter 'solar radiation') were obtained directly from Daymet. These data were used to calculate several secondary climatic variables. Mean monthly temperature (hereafter 'mean temperature') was derived by calculating the mean daily temperature (mean temperature = (maximum temperature + minimum temperature)/2) and averaging over the course of the month. Three thermal time indices were also calculated: accumulated heating degree days (floor of 4 • C) before the start of season (hereafter 'spring heating'); accumulated heating degree days before the growing season maximum (hereafter 'summer heating'); and accumulated chilling degree days (ceiling of 20 • C) between the growing season maximum and the end of the growing season (hereafter 'fall chilling') [13]. Windows for accumulated spring heating (1 January to 31 May), summer heating (1 January to 31 July), and fall chilling (1 August to 31 October) were defined to accumulate heating from the beginning of heat accumulation in winter until the end of spring, until approximately the typical date of maximum greenness during summer, and following the maximum until after typical EOS. The typical date of maximum greenness and EOS dates were defined by the observed median date within A2A (1989 to 2014, not including years affected by orbital drift). Spring heating and summer heating were derived separately so that temperature accumulation would not be represented beyond the time relevant to SOS (as summer heating would be). Accumulation periods were defined arbitrarily by calendar date to capture thermal time within consistent periods for intuitive inter-annual comparison. The standardized precipitation-evapotranspiration index (hereafter 'SPEI') was calculated to assess climatological drought [44]. SPEI is a useful drought index because it effectively models these conditions using only temperature and precipitation data as inputs, and has comparable results to more elaborate models, such as the Palmer Drought Severity Index [45]. Also calculated was root-freeze-risk, which represents the number of days each year where there is potential for frost damage to the roots of trees (for calculation methods, see Table 2). It has been shown that vegetation is sensitive to frost damage that would otherwise have been avoided by snow cover [10]. To the best of our knowledge, no index such as this has been incorporated into a study of forest growth at this scale. A description of all climatic variables used is provided in Table 2.
Independent temporally and spatially specific accuracy assessment can be performed on Daymet data using the station-level cross-validation dataset from the Oak Ridge National Laboratory [47]. We used these data to assess the accuracy of the minimum air temperature, maximum air temperature, and precipitation amounts within our study period and area of interest (see Appendix A Table A1). The R 2 values of the linear models fit to data from stations within A2A were approximately 0.6 for precipitation and were > 0.9 for maximum air temperature and minimum air temperature. A lower R 2 for precipitation was not unexpected given that Daymet is thought to underestimate total precipitation at higher elevations [48]. Table 2. Climatic variables used as independent variables in forest-climate models, their definitions, and their temporal resolution following aggregation from a base daily resolution.

Day length The amount of time between dawn and disk (s) Monthly Mean
Solar radiation Mean incident radiation flux density (W/m 2 ) Monthly Mean

Modelling Schema
The relationship between climate and forest growth was modelled using RF regression [49]. RF is an ensemble machine learning method wherein the resulting model is the average of a large number of decision trees, which are constructed by recursive partitioning of bootstrap sub-samples of independent variables to describe the variance in the dependent variable [50]. RF maintains spatial and temporal independence through random sampling with replacement and ensures independence between independent variables by selecting the best variable to form the split at each node given a random subset of those independent variables. For a thorough description of RF, see [49,50].
A strength of RF is that the importance of given independent variables can be quantified by assessing changes in model accuracy when using different independent data to split training data within trees. Genuer et al. [51] proposed using RF's built-in variable importance ranking mechanism to identify the most important independent variables within a preliminary RF model before fitting a final more robust RF model using only those important variables. They tested this method on a variety of regression and classification problems using highly dimensional datasets and found that RF variable selection is able to reliably identify the most important variables within a dataset, and that models that used only important variables had greater accuracy than models using all variables, thus making the models more parsimonious. We employed such a two-step RF modelling approach by constructing a preliminary and final model for each forest growth index. The preliminary model was used to efficiently identify climatological variables that were important drivers of inter-annual variation in forest growth, and the final model was to robustly model those forest-climate relationships. A graphical depiction of this method is illustrated in Figure 2. RF models were constructed using the randomForest package for R [52] using sequential processing [53,54]. Flowchart of major steps in the forest-climate modelling workflow. This procedure was performed four times, once for each forest growth index. Inputs were AVHRR forest growth indices (the dependent variable) and pre-processed Daymet climatological data (the independent variables).

Training and Validation Data
Gridded datasets were converted to tabular format for model training and validation. Data from pixels for which the Z-scores of a given forest growth index were <−1 or >1 were included in the data tables along with the corresponding climatological data. Climatological variables included with each forest growth index were determined a priori based on their hypothesized influence. Once compiled, these data were split randomly into training (75%) and validation (25%) subsets using the caret package for R [55]. Genuer et al. [56] used large simulated and real-world datasets to train RF models, and found that models trained on larger subsample proportions-even up to 100% of the data-had lesser prediction error but required substantially greater computational time. We chose a 75% training subsample size to balance the need for low predictive error and training time limitations, while still obtaining a substantial independent validation subsample.

Model Training
RF models are relatively insensitive to changes in model hyperparameters relative to other machine-learning methods but still require appropriate settings to optimize model accuracy and efficiency [50]. The number of trees (ntree), minimum node size (nodesize), and number of variables tested at each node (mtry) hyperparameters can be set for RF models. Hastie et al. [50] showed that modifying mtry affects both the correlation between predictions made by individual regression trees and the mean squared error of the resulting RF model, and emphasize that modifying the mtry value affects the relative benefit of averaging trees in a forest. mtry was tuned by performing a 5-fold cross-validation of model root-mean-squared error (RMSE), and selecting the setting with the lowest RMSE, which was done using the caret package for R [55]. RF models are less sensitive to ntree and nodesize settings, so given computational restrictions we did not train these statistically but instead experimented with various settings to identify a suitable balance of model efficiency and accuracy for preliminary and final models.
Preliminary models were trained with datasets including all climatological variables that we hypothesized to be important drivers of each forest growth index. The ntree was set to 150, nodesize was set to 0.001% of the number of observations, and mtry was trained a priori. Fifteen variables with the highest variable importance ranks were selected from the results of preliminary models for use in final models. We iteratively explored different numbers of variables and found that 15 provided an adequate balance between model error and interpretability.
Final models were trained with datasets including the 15 variables selected in preliminary models. The ntree was set to 1000, nodesize was set to 0.001% of the number of observations, and mtry was trained a priori. Final RF models were intended to be more robust than preliminary models, and to be representative of the relationship between climate and forest growth.
Variable importance was ranked using RF's node purity metric (IncNodePurity), which is measured by taking the cumulative differences between the residual sum of squares before and after all splits using a given variable across all trees in a forest [49,52]. RF's 'out of the box' variable importance metrics are vulnerable to bias from independent data characteristics, namely variable intercorrelation and inconsistent measurement scale (reviewed in [57]). Variable importance bias can persist in RF models regardless of the variable selection metric chosen because of the inherent use of the Gini node purity metric to select appropriate independent variables with which to split data at nodes within trees [58]. Knowing that some bias may be present in a variable importance metric, we interpreted the results in terms of relative ranking patterns over absolute values from variable importance metrics. The IncNodePurity, expressed as a percentage, of the 15 most important independent variables was used to assess the relative importance of variables for each forest growth index.
These forest-climate relationships form final RF models were visualized using variable importance and partial dependence plots. The marginal influence of each important inde-pendent variable was plotted using partial dependence plots [59,60]. Partial dependence plots show the amplitude and shape of the response of the dependent variable to a given independent variable with the effects of other independent variables averaged but not ignored [49].

Model Validation
The strength of the models was assessed using the root mean squared error (RMSE), and the coefficient of determination, reported as the percent of variance explained by a model ( p R 2 ) and representing the predictive accuracy of each model. These metrics were calculated for preliminary and final models using withheld 25% validation datasets with the caret package for R [55]. Genuer et al. [56] tested various applications of RF modelling on big data, and recommended validating RF prediction accuracy using an independent dataset that is smaller than the training dataset instead of using model out-of-bag error, which they found to be an unreliable indication of model error in big data applications.

Model Characteristics
RF models, which used independent climatological data, explained a large proportion of the variation in forest growth indices of phenology and productivity in the A2A region ( p R 2 > 80%, see Table 3). RF models for each forest growth index were trained on different datasets, based on the forest pixels randomly included in the training set and the climatic variables that we hypothesized were important for each forest growth index (see Table 3). SOS was trained on the smallest temporal window (18 months) because months after the start of season were not relevant, while the LOS, EOS, and TIN models were trained on a larger window (23 months). SOS models had the fewest training observations (N = 393,369) following the removal of low-variance Z-scores, and the fewest independent variables (m = 151), whereas TIN had the most (N = 415,881, m = 197). TIN and LOS required less computational effort than SOS or EOS (mtry = 25 vs. 85 and 100, respectively). RMSE decreased and p R 2 increased from the preliminary models to the final models for all four forest growth indices (see Table 3). Table 3. Inputs and outputs from preliminary and final RF (Random Forest) models. Model Inputs are the independent variables considered in each model and the time period considered ( y−1 = lagged). N is the number of observations (pixels) used to train each model and was identical for the preliminary and final models. m is the number of independent variables used in each model. mtry is the number of randomly selected variables tested for best split at each node. RMSE is root-mean-squared error calculated using the validation dataset. p R 2 (pseudo-R 2 ) represents the percent of variance explained by the model. Forest growth indices: TIN = Time-integrated NDVI; LOS = length of growing season; SOS = start of season date; EOS = end of season date.

Forest-Climate Relationships
Spring and summer heating were the most important variables for SOS (spring), LOS, and TIN (summer) (see Table 4). Accumulated heating and chilling were ranked as being more important than monthly climate variables by a wide margin over other climatic variables considered here. Lagged effects of accumulated heating and chilling were generally ranked as being of lesser importance than current effects, except in the case of EOS, where lagged accumulated heating and chilling were ranked as more important than current accumulated heating. Greater accumulated heating resulted in earlier SOS and later EOS, though EOS was driven by greater current heating and lesser lagged heating (see Figure 3: AHS and AHM). LOS had a similar response to accumulated heating as EOS, with a larger response to greater current heat accumulation, which is consistent with the response of SOS. TIN responded negatively to heat accumulation below 1000 • C and responded positively above 1000 • C, with the largest responses in the mid-range of values, though TIN responded only positively to lagged heat accumulation. There was consistency in the inflection points of responses to accumulated heating, which appeared at approximately 900 and 1200 • C, and there was consistency in 0-crossovers at approximately 1000 • C. Table 4. Importance of climatic variables for modelling inter-annual variation in forest growth variables, ranked by scaled IncNodePurity. Variable importance (shown in the Imp. column for each final RF model) was scaled to a percentage (%) for comparison between final RF (Random Forest) models. Monthly climatic variables are labelled with the corresponding month, and lagged variables (i.e., from the previous year) are denoted with ' y−1 '. Forest growth indices: TIN = Timeintegrated NDVI; LOS = length of growing season; SOS = start of season date; EOS = end of season date. Climatic variables: TMX, TME, TMN = maximum, mean and minimum temperature; precipitation = PRT. Fall chilling accumulation was the most important variable for EOS, again by a wide margin over other climatic variables (see Table 4). Greater accumulated chilling resulted in earlier EOS, and lagged effects resulted in earlier SOS (see Figure 3: ACE). LOS responded positively to accumulated chilling, wherein lesser chilling resulted in longer LOS, and greater chilling also resulted in longer LOS to a lesser extent, but mid-range chilling had a neutral effect on LOS. LOS' response to chilling appears to be a rough approximation of the most pronounced effects of chilling on SOS and EOS, wherein greater chilling resulted in a longer LOS at low, and to a lesser extent at high, levels of chilling accumulation. For EOS and LOS, the lagged effects of chilling were similar to current effects but with a lesser amplitude. TIN responded positively to accumulated chilling, with mid to high chilling accumulation resulting in higher TIN, though lagged chilling has a minimal effect at high chilling accumulation. TIN's response to accumulated chilling mirrored the lagged response of SOS, which could indicate a relationship between these responses. Again, there was consistency in the inflection points of responses to accumulated chilling, which appeared at approximately 1200 and 1600 • C.

Rank
Monthly temperature variables, generally the mean, were ranked as being of some importance to forest phenology and productivity indices (see Table 4). The amplitude of their influence was less than that of accumulated heating and chilling (see Figure 3: TMN, TME, and TMX). SOS responded negatively to monthly temperature, whereas LOS and TIN responded positively, and EOS responses were relatively neutral. The relationships between monthly temperature and forest growth indices were generally similar in shape but were shifted along the x-axis depending on the range of temperatures in each month. Responses appeared in similar locations along the x-axis, indicating the range of monthly temperature means, minimums, and maximums that are most important for a given forest growth index.
Some climatic variables that we considered were not ranked among the 15 most important variables in preliminary RF models, and thus were not included in final RF models. Those variables were: drought, snowpack, root-freeze risk, solar radiation, and photoperiod. Precipitation was of some importance for SOS (rank 13/15, see Table 4).
orests 2021, 12, x FOR PEER REVIEW 10 of 18 sponse of SOS. TIN responded negatively to heat accumulation below 1000 °C and re sponded positively above 1000 °C, with the largest responses in the mid-range of values though TIN responded only positively to lagged heat accumulation. There was con sistency in the inflection points of responses to accumulated heating, which appeared a approximately 900 and 1200 °C, and there was consistency in 0-crossovers at approxi mately 1000 °C. Fall chilling accumulation was the most important variable for EOS, again by a wide margin over other climatic variables (see Table 4). Greater accumulated chilling resulted in earlier EOS, and lagged effects resulted in earlier SOS (see Figure 3: ACE). LOS re sponded positively to accumulated chilling, wherein lesser chilling resulted in longer LOS, and greater chilling also resulted in longer LOS to a lesser extent, but mid-range chilling had a neutral effect on LOS. LOS' response to chilling appears to be a rough ap proximation of the most pronounced effects of chilling on SOS and EOS, wherein greater chilling resulted in a longer LOS at low, and to a lesser extent at high, levels of chilling accumulation. For EOS and LOS, the lagged effects of chilling were similar to current ef . Partial dependence plots of important variables used in the final RF (Random Forest) models. Partial dependence (i.e., the marginal influence of a given variable with the effect of all other variables averaged out but not ignored) plots are grouped by variable class, and variables are divided by time (y = current, y − 1 = lagged). For plots with no lines, the corresponding variables were not ranked as important by preliminary RF models and were therefore not included in the final RF models. Forest growth indices: TIN = time-integrated NDVI; LOS = length of growing season; SOS = start of growing season; EOS = end of growing season. Climatic variables: AHS = accumulated heating (SOS); AHM = accumulated heating; ACE = accumulated chilling; TMX, TME, and TMN = maximum, mean and minimum temperature; PRT = precipitation.

Discussion
We assessed the inter-annual response of forest phenology and productivity indices to climatological variables in the forests of A2A from 1989-2014 with RF regression models, which allowed us to rank the importance of climatological variables for forest growth and characterize forest growth responses to climatological variables using partial dependence plots. We found that air temperature variables, in particular seasonal heating and chilling accumulation, explained much of the inter-annual variation in forest phenology and productivity in A2A. Forest growth responses to heating and chilling accumulation were non-linear, and variable across their range, and lagged responses differed from current responses.
A large proportion of inter-annual variation in phenology indices was explained by climatic variables. Accumulated heating and chilling were the most important variables in phenology models by a notable margin, especially for SOS and EOS. Generally, greater heat accumulation was associated with earlier SOS in spring and later EOS in fall, and greater chilling accumulation was associated with earlier SOS in the subsequent spring and earlier EOS in fall. This is broadly consistent with a large body of literature that has developed indicating that accumulated heating and chilling drives phenological development for temperate vegetation (reviewed in [10,61]). This is also consistent with local-scale field-based phenology observations, such as those by Richardson et al. [13], who found that logistic models based on accumulated warming degree-days (>4 • C) explained 90% of the variation in spring and fall phenophase progression timing in sugar maple, yellow birch, and American beech trees. Similarly, Yu et al. [62] trained species-level multiple linear regression predictive models of phenophase progression from field observations of deciduous trees, and found that models using accumulated growing degree days (>3 • C) were effective for simulating phenological progression in spring and fall. Our results demonstrate a similar relationship across A2A, which contains a diverse mixture of deciduous as well as coniferous trees. There has been some experimental work on differentiating species-specific sensitivity of different deciduous tree species' phenology to climatic cues (ex. [26]), and we recommend future work on differentiating the responses of different forest communities at a regional scale, especially for largely coniferous communities. LOS response curves appeared to include the combined responses of SOS and EOS to temperature, which is logical because SOS and EOS dictate the endpoints of LOS. However, variation in LOS could be from either SOS or EOS so we considered them together here for context. Lower levels of accumulated heating were associated with later EOS and longer LOS in the following year, which could indicate a compensatory reaction to early fall senescence or short growing seasons assuming that conditions in the subsequent year are more favorable. EOS was later in years with lower chilling accumulation, indicating that forests responded opportunistically in warm years and continued their growth later into the fall. Interestingly, SOS was earlier in years when chilling was greater in the previous fall, suggesting that in years where EOS is early due to chilling, SOS may also be earlier.
Like phenology, we found that inter-annual variation in forest productivity was strongly linked to climate, and accumulated heating was the most important climatic variable for TIN. We observed productivity responses to be highest at moderate levels of heat accumulation, but that this response was negative at higher levels of heat accumulation. In contrast, productivity was higher with higher levels of chilling accumulation. Together, these could indicate that forest productivity is adapted to a mid-range heating and chilling envelope in this region, and that a potential future heating increase and chilling decrease could shift climatic conditions away from the fundamental niche of A2A's current forest tree species. Wang et al. [63] also observed the responsiveness of forests in eastern North America to temperature. They tested for piecewise trends in NDVI productivity and spring temperature in North America and found that greening trends stalled around 2000 when spring temperature began to decline. However, studies at different scales have found that different factors drive forest productivity. At a very fine scale, studies using carbon flux as a proxy measure of productivity found that net ecosystem productivity is strongly correlated with photosynthetically active radiation in southern Ontario [22] and Massachusetts [23]. Zhu et al. [1] used trends in remotely sensed leaf area index, and found that atmospheric carbon fertilization explains much of the global greening that occurred from 1982-2009, and climate change explained only a small proportion of that greening. The contrast of our results with these finer-and coarser-scale studies suggests that there is inconsistency in the drivers of forest productivity when it is observed at different scales or using different methods. Myers-Smith et al. [64] explored a similar variation in productivity-climate relationships across the Arctic tundra biome when measured at different scales and our results point to the need for a parallel analysis for the temperate forest biome.
Many of the climatic variables that we included were not ranked as being important drivers of inter-annual variation in forest growth. Precipitation was of marginal importance for SOS, while drought, snowpack, root-freeze risk, solar radiation, and photoperiod were not ranked as being among the most important drivers of forest growth indices examined here. Photoperiod did not vary year-to-year, so it would not explain interannual variation. However, Yu et al. [62] found that photoperiod was an important variable in building reliable predictive models of the start and end of season dates, which indicates that the photoperiod can influence the date of a phenological shift but does not have a large influence on inter-annual variation in those dates. The limited importance of precipitation was surprising as precipitation is thought to provide necessary moisture for leaf development in spring and it has been shown to be important for phenological progression in spring and fall in other temperate regions [10,11]. Seyednasrollah et al. [24] examined environmental drivers of deciduous spring and fall phenology using PhenoCam data and a hierarchical multivariate joint model, and found that fall senescence advanced in parts of North America where mean annual precipitation was low, which indicates that forest sensitivity to precipitation depends on local climate. The limited importance of precipitation and drought in this case could indicate that the fall phenology of forests in A2A are insensitive to precipitation because of their climatic niche. Snow pack, or specifically the absence of snow pack, has been suggested as being a signal of the start of the growing season in high elevation or high-latitude environments [10], so it is logical that snowpack was not among the most important drivers of phenology in A2A, which is at low to middle elevations. However, snowpack serves to insulate roots from frost damage [65]. Major 'false spring' events, in which early snowmelt is followed by a cold snap, have resulted in notable root-freezing and damage in eastern North America in recent decades [14,66]. We designed the root-freeze risk index to quantify the potential for such damage and did not expect that it would be absent from the most important variables here because of known false spring events in A2A during the study period. However, these false springs were singular events, rather than continuous over time, so it is possible that they did not influence inter-annual variation over the entirety of the study period and thus were not ranked as being of high importance. It is possible that precipitation, photoperiod, and other variables have influenced forest phenology and productivity, but that these effects were ephemeral and were thus removed in the variable-selection stage following preliminary RF models. Other modelling approaches (ex. [24,62]) are more effective for identifying these subtleties for specific variables, and we recommend that future research examines the interrelated effects of episodic and continuous climatic variables on temperate forest growth.
Rodriguez-Galiano et al. [29] used RF regression to model climatic drivers of interannual variation in SOS and EOS across Europe, and following their success in its application recommended the use of RF to answer similar research questions elsewhere. We also found that RF was well suited to inter-annual forest-climate modelling. The ability to include large numbers of inter-correlated training variables-which climatic variables often are [38]-was especially useful as it allowed us to select the most important climatic variables from the many that could be influential. The ability to show non-linear relationships with partial dependence plots was also revealing because of the complexity of the forest-climate relationships that we found, which were not prototypical relationship forms (ex, linear, exponential). Our two-staged RF modelling approach made the modelled results more intuitive and parsimonious as less important climatic variables were not included in final results (sensu [51]). Recently, spatial cross-validation methods have been suggested to improve the predictive power of RF and other machine learning methods when using spatial data [67][68][69]. We did not apply spatial cross-validation in training, and our model validation metrics may be overly optimistic as a result. However predictive power and applicability outside of the area and time of interest were not as important as variable importance and partial dependence in answering our research questions, so those were the focus of this work. Studies in which greater predictive strength is required should consider applying such methods to avoid overfitting of spatially autocorrelated variables.

Conclusions
We used remotely sensed data and random forest regression to model forest-climate relationships for forest phenology and productivity indices in the Algonquin-to-Adirondack (A2A) conservation corridor from 1989-2014. In this analysis, we were able to compare forest phenology and productivity responses directly. - A large proportion of year-to-year variability in forest phenology and productivity was explained by climatic variables, particularly heating and chilling accumulation; -Accumulated heating was most important for SOS, EOS, and TIN, but accumulated chilling was most important for EOS; -Phenology was most responsive to temperature accumulation at low or high extremes, while productivity was most responsive to more moderate accumulations; -Lagged responses of EOS and LOS to accumulated heating, and of SOS to accumulated chilling had important distinctions from current responses; -Precipitation, photoperiod, and other climate variables were not ranked as being among the most important drivers of phenology or productivity.
The climate of eastern North America has warmed in the last century (+0.08 • C/decade), and more rapidly in recent decades (1970-2000: +0.25 • C/decade), though more of this warming has occurred in winter over summer [5,70]. These trends are projected to continue into future decades [3,4]. Given the relationships between forest growth and temperature that we observed, it seems likely that the growing season will be extended in A2A forests in coming years. These phenological changes have been anticipated using predictive models for forests in New England [71,72], lending support to this conclusion. However, our modeling indicates that warming could have a less predictable effect on total forest productivity in the future; moderate amounts of heating or chilling accumulation had a positive effect on TIN, but outside of that range, the effects of temperature accumulation were increasingly negative. This suggests that the warmer growing season temperatures expected in the future, which will likely include reduced chilling, could result in a decrease in forest productivity in some areas until forest composition changes to include higher proportions of tree species better suited to the future climate conditions. Appendix A Table A1. Results of accuracy assessment of Daymet climatological data for A2A. Tests were performed using observed and predicted daily data for the beginning (1988) and end (2014) of the study period. The data used for validation are available for download [47]. Only the precipitation (prcp), minimum temperature (tmin), and maximum temperature (tmax) variables were available at the time that we performed this test. An ordinary-least squares linear model and a one-sided test of Pearson's product moment correlation coefficient were used to assess the similarity between observed data from climatological stations and spatially corresponding predicted values from interpolated Daymet data.  Table A2. Forest area with trends in growth, expressed as percent forest area within each ecoregion. Areas are presented as area of statistically significant trends (p ≤ 0.05) and statistically significant trends after correction for false-discovery rate (FDR). We used linear regression to identify annual trends in forest growth indices over the course of the 26-year study period. Time-ordered datasets for each forest pixel were first pre-whitened using the Yue-Pilon method [73]; wherein the series was transformed by subtracting the first autoregressive component from the series in order to remove temporal autocorrelation and avoid Type-1 error in trend testing [74]. Linear models were then fit using Sen's slope [75], with the forest growth variable as the dependent variable and time (i.e., year) as the independent variable. The significance of models was assessed using p-values from a two-tailed Mann-Kendall significance test [76]. Testing many spatial pixels simultaneously presents a multiple-testing problem, wherein there was essentially a guarantee that false-positives would be identified without appropriate correction [77]. To address this, we adjusted p-values to account for the false-discovery rate using the Benjamini and Hochberg method [78]. Trend tests were performed using the zyp package for R [74].