1. Introduction
The study of plant responses to climate change has greatly expanded in the recent past with the development and use of remote sensing and the compilation of multidecadal satellite data sets. Satellites allow for near-continuous observations of earth and monitoring of vegetation dynamics, or changes in vegetation cover and greeness over time and space, as well as climate conditions at scales ranging from local to global [
1]. Phenology refers to periodic and seasonal developmental events in biological life cycles. Annual vegetation dynamics are largely caused by vegetative phenological phenomena that are sensitive to climate conditions. Therefore, changes in phenology, such as the timing, rate, duration, and magnitude of annual vegetative growth, can signal important effects of climate change on plants [
2].
Leaf Area Index (LAI), a widely utilized measure of plant growth and activity, is a unit-less measurement of leaf area (m
2) per ground area (m
). LAI provides a key measure of plant cover in a given area and is defined as an essential climate variable (ECV) by the Global Climate Observing System (GCOS) due to its critical contribution to the characterization of Earth’s climate [
3]. Satellite-derived LAI products offer multidecadal records of vegetation dynamics around the world, allowing for analysis of intra- and inter-annual variability and providing key insights on the impacts of changing climate conditions on plant biological processes and feedbacks [
4,
5]. This understanding of how climate interacts with and affects plant life is necessary for characterization of global change and for developing and implementing adaptive and sustainable practices worldwide.
Earth-observing satellites have used greening indices to infer phenological changes and vegetation dynamics since the early 1980s and, in recent decades, the quality of these products has continued to improve. These improvements have yielded massive reservoirs of high-resolution spatio-temporal data [
6]. Remotely-sensed vegetation data poses a number of methodological challenges. (1) The regular and dense occurrence of missing values resulting from excessive cloud cover, snow cover, and barren landscapes requires extensive pre-processing of the data. (2) For the analysis of large regions, tens of thousands of sites are available. The analysis of thousands of adjacent sites requires modeling of the inherently complex spatio-temporal correlatory structure. (3) The collection of a phenological process over multiple decades yields many replications of a stochastic process at each site. The variability of intra-site replications of an annual process is often characterized by complex cyclic or at least year-dependent structure.
To the best of our knowledge, most literature on the analysis of greening trends rely heavily on the annual maximum or mean greenness trends using time series and temporal site/region correlation [
7,
8,
9,
10]. These approaches have a number of advantages, as maximum/mean greenness is an important phenological attribute and the dimensionality of the data is decreased exponentially when analyzing the maximum or mean. Further, the expansive number of missing values inherently present in temporal vegetation measurements are removed.
However, this simplification neglects modeling the elegant periodicity and continuity of plant phenology and limits the potential of an effective exploratory analysis of vegetation dynamics. In this project, a philosophically different approach is taken to analyze remotely-sensed vegetation data in which the unit of analysis is a curve (or function) as opposed to a single site measurement. This approach, widely referred to as functional data analysis (FDA), roots in the assumption that measurements vary over some continuum such as space or time and that there is an underlying smoothness inherent to the process of interest [
11,
12,
13]. A temporal process, measured discretely on a regular or irregular time grid, is smoothed using an optimized basis function expansion. In almost all circumstances, replications of this process are present at the same spatial location or across several different locations, and as such, a collection of continuous differentiable curves are obtained for analysis. The retention of an entire smooth curve yields several advantages for analyzing remotely-sensed vegetation and climate data: (1) missing values can be effectively smoothed over, (2) anomaly values can be filtered/smoothed out of the data, (3) the timing and magnitude of max greenness is retained naturally/simultaneously, and (4) differential information is naturally contained in the basis function expansion used to smooth the curves.
Literature documenting the use of FDA in remotely-sensed plant phenology is sparse, but recent work in mapping forest plant associations using FDA combined with other machine learning methodology has shown promising results [
14]. The study of inter-annual dynamics of biological processes and their relationship with climate has been investigated in other works while accounting for their continuity [
15,
16,
17]. However, in this work, we argue that the grouping of remotely-sensed sites into increasingly homogenous groups improves the interpretability of any study of regional change in vegetation dynamics. Once these groups are identified, the FDA paradigm allows for an accessible study of these changes.
In this study, the Columbia River Basin (CRB) region of the Northwestern United States was chosen to explore the use of FDA for analyzing vegetation dynamics over multiple decades and correlating vegetation dynamics to changing climate conditions. The objective of this research is to demonstrate the potential of FDA to analyze remotely-sensed vegetation and climate data and provide insight on the effects of changing climate conditions on plant life across a diverse set of ecoregions in the CRB. To do this, we constructed smoothed-spline estimates of LAI from 1996 to 2017 using the remotely-sensed NOAA Advanced Very High-Resolution Radiometer (AVHRR) time series product. We then developed a functional cluster analysis model to allocate 27,196 sites in the CRB into more uniform groups based on similar trends in LAI. Next, we took regional averages of those groups and investigated intra- and inter-annual variability in vegetation dynamics within each cluster. Finally, we used a climate data product from the European Centre for Medium-Range Weather Forecasts global atmospheric re-analysis (ERA-Interim) to determine correlations between intra- and inter-annual vegetation dynamics and temperature and precipitation in each cluster.
3. Results
This analysis detects widespread greening earlier in the growing seasons across diverse areas in the CRB from 1996 to 2017. Initial exploration of correlations between annual maximum LAI and time in this region provide evidence of this phenological shift.
Figure 4 and
Table 1 and
Table 2, show calculated correlations between detected annual maximum LAI at each site and time (measured in years). To be explicit, correlations are calculated at each site for 22 years with respect to 22 annual maximum LAI measurements. These are computed from the raw data in the LAI CDR. Significant correlations were determined at
. Sites are referred to as “greening sites” if they have significant positive correlations between LAI and time, whereas sites with significant negative correlations between LAI and time are referred to as “browning sites.” Non-significant correlations are denoted as “Neither” greening nor browning. High frequencies of greening were detected along large sections of the the major CRB rivers (the Columbia and Snake) and in the Pacific coastal region of the CRB. Generous significant levels were chosen in this preliminary stage since, with only 22 replications of annual maximum LAI (over the 22 years), the correlation coefficients are sensitive to anomaly measurements and sub-intervals of decreasing annual maximum LAI. A high anomaly reading in early years makes it difficult to detect an increasing trend across the remaining years. Further, a sudden drop in maximum LAI in later years may leverage the correlation away from a high positive correlation detected in earlier intervals.
The spatial distribution of greening magnitude and timing demanded a more rigorous exploratory analysis that accomplished the following objectives: (1) eliminate/filter anomalies (false high LAI recordings), (2) detect regions that are strongly correlated over time while retaining at least some information regarding spatial proximity, and (3) examine changes/perturbations in the functional structure of annual LAI profiles.
The proposed functional clustering model was implemented on the 22 year smoothed LAI profiles using the dissimilarity matrix outlined in
Section 2.6. Sites allocated into the same cluster are determined to have strong multidecadal relationships to each other as inherited from the dissimilarity matrix.
Figure 5 depicts the results of the 5 cluster k-mediods model geographically. The details of the all cluster models for
are provided in
Table 3. We selected
primarily because the yielded cluster sizes were sufficient for regional-scale inferences. The differences in regional average LAI profiles by cluster provided exceptional separation by the timing, magnitude, and number of annual peaks in LAI. The geographic shape of the clusters also follows many major geographic features in the region, such as river basins and mountain ranges.
It becomes apparent that a simple naming or classifying of clusters is not achievable. Although there is significant separation of the clusters by land cover and elevation, there are clearly other regional or spatially confounding environmental factors separating these clusters. Recent work investigating these clusters has shown that site elevation, water storage potential, and terrain slope are the most important site attributes when predicting cluster allocation of sites [
33].
The 5 clusters intuitively distinguish regions with different land cover, elevation, and climate characteristics based on satellite-derived LAI values.
Table 1 and
Table 2 show that across Clusters 1, 2, 4, and 5, 32–70 percent of sites are identified as greening sites, and extremely low percentages of sites in each cluster are identified as browning sites.
Table 4 shows the land cover classifications for each of the five clusters. Clusters 1, 2, and 3 contain the largest proportion of agricultural sites and the majority land cover classification for these 3 clusters is scrub/shrub. Clusters 4 and 5 are dominantly forested, evergreen sites. All clusters are distinguished by significant differences in elevation distributions, with Cluster 5 containing the highest elevation sites and Cluster 4 containing the lowest elevation sites (
Figure A1).
Figure 6 shows annual profiles for weekly maximum LAI, weekly maximum temperature, and average weekly precipitation for each of the five clusters. Annual maximum LAI is highest in the evergreen forested sites in Clusters 4 and 5 and lowest in Cluster 3, which contains sites with the highest proportion of scrub/shrub land cover in the CRB. Annual temperature profiles are similar throughout the CRB region, with slightly lower annual maximum and higher annual minimum temperatures detected throughout the 22 year time period at the low-elevation coastal sites comprising Cluster 4. Annual precipitation profiles are similar for Clusters 1, 2, 3, and 5, with Cluster 4 receiving much greater cumulative precipitation than the other clusters each year.
A brief summary of the FPCA results is presented in
Table 5. In the FPCA of annual LAI profiles, among all of the clusters, 55–75 percent of the inter-annual variation is explained by the first principal component which is characterized by earlier and higher peaks in annual maximum LAI. This is shown in
Figure 7. The linear increase in principal component scores over time indicates a trend toward earlier and higher annual maximum LAI throughout the CRB region. This demonstrates that despite differences in land cover, elevation, and annual precipitation profiles between the five clusters, a clear greening trend is detected over the 22 year period throughout the CRB. In
Figure 8, FPCA on temperature shows 45–48 percent of the inter-annual variability is explained by the first principal component which is associated with linearly warmer temperatures from the beginning of the year through the annual peak in summer temperatures. In
Figure 9, the second principal component, explaining 20–23 percent of the inter-annual variability among the five clusters, is associated with either significantly warmer temperatures during the first 20 weeks of the year and a lower annual maximum temperature, or cooler temperatures during the first 20 weeks of the year and a higher annual maximum. In
Figure 10, identical analysis on annual cumulative precipitation shows 90 percent of the inter-annual variation among all of the clusters was explained by the first principal component characterized by either greater or less annual cumulative precipitation. Annual maximum temperature and annual precipitation profiles do not demonstrate an obvious trend over the time period 1996–2017 (
Figure 8 and
Figure 10).
The FCCA reveals correlations between intra-annual variation in temperature and precipitation and the earlier and higher LAI peak being detected in each of the clusters. In each cluster, the shift in phenology toward earlier and higher annual maximum LAI values are correlated with warmer temperatures during the first 20 weeks of the year, shown in
Figure 11. FCCA between LAI and precipitation did not yield a consistent correlation among the clusters. For Clusters 1 and 5, greater and earlier maximum LAI was correlated with large increases in cumulative precipitation in the winter and spring, and later and lower maximum LAI was correlated with greater accumulation in the late summer and early fall. Cluster 2 shows a similar trend except that the correlation between LAI and cumulative precipitation is most strongly driven by differences in fall and early winter where large accumulations of precipitation in the fall are more strongly associated with later and lower maximum LAI. For Cluster 3, greater but not earlier maximum LAI in Cluster 3 was also correlated with greater spring precipitation, shown in
Figure 12. For Cluster 4, the first annual peak in LAI is associated with dryer summers and greater accumulation in the winter. Taken together, the results of the functional canonical correlations indicate that the widespread shift in phenology toward an earlier and higher peak in annual LAI in the CRB is largely associated with warmer temperatures early in the year. However, the differences in inter-annual trends in temperature and LAI over the studied time period suggest there are other drivers of the trend in LAI not captured in this analysis. Greater annual maximum temperatures are not as strongly correlated with this shift in LAI and the correlation between annual precipitation and the timing and magnitude of the annual LAI peak varies between the 5 clusters.
4. Discussion
The results of this work demonstrate the utility of FDA for the detection of annual greening trends of high dimensional data of phenological processes, and more specifically, the characterization of the within-year and across-year trends in vegetation dynamics. Annual greening of field measured or remotely-sensed sites is best characterized by multiple parameters, namely the magnitude of the peak “greenness”, the (annual) timing of the peak, the duration of greeness, and the point of maximum change in greenness. Although only the first and second of these parameters are examined in this analysis, all of these features are inherently contained in the sample of 27,196 smooth functions used in this study. Any future work to examine the other features of LAI curves can be performed using the same preprocessing used here. Without the use of spline smoothed LAI curves, the analysis of these individual parameters must be assessed without the same theoretical cohesiveness present in this approach. Ultimately, annual LAI profiles are simple to smooth, and we demonstrate that variation in annual LAI across years is effectively detected and explained using functional principal components analysis.
We believe that the modeling of processes with underlying continuity should take advantage of this continuity when possible. In this analysis, we demonstrate that the modeling of continuous processes in the presence of high volumes of missing values is achievable, and our results lead us to believe that our choice to only consider sites with less than 72 percent missing values is conservative, and such an analysis would be effective with upwards of 80 percent missing values. This provides opportunities to use such methods in regions where remotely-sensed greenness indices are recorded with extreme sparseness, such as boreal, and Arctic climates [
7]. We must acknowledge disadvantages of our sampling of sites in the region. First, the quality of the LAI AVHRR CDR has improved across its entire domain from 1982 to present. Beginning in 1996 eliminates some of the years with the lowest quality, but it is possible and likely that we have removed sites that had improved satellite coverage in the later years of the domain. Also, we note that the removal of most sites along the Montana–Idaho border and mountains of Oregon and Washington is not indicating low greenness but rather poor coverage through a substantial portion of years in our study.
Our clustering approach used in this project is effective at separating regions intuitively across an array of variables (land cover, elevation, temperature, and precipitation) with only the use of satellite-derived LAI profiles. We have conducted a deeper investigation of cluster allocation by a larger array of environmental attributes in a secondary work, where we identify strong predictive relationships between elevation, water storage potential, and slope on cluster allocation [
33]. The clustering approach in the present study provides the theoretical framework to make regional inferences about changes in climate and greenness. The approach is proficient for analyzing tens or hundreds of thousands of sampled sites
without parallel processing or high-performance computing (HPC). Future implementation of this approach using HPC would allow for investigation of regions far greater than the CRB.
Although our cluster model retains some implicit use of proximity of sites (since closer sites have a tendency to have higher correlations), we believe that there are necessary improvements to such a clustering approach, namely the filtering of noise sites (using methods such as DBSCAN) and inducing a spatial weighting (or penalization) on the dissimilarity matrix used in our work. We argue that the merits of the approach taken justify its presentation here, and we encourage further work in unsupervised learning methodology of phenological processes.
We emphasize that our work here is strictly exploratory, and not predictive. The relationships between climate and LAI discussed here are associative by nature, and further work using functional regression models is required to explore the fascinating predictive relationship between these attributes. Recent literature implementing predictive modeling of FPCA scores of NDVI has yielded promising results, and this approach can be extended to model climatic factors (such as CO
2 concentration, Gross Primary Production, plant respiration, soil moisture) that predict higher FPCA scores for LAI in recent years in this region [
14].
Our work is also insufficient without recognizing disadvantages of the LAI CDR using AVHRR sensors [
34,
35,
36]. Recent literature has shown that this product is lesser to Moderate-Resolution Imaging Spectroradiometer (MODIS) sensors in making valid inferences on field measurement resolution and areas with higher annual precipiation (>1 m) [
37,
38]. As shown in
Figure 6, regional cumulative precipiation averages for all clusters are well below this threshold. We add our work to the body of literature on the detection of changes in remotely-sensed greening, and we emphasize that the methods used in this project are directly extendable to any remotely-sensed time series data.
Numerous studies investigating remotely-sensed vegetation dynamics have reported a “greening” of the planet over the last several decades, attributed to multiple factors including the “fertilization effect” of higher atmospheric [CO
2], nitrogen deposition, or increasing temperatures lengthening the growing season in many regions [
39,
40,
41]. In this analysis, we were able to detect such a “greening” across a range of land cover classes and ecosystem types in the CRB of North America from 1996 to 2017. The shift in vegetation dynamics toward earlier and higher annual LAI peaks indicates changes in plant phenology across this region over the studied time period.
Plant phenological events in temperate regions are triggered predominantly by the well-known climatic changes associated with the changing of the seasons. These responses of vegetation to environmental conditions provide a measurable and accurate signature of the impact of climate change on plants [
42]. The importance of understanding how plants respond to changing climate conditions has led to considerable work on the influence of climate variables on plant phenology [
43]. Our analysis investigates the intra-annual relationships between vegetation dynamics and the climate variables temperature and precipitation throughout the year over a multidecadal timescale. Temperature is the dominant driver of the timing of many plant developmental processes and phenological shifts [
44,
45,
46]. Plants synchronize their growth and development with favorable thermal conditions in order maximize the growing season and minimize the risk of frost damage. Sufficient exposure to cold temperatures in the winter is required for many plants to break dormancy, and a subsequent accumulation of degree days in the spring (time above a given temperature threshold) triggers budburst and the unfolding of leaves [
47].
In the present study, earlier and higher annual maximum LAI throughout the CRB was largely correlated with higher temperatures during the first 20 weeks of the year. Phenological responses to environmental conditions can vary significantly among different regions and plant species [
48,
49]; however, similar trends in vegetation dynamics were found across a range of natural ecosystem and land cover types in the CRB, indicating common responses to abiotic environmental factors across the regional scale of this study. On agricultural land, spring planting date could influence the timing and magnitude of the peak in annual LAI. However, each of the 5 clusters contains a variety of land cover types and relatively small proportions of agricultural land. Because all of the clusters are showing coherent trends in LAI over the time period studied, the effect of differences in planting date on the agricultural fields in each cluster likely does not significantly influence the satellite-observed regional vegetation dynamics in the CRB.
The same intra-annual temperature trend was correlated with the earlier and higher maximum LAI values in each of the five clusters in the CRB. This result shows that greater accumulation of warm temperatures early in the year leads to an earlier onset of budburst and leaf unfolding, as well as an earlier peak of plant productivity in the summer growing season. Intra-annual trends in cumulative precipitation did not demonstrate a uniform correlation with the observed LAI trend among the five clusters in the CRB. This aligns with previous research that shows differential responses of phenology to precipitation between arid and wet regions and an overall lesser or indirect contribution of precipitation to plant phenological shifts compared to temperature [
50,
51].
A global warming trend of 0.2 degrees C per decade has been observed since the 1980s [
52], and recent warming of the Northern Hemisphere, particularly in the winter and spring, is well documented [
17,
53,
54]. Results of the functional principal components analysis on temperature in the CRB showed that greater than 60 percent of the inter-annual variability can be explained by warmer temperatures early in the year. Despite this, significant inter-annual variations in temperature trends exist across the CRB region between 1996 and 2017. Although warmer temperatures early in the year are seemingly the most important factor influencing the greening trend over time in this analysis, a clear trend toward early-year warming over the time period studied is lacking as shown by the lack of linearly increasing principal component scores for temperature over the 22 years. This indicates that while warmer spring temperatures are clearly influential over vegetation dynamics in the CRB, there are likely other factors playing important roles in the observed greening trend over time.
Vegetation dynamics in most plant species are mainly governed by temperature, photoperiod, precipitation, and the interactions among these key variables. The sensitivity of phenological shifts to these climate variables can differ among regions and plant species (especially sensitivities to photoperiod and precipitation [
51,
55,
56]), and many of the underlying biological mechanisms that control phenological responses to these climate variables are still unknown. The influence of different climate variables have on plant phenology are entangled and the combined effects likely promote or constrain observed trends in vegetation dynamics [
57,
58]. For example, in mid and high latitude regions, warming temperatures in the spring are correlated with increasing day length. In photoperiod-sensitive plant species, early warming before a particular day length threshold is reached could constrain the temperature effect on spring phenology. Also, clouds associated with heavy spring precipitation could lower the sunlight intensity and quality and similarly constrain spring phenology. The timing of snowmelt is also an important factor in spring phenological shifts in regions with cold winters [
59,
60]. Warming early spring temperatures in regions near the CRB have been correlated with earlier timing of snowmelt [
17], which is likely also correlated with the observed phenological shifts toward earlier greening.
Further investigation of the interactive and combined effects of climate variables on vegetation dynamics in the CRB is needed to fully understand the relationship between changing environmental conditions and observed trends in LAI.
The effect of globally increasing concentrations of atmospheric CO
2 on vegetative growth should not be overlooked. Increases in anthropogenic emissions and land use change since the industrial revolution has driven the atmospheric CO
2 concentration to over 400 parts per million (ppm), a approximately 40 percent increase since pre-industrial times [
61]. Higher concentrations of CO
2 in the atmosphere suppress the oxygenase activity of the main carbon-fixing enzyme in plants, ribulose 1,5-bisphosphate carboxylase-oxygenase (Rubisco). This leads to reduced rates of the carbon and energy dissipative process of photorespiration and increased photosynthetic carbon assimilation. While increasing concentrations of atmospheric CO
2 are not found to change the timing of annual plant phenology [
62], greening trends around the world have been attributed to the “fertilization effect” of increasing concentrations of atmospheric CO
2 [
39,
63,
64]. Though not investigated in the present study, increasing concentrations of atmospheric CO
2 could play a role in the increasing magnitude of annual maximum LAI observed in the CRB.