Start of the Green Season and Normalized Difference Vegetation Index in Alaska’s Arctic National Parks

: Daily Normalized Difference Vegetation Index (NDVI) values from the MODIS Aqua and Terra satellites were compared with on-the-ground camera observations at ﬁve locations in northern Alaska. Over half of the spring rise in NDVI was due to the transition from the snow-covered landscape to the snow-free surface prior to the deciduous leaf-out. In the fall after the green season, NDVI ﬂuctuated between an intermediate level representing senesced vegetation and lower values representing clouds and intermittent snow, and then dropped to constant low levels after establishment of the permanent winter snow cover. The NDVI value of snow-free surfaces after fall leaf senescence was estimated from multi-year data using a 90th percentile smoothing spline curve ﬁt to a plot of daily NDVI values vs. ordinal date. This curve typically showed a ﬂat region of intermediate NDVI values in the fall that represent cloud- and snow-free days with senesced vegetation. This “fall plateau” was readily identiﬁed in a large systematic sample of MODIS NDVI values across the study area, in typical tundra, shrub, and boreal forest environments. The NDVI level of the fall plateau can be extrapolated to the spring rising leg of the annual NDVI curve to approximate the true start of green season.


Introduction
The Normalized Difference Vegetation Index (NDVI) has been widely used to study vegetation phenology since the 1980s when satellite NDVI data became available [1,2]. NDVI is a simple ratio computed from reflectance in two widely available wavelength bands (red-R and near infrared-NIR): NDVI = (NIR − R)/((NIR + R) [3]. NDVI in the Arctic and Subarctic exhibits a single annual cycle that has been parameterized to quantify the timing and length of the growing season (e.g., [4,5]). The spring rise in NDVI in environments with a distinct snow season is influenced by both loss of snow and greening of vegetation [5][6][7][8]. This occurs because snow and clouds have NDVI close to zero, while the evergreen plants, soil and leaf litter exposed by snowmelt have a NDVI greater than zero [9,10]. This problem has prompted development of other vegetation indices that would respond more specifically to vegetation greenness alone [11,12]. These proposed vegetation indices are more complex than NDVI and require additional wave bands [11] or multiple assumed model parameters [12] to compute.
The timing of deciduous leaf-out and senescence can be determined from NDVI alone if the NDVI level that represents snow-free vegetation prior to annual leaf-out or after leaf senescence is known. The NDVI of senesced and snow-free vegetation has been estimated using multi-year maximum NDVI values for a fixed time interval after senescence, e.g., 16 Oct to 17 Nov for a study area in northern Scandinavia [13]. This date range is study-area specific and would not be useful, for example, over most of the current study area where snow cover is usually present before 16 Oct [14]. The purpose of this Technical Note is to portray the annual NDVI cycle in a study area in arctic and subarctic Alaska, and suggest a more flexible method for determining the NDVI level of senesced vegetation. In the and suggest a more flexible method for determining the NDVI level of senesced vegetation. In the proposed method, the post-senescence NDVI value is determined on a perpixel basis by analysis of multi-year data. This NDVI value may then be used to locate the start of the green season in individual years.

Study Area
The study area is the five National Park Service (NPS) units in northern Alaska (Figure 1). These parks cover approximately 82,000 km 2 , including lowlands near sea level and rugged mountainous terrain with elevations that reach over 2000 m above sea level. The climate is arctic and subarctic, with long-term mean annual air temperatures ranging from about −5 °C at low elevations in the south and west to about −10 °C in the northern mountains. Most of the study area is shrub-or graminoid-dominated arctic tundra, or alpine barrens, with boreal forests of spruce (Picea mariana and P. glauca), birch (Betula neoalaskana) and aspen and poplar (Populus tremuloides and P. balsamifera) present at low elevations in southern inland locations ( Figure 1).  [15]. Locations of the five remote camera sites are labeled, with further information in Table 1. The systematic sample points, spaced at 20 km intervals, are also shown.

Satellite Data Analysis
NDVI was computed from MODIS satellite data, bands 1 (red, 620-670 nm) and 2 (near infrared, 841-876 nm). The AppEARS point extraction tool [16] was used to extract the daily measurements of red and near infrared surface reflectance for the entire MODIS record through 2020: products MOD09GQ and MYD09GQ, version 6 for the Aqua (2003-2020) and Terra (2000-2020) satellites, respectively [17,18]. These products contain daily observations at 250 m resolution, corrected for atmospheric conditions such as gasses, aerosols and Rayleigh scattering. Data are missing for 2 to 3 months centered around winter solstice when the sun angle is less than 5° above the horizon. Point extractions were made  [15]. Locations of the five remote camera sites are labeled, with further information in Table 1. The systematic sample points, spaced at 20 km intervals, are also shown.

Satellite Data Analysis
NDVI was computed from MODIS satellite data, bands 1 (red, 620-670 nm) and 2 (near infrared, 841-876 nm). The AppEARS point extraction tool [16] was used to extract the daily measurements of red and near infrared surface reflectance for the entire MODIS record through 2020: products MOD09GQ and MYD09GQ, version 6 for the Aqua (2003-2020) and Terra (2000-2020) satellites, respectively [17,18]. These products contain daily observations at 250 m resolution, corrected for atmospheric conditions such as gasses, aerosols and Rayleigh scattering. Data are missing for 2 to 3 months centered around winter solstice when the sun angle is less than 5 • above the horizon. Point extractions were made for five on-the-ground monitoring station locations ( Figure 1 and Table 1) and at 210 systematically arrayed points (20 km spacing, Figure 1) across the study area that have been used in other studies [19,20]. Photo-interpreted vegetation classifications are available for the 210 sample points, based on a 4 ha circle (113 m radius) surrounding each point [19]. While the 4 ha circles do not coincide perfectly with MODIS pixels (which are 250 by 250 m, 6.25 ha and intersected by but not centered on the points), they provide a useful approximation of the vegetation present in the MODIS pixels.
Cloud cover results in low NDVI readings throughout the year. To approximate the yearly cycle of cloud-free NDVI, I plotted all NDVI readings at a sample point against the ordinal date and ran a 90th percentile smoothing spline [21] using the "rqss" function with a "lambda" smoothing parameter of 10 from the R package "quantreg" [22]. This function fits a smooth non-parametric curve to the highest (90th percentile) values; thus, the resulting curve approximates the annual cycle of maximum (i.e., cloud-free) NDVI. The R code snippet below gives the essential steps. The variables x and y are the input satellite data, from all available years, for the ordinal date (i.e., a number between 1 and 365) and NDVI, respectively; pred$x and pred$y are the corresponding curve-predicted (smoothed) values. The "quantreg" package supplies the rqss and qss functions and tau is the chosen quantile: fit <-rqss(y~qss(x, lambda=10), tau = 0.90) pred<-data.frame(x=seq(min(x), max(x))) pred$y <-predict(fit, newdata=pred) As will be shown below in the results, plots of NDVI vs. date at most locations show a distinct plateau region of intermediate NDVI in the fall that corresponds to snow-free, senesced vegetation. The fall NDVI plateau is best identified by fitting a 90th percentile spline curve to all years of data together, because in any one year the senesced vegetation can be obscured by clouds or early snow. I assumed that the NDVI level of senesced vegetation corresponded to the flattest portion of the smoothed NDVI curve in the fall season. The flattest portion of the fall NDVI curve was found by computing two-week running regressions of the above spline-fitted values vs. date and corresponding two-week running means of the daily spline-fitted values. I then defined the "senesced vegetation" NDVI as the two-week running mean of the spline-fitted NDVI values for the date interval with the flattest regression slope, after day 240 (28 August) and before the smoothed NDVI dropped below 0.1 (i.e., snow-covered). The following R code snippet performs the calculation on the output data frame pred from above: pred$sl <-NA #column for 2 week slope pred$runmn <-NA #column for 2 week running mean for (i in 1:(nrow(pred)-13)){ #compute running slope and mean of 2-week window season for each year. The annual NDVI curves were fit using the procedure outlined above, with 90th percentile smoothing splines to approximate cloud-free NDVI.
The steps involved in finding the start of green season by the proposed method are summarized below. All operations are performed on a per-pixel basis.

1.
Fit a 90th percentile smoothing spline curve to NDVI vs. ordinal date (1 to 365), using all available years of data.

2.
Locate the two-week interval with the flattest slope of the curve from step 1 in the fall; "fall" is defined as day number greater than 240 (28 Aug) and smoothed NDVI greater than 0.1 (a level indicating snow cover). Compute the mean NDVI value of this two-week period from the fitted curve (90th percentile) values. This is the assumed NDVI of senesced, snow-free vegetation in the pixel.

3.
Fit 90th percentile smoothing spline curves to NDVI vs. ordinal date for each individual year.

4.
Find the date on each annual curve from step (3) where NDVI first crosses the level of senesced vegetation from step (2). This is the start of the green season.

Ground Verification
Ground verification of satellite phenology observations was obtained from automated cameras at five NPS climate monitoring stations (Table 1, Figure 1 [23]). The three-letter symbols listed in Figure 1 and Table 1 are used below to refer to these stations. Photographs were taken with Campbell Scientific CC5MPX cameras (2592 by 1944 pixels, jpg format), five photos per day at hourly intervals centered at solar noon, beginning in 2013. Periods of missing data were common at some locations due to wind or wildlife damage. A set of visual phenology indicators was chosen at each station. These phenology indicators were observed in a consistently defined portion of the cameras view at each location, as described in the monitoring protocol [24]. They included first date of spring 50% and 0% snow cover, and flowering and leaf-out of conspicuous plants.
Green season metrics derived from quantitative assessment of color photographs are comparable to MODIS phenology metrics [25]. Following [25], I computed the mean green chromic coordinate (G cc ), for a rectangular foreground area at the three stations where a suitable rectangular area of homogenous foreground vegetation was visible in the photographs (MNO, PAM and SRT). G cc is a proportion representing the "greenness" of a pixel value: G cc = G DN /(R DN + G DN + B DN ), where R DN , G DN and B DN are the red, green and blue digital numbers of the pixel. These results were of limited usefulness: due to camera issues only two years of data were available for MNO, and the PAM G cc had a very poor signal-to-noise ratio due to sparse vegetation.

Annual NDVI Cycle at the Climate Monitoring Stations
Comparison of NDVI with on-the-ground camera observations showed that much of the seasonal rise and fall of NDVI was due to the gain and loss of snow cover. The Kugururok Station in 2016 ( Figure 2) is a typical example: there was a rapid rise in NDVI attributable to snow loss (points 1 to 2), followed immediately by a somewhat slower rise due to vegetation leaf-out and growth (points 2 to 3). After peak green-up (point 3), NDVI fell again to the intermediate level corresponding to predominantly senesced vegetation (point 4), after which it fluctuated for a time between zero and the "senesced" level due to clouds and intermittent snow cover (points 4 to 5), before falling to a consistent level near zero when permanent winter snow cover was established (point 6).
to clouds and intermittent snow cover (points 4 to 5), before falling to a consistent level near zero when permanent winter snow cover was established (point 6). In this example, over half of the annual amplitude in NDVI was due to the transition between the snow-covered landscape with NDVI near zero and an NDVI near 0.55 representing litter, soil and leafless or evergreen vegetation in the fall or spring. The contribution of seasonally green vegetation in this example was to raise the NDVI from 0.55 to the midsummer green peak of about 0.8. In this example, over half of the annual amplitude in NDVI was due to the transition between the snow-covered landscape with NDVI near zero and an NDVI near 0.55 representing litter, soil and leafless or evergreen vegetation in the fall or spring. The contribution of seasonally green vegetation in this example was to raise the NDVI from 0.55 to the midsummer green peak of about 0.8.
The basic features of the annual NDVI cycle shown in Figure 2 are visible at all five of our monitoring stations in Figure 3, where all years of data are plotted together. The NDVI level of senesced fall vegetation is visible on the NDVI curves of all five stations as a distinct flat region beginning near 1 September (day 243; Figure 3). During this time of year, low NDVI can result from snow or clouds and the 90th percentile curve approximates the NDVI on days without either. , 13, x FOR PEER REVIEW 6 of 11 The basic features of the annual NDVI cycle shown in Figure 2 are visible at all five of our monitoring stations in Figure 3, where all years of data are plotted together. The NDVI level of senesced fall vegetation is visible on the NDVI curves of all five stations as a distinct flat region beginning near 1 September (day 243; Figure 3). During this time of year, low NDVI can result from snow or clouds and the 90th percentile curve approximates the NDVI on days without either. The NDVI level of snow-free, senesced vegetation is represented in Figure 3 by the dashed lines. The steeply rising spring legs of the red maximum NDVI curves below the dashed lines represent the effect of snow loss and the slightly more gentle rise above the The NDVI level of snow-free, senesced vegetation is represented in Figure 3 by the dashed lines. The steeply rising spring legs of the red maximum NDVI curves below the dashed lines represent the effect of snow loss and the slightly more gentle rise above the dashed lines represents actual green-up. The peak NDVI occurred on days 189-193 (8 to 12 July) at all stations except MNO (day 204, 23 July). The NDVI level of senesced vegetation can be intersected with curves fitted to annual data to approximate the date of start of the green season (SOG) in an individual year (e.g., Figure 2).

Comparison with the Camera Observations of Phenology Events
The SOG, estimated from the intersection of the multi-year senesced vegetation NDVI level with spring rising leg of each annual NDVI curve at the climate monitoring stations, shows a consistent relationship with the cumulative thawing degree-days and with various spring phenology events observed on the cameras (Figure 4). An exact match between SOG derived from satellite NDVI and any given spring camera event is not expected, because the camera views do not coincide exactly with the MODIS pixels and the camera events represent some visible feature (e.g., Dryas flowering) that is related to but not necessarily coincident with the beginning of mass leaf-out. In most cases the MODIS SOG fell after the last snow melted and between the dates of various observed early spring plant phenology events such as Dryas flowering. Exception include station SRW, where a large snowdrift in the camera view persisted after MODIS SOG and the various visual spring phenology events, and station KUG where MODIS SOG occurred a few days after full leaf-out of birch in two years (note that birch was readily observed in the camera view but was not the dominant plant at this site). At SRT (the only station with a densely vegetated foreground area on the photographs suitable for quantitative greenness analysis using green chromic coordinate and more than 2 years of data), MODIS SOG matched closely with 25% greenness as determined by analysis of on-the-ground photographs, with mean square deviation of 1.4 days and average difference of −0.7 days.

NDVI Cycle at 210 Systematic Sample Points
A similar annual NDVI cycle was observed at the 210 systematically located points across the study area, with some interesting exceptions ( Figure 5). These exceptions were for vegetation types not represented by the typical tundra vegetation of our five monitoring stations. The study area contains significant areas of sparse or no vegetation [10]. The points where the 4 ha photo-interpreted vegetation plot at the point was classified as 75% or more barrens [19] clearly had low maximum NDVI, often below 0.3, and a less prominent fall plateau ( Figure 5). These are locations where green vegetation was lacking or nearly so and, thus, the annual NDVI rise and fall was due mainly to gain and loss of snow cover and in some cases to seasonal terrain shadow in these mountain environments. (The one anomalously high "barrens" curve in Figure 5, the brown line with maximum NDVI near 0.6, had dense green vegetation about 80 m from the plot center on one side, which allowed the circular photo-interpreted sample plot to be >75% barren while the larger MODIS pixel clearly contained significant green vegetation.) Exclusive of these points with sparse or no vegetation, a fall plateau suitable for locating the NDVI of senesced and snow-free vegetation was universally present. 021, 13, x FOR PEER REVIEW 8 of 11 Figure 4. Spring start of green season (SOG) as determined by MODIS NDVI curves compared to events observed using remote automated cameras ("Cameras Obs.") and cumulative thawing degree-days (°C-days) at monitoring stations. The camera events are shrub birch first leaves and full leaf-out ("birch_first", "birch_full"), alder half leaf-out, Dryas first flowers and peak flowering ("dryas_first", "dryas_peak") and first day with no snow ("snow_off_all"). In italics below are seasonal maximum snow depths (m) as measured by a single ultrasonic depth sensor near the camera.
Another interesting subset of curves is those with substantial evergreen tree vegetation (spruce trees; defined as plots that were more than one-third occupied by spruce forest [19]). These curves had higher winter NDVI values corresponding to a mix of exposed treetops and snow, relative to the snow-dominated tundra points. The spruce points also had the highest NDVI values in the fall plateau region (days 250 to 300). The peak NDVI of forested points was at the high end of the range of values, reflecting the high overall leaf area of forested communities. . Spring start of green season (SOG) as determined by MODIS NDVI curves compared to events observed using remote automated cameras ("Cameras Obs.") and cumulative thawing degree-days ( • C-days) at monitoring stations. The camera events are shrub birch first leaves and full leaf-out ("birch_first", "birch_full"), alder half leaf-out, Dryas first flowers and peak flowering ("dryas_first", "dryas_peak") and first day with no snow ("snow_off_all"). In italics below are seasonal maximum snow depths (m) as measured by a single ultrasonic depth sensor near the camera.  [19]. The green lines are spruce forest (of Picea glauca or P. mariana) greater than one-third cover, gray lines are miscellaneous tundra and shrub communities, and brown lines are more than 75% cover by barrens (i.e., little or no vegetation).
The NDVI level of senesced vegetation, estimated from the fall plateau as described previously, was generally over half of the maximum NDVI value. Among points with maximum NDVI of 0.4 or greater (i.e., those with substantial green vegetation present, n = 184 out of 210), the NDVI of senesced vegetation averaged 0.70 of the maximum NDVI; the latter was calculated as the maximum NDVI from running 2-week means of daily values for the 90th percentile curves shown in Figure 5. In other words, about 70% of the annual NDVI amplitude was due to the transition between snow and the surface without summer green leaves (e.g., plant litter, above-ground perennial plant parts including evergreen leaves, soil, and rock); just 30% was due to summer greening. The multi-year maximum NDVI and NDVI of senesced vegetation as defined above were well correlated, with r 2 = 0.61. If the NDVI of senesced vegetation is predicted as simply 0.7 times the maximum NDVI, the root mean square error is 0.06 NDVI units.

Discussion and Conclusions
Over half of the annual amplitude in the NDVI cycle in this arctic-subarctic study area was due to the formation and loss of the annual snow cover. Thus, the length of the  [19]. The green lines are spruce forest (of Picea glauca or P. mariana) greater than one-third cover, gray lines are miscellaneous tundra and shrub communities, and brown lines are more than 75% cover by barrens (i.e., little or no vegetation).
Another interesting subset of curves is those with substantial evergreen tree vegetation (spruce trees; defined as plots that were more than one-third occupied by spruce forest [19]). These curves had higher winter NDVI values corresponding to a mix of exposed treetops and snow, relative to the snow-dominated tundra points. The spruce points also had the highest NDVI values in the fall plateau region (days 250 to 300). The peak NDVI of forested points was at the high end of the range of values, reflecting the high overall leaf area of forested communities.
The NDVI level of senesced vegetation, estimated from the fall plateau as described previously, was generally over half of the maximum NDVI value. Among points with maximum NDVI of 0.4 or greater (i.e., those with substantial green vegetation present, n = 184 out of 210), the NDVI of senesced vegetation averaged 0.70 of the maximum NDVI; the latter was calculated as the maximum NDVI from running 2-week means of daily values for the 90th percentile curves shown in Figure 5. In other words, about 70% of the annual NDVI amplitude was due to the transition between snow and the surface without summer green leaves (e.g., plant litter, above-ground perennial plant parts including evergreen leaves, soil, and rock); just 30% was due to summer greening. The multi-year maximum NDVI and NDVI of senesced vegetation as defined above were well correlated, with r 2 = 0.61. If the NDVI of senesced vegetation is predicted as simply 0.7 times the maximum NDVI, the root mean square error is 0.06 NDVI units.

Discussion and Conclusions
Over half of the annual amplitude in the NDVI cycle in this arctic-subarctic study area was due to the formation and loss of the annual snow cover. Thus, the length of the Arctic green season is overestimated by the full length of time that NDVI curve is above its winter values near zero [8]. The fact that the true SOG occurs at a point over halfway up the steep rising portion of the annual NDVI curve, rather than near the concavity at its base, is actually an advantage for precisely locating the SOG date. On the steep rising portion of the curve, any error in selecting the NDVI level representing SOG (the y coordinate) results in less error in the date (the x coordinate) than it would on a lower and flatter portion of the curve. From this perspective, NDVI may be more accurate for locating SOG that other vegetation indices that have values for soil and senesced vegetation closer to zero (e.g., the phenology index [11] or the plant phenology index [12]).
The challenge in the use of NDVI to locate SOG is to determine what NDVI value represents snow-free vegetation lacking deciduous leaves, because snow dominates most of the non-growing season in the Arctic. As demonstrated here and in [13], this value may be extracted from multi-year data for the fall season, when senesced vegetation may persist for a time without snow cover. The fall senesced-vegetation NDVI value may be determined using the maximum NDVI for a fixed date range, if a suitable date range can be established for an entire study area [13]. However, no single date range will be optimal for a study area with a wide range of climates. In my study area, high elevation areas usually have a persistent winter snow cover in early September [14], resulting in maximum NDVI values that drop to winter values near zero by mid-to late-September ( Figure 5). Meanwhile, in warmer locations (represented by the forest curves in Figure 5), maximum NDVI does not decline to levels representing senesced vegetation until mid-September. By the adaptive process outlined here, senesced vegetation's NDVI value is determined for each pixel from the fall plateau of intermediate NDVI values (visible on multi-year data) that represent post-senescence and pre-snow conditions. Once the NDVI value of senesced vegetation has been located, it can be intersected with the spring rising leg of the annual NDVI curves to locate SOG.
The NDVI after vegetation senescence was higher in communities with higher maximum NDVI. This was probably due to both the presence of evergreen plants and the fact that plant litter has a higher NDVI than bare soil [9]. The typical NDVI level of senesced vegetation in my study area was close to 70% of the typical maximum NDVI. A recent study in the high Arctic [26] relating annual plant productivity to time-integrated NDVI also defined SOG as the day when NDVI exceeded 70% of the multi-year average summer value. In a previous study, I used 50% of maximum NDVI to investigate trends in SOG [14]. NDVI remains a very useful measure of vegetation greenness in the Arctic, as long as the contribution to the index of surfaces other than green deciduous leaves are taken into account.