Evaluation of Vegetation Indexes and Green-Up Date Extraction Methods on the Tibetan Plateau

: The vegetation green-up date (GUD) of the Tibetan Plateau (TP) is highly sensitive to climate change. Accurate estimation of GUD is essential for understanding the dynamics and sta-bility of terrestrial ecosystems and their interactions with climate. The GUD is usually determined from a time-series of vegetation indices (VIs). The adoption of different VIs and GUD extraction methods can lead to different GUDs. However, our knowledge of the uncertainty in these GUDs on TP is still limited. In this study, we evaluated the performance of different VIs and GUD extraction methods on TP from 2003 to 2020. The GUDs were determined from six Moderate Resolution Imaging Spectroradiometer (MODIS) derived VIs: normalized difference vegetation index (NDVI), enhanced vegetation index (EVI), normalized difference infrared index (NDII), phenology index (PI), normalized difference phenology index (NDPI), and normalized difference greenness index (NDGI). Four extraction methods ( β max , CCR max , G20, and RC max ) were applied individually to each VI to determine GUD. The GUDs obtained from all VIs showed similar patterns of early green-up in the eastern and late green-up in the western plateau, and similar trend of GUD advancement in the eastern and postponement in the western plateau. The accuracy of the derived GUDs was evaluated by comparison with ground-observed GUDs from 19 agrometeorological stations. Our results show that two snow-free VIs, NDGI and NDPI, had better performance in GUD extraction than the snow-calibrated conventional VIs, NDVI and EVI. Among all the VIs, NDGI gave the highest GUD accuracy when combined with the four extraction methods. Based on NDGI, the GUD extracted by the CCR max method was found to have the highest consistency (r = 0.62, p < 0.01, RMSE = 11 days, bias = − 3.84 days) with ground observations. The NDGI also showed the highest accuracy for preseason snow-covered site-years (r = 0.71, p < 0.01, RMSE = 10.69 days, bias = − 4.05 days), indicating its optimal resistance to snow cover inﬂuence. In comparison, NDII and PI hardly captured GUD. NDII was seriously affected by preseason snow cover, as indicated by the negative correlation coefﬁcient (r = − 0.34, p < 0.1), high RMSE and bias (RMSE = 50.23 days, bias = − 24.25 days).


Introduction
The Tibetan Plateau (TP), located in southwest China and central Asia, is the highest and largest plateau in the world. The special geographical conditions, unique natural environment, and fragile alpine ecosystem of the TP make it a region susceptible to climate change [1]. In the past 50 years, the temperature on the TP has risen by an average of approximately 0.37 • C per decade, twice more than the global average ascending rate during the same period [2]. Climate change often leads to acute changes in plant growth and generates a great instability in ecosystem. A sensitive indicator of terrestrial ecosystem response to climate change is vegetation phenology. In particular, the vegetation greenup date (GUD)-which refers to the time when vegetation photosynthetic activity and plant germination and growth processes under suitable hydrothermal conditions begindescribes the initial increase in land surface greenness in spring [3]. GUD changes can affect the interaction between biosphere and atmosphere, thereby influencing carbon, water, and energy balance [4][5][6]. Therefore, GUD is one of the most important parameters for studying the response and feedback mechanisms of terrestrial ecosystems to climate change (e.g., [7,8]).
Traditionally, GUD detection relies on ground-based observations, such as field visual observations and near-surface measurements. Field visual observations directly observe the seasonality and interannual variation of vegetation phenological growth rhythm at fixed stations. This method is time-consuming and labor-intensive, and is limited by time, place, and species factors. Near-surface measurements record vegetation information by instruments mounted on platforms such as tripod and unmanned aerial vehicle, including optical digital camera [9], multi-band spectroradiometer [10], and CO 2 flux sensor [11]. Digital camera and spectroradiometer can obtain vegetation phenological information at canopy level [12], while CO 2 flux measurements provide gross primary production, from which the detected GUD representing the "physiological phenology" of the vegetation can be found [13]. Despite their ability to observe vegetation information more efficiently, these methods are limited to a small spatial scale [14]. With satellite remote sensing data being available on a long-term global basis, estimates of GUD can be made on a large scale with high spatial and temporal resolution. Time series of vegetation indexes (VIs) are usually used for satellite-based GUD extraction. The normalized difference vegetation index (NDVI) is sensitive to canopy structure and chlorophyll and has been the most widely used VI [15]. However, NDVI can be influenced by the radiance from atmosphere and soil background, and is prone to saturation at a high leaf area index [16]. The enhanced vegetation index (EVI) improves the sensitivity in high biomass regions and reduces the effect from atmosphere and soil [17]. However, it is subject to topography [18]. Additionally, NDVI and EVI can increase with both vegetation growth and snow melt in spring, so the GUD estimation may be influenced by snow melt [19]. In order to account for the snow melt effect, several VIs have been proposed. The normalized difference infrared index (NDII) is expected to separate the two natural processes, as it increases with vegetation growth and decreases with snow melt [20]. However, NDII is ineffective for capturing GUD in places where the two processes occur simultaneously [21] and in some burned areas [22]. The phenology index (PI) combines NDVI and NDII to remove the confounding effects of brightness and wetness [21]. However, it cannot identify sparse vegetation [23,24], and thus it may struggle to recognize the initial growth of vegetation. The normalized difference phenology index (NDPI) is designed for deciduous ecosystems to best distinguish vegetation from soil and snow, while minimizing differences between these background components [24]. However, NDPI ignores the spectral properties of dry grass. By contrast, the normalized difference greenness index (NDGI) not only captures the presence of sparse vegetation well, but also compensates for the effect of dry grass on GUD extraction [23]. Its application is limited to tundra and grassland ecosystems. In large spatial scales, no matter which VI is used, the estimated GUD may have uneven distribution of accuracy due to the spatial heterogeneity of latitudes and vegetation types [25,26]. It is still unclear how these VIs perform in GUD extraction on the TP.
In addition, different GUD extraction methods also play important roles for identifying GUD from time series of VI values. These GUD extraction methods mainly fall into two categories: threshold-based methods and change detection methods [27]. The threshold-based methods include absolute threshold method [28] using a fixed VI value and relative threshold method [29,30] using a certain ratio of the amplitude of the VI curve to determine the GUD. Commonly used relative thresholds include 10% (G10), 20% (G20), and 50% (G50). Despite being simple and convenient, threshold-based methods have little practical biophysical significance. A single threshold is not universally applicable for GUD detection in a large region with different environmental conditions and vegetation types [31]. The absolute threshold method is susceptible to VI fluctuation caused by external factors instead of vegetation, resulting in the bias of GUD detection. In contrast, the relative threshold method excels at adaptability in GUD estimation due to its consideration of the differences between vegetation types and the annual variation in vegetation growth. However, the relative threshold method is sensitive to noise and may be unstable over time [29]. The change detection methods determine GUD by finding the change characteristic points on VI curve, such as the maximum value of the first derivative (β max ) [32], the first local maximum curvature change rate (CCR max ) of the annual VI curve [3], and the fastest change rate (RC max ) of multi-year averaged VI curve [33]. These change detection methods were regarded as effective in GUD extraction for general vegetation types and mixed vegetation pixels [34]. However, these change detection methods cannot accurately determine GUD if there is no abrupt VI increase during the growth stage, especially when the VI data were contaminated by cloud [35], or if the vegetation has no much seasonal spectral variation [36]. Furthermore, the change detection methods are also sensitive to the noise in VI time series [37]. In recent years, more effort has been made to develop the methods for deriving physiological GUD of specific vegetation types, especially for crops. For example, Pan et al. [38] extracted GUDs of corn and wheat based on the threshold with the highest probability of VI ratio. Xu et al. [39] calibrated the program of GUD estimations for barley, rape, wheat, and beet, by optimizing the smoothing method, algorithm of Best Index Slope Extraction, and the threshold setting based on ground-observed GUDs. These methods are actually the extensions of the threshold-based methods using empirical knowledge and statistical analysis. They are, to some extent, influenced by individual experience, and the correct and rational threshold for unbiased GUD estimation is difficult to define. Additionally, the phenology matching method was designed to match the ground-observed GUD to a certain trajectory feature point in the VI time series [37]. However, the GUD accuracy may be impacted by the considerable variability of ground observations across different years and locations [40]. Although these advanced methods have achieved some effectiveness in small field-scale areas and specific species, they have no widespread applicability. In contrast, the general GUD extraction methods (such as β max , CCR max , G20, and RC max ) are more suitable for large geographic areas such as TP.
Some studies have compared the performances of different GUD extraction methods, but most are limited to the analysis of a single conventional VI. White et al. [41] assessed ten GUD extraction methods in North America using NDVI data and found that individual methods differed in GUD estimates by ±60 days. Based on NDVI data, Wang et al. [42] employed five methods (RC max , G20, G50, maximum relative change, and maximum increase) to extract GUD in China's temperate monsoon area. They concluded that the GUD derived from G50 was significantly correlated with ground-observed GUD. Zhang et al. [43] evaluated three relative threshold methods (G10, G30, and G50), five absolute threshold methods (0.1, 0.5, 1, 1.5, and 2), and seasonal trend decomposition by loess in phenology extraction in China using the leaf area index and found that G10 gave a better GUD prediction than the other methods. Although previous studies have confirmed the nonnegligible influences of extraction methods on GUD estimation, further research is needed to quantify the uncertainty of GUD determined from different VIs and by different methods.
This study evaluated the effects of VIs and extraction methods on the determination of GUD over the TP. We calculated GUDs based on the time series (2003-2020) of NDVI, EVI, NDII, PI, NDPI, and NDGI derived from Moderate Resolution Imaging Spectroradiometer (MODIS) surface reflectance data. Four GUD extraction methods (β max , CCR max , G20, and RC max ) were applied individually to each VI. We analyzed the spatiotemporal variation in GUDs derived from six VIs on the TP during the study period. By comparison with ground-observed GUD from 19 agrometeorological stations, we then evaluated the overall accuracy of GUDs derived from six VIs and their accuracy in areas with and without preseason snow cover. Based on the optimal VI, we also assessed the consistency of the GUDs extracted using different extraction methods with the ground observations. Then, we discussed the applicability of snow-free VIs on the TP and the impact of extraction methods on GUD accuracy.

Study Area
The TP lies in southwestern China. Starting from the Pamir Plateau in the west, it reaches the Hengduan Mountains in the east, the southern margin of the Himalayas in the south, and the northern side of the Kunlun and Qilian Mountains in the north (26 • Figure 1). It covers an area of approximately 2.5 × 10 6 km 2 with an average elevation exceeding 4000 m above sea level (asl). The topography of the TP is complex, and its elevation gradually decreases from west to east and from north to south. The climate of the TP is characterized by low temperatures, little rainfall, and strong solar radiation. In general, it varies gradually from dry and cold in the northwest to humid and warm in the southeast [44].
overall accuracy of GUDs derived from six VIs and their accuracy in areas with and without preseason snow cover. Based on the optimal VI, we also assessed the consistency of the GUDs extracted using different extraction methods with the ground observations. Then, we discussed the applicability of snow-free VIs on the TP and the impact of extraction methods on GUD accuracy.

Study Area
The TP lies in southwestern China. Starting from the Pamir Plateau in the west, it reaches the Hengduan Mountains in the east, the southern margin of the Himalayas in the south, and the northern side of the Kunlun and Qilian Mountains in the north (26°00′12″N-39°46′50″N, 73°18′52″E-104°46′59″E; Figure 1). It covers an area of approximately 2.5 × 10 6 km 2 with an average elevation exceeding 4000 m above sea level (asl). The topography of the TP is complex, and its elevation gradually decreases from west to east and from north to south. The climate of the TP is characterized by low temperatures, little rainfall, and strong solar radiation. In general, it varies gradually from dry and cold in the northwest to humid and warm in the southeast [44].
These unique environmental conditions have formed a typical alpine ecosystem on the TP, where the vegetation distribution includes alpine-steppes and meadows over the flat plateau, shrubs in the valleys, and a small percentage of forest in the southeastern plateau ( Figure 1). The vegetation growing season in the TP is mainly from May to September. According to topographical, moisture, and thermal features, the TP is divided into 11 eco-geographical regions [45] (Figure 1, Table A1).
The TP is also rich in snow in the winter season. The duration of snow cover varies across different elevation ranges and generally increases with altitude [46]. Higher elevations on the TP are covered perennially with snow.  These unique environmental conditions have formed a typical alpine ecosystem on the TP, where the vegetation distribution includes alpine-steppes and meadows over the flat plateau, shrubs in the valleys, and a small percentage of forest in the southeastern plateau ( Figure 1). The vegetation growing season in the TP is mainly from May to September. According to topographical, moisture, and thermal features, the TP is divided into 11 ecogeographical regions [45] (Figure 1, Table A1).
The TP is also rich in snow in the winter season. The duration of snow cover varies across different elevation ranges and generally increases with altitude [46]. Higher elevations on the TP are covered perennially with snow. Explorer (https://espa.cr.usgs.gov/, accessed on 13 December 2020). MODIS measured surface spectral reflectance at bands 1 to 7, and corrected for the effects of atmospheric conditions such as thin cirrus clouds, aerosols, Rayleigh scattering, etc. [47]. The following bands, red (band 1), near-infrared (NIR, band 2), blue (band 3), green (band 4), and shortwave infrared (SWIR, band 6), were employed to calculate the six VIs (NDVI, EVI, NDII, PI, NDPI, and NDGI).
For all the time series of VIs over the study area, the maximum value composite (MVC) method [48] was applied to synthesize every two consecutive 8-day composite VIs into 16-day composited VIs. This operation helped to eliminate data noise caused by cloud pollution and harsh atmospheric conditions. The VI time series was constructed using the date of the first day of each 16-day cycle from the MODIS product. The remaining missing data in the time series of the VIs were filled using linear interpolation. The Savitzky-Golay filter [49,50] was used to remove outliers and smoothen the time series.
To concentrate on pixels with seasonal vegetation dynamics, we used the following process, as recommended by Shen et al. [51]. Since the NDVI and EVI values of snow cover are lower than those of vegetation [52], for each pixel, we replaced the snow-polluted NDVI (or EVI) values with the median of the unpolluted winter NDVI or EVI values from November to March [51]. Pixels with snow cover were recognized by using the snow flag in the product quality evaluation data (MOD09A1). Pixels that met the following requirements were reserved: (1) the annual maximum NDVI occurred in the period between July and September, (2) the average NDVI from June to September exceeded 0.1, and (3) the average NDVI from July to September was 1.2 times higher than that from November to March [51]. According to the pixels retained by the above practices, we also selected pixels of EVI, NDII, PI, NDPI, and NDGI in the corresponding positions.

Ground-Observed GUD Data
The ground-observed GUD records for the TP during 2003-2013 were collected from the China Meteorological Data Sharing Service System (http://data.cma.cn/, accessed on 11 July 2021). The dataset contained observations from 19 agrometeorological stations. The dates of phenological phases with names of green-up (or emergence of seeding) in the dataset were used to represent the GUD. If there was more than one GUD record for any ground station in a year, the averaged GUD was used.

Snow Cover Product
The 500 m resolution daily and gapless snow cover product from 1 October 2002 to 30 September 2020 was produced for the TP using the HMRF modeling technique [53]. The HMRF modeling technique combined spectral, spatiotemporal, and environmental information in an optimal manner. This method can not only fill the data gaps in the original snow cover products but also significantly improve the accuracy of the original product. The overall accuracy of the data was verified to be 98.31% according to the snow depth data from ground meteorological stations, and 92.44% according to the snow cover extent extracted from Landsat-8 [54]. This snow cover product was used to determine the snow cover end date (SCED) for the study period, and to distinguish whether there is preseason snow cover at agrometeorological stations.

Land Cover Type Data
The land cover type map for the TP with a spatial resolution of 500 m was obtained from the Science Data Bank (http://www.scidb.cn/doi/10.11922/sciencedb.398, accessed on 15 July 2021). These data delineate the spatial distribution of main vegetation types on the TP, including alpine steppes, meadows, shrubs, and forests.

Methods
Six VIs (NDVI, EVI, NDII, PI, NDPI, and NDGI) were first calculated from MODIS surface reflectance products MOD09A1.The GUDs were then extracted using four methods (β max , CCR max , G20, and RC max ) for each VI. SCED was calculated from daily snow cover data to distinguish the areas with and without preseason snow cover. The Theil-Sen nonparametric regression was used to analyze the trend of GUD temporal variation. Then, the performances of different VIs and GUD extraction methods were assessed using Pearson's correlation, linear regression, RMSE, and bias.

Calculation of VIs
The NDVI is calculated as the normalized form of the red (ρ Red ) and NIR (ρ N IR ) band reflectance [15]:

EVI
The EVI adds a blue band (ρ Blue ) to enhance vegetation signals and correct the effects of soil background and aerosol scattering. It can improve sensitivity in areas with high biomass [17]: NDII is the normalized composition of the NIR (ρ N IR ) and SWIR band reflectance (ρ SW IR ) [55]:

PI
The PI combines NDVI and NDII to solve the mixed effects of brightness and humidity on vegetation information [21]: The NDPI utilizes the red, NIR and SWIR bands to significantly distinguish vegetation from background components (soil and snow) [24]: where α is a weight coefficient determined by the spectral configuration of satellite sensor, and ranges from 0.0 to 1.0. It was parameterized as 0.74 for MODIS bandwidths [24].

NDGI
The semi-analytical index NDGI consists of green, red, and NIR bands, to account for the spectral characteristics of soil, snow, dry grass, and vegetation, and it adopts a linear spectral mixture model to improve the ability of monitoring vegetation phenology [23]. NDGI is calculated as: where α is the weight coefficient that depends on the observation sensor. For MODIS data, α was found to be 0.65 in reference to Yang et al. [23].

Determination of Vegetation GUD
The GUD varies with different extraction methods, and its accuracy may have spatial heterogeneity [26]. Presently, there is no consensus on which method is best suited for the GUD extraction for different ecosystems [56]. In our study, four methods were employed to determine GUD from annual VI time series, including β max , CCR max , G20, and RC max .
For the β max , CCR max , and G20 methods, the four-parameter logistic function was first used to fit the VI time series from the beginning of the year to the annual maximum VI [3]: where t is the time in days of year (DOY), VI(t) is the VI value at time t, a and b are fitting parameters, d is the initial background VI value, and c + d is the annual maximum VI value [3]. For RC max , the VI time series from January to September of each year was fitted using a sixth-order polynomial.
Then, the mutation points on the fitting curves were determined as GUD with the following rules.

Method 1: β max
This method focuses on the rate of increase in the VIs during the growing season, which is represented by the first derivative (β) of the fitted logistic function. The GUD is defined as the time at which β reaches its maximum [32].

Method 2: CCR max
CCR max calculates the curvature change rate (CCR) of the fitted logistic function as follows [3]: where the meanings of a, b, and c are the same as those in Equation (7), z = e a+bt . The time corresponding to the first local maximum of CCR marks the GUD [3].

Method 3: G20
For G20, GUD is defined as the first day when the VI ratio reaches the dynamic threshold of 0.2 in the fitted logistic function [29,30]. A VI ratio of 0.2 states that a site achieved 20% of its maximum greenness regardless of vegetation type. The annual VI ratio is calculated as follows: where VI t is the VI value at time t, and VI max and VI min are the maximum and minimum VI values in the ascending part of the annual VI time series, respectively.

Method 4: RC max
We first calculated the 18-year averaged VI series for each pixel at 16-day intervals. The rate of change (RC) of the averaged VI series was then calculated following Piao et al. [33]: where VI t is the multi-year averaged VI value at a given time t (temporal resolution of 16 day). The VI value corresponding to the maximum VI ratio was considered as the common threshold for the annual VI series of a pixel to determine the GUD. The annual GUD was defined as the first date when the fitted polynomial curve exceeded a predefined common threshold.

Calculation of SCED
In this study, the snow cover season was defined as the period from 1 October to 30 September of the next year. Several ephemeral snow events often occur in early fall prior to persistent snow events or in late spring after persistent snow events, resulting in multiple snowmelt and re-accumulation processes in cold regions throughout the snow season. Thus, we adopted a snow cover duration threshold of 5 days, which is commonly used in snow cover phenology research (e.g., [57,58]) to define SCED. SCED is the last day when the pixel is marked as snow for five consecutive days.

Trend Analysis
The inter-annual variation trend of vegetation GUD were analyzed at each pixel using Theil-Sen non-parametric regression [59,60]. All pairwise slopes of the time series data were calculated using this method, with the median slope representing the overall trend of change throughout the study period. Compared with traditional least-squares linear regressions, this approach does not require the independence and normality of the time series data, and is robust against outliers. The magnitude of the trend was measured using β, which is defined as follows: where β is the slope of the trend, n represents the length of the time series, and x i and x j are the ith and jth data values in the time series, respectively. The significance of the trend was inspected using the Mann-Kendall test [61].

Accuracy Assessment
The accuracy of GUDs derived from different VIs and extraction methods was evaluated by comparison with the ground-observed GUD from 19 agrometeorological stations over the TP. The trends of satellite and ground-observed GUDs have been proven to be consistent over time and space, despite the differences in their differences [14]. Satellite GUDs within a 10 km × 10 km (21 pixels × 21 pixels) window centered on each station were used to calculate the mean value. This mean value and the ground-observed GUD at the station form one pair and 19 × 11 (station × year) pairs of GUDs was obtained for each VI (or each extraction method). All the GUD pairs during 2003-2013 were then analyzed for each VI (or each extraction method) using a simple linear regression model. Four indices, namely the Pearson's correlation coefficient r, regression slope, root mean square error (RMSE), and bias, were used for accuracy assessment. We also explained the impact of preseason snow cover on GUDs extracted from different VIs by comparing the accuracy in pixel-years with and without preseason snow cover. A pixel-year was marked as affected by preseason snow cover if SCED occurred within two months prior to GUD [62]. To explain the differences between GUDs extracted though different methods, we further calculated and compared the threshold percentage of VI curve amplitude corresponding to ground-observed GUD and satellite-observed GUD for different vegetation types at agrometeorological stations. EVI, NDII, PI, NDPI, and NDGI, respectively. Generally, GUDs derived from different VIs mainly occurred from 120 to 160 DOY (May to 9 June), and showed a similar delay from east to west, with a great spatial heterogeneity over the whole plateau. Across all vegetation types, the GUD of forests occurred the earliest, followed by that of shrubs and meadows, and finally the alpine steppes. The earliest GUD (before 130 DOY, 10 May) mainly occurred in sub-humid regions (HIIAB1, HIB1, and HIIC1), and the latest GUD (after 150 DOY, 30 May) mainly occurred in semi-arid regions (HIC2 and HIIC2). Specific spatial differences between the GUDs extracted from different VIs can be observed. GUD NDVI , GUD EVI , GUD NDPI , and GUD NDGI mainly occurred during the period of 120-150 DOY (May) (Figure 2a,b,e,f). These four showed consistently that forests mainly greened from 110 to 130 DOY (20 April to 10 May), shrubs from 110 to 140 DOY (20 April to 20 May), meadows from 120 to 150 DOY (May), and steppes from 130 to 160 DOY (10 May to 9 June). GUD NDII had a relatively longer time span, mostly ranging from 100 to 165 DOY (early April to middle June) in most regions (Figure 2c). The spatial transitions of GUD NDII are less obvious compared to the other GUDs since the steppe was the latest to fully green up-until nearly 230 DOY (mid-August). GUD PI occurred generally later in the entire plateau, mainly from 135 to 160 DOY (15 May to 9 June) (Figure 2d). The differences in GUD PI among the different vegetation types were larger. Figure 3 shows the temporal variation trends of the GUDs from 2003 to 2020. The significance of the trend was marked in the lower left of each subgraph. Generally, the GUDs derived from different VIs were advanced in the eastern, but delayed in the western part of the plateau during the study period. This trend was significant mainly in the northeast part of TP. Nearly half of the study area had advanced GUD from 2003 to 2020. The average advancement determined from GUD NDVI , GUD EVI , GUD NDII , GUD PI , GUD NDPI , and GUD NDGI were −0.92, −0.7, −0.44, −0.90, −0.74, and −0.84 days/year (p < 0.01), respectively. The delayed GUD was mainly distributed in the southwest, but the trend of delay was almost less than 1 day/year, and was not significant. Among the different vegetation types, the meadows had the most significant trend of advancement, while the steppe's variation trend was the mildest and was insignificant. The areas with the most advanced GUD were located in sub-humid and semi-arid regions (HIB1, HIC1, and HIIC1).

Performance of VIs in GUD Extraction
Different VIs and extraction methods lead to different GUD accuracies. The performances of the six VIs for capturing GUD were first evaluated through the comparison of correlation coefficient r, linear regression slope, RMSE, and bias between the ground-

Performance of VIs in GUD Extraction
Different VIs and extraction methods lead to different GUD accuracies. The performances of the six VIs for capturing GUD were first evaluated through the comparison of correlation coefficient r, linear regression slope, RMSE, and bias between the groundobserved GUDs and satellite-observed GUDs between 2003 and 2013 ( Figure 4). Each VI was combined using all four methods. The results showed that NDVI, EVI, NDPI, and NDGI can capture GUD with the r higher than 0.48, the regression slope greater than 0.30, the RMSE of approximately 15 days, and the bias of approximately 10 days. It is worth noting that the NDVI and EVI here have been preprocessed to remove snow influences. Among them, NDGI achieved the largest r (r = 0.53) and regression slope (slope = 0.45), indicating that the GUD extracted from NDGI had the best consistency with ground observations. However, NDII and PI showed a lower correlation and greater difference with ground observations than the other VIs.  (Figure 4). Each VI was combined using all four methods. The results showed that NDVI, EVI, NDPI, and NDGI can capture GUD with the r higher than 0.48, the regression slope greater than 0.30, the RMSE of approximately 15 days, and the bias of approximately 10 days. It is worth noting that the NDVI and EVI here have been preprocessed to remove snow influences. Among them, NDGI achieved the largest r (r = 0.53) and regression slope (slope = 0.45), indicating that the GUD extracted from NDGI had the best consistency with ground observations. However, NDII and PI showed a lower correlation and greater difference with ground observations than the other VIs. Snow cover in early spring can affect GUD identification [21,62]. To understand the influence of preseason snow cover on the accuracy of GUD extracted from the six VIs, we further divided all site-years into categories of with and without preseason snow cover, and tested the correlation coefficient r, regression slope, RMSE, and bias between groundobserved GUD and satellite-observed GUDs under these two cases (Table 1). Here, satellite-observed GUDs were detected from six VIs using CCRmax. For the site-years without preseason snow cover, the comparison between ground-observed GUD and satellite-observed GUDs derived from six VIs showed significant positive correlations. NDGI had the best performance in the snow-free case, with the highest r value, lowest RMSE, and smallest bias (r = 0.60, p < 0.01, RMSE = 11.06 days, bias = −3.80 days). NDVI (r = 0.58, p < 0.01, RMSE = 14.03 days, bias = −8.70 days), NDPI (r = 0.57, p < 0.01, RMSE = 12.78 days, bias = −6.81 days) and EVI (r = 0.52, p < 0.01, RMSE = 13.96 days, bias = −8.30 days) also had relatively good correlation with ground-observed GUD. The GUDs extracted from NDII and PI were weakly correlated with the ground-observation GUD. For the preseasonsnow-covered site-years, NDVI (r = 0.72, p < 0.01), NDGI (r = 0.71, p < 0.01), EVI (r = 0.63, p < 0.01), and NDPI (r = 0.57, p < 0.01) still performed well with a significant correlation with ground-observed GUD. Among them, NDGI had a lower RMSE and bias (RMSE = 10.69 days, bias = −4.05 days) than the other VIs. GUDs extracted from NDII were negatively correlated with and greatly deviated from ground observations, indicating that preseason snow cover caused a great disturbance to GUD extraction from NDII. Snow cover in early spring can affect GUD identification [21,62]. To understand the influence of preseason snow cover on the accuracy of GUD extracted from the six VIs, we further divided all site-years into categories of with and without preseason snow cover, and tested the correlation coefficient r, regression slope, RMSE, and bias between ground-observed GUD and satellite-observed GUDs under these two cases (Table 1). Here, satellite-observed GUDs were detected from six VIs using CCR max . For the siteyears without preseason snow cover, the comparison between ground-observed GUD and satellite-observed GUDs derived from six VIs showed significant positive correlations. NDGI had the best performance in the snow-free case, with the highest r value, lowest RMSE, and smallest bias (r = 0.60, p < 0.01, RMSE = 11.06 days, bias = −3.80 days). NDVI (r = 0.58, p < 0.01, RMSE = 14.03 days, bias = −8.70 days), NDPI (r = 0.57, p < 0.01, RMSE = 12.78 days, bias = −6.81 days) and EVI (r = 0.52, p < 0.01, RMSE = 13.96 days, bias = −8.30 days) also had relatively good correlation with ground-observed GUD. The GUDs extracted from NDII and PI were weakly correlated with the ground-observation GUD. For the preseason-snow-covered site-years, NDVI (r = 0.72, p < 0.01), NDGI (r = 0.71, p < 0.01), EVI (r = 0.63, p < 0.01), and NDPI (r = 0.57, p < 0.01) still performed well with a significant correlation with ground-observed GUD. Among them, NDGI had a lower RMSE and bias (RMSE = 10.69 days, bias = −4.05 days) than the other VIs. GUDs extracted from NDII were negatively correlated with and greatly deviated from ground observations, indicating that preseason snow cover caused a great disturbance to GUD extraction from NDII. The symbols *, **, and *** are significant at levels of 0.1, 0.05, and 0.01, respectively.
For almost all the VIs, the derived GUDs had stronger correlation with ground observation in preseason-snow-covered areas than in preseason-snow-free areas. Similar findings were also concluded in the study of Huang et al. [62]. The possible reason is that the sample number of site-years with preseason snow cover (sample number = 32) was much lower than that of sites without preseason snow cover (sample number = 168), thus resulting in the difference between the correlation results.

Performance of GUD Extraction Methods
The above results suggest that NDGI had the best performance and is suitable for GUD detection. Next, we focused on NDGI to evaluate the performance of the four GUD extraction methods ( Figure 5). The comparison between ground-observed GUD and satellite-observed GUDs derived through the four methods showed significant correlations for all site-years (Figure 5a). CCR max obtained the most accurate GUD with the highest r, lowest RMSE, and smallest bias (r = 0.62, p < 0.01, RMSE = 11 days, bias = −3.84 days). β max had a relatively lower r, the highest RMSE, and the largest bias (r = 0.40, p < 0.01, RMSE = 36.80 days, bias = 34.79 days). When the comparison was conducted between the multi-year averaged satellite-based and ground-based GUDs (in total 19 GUD pairs for each extraction method), the linear correlations became stronger (Figure 5b), with CCR max still having the strongest relationship (r = 0.79, p < 0.01, RMSE = 8.07 days, bias = −3.82 days) compared to the other methods (Figure 5b). We further compared the interannual variations and trends of the spatially averaged GUDs extracted using different methods (Figure 5c). The trend of GUD was 0.11 days/year (p > 0.1) for ground observation, and was −0.49 days/year (p < 0.05) for GUD detected by β max , −0.42 days/year (p > 0.1) for GUD detected by CCR max , −0.45 days/year (p < 0.1) for GUD detected by G20, −0.63 days/year (p < 0.1) for GUD detected by RC max . Although not all the trends were significant, the GUD estimated through CCR max was closest to the ground observation trend, while the other methods were different. In summary, CCR max was the most accurate and suitable method for GUD extraction.

Applicability of Snow-Free VIs on the TP
GUD derived from conventional VIs, such as NDVI and EVI, can be affected by snow cover. To eliminate the influence of snowmelt on VIs, previous studies replaced snowpolluted VI values with unpolluted VI values (e.g., [63,64]), which was the method we used to process NDVI and EVI in this study. By comparing the accuracy of GUDs extracted by the two VIs with and without snow calibration (Table 2), we found that snow calibration can significantly improve the r value (increased by 0.42 for NDVI, 0.43 for EVI) and regression slope (increased by 0.26 for NDVI, 0.25 for EVI), while significantly reducing RMSE (reduced by 9.13 days for NDVI, 9.39 days for EVI) and bias (reduced by 1.93 days for NDVI, 1.84 days for EVI).

Applicability of Snow-Free VIs on the TP
GUD derived from conventional VIs, such as NDVI and EVI, can be affected by snow cover. To eliminate the influence of snowmelt on VIs, previous studies replaced snow-polluted VI values with unpolluted VI values (e.g., [63,64]), which was the method we used to process NDVI and EVI in this study. By comparing the accuracy of GUDs extracted by the two VIs with and without snow calibration (Table 2), we found that snow calibration can significantly improve the r value (increased by 0.42 for NDVI, 0.43 for EVI) and regression slope (increased by 0.26 for NDVI, 0.25 for EVI), while significantly reducing RMSE (reduced by 9.13 days for NDVI, 9.39 days for EVI) and bias (reduced by 1.93 days for NDVI, 1.84 days for EVI). Snow-free VIs have been proposed to account for and minimize the influence of snow cover on GUD monitoring, such as NDII, PI, NDPI, and NDGI. However, they still have some drawbacks in GUD detection. NDII is sensitive to moisture because of the incorporation of SWIR spectral band [20], leading to higher values over snow than for vegetation [21]. Therefore, NDII can increase during vegetation growth and decrease with snow melt [22], both of which happen in the spring season, thus resulting in overestimation of GUD [20]. The PI was designed to reflect a more realistic status of vegetation growth by singling out the impact of wetness from greenness [21]. For ice, snow, and water that result in NDVI < 0, PI is assigned 0; for soil and non-photosynthetic vegetation components that result in NDII < 0, PI is assigned 0 [21]. However, sparse vegetation cannot be recognized by PI [24], usually leading to the overestimated GUD.
Our results indicate that NDPI and NDGI can capture the TP's surface GUD, giving better results than the NDVI and EVI after snow calibration ( Figure 4). In particular, the NDGI achieved the highest accuracy. The GUD estimated by NDPI showed a greater negative bias than that estimated by NDGI, mainly in the southwest TP with arid alpine steppe as the main vegetation type. Similar results were also found in studies at grassland flux tower sites in the middle and high latitudes [23], and NDPI has been proved to have greater uncertainty in GUD extraction than NDGI in grasslands in Kazakhstan, Mongolia, and North Asia [25]. The probable reason is that NDGI accounts for dry grass-whose reflectance varies almost linearly from green to NIR-and the rationale behind NDGI gives dry grass a value of approximately zero [23]. NDPI is based on the assumption that the reflectance of background components (soil and snow) varies monotonously from red to NIR to SWIR bands and is designed to be close to zero on the background component [24]. However, the spectral characteristics of dry grass showed nearly equal reflectance in the NIR and SWIR bands. Therefore, the NDPI value of dry grass will be higher than those of other background components, which may be mistaken as vegetation growth and result in an underestimation of GUD. Nevertheless, it has also been found that NDPI performs better than NDGI in the absence of dry grass due to its better ability to suppress variations in soil background [24], indicating the unique value of NDPI for deciduous ecosystem. These results inspired us to select the most suitable VI for GUD extraction according to the vegetation types and environmental conditions of the study area. In addition, these two snow-free VIs can be easily and expediently calculated from multiband surface reflectance data, giving it an advantage over the practice of eliminating snow cover from traditional VIs with the help of auxiliary data.

Impact of Extraction Methods on GUD Accuracy
In this study, although the GUDs extracted by the four methods were significantly correlated with the ground observation, there were obvious differences in the accuracy of GUDs determined by different extraction methods ( Figure 5). Figure 6 shows the threshold percentage of NDGI curve amplitude corresponding to ground-observed GUD and satellite-observed GUD for the four vegetation types. We found that the NDGI threshold percentage of ground-observed GUD ranged from 8% to 16%, and varied with vegetation type (Figure 6). This implies that a single relative threshold cannot capture a GUD with 100% accuracy. In addition, the NDGI threshold percentage varies with different GUD extraction methods; thus, different GUD extraction methods reflect different stages of plant growth [41]. For example, the NDGI threshold percentages for CCR max and β max were approximately 9.18% and 50%, respectively (Figure 6), which was also found in a previous study [65]. Therefore, CCR max is more likely to detect GUD in the early stages of vegetation growth, whereas β max favors exuberant photosynthesis and fast plant growth. CCRmax showed higher GUD extraction accuracy than the other methods ( Figure 5), which is consistent with previous studies on winter wheat in the Huanghuai region [26] and for grassland on the TP [66]. There are two probable reasons for CCRmax's superiority. First, the NDGI threshold percentage of the GUD extracted by CCRmax was closest to that of ground observation ( Figure 6). Second, CCRmax is less affected by fluctuations in the initial background VI value, which is more susceptible to noise [66].
βmax and G20 overestimate the GUD, and βmax has a larger positive deviation and lower correlation with ground observations than G20 in GUD extraction ( Figure 5). This result is reasonable, since the βmax is mostly equivalent to the 50% relative threshold method (G50). Many studies have found that G20 gained a GUD that is closer to and more correlated with ground observations than G50 [67]. This is probably because GUD at the species scale cannot be reflected by a 50% relative threshold [68].
RCmax acts similarly by setting a fixed NDGI threshold for each pixel, marking the multi-year averaged NDGI with the fastest growth rate [33]. Therefore, RCmax is more sensitive to the shape and interannual variation in the VI curve. The corresponding NDGI threshold percentages of RCmax differ for different vegetation types, mainly 26-52% for steppe and 2-10% for meadows, shrubs, and forests ( Figure 6). Therefore, the accuracy of GUD extracted by RCmax may have greater spatial heterogeneity. Therefore, CCRmax is recommended for GUD extraction in combination with NDGI on the TP, and the uncertainty of GUD extraction using different methods in different study areas still deserves attention.

Conclusions
This study extracted satellite-based GUDs on the TP from six MODIS-derived VIs (NDVI, EVI, NDII, PI, NDPI, and NDGI) by applying four different extraction methods (βmax, CCRmax, G20, and RCmax). We evaluated the spatial distribution and temporal variation trends of GUD on the TP. We then assessed the accuracy of satellite-based GUDs by comparison with ground-observed GUD from 19 agrometeorological stations, using Pearson's correlation, linear regression, RMSE, and bias.
The results show that the GUD on the TP mainly occurred from 120 to 160 DOY (May to 9 June), and it was early in the eastern and late in the western part of the plateau. GUD occurred earliest in forests for all vegetation types, followed by shrubs and meadows, and alpine steppes were the latest. During the study period, the GUD advanced in the eastern region but was delayed in the western part of the plateau. The GUD of meadows in the northeastern TP exhibited and the most significant advancement. The GUD variation trend in the western and southeastern plateau is mild and insignificant. CCR max showed higher GUD extraction accuracy than the other methods ( Figure 5), which is consistent with previous studies on winter wheat in the Huanghuai region [26] and for grassland on the TP [66]. There are two probable reasons for CCR max 's superiority. First, the NDGI threshold percentage of the GUD extracted by CCR max was closest to that of ground observation ( Figure 6). Second, CCR max is less affected by fluctuations in the initial background VI value, which is more susceptible to noise [66].
β max and G20 overestimate the GUD, and β max has a larger positive deviation and lower correlation with ground observations than G20 in GUD extraction ( Figure 5). This result is reasonable, since the β max is mostly equivalent to the 50% relative threshold method (G50). Many studies have found that G20 gained a GUD that is closer to and more correlated with ground observations than G50 [67]. This is probably because GUD at the species scale cannot be reflected by a 50% relative threshold [68]. RC max acts similarly by setting a fixed NDGI threshold for each pixel, marking the multi-year averaged NDGI with the fastest growth rate [33]. Therefore, RC max is more sensitive to the shape and interannual variation in the VI curve. The corresponding NDGI threshold percentages of RC max differ for different vegetation types, mainly 26-52% for steppe and 2-10% for meadows, shrubs, and forests ( Figure 6). Therefore, the accuracy of GUD extracted by RC max may have greater spatial heterogeneity. Therefore, CCR max is recommended for GUD extraction in combination with NDGI on the TP, and the uncertainty of GUD extraction using different methods in different study areas still deserves attention.

Conclusions
This study extracted satellite-based GUDs on the TP from six MODIS-derived VIs (NDVI, EVI, NDII, PI, NDPI, and NDGI) by applying four different extraction methods (β max , CCR max , G20, and RC max ). We evaluated the spatial distribution and temporal variation trends of GUD on the TP. We then assessed the accuracy of satellite-based GUDs by comparison with ground-observed GUD from 19 agrometeorological stations, using Pearson's correlation, linear regression, RMSE, and bias.
The results show that the GUD on the TP mainly occurred from 120 to 160 DOY (May to 9 June), and it was early in the eastern and late in the western part of the plateau. GUD occurred earliest in forests for all vegetation types, followed by shrubs and meadows, and alpine steppes were the latest. During the study period, the GUD advanced in the eastern region but was delayed in the western part of the plateau. The GUD of meadows in the northeastern TP exhibited and the most significant advancement. The GUD variation trend in the western and southeastern plateau is mild and insignificant.
The accuracy assessment of the GUD showed that different VIs and extraction methods produced different levels of uncertainties in GUD detection. NDGI, NDPI, and snowcalibrated NDVI and EVI can capture GUD and maintain favorable accuracy for areas with preseason snow cover, whereas NDII and PI hardly captured GUD. NDGI was superior to the other VIs in GUD extraction, with the highest overall accuracy. We also found that the GUD extracted by CCR max had the best consistency with ground observations. In addition, our experiment proved that snow calibration could significantly improve the accuracy of GUD extracted from NDVI and EVI. In contrast, the snow-free VIs, NDGI and NDPI, were able to achieve similar or even better accuracies than NDVI and EVI after snow calibration. Our research substantiates the applicability of snow-free VIs, particularly NDGI, for GUD detection on the TP. These results provide a significant perspective for analyzing vegetation dynamics on the TP.

Data Availability Statement:
The data that support the findings of this study are available from the website given in the manuscript.