Comparison of Winter Wheat Extraction Methods Based on Different Time Series of Vegetation Indices in the Northeastern Margin of the Qinghai–Tibet Plateau: A Case Study of Minhe, China

: The northeastern margin of the Qinghai–Tibet Plateau (QTP) is an agricultural protection area in China’s new development plan, and the primary region of winter wheat growth within QTP. Winter wheat monitoring is critical for understanding grain self-sufﬁciency, climate change, and sustainable socioeconomic and ecological development in the region. However, due to the complex terrain and high altitude of the region, with discontinuous arable land and the relatively low level of agricultural development, there are no effective localization methodologies for extracting and monitoring the detailed planting distribution information of winter wheat. In this study, Sentinel-2A/B data from 2019 to 2020, obtained through the Google Earth Engine platform, were used to build time series reference curves of vegetation indices in Minhe. Planting distribution information of winter wheat was extracted based on the phenology time-weighted dynamic time warping (PT-DTW) method, and the effects of different vegetation indices’ time series and their corresponding threshold parameters were compared. The results showed that: (1) the three vegetation indices—normalized difference vegetation index (NDVI), normalized differential phenology index (NDPI), and normalized difference greenness index (NDGI)—maintained high mapping potential; (2) under the optimal threshold, >88% accuracy of index identiﬁcation for winter wheat extraction was achieved; (3) due to improved extraction accuracy and resulting boundary range, NDPI and its corresponding optimal parameter ( T = 0.05) performed the best. The process and results of this study have certain reference value for the study of winter wheat planting information change and the formulation of dynamic monitoring schemes in agricultural areas of QTP. combined with the PT-DTW algorithm were compared to obtain the opti- mal predictive indices and corresponding model (i.e., algorithm) parameters for distribu- tion mapping. The findings of the present study could lay a foundation for the develop- ment of methods for the extraction of winter wheat planting information in the northeast- ern margin of QTP, and provide a technical reference for subsequent remote sensing mon- itoring efforts for other important agricultural products. The results indicated veriﬁcation and relatively training sample requirements led to the selection of the PT-DTW method for combining NDVI, NDPI, and NDGI time series data as an alternative identiﬁcation scheme for winter wheat in Minhe.


Introduction
Wheat is a widely grown cereal worldwide, and its caryopsis is a staple for humankind [1][2][3]. Accordingly, any changes in its distribution and yield are of great significance to anthropogenic sustainability and survival; however, due to climate change, extreme weather, and socioeconomic development, the uncertainty surrounding crop planting distribution and yield is increasing, forcing regional food security issues to the front processing, helping to reduce local data storage and computing pressures [26,27]. For example, Xiao et al. [28] evaluated the application potential of Sentinel-1/2 data in rice mapping over the Hangjiahu Plain in China based on the GEE, whereas He et al. [29] extracted winter wheat and winter rape from 119 landscape images by using the GEE. Yang et al. [30] used the GEE to monitor winter wheat in Shandong, China according to Sentinel-1/2 data. Accordingly, remote sensing data processing and crop information extraction can be carried out through relatively simple programming techniques, currently providing one of the most promising remote sensing technology applications available [29].
Indeed, remote sensing extraction of winter wheat planting information has a strong foundation in both methodologies and supporting platforms, as numerous references exist for dynamic monitoring by governments and industries; however, the above discussions and research on monitoring methods and platforms are mainly focused on conventional planting areas or other major grain-producing regions, with low altitudes and relatively mild climates [11,12,25]. Accordingly, in the high-altitude and cold border areas, due to the small scale of production, the contribution of grain output to the global output can be ignored, and few research activities have focused on the grain production, including extraction and monitoring of winter wheat information based on remote sensing technologies. Actually, the agricultural activities in the high-altitude and cold border areas play a major role in regional grain self-sufficiency, detection of climate change, and research on sustainable socioeconomic and ecological development.
The northeastern margin of the Qinghai-Tibet Plateau (QTP) is one of the main agricultural areas in the QTP and primarily refers to the Hehuang Valley, as a whole, and even smaller components. The northeastern margin marks the transition zone from the northeastern QTP to the Loess Plateau. With an average altitude > 3000 m, the plateau continental climate characteristics are significant. Nevertheless, it serves as the northwestern border zone for winter wheat cultivation in China. Planting wheat across river valleys ensures grain self-sufficiency for the QTP population, of about 10 million. Accordingly, changes in winter wheat planting distribution in the region can serve as an indicator of regional climate and agricultural structure change. However, recent field investigations by agricultural administration departments and with farmers have revealed that the winter wheat planting area has been decreasing annually, with frequent occurrences of extreme weather/climate events, seed degradation, pest and disease, intensified human-wildlife conflict as a result of the Chinese government's ecological protection strategy, and low returns compared to other industries, in notable contrast to the predictions of potential expansion under a global warming background. Therefore, the main grain planting information presented here about winter wheat in the northeastern margin of QTP is vital for the prediction of future regional food security, climate change, socioeconomic growth, and environmental sustainability. However, the detailed planting distribution information of winter wheat in the region is not clear, and corresponding operational extraction and monitoring of methodologies are also lacking.
To address the above issue, remote sensing methods for extracting winter wheat in the northeastern margin of QTP are discussed. Specifically, this study selected Minhe (Qinghai, China) in the northeastern margin of QTP as the study area. Based on its socioeconomic development, data support and the cost-effective, widely applied time series vegetation index method was selected as a guideline, and the optimal vegetation index and corresponding threshold parameters were selected for extracting winter wheat distribution information. In detail, a winter wheat sample reference curve was constructed using Sentinel-2 A/B data of the GEE platform, and the identification effects of different vegetation indices combined with the PT-DTW algorithm were compared to obtain the optimal predictive indices and corresponding model (i.e., algorithm) parameters for distribution mapping. The findings of the present study could lay a foundation for the development of methods for the extraction of winter wheat planting information in the northeastern margin of QTP, and provide a technical reference for subsequent remote sensing monitoring efforts for other important agricultural products.

Study Area
Minhe is located in the Hehuang Valley, northeastern QTP, under the jurisdiction of Qinghai Province, China. It comprises a portion of the transition zone from the Loess Plateau to the QTP (Figure 1) and maintains a typical plateau continental climate. The terrain is high in the northwest, low in the southeast, and high mountains dominate the southwest. The maximum and minimum altitudes are 4220 m and 1650 m, respectively, with the average altitude being 2100-2500 m. Ravines and mountainous territory overlap, creating complex terrain. The fertile cultivated land area, suitable for planting wheat, corn, rapeseed, etc., accounts for approximately 24% of the total land area [31]. Yellow and Huangshui River Valleys are suitable for planting winter wheat due to better hydrothermal conditions in these areas [31]. Since 1958, winter wheat has been planted in the region, and large-scale promotion has been gradually realized; however, due to the influence of unstable natural conditions, such as climate change and extreme weather/climatic events, harvest yield fluctuates greatly [32]. Compared with growth in the main producing area of China, the winter wheat growth in the same phenological period in this region is significantly weaker, with relatively longer growth periods (sowing in mid-September, maturity in mid-to-late July of the following year) [12]. At present, regional winter wheat is mainly distributed in the Hehuang Valley, namely, the Yellow and Huangshui River Valleys. However, data from the government statistics department and interviews with farmers revealed that the proportion of winter wheat planting area to grain crops planting area has shrunk sharply in recent years in Minhe. Elsewhere, it has been shown that under the influence of global warming, the suitable planting area of winter wheat in Minhe has expanded [32]. Accordingly, it is necessary to conduct localized research on winter wheat planting information extraction in the region, which could facilitate an examination of change in agricultural structure under the influence of climate change, socioeconomic transition, and ecological protection.
itoring efforts for other important agricultural products.

Study Area
Minhe is located in the Hehuang Valley, northeastern QTP, under the jurisdiction Qinghai Province, China. It comprises a portion of the transition zone from the Loess Pla eau to the QTP ( Figure 1) and maintains a typical plateau continental climate. The terra is high in the northwest, low in the southeast, and high mountains dominate the sout west. The maximum and minimum altitudes are 4220 m and 1650 m, respectively, wi the average altitude being 2100-2500 m. Ravines and mountainous territory overlap, cr ating complex terrain. The fertile cultivated land area, suitable for planting wheat, cor rapeseed, etc., accounts for approximately 24% of the total land area [31]. Yellow an Huangshui River Valleys are suitable for planting winter wheat due to better hydrothe mal conditions in these areas [31]. Since 1958, winter wheat has been planted in the regio and large-scale promotion has been gradually realized; however, due to the influence unstable natural conditions, such as climate change and extreme weather/climatic even harvest yield fluctuates greatly [32]. Compared with growth in the main producing ar of China, the winter wheat growth in the same phenological period in this region is si nificantly weaker, with relatively longer growth periods (sowing in mid-September, m turity in mid-to-late July of the following year) [12]. At present, regional winter wheat mainly distributed in the Hehuang Valley, namely, the Yellow and Huangshui River Va leys. However, data from the government statistics department and interviews with farm ers revealed that the proportion of winter wheat planting area to grain crops planting ar has shrunk sharply in recent years in Minhe. Elsewhere, it has been shown that under t influence of global warming, the suitable planting area of winter wheat in Minhe has e panded [32]. Accordingly, it is necessary to conduct localized research on winter whe planting information extraction in the region, which could facilitate an examination change in agricultural structure under the influence of climate change, socioeconom transition, and ecological protection.

Data
The distribution of cultivated land in Minhe is scattered and discontinuous, under the influence of topographic and climatic factors, and the land plot area is small; therefore, when considering the spatiotemporal resolution of the image and cost-effectiveness of the application, Sentinel 2A/B satellite data (Table 1) were selected for the analysis here, owing to its relatively high resolution and 5-day revisit period, which are conducive to the synthesis of crop growth curves with dense time series. Because GEE has built-in Sentinel 2 datasets that have undergone partial pre-processing steps, it is readily available for extractions of bottom-of-atmosphere corrected reflectance information. Accordingly, this study obtained all 678 pre-processed Sentinel 2A/B data scenes across Minhe from 25 August 2019 to 30 July 2020.  20 20 Administrative and cultivated land spatial distribution data were also used in this study, with the data on administrative divisions coming from the 1:1,000,000 national basic geographic database [33], and cultivated land data were obtained from within the research group of the authors.

Technical Scheme
Based on Sentinel 2A/B data, a comparative study on winter wheat identification methods was conducted here using the PT-DTW method combined with the vegetation indices of time series in GEE. The conceptual diagram of the processes used is shown in Figure 2 and includes three primary components: (1) constructing high-quality time series vegetation indices, (2) expanding the winter wheat sample set and setting an extraction threshold, and (3) evaluating the precision of winter wheat identification from different vegetation indices. The details are as follows.

PEER REVIEW 6 of 19
expand upon the reference curve dataset (4500 presence and 4500 absence samples) using 30 presence and 10 absence samples. Taking cultivated land as the reference, a threshold T value was set between 0 and 0.3 for each index and was tested according to the sample dataset combined with PT-DTW. Finally, the receiver operating characteristic (ROC) curve was created according to the experimental results, and another 30 presence and 30 absence samples.

Accuracy Assessment via ROC and Optimal T
The optimal threshold value for each vegetation index was selected by analyzing ROC. The optimal T for each vegetation index was applied to identify winter wheat and the mapping effects of each vegetation index were assessed.

Vegetation Indices
The vegetation indices composed of the spectral reflectance of vegetation can help control the influence of signal attenuation when electromagnetic waves pass through the

Time Series Construction
Initially, based on the 678 obtained atmosphere-corrected images, the NDVI, normalized differential phenology index (NDPI), and normalized difference greenness index (NDGI) were calculated. Subsequently, the time series of images were mosaicked using the quality mosaic algorithm every 10 days and clipped based on the administrative division of Minhe. Lastly, the vegetation indices were combined into a time sequence, and a Savitzky-Golay (SG) filter was used to obtain a high-quality data time series for each pixel.

Expanding Samples and Extraction
According to the characteristics of winter wheat in the imagery, an SG filtered time series of vegetation indices and sample points of presence/absence were selected based on the distribution of cultivated land. Next, a median value of 30 winter wheat samples was used to construct the reference curve. A linear spectral mixed model (LSM) was used to expand upon the reference curve dataset (4500 presence and 4500 absence samples) using 30 presence and 10 absence samples. Taking cultivated land as the reference, a threshold T value was set between 0 and 0.3 for each index and was tested according to the sample dataset combined with PT-DTW. Finally, the receiver operating characteristic (ROC) curve was created according to the experimental results, and another 30 presence and 30 absence samples.

Accuracy Assessment via ROC and Optimal T
The optimal threshold value for each vegetation index was selected by analyzing ROC. The optimal T for each vegetation index was applied to identify winter wheat and the mapping effects of each vegetation index were assessed.

Vegetation Indices
The vegetation indices composed of the spectral reflectance of vegetation can help control the influence of signal attenuation when electromagnetic waves pass through the atmosphere [34]. Accordingly, they are widely used to evaluate vegetation cover and growth status as well as improve the precision of remote sensing monitoring of vegetation (e.g., crops) growth or yield estimates [35][36][37]. NDVI, NDGI and NDPI are common vegetation indices used to characterize vegetation distribution. NDVI (Equation (1)) is the most widely used vegetative evaluation index and is suitable for dynamic monitoring in early and middle growth stages [34,38]. Furthermore, NDVI has been extensively studied for its extraction potential of winter wheat [17,39,40]. However, NDVI is affected by snow cover and soil background. NDGI and NDPI are two vegetation cover indices that eliminate the influence of these factors, and can better reflect the greenness characteristics of vegetation in the case of snow cover. To be specific, NDPI was designed to enhance the comparison between vegetation and background (i.e., it is soil and snow insensitive), while minimizing the differences between backgrounds (Equation (2)) [41]; thus, it often maintains a higher accuracy for identifying wheat growth information than NDVI and has an enhanced vegetation index (EVI) [12,42]. NDGI is an analysis index based on the linear spectral mixture model proposed by Yang et al. [43] (Equation (3)). It essentially removes the influence of snow on the vegetation index and has shown good response effects on sparsely vegetated or cropland areas. There are many cases in which the two indices have been verified using small samples of phenological observation sites and carbon flux sites, and there are a few cases in which the accuracy of the indices has been evaluated on a large spatial scale, but this also demonstrates their application potential.
where B g , B r , B nir , and B swir correspond to the green, red, near-infrared, and mid-infrared bands, respectively, equating to B3, B4, B8, and B11 of Sentinel 2 data, respectively. As described in Section 2.1, the topography of Minhe is complex, with land cover seasonally dominated by snow. Winter wheat cultivation is discontinuous and sparse in winter. No study has proposed a vegetation index suitable for winter wheat extraction in this region. Accordingly, NDVI, NDPI, and NDGI were selected for the identification of winter wheat to identify the optimal index and corresponding threshold parameters.

Mosaic and Cloud Removal Algorithm
Although Sentinel-2A/B scene data are 290 km wide, the inconsistent coverage range of each image required the mosaicking of multiple scenes for complete study area coverage. In addition, cloud masks for affected images are necessary across all remote sensing applications. Using the quality mosaic function of GEE, it was found that when NDVI was designated as the quality band, image mosaic and cloud removal effects were more precise compared to when NDPI or NDGI were employed. Thus, mosaicking was conducted accordingly, using the quality mosaic function of GEE, cloud removal was carried out through the mosaic cloud removal method directly, and a phase mosaic, cloud free image was synthesized with a step length of 10 days.

Savitzky-Golay Filter
Owing to inherent uncertainties in satellite image acquisition, remote sensing data, including the created time series here, maintain certain levels of temporal noise. Filtering time series data is, thus, a necessary process for constructing reliable crop growth curves based on vegetation indices [44]. The present study directly selected Savitzky-Golay (SG) filtering as the smoothing method for NDVI, NDPI, and NDGI time series, based on similar successful applications for vegetation time series [45][46][47] and winter wheat identification [12,48,49].
In essence, SG filtering relies on the least square fitting of local high order polynomials to obtain weights, and then calculates weighted averages in order to smooth curves. Assuming that there are 2n + 1 terms, T times of fitting, the calculation formula of any SG filtered value is as follows: where Y j is the result after SG filtering. When the given higher order polynomial is determined, the weights of the terms are also determined. When the polynomial is a three-order and five-term fit, the calculation formula of the intermediate term can be simplified to: where V 0 is the result of SG filtering of the middle term, Y −2 and Y −1 are the former 2 and 1 terms of the middle term, respectively, Y 1 and Y 2 are the after 1 and 2 terms of the middle term, respectively.

PT-DTW Algorithm
PT-DTW is an improved algorithm for winter wheat recognition based on the dynamic time warping (DTW) algorithm [12]. DTW is a widely used method for measuring the similarity between two time series of different lengths [50][51][52], although it is notably limited by excessive distortion. To address this issue, Dong et al. [12] added a time-distance penalty (M) [12,53] for calculating the DTW similarity matrix (d) to enhance the phenological weight of winter wheat and, thus, highlight its dual growth peak characteristics ( Figure 3). Thus, the PT-DTW method was implemented for winter wheat extraction in the North China Plain by combining the NDPI time series. The results indicated good verification accuracy [12], and the relatively small training sample size requirements led to the selection of the PT-DTW method for combining NDVI, NDPI, and NDGI time series data as an alternative identification scheme for winter wheat in Minhe.
The minimum cost path from d1,1 to dm,m is solved in the mat the points value on the path multiplied by the weight Wi is the P R. The formula for calculating this value is as follows:

Accuracy Evaluation
ROC is a graphic method for representing the accuracy of bi els [54], creating a two-dimensional graph of the relationship be rate (TPR) (Equation (4)), and the false positive rate (FPR; Equati For the specific calculation process of PT-DTW, it is assumed that there are two curves, Q = (q 1 , q 2 , q 3 , . . . , q m ), R = (r 1 , r 2 , r 3 , . . . , r m ), with the same number of nodes in the event sequence. Q and R are the curves of target event sequence and reference time series, respectively. For each element in the matrix (6) The minimum cost path from d 1,1 to d m,m is solved in the matrix, and then the sum of the points value on the path multiplied by the weight W i is the PT-DTW values of Q and R. The formula for calculating this value is as follows:

Accuracy Evaluation
ROC is a graphic method for representing the accuracy of binary classification models [54], creating a two-dimensional graph of the relationship between the true positive rate (TPR) (Equation (4)), and the false positive rate (FPR; Equation (5)). As early as 1989, Spackman [55] examined the value of utilizing the relatively simple ROC curves for evaluating and comparing classification algorithms. In an ROC diagram, each point on the curve represents the corresponding classification result of a group of parameters, where the closer the point is to the upper left corner, the more accurate the results, (i.e., high TPR and low FPR). Points under different thresholds are connected to form the ROC curves. Notably, if the ROC curve representing one vegetation index is above the curves of other indices, it indicates the superiority of this index within the given application. A binary classification model (i.e., presence, absence) was employed in the present study, and the ROC curve could thus be used to evaluate the performance of different vegetation indices for winter wheat identification and classification results. The formulas for each evaluation index of ROC are presented in Equations (8)-(10): where TP is the number of correct classifications of positive samples, FN is the number of incorrect classifications of positive samples, FP is the number of incorrect classifications of negative samples, TN is the number of correct classifications of negative samples, and ACC is the accuracy rate.

Time Series of Vegetation Indices
According to the conceptual design, the construction of the time series of vegetation indices for winter wheat in Minhe included three main processes: (1) The vegetation indices were first calculated for all images. In GEE, the image expression interface was used to program and carry out the vegetation index algorithms according to Equations (1)-(3) by calling the 678 atmospheric corrected images covering Minhe from the Sentinel 2A/B dataset; afterwards, NDVI, NDPI, and NDGI were obtained. (2) Using the quality mosaic function within GEE, a 10-day time step, and NDVI was designated as the quality band, the multi-temporal images were mosaicked and cloud cover was removed, resulting in a total of 34 time series images. The corresponding bands of the three vegetation indices were then extracted from the mosaics and combined into three sets of time series vegetation index images for constructing a high-quality index curve.

Selection of Reference Curve and Threshold T
Parameter setting of the PT-DTW algorithm is the key to extracting winter whe information via the fusion of the vegetation indices time series. Parameter setting involv three steps: (1) Obtaining the median reference curve of winter wheat and expanding t samples; (2) Setting the key phenology weight of winter wheat. The research results Dong et al. [12] showed that winter wheat recognition effects were superior when equa weight values were assigned to data of the key phenological period, that is, the sum weights of the key phenological period was 1.0, and data of other phenological perio were not involved in the calculation, as was employed here; (3) Selecting the extractio threshold T for winter wheat.

Sample Selection
Winter wheat sample selection is critical to constructing the reference curves of ve etation indices time series. Here, in the field investigation, winter wheat samples were n obtained during growing season of 2019 to 2020; thus, the manual interpretation of a

Selection of Reference Curve and Threshold T
Parameter setting of the PT-DTW algorithm is the key to extracting winter wheat information via the fusion of the vegetation indices time series. Parameter setting involves three steps: (1) Obtaining the median reference curve of winter wheat and expanding the samples; (2) Setting the key phenology weight of winter wheat. The research results of Dong et al. [12] showed that winter wheat recognition effects were superior when equalweight values were assigned to data of the key phenological period, that is, the sum of weights of the key phenological period was 1.0, and data of other phenological periods were not involved in the calculation, as was employed here; (3) Selecting the extraction threshold T for winter wheat.

Sample Selection
Winter wheat sample selection is critical to constructing the reference curves of vegetation indices time series. Here, in the field investigation, winter wheat samples were not obtained during growing season of 2019 to 2020; thus, the manual interpretation of archived remote sensing data was conducted. Cultivated land data were taken as the reference, and the sample points of winter wheat were obtained according to its bright red standard false color spectral characteristics when it overwintered and typical bimodal time series of vegetation indices during growth. The potential distribution range of winter wheat was estimated according to the changing characteristics of accumulated temperature with altitude in the study area and the availability of natural water systems or other sources capable of ensuring irrigation [31]. Lastly, in the Huangshui and Yellow River Valleys, 60 winter wheat sample points and 40 non-winter wheat sample points were selected ( Figure 5, Table A1); among them, 30 presence and 10 absence samples were randomly selected to construct the reference curves of vegetation indices and expand the experimental samples, whereas all remaining samples were used for verification.

Vegetation Indices Reference Curves
The median of the sampled curves was selected as a reference of winter wheat time series that are neither biased towards the winter wheat growing well nor growing poorly. Accordingly, the median of each vegetation index from 30 randomly selected winter wheat samples was taken as the corresponding time series reference (i.e., standard curve; Figure 6).

Vegetation Indices Reference Curves
The median of the sampled curves was selected as a reference of winter wheat time series that are neither biased towards the winter wheat growing well nor growing poorly. Accordingly, the median of each vegetation index from 30 randomly selected winter wheat samples was taken as the corresponding time series reference (i.e., standard curve; Figure 6).

Selection of Threshold T
In essence, the PT-DTW value is used to measure the similarity between a specific time series curve and the reference curve for winter wheat, whereas a smaller value indicates greater similarity. The T value accuracy directly determines the accuracy of extraction as it represents the threshold between winter wheat and non-winter wheat. In principle, threshold T values should be obtained based on sample data, but an insufficient sample size can lead to incongruencies between the determined and optimal thresholds. Accordingly, the linear spectral mixing (LSM) [12] model can develop a large number of sample datasets using a relatively small set of sample data. Therefore, the LSM model used in this study generated mixed vegetation index curves from the median reference curve, and any two others from 30 randomly selected winter wheat samples and 10 randomly selected non-winter wheat samples. A total of 4500 positive and 4500 negative samples were generated (those with winter wheat composition > 50% were considered positive samples). Then, the minimum PT-DTW values of each sample and the 30 winter wheat samples were calculated. In the analysis, the T value of the 30 winter wheat samples were <0.3; therefore, a corresponding range of T value was set from 0.0 to 0.3, with a change in step size of 0.001. Lastly, based on the extended sample dataset and cultivated land, winter wheat across Minhe was extracted based on different threshold T values and time series of vegetation indices.

Vegetation Indices Reference Curves
The median of the sampled curves was selected as a reference of winter series that are neither biased towards the winter wheat growing well nor grow Accordingly, the median of each vegetation index from 30 randomly sele wheat samples was taken as the corresponding time series reference (i.e., stan Figure 6).

Selection of Threshold T
In essence, the PT-DTW value is used to measure the similarity betwee time series curve and the reference curve for winter wheat, whereas a smaller cates greater similarity. The T value accuracy directly determines the accurac  Figure 7 shows the ROC plot based on the results of the above process and Equations (4) and (5). The figure indicates that all three vegetation indices have the potential to accurately identify winter wheat. For FPR ≤ 5%, the TPRs of the three indices were nearly equal, indicative of their similar ability within the low FPR range. When the FPR > 5%, the TPR of NDVI was slightly lower than that of NDPI or NDGI, indicating that the latter two vegetation indices, which successfully minimized the noise associated with background soil and snow cover, is superior to the former during the identification of winter wheat.
To determine the optimal threshold T value, the distance between each point on the ROC plot and the coordinate point (0, 1) was calculated, and the threshold represented by the point with the smallest distance was considered optimal for each of the vegetation indices ( Table 2). The accuracy was verified based on the remaining 30 winter wheat presence and absence samples. From the ROC curve, the FPRs of the three indices were all zero while maintaining a high TPR (>75%), indicating that the selection result of the optimal threshold value was reasonable. The ACC for NDPI was the best, followed by NDGI and NDVI. As smaller optimal threshold values indicate a more accurate identification potential for winter wheat, NDPI (T = 0.050) was identified as the top predictor of winter wheat presence in the analyses here.
Partial distribution results of winter wheat in Minhe were then extracted using the optimal threshold values in Table 2 ( Figure 8). Overall, the spatial distributions of winter wheat based on the three vegetation indices were good; however, notable differences emerged at higher levels of detail. In the discontinuous growing area (Figure 8a), the corresponding winter wheat distribution maps based on the three indices captured this pattern (Figure 8b-d), whereas in the more continuous areas (Figure 8e), the results based on NDVI remained irregular (Figure 8f) and those based on NDPI and NDGI were relatively contiguous (Figure 8g,h). Notably, the resulting boundary range based on NDGI was largest, followed by NDVI. Accordingly, NDPI was also optimal according to the extracted spatial distribution results. curately identify winter wheat. For FPR ≤ 5%, the TPRs of the three indices equal, indicative of their similar ability within the low FPR range. When the F TPR of NDVI was slightly lower than that of NDPI or NDGI, indicating that t vegetation indices, which successfully minimized the noise associated with soil and snow cover, is superior to the former during the identification of wi To determine the optimal threshold T value, the distance between each ROC plot and the coordinate point (0, 1) was calculated, and the threshold re the point with the smallest distance was considered optimal for each of th indices ( Table 2). The accuracy was verified based on the remaining 30 winte ence and absence samples. From the ROC curve, the FPRs of the three ind zero while maintaining a high TPR (>75%), indicating that the selection resu mal threshold value was reasonable. The ACC for NDPI was the best, follow and NDVI. As smaller optimal threshold values indicate a more accurate i potential for winter wheat, NDPI (T = 0.050) was identified as the top predic wheat presence in the analyses here.   Partial distribution results of winter wheat in Minhe were then extracted using the optimal threshold values in Table 2 ( Figure 8). Overall, the spatial distributions of winter wheat based on the three vegetation indices were good; however, notable differences emerged at higher levels of detail. In the discontinuous growing area (Figure 8a), the corresponding winter wheat distribution maps based on the three indices captured this pattern (Figure 8b-d), whereas in the more continuous areas (Figure 8e), the results based on NDVI remained irregular (Figure 8f) and those based on NDPI and NDGI were relatively contiguous (Figure 8g,h). Notably, the resulting boundary range based on NDGI was largest, followed by NDVI. Accordingly, NDPI was also optimal according to the extracted spatial distribution results.

Discussion and Conclusions
To optimize the method of extraction of winter wheat based on the time series of vegetation indices and corresponding threshold parameters for the Hehuang Valley agricultural area in QTP, Minhe (Qinghai, China) was selected as the research area. With the

Discussion and Conclusions
To optimize the method of extraction of winter wheat based on the time series of vegetation indices and corresponding threshold parameters for the Hehuang Valley agricultural area in QTP, Minhe (Qinghai, China) was selected as the research area. With the support of the GEE platform and Sentinel 2A/B data, atmosphere-corrected image data were used to calculate the NDVI, NDPI, and NDGI of winter wheat in the 2019-2020 growing season, and mosaicking was performed based on the mosaic cloud removal method. A high-quality time series of the vegetation indices was obtained by SG filtering after ordering chronologically. Lastly, after selecting points of true winter wheat presence through visual interpretation, extension of samples by the LSM algorithm, and based on the PT-DTW algorithm, analytical optimization of the winter wheat extraction method was carried out using ROC curves.
In the above processes, the calculation of each vegetation index using the atmospherically corrected data helped reduce the atmospheric effects and more accurately reflected the vegetative growth condition on the ground.
When mosaicking and masking cloud cover, the quality mosaic algorithm provided by GEE was adopted to retain the pixel value with the highest quality among all mosaic data. A comparison with the results of another cloud removal algorithm specifically for Sentinel 2A/B data provided by GEE ( Figure 9) showed that although the latter can achieve reasonable cloud removal effects, it simply excludes cloud and shadow pixels, resulting in missing pixels during multi-phase image mosaics when all images involved in mosaicking are missing pixels at the same location (Figure 9b,e). The former technique (adopted in the present research) was applied due to its advantage in the form of the ability to maximize retention of high-quality pixels (Figure 9c,f) while removing cloud cover as much as possible for subsequent processing; however, this method could produce false high-quality pixels due to cloud cover over a long time series, increasing the resulting error. Therefore, mosaicking and cloud removal fusion were carried out in this method to reduce the error transmission caused by the two processes being carried out separately to some extent, and also ensured the integrity of the result data. In addition, the study area was located in middle/high latitudes and in the temperate continental monsoon climate zone, with sunnier days with less cloud cover. The probability of the inability to synthesize an optical fusion image in the short time interval due to cloud cover is small. In order to obtain a good phenological curve and reduce the noise of time series, fusion images were constructed with a time-interval of 10 days in this study. However, in areas with more cloudy and rainy weather, such as the Mediterranean climate region, optical remote sensing images in winter could be highly affected by weather conditions and cloud presence and, thus, it may be difficult to obtain short-time-interval fusion images. Therefore, determination of the maximum acceptable time-interval and how the accuracy changes by increasing/decreasing this time interval requires further study.
In terms of noise removal of the vegetation index time series curve, there is a general consensus that SG filtering is more capable of weakening time series noise compared to asymmetric Gaussian filtering, double-logic filtering, or Fourier filtering [45]. In the present study, the SG filter used five time period units to smoothen the data, improving stability and reliability of the time series of the vegetation indices. Therefore, the data and preprocess methods in this study are somewhat reliable.
In terms of setting the phenological option weight, this study referred to Dong et al. [12] and only selected data of characteristic phenological periods to set the weight and participate in the calculation. Compared with using all phenology data in the calculation, the amount of data and computation were greatly reduced, and the complex weight determination process was also reduced. For sample handling, sample points for winter wheat were selected following visual interpretation, and the reference curve constructed from the sample median ensured that it would neither favor winter wheat with good growth nor with poor growth. Accordingly, referring to previous studies [12,56], the LSM model was used to expand the sample dataset as well as to ensure that the generated positive sample data were always near the median reference curve and were roughly consistent with actual sample data. The identification accuracy of winter wheat based on these extended samples was also consistent with those derived from previous studies [12,56]; however, the experimental results of the sample dataset expanded via the LSM model were used in threshold selection, which may be notably different from those based on a field-derived samples. Therefore, it is necessary to make up for the absence of field investigation samples in subsequent analyses.
sensing images in winter could be highly affected by weather conditions ence and, thus, it may be difficult to obtain short-time-interval fusion im determination of the maximum acceptable time-interval and how the a by increasing/decreasing this time interval requires further study.
In terms of noise removal of the vegetation index time series curve, t consensus that SG filtering is more capable of weakening time series no asymmetric Gaussian filtering, double-logic filtering, or Fourier filtering sent study, the SG filter used five time period units to smoothen the data bility and reliability of the time series of the vegetation indices. Therefo preprocess methods in this study are somewhat reliable.  The winter wheat phenological weight used in the present study was directly referenced from research on the main winter wheat producing areas in central and eastern China [12]. The terrain of the present research area is complex, with discontinuous winter wheat planting, whereas the central and eastern planting regions of China are relatively flat, with contiguous winter wheat planting. Therefore, subsequent research should compare the characteristics and growth differences of winter wheat between the study area and the main production areas in the central and eastern regions of China, optimizing models for local phenological weight to improve the identification accuracy.
The T value directly determines the accuracy of the extraction result. For the identification and extraction of winter wheat, different vegetation indices correspond to different optimal thresholds. Compared with existing studies [12,56], there are slight differences in the threshold of the NDPI index between the plain area and the northeastern part of the QTP. In the results of this study, as the threshold increases, TPR increases at an obvious speed. However, when TPR continues to rise to a certain extent, FPR increases also become obvious, and each adjustment of the threshold is accompanied by great changes in the results. Therefore, the determination of the threshold is an important process of the technical scheme, and the determination of the localization threshold is particularly important in the generalization.
In summary, based on ROC and the spatial distribution results of extracted winter wheat, all three vegetation index time series, combined with the PT-DTW method examined, had the potential to extract spatial information on winter wheat. Under the optimal thresh-old, the accuracy of winter wheat extraction according to the three-index identification was >88%. The classification effects of NDVI were notably poorer than those of NDPI and NDGI, indicating that the indices that have minimized the background effects of soil and snow are more suitable for winter wheat identification. Although NDPI did not perform any better than NDGI in terms of the reliability of extracted winter wheat lands, it is advantageous, owing to its improved extraction accuracy and yielded boundary range. This showed that the vegetation indices that reduced the influence of soil and snow background at the same time were slightly better than those that reduced the influence of snow background alone, in terms of the overall accuracy of winter wheat extraction. Accordingly, NDPI and its corresponding optimal parameter (T = 0.05) performed optimally. However, as mentioned above, there are uncertainties in this study, such as image fusion time interval, sample processing, phenological period selection and weight determination, and threshold determination. For the extraction of winter wheat, if the study area changes, the results of corresponding parameters in this study may also change. The localization parameter determination based on this scheme is necessary in the generalization, although these parameters may vary very little from region to region. Of course, if it is to extract other crops, the specific extraction scheme should be designed separately, given the spectral and phenological differences between them and winter wheat.
Based on the above result, the planting area of winter wheat in Minhe during the 2019-2020 growing season, extracted in the present study, was 431.79 ha. According to recent field investigations by agricultural administration departments and interviews with farmers in the present study, farmer willingness to plant wheat has decreased. In addition, winter wheat planting area exhibits a decreasing trend. Based on the experiences of the interviewees, the winter wheat planting area extracted in this study is potentially similar to the actual planting area. The actual planting area of winter wheat in the 2019-2020 growing season was not obtained in this study; therefore, subsequent studies should consider verifying the results obtained in the present study.
Author Contributions: Conceptualization, F.H., X.X. and X.Z.; methodology, F.H. and X.X.; software, F.H.; formal analysis, F.H. and X.X.; investigation, F.H., X.X. and Q.C.; validation, X.X. and S.L.; resources, Y.P. and Q.C.; data curation, F.H. and X.X.; writing-original draft preparation, F.H. and X.X.; writing-review and editing, X.X., Y.H. and X.Z.; visualization, X.X.; supervision, Y.P. and X.Z.; project administration, Y.P.; funding acquisition Y.P. and Q.C. All authors have read and agreed to the published version of the manuscript. Data Availability Statement: Some data, models, or the code generated or used during the study are available in a repository or online. Data sources and models used during the study appear in the text; the URL of the code is https://github.com/huangfujue/winter-wheat-extraction-methodsbased-on-different-time-series-of-vegetation-indices (accessed on 11 January 2021).