Early-Season

: Timely and accurate crop identiﬁcation and mapping are of great signiﬁcance for crop yield estimation, disaster warning, and food security. Early-season crop identiﬁcation places higher demands on the quality and mining of time-series information than post-season mapping. In recent years, great strides have been made in the development of deep-learning algorithms, and the emergence of Sentinel-2 data with a higher temporal resolution has provided new opportunities for early-season crop identiﬁcation. In this study, we aimed to fully exploit the potential of deep-learning algorithms and time-series Sentinel-2 data for early-season crop identiﬁcation and early-season crop mapping. In this study, four classiﬁers, i.e., two deep-learning algorithms (one-dimensional convolutional networks and long and short-term memory networks) and two shallow machine-learning algorithms (a random forest algorithm and a support vector machine), were trained using early-season Sentinel-2 images and ﬁeld samples collected in 2019. Then, these algorithms were applied to images and ﬁeld samples for 2020 in the Shiyang River Basin. Twelve scenarios with different classiﬁers and time intervals were compared to determine the optimal combination for the earliest crop identiﬁcation. The results show that: (1) the two deep-learning algorithms outperformed the two shallow machine-learning algorithms in early-season crop identiﬁcation; (2) the combination of a one-dimensional convolutional network and 5-day interval time-series Sentinel-2 data outperformed the other schemes in obtaining the early-season crop identiﬁcation time and achieving early mapping; and (3) the early-season crop identiﬁcation mapping time in the Shiyang River Basin was identiﬁed as the end of July, and the overall classiﬁcation accuracy reached 0.83. In addition, the early identiﬁ-cation time for each crop was as follows: the wheat was in the ﬂowering stage (mid-late June); the alfalfa was in the ﬁrst harvest (mid-late June); the corn was in the early tassel stage (mid-July); the fennel and sunﬂower were in the ﬂowering stage (late July); and the melons were in the fruiting stage (around late July). This study demonstrates the potential of using Sentinel-2 time-series data and deep-learning algorithms to achieve early-season crop identiﬁcation, and this method is expected to provide new solutions and ideas for addressing early-season crop identiﬁcation monitoring.


Introduction
Accurate, timely, and repeatable crop mapping is essential for food security [1]. Earlier or near real-time information of crop distribution can support food security analysis and early warning of famine [2]. Early crop distribution information can also be used by agricultural insurers to assess disaster losses and compensate farmers [3]. In addition, the early identification of crops can help guide agricultural water and fertilizer management and crop transport coordination [4].
Satellite remote sensing is a highly effective technique for extracting spatial distribution information and monitoring crop conditions due to its relatively low labor costs compared with the traditional method of ground investigation, and ability to provide (1) Are the classification performances of deep-learning algorithms in early-season crop identification better than those of shallow machine-learning algorithms? (2) What is the smallest temporal interval of the image series required for accurate earlyseason crop identification (i.e., 5, 10, or 15 days)? (3) What is the earliest identification time of the major crops in the Shiyang River Basin?

Overview of the Study Area
The Shiyang River Basin is located in central Gansu Province, north of the Qilian Mountains (37.2-39.5 • N, 101.1-103.2 • E), in the eastern part of the arid and semi-arid region of northwestern China, and it has a total area of about 40,300 km 2 ( Figure 1). The Shiyang River Basin has a typical continental temperate arid climate. The evapotranspiration in the study area exhibits large spatial and temporal heterogeneity. In the southern mountainous areas, the annual precipitation is 300-600 mm and the annual potential evapotranspiration is 700-1200 mm; while in the downstream areas, the annual precipitation is less than 150 mm and the annual potential evapotranspiration is greater than 2000 mm.
atures. Wheat is the earliest sown and earliest harvested crop in the study area, and alfalfa is harvested three times between April and October. Corn, fennel, and sunflower have essentially the same growing period. They are sown in late April and harvested in early September, with a longer growing period than wheat. The melons grown in the study area can be divided into early (sown in late April and harvested in late August) and late (sown in mid-May and harvested in late September) melons according to their different growing seasons.

Sentinel-2 Data Products
In this study, the Sentinel-2 L2A product (the bottom atmosphere reflectance product) was used as the primary data source for the crop classification, and the L1C product (the top atmosphere reflectance product) was selected as a secondary data source for cloud detection. Eleven tiles of Sentinel-2 images were acquired to achieve full coverage of the entire Shiyang River Basin. The Sentinel-2 L2A and L1C products for the 2019 and 2020 time series were downloaded from the official European Space Agency (ESA) data distribution website (https://scihub.copernicus.eu (accessed on 1 January 2021)). In total, 1606 images from the L2A product and 1606 images from the L1C product acquired from March to October in 2019-2020 were compiled and used for early-season crop identification.

Ground Reference Data
In May and August of 2019, a handheld global positioning system (GPS) instrument was used to record the center coordinates of the cropland plots of the sample crops, the crop types of the typical crops from the plots were recorded, and the ground sample sets of crops were generated by extracting the corresponding pixels from Sentinel-2 images. We collected data from 268 crop plots. The boundaries of the 268 plots were identified The farmland in the Shiyang River Basin covers about 10% of the total area, and it is mainly located in the middle and lower reaches of the Shiyang River. The local cropland heavily relies on irrigation. The main types of crops in the Shiyang River Basin are wheat, corn, sunflower, alfalfa, fennel, and melons. All six crops, except for alfalfa, are grown in a single season system from April to October due to drought and the cumulative temperatures. Wheat is the earliest sown and earliest harvested crop in the study area, and alfalfa is harvested three times between April and October. Corn, fennel, and sunflower have essentially the same growing period. They are sown in late April and harvested in early September, with a longer growing period than wheat. The melons grown in the study area can be divided into early (sown in late April and harvested in late August) and late (sown in mid-May and harvested in late September) melons according to their different growing seasons.

Sentinel-2 Data Products
In this study, the Sentinel-2 L2A product (the bottom atmosphere reflectance product) was used as the primary data source for the crop classification, and the L1C product (the top atmosphere reflectance product) was selected as a secondary data source for cloud detection. Eleven tiles of Sentinel-2 images were acquired to achieve full coverage of the entire Shiyang River Basin. The Sentinel-2 L2A and L1C products for the 2019 and 2020 time series were downloaded from the official European Space Agency (ESA) data distribution website (https://scihub.copernicus.eu (accessed on 1 January 2021)). In total, 1606 images from the L2A product and 1606 images from the L1C product acquired from March to October in 2019-2020 were compiled and used for early-season crop identification.

Ground Reference Data
In May and August of 2019, a handheld global positioning system (GPS) instrument was used to record the center coordinates of the cropland plots of the sample crops, the crop types of the typical crops from the plots were recorded, and the ground sample sets of crops were generated by extracting the corresponding pixels from Sentinel-2 images. We collected data from 268 crop plots. The boundaries of the 268 plots were identified using high-spatial-resolution Google Earth images. A total of 16,880 pixels of Sentinel-2 images within the 268 plots were extracted as the ground sample dataset for 2019. In June 2020, the unmanned aerial vehicle (UAV) observations were combined with Google Earth images to obtain the ground truth samples. The 654 crop plots and 27,843 pixels of Sentinel-2 images were extracted as the crop ground samples for 2020. The crop ground sample sets for 2019 and 2020 were used as the training and testing sets for the classification models, respectively. The number of specific crop samples is shown in Table 1, and the locations of the crop samples are shown in Figure 1. Time-series optical images are susceptible to cloud contamination. To remove the contaminated pixels, we applied the Fmask algorithm to detect clouds, cloud shadows, and snow/ice in each Sentinel-2 image [42]. Figure 2 shows the numbers of the high-quality observations of the individual pixels from March to October in the cropland region of the Shiyang River Basin in 2019 and 2020. The overall numbers of observations were 49 from March to October in both 2019 and 2020, and more than 90% of the pixels had more than 20 observations without cloud interference. Compared with the Landsat data, the high temporal resolution of Sentinel-2 resulted in a significant increase in the observation quality and frequency.

Feature Construction
Yi et al. [43] showed that the four differential vegetation indices, built upon the reflectance from the green band (G), the first red-edge band (RE1), the red band (R), the near-infrared band (NIR), and the first short-wave infrared band (SWIR1) of the Sentinel-2 satellite, are the most effective features for crop classification in the Shiyang River Basin. The four vegetation indices used in this study are expressed in Equations (1)-(4): The normalized difference vegetation index (NDVI) was constructed based on the principle that the reflectance of healthy plants is usually higher in the NIR band than in the visible band and it is the most commonly used vegetation index indicator. Given the saturation of the NDVI in areas with high vegetation cover, the green normalized difference vegetative index (GNDVI) was chosen to compensate for the drawbacks of the NDVI [44]. In addition, the RE1 band of the Sentinel-2 data, which is more sensitive to changes in vegetation chlorophyll, was also chosen to construct the vegetation index [45]. The shortwave infrared band is sensitive to leaf moisture and soil moisture and is often used to estimate the vegetation canopy moisture thickness; therefore, we chose the normalized difference water index (NDWI) to reflect the changes in the moisture content [46]. Remote Sens. 2022, 14, x FOR PEER REVIEW 6 of 23

Feature Construction
Yi et al. [43] showed that the four differential vegetation indices, built upon the reflectance from the green band (G), the first red-edge band (RE1), the red band (R), the near-infrared band (NIR), and the first short-wave infrared band (SWIR1) of the Sentinel-2 satellite, are the most effective features for crop classification in the Shiyang River Basin. The four vegetation indices used in this study are expressed in Equations (1)-(4): The normalized difference vegetation index (NDVI) was constructed based on the principle that the reflectance of healthy plants is usually higher in the NIR band than in the visible band and it is the most commonly used vegetation index indicator. Given the saturation of the NDVI in areas with high vegetation cover, the green normalized difference vegetative index (GNDVI) was chosen to compensate for the drawbacks of the NDVI [44]. In addition, the RE1 band of the Sentinel-2 data, which is more sensitive to changes in vegetation chlorophyll, was also chosen to construct the vegetation index [45]. The shortwave infrared band is sensitive to leaf moisture and soil moisture and is often used

Data Interpolation and Smoothing
To obtain continuous time-series data with regular time intervals, in this study, we used the Savitzky-Golay filter algorithm to smooth and reconstruct the pixel values with cloud interference [47]. The Savitzky-Golay filter method is based on local polynomial least square fitting in the time domain, which is more suitable for filtering data with a limited data length. Two basic conditions need to be met for reconstruction based on Savitzky-Golay filter. First, the satellite vegetation index is a valid proxy for vegetation growth conditions, and second, clouds and harsh atmospheric conditions usually reduce the vegetation index values. Therefore, sudden drops in the vegetation index that are not consistent with the gradual process of vegetation growth changes are treated as noise and are removed. Based on this, the iteration approaching the upper envelope of the vegetation index series using the Savitzky-Golay filter can be used to reconstruct the continuous time-series vegetation index well. The hyperparameters of the Savitzky-Golay filter mainly include the length of the window involved in the fitting and the power of the polynomial fit. In this study, the window length and the power of the polynomial were set to 5 and 3, respectively. In addition, as the normalized difference water index (NDWI) varies under different dry and wet conditions, it does not satisfy the requirements of the Savitzky-Golay filter. Thus, we simply used the linear interpolation method to fill the gaps in the NDWI time series.

Deep Learning Models
In this study, we selected two deep-learning models to classify the crops based on the four time-series vegetation indexes: the Conv1D network, which is a special type of convolutional network, and the LSTM, which is an RNN. The Conv1D network and LSTM represent two different but effective strategies for representing sequential data. The Conv1D network uses a one-dimensional convolution operator to capture the temporal patterns of the input sequence. The Conv1D network layers can be stacked so that the lower layers focus on the local features and the higher layers summarize the more general patterns to a greater extent. The LSTM units are designed to memorize the values at arbitrary time intervals (long or short). The LSTM improves the efficiency of describing the temporal patterns at different frequencies, which is an ideal feature for analyzing crop growth cycles of different lengths. Figure 3 shows the architecture of the convolutional network used in this study. The entire convolutional network consisted of two normal convolutional layers with a convolutional kernel size of three and two inception structures used to extract the high-level temporal features. Then, two fully connected layers and a softmax layer are used to output the classification probabilities to obtain the final crop type. The number of channels in the first convolutional module layer was 16, and thereafter, the number of channels in each layer was gradually increased. The final layer of the fully linked layer contained six neurons and was used to output the probabilities of the six crops. The penultimate layer collected information from the previous layer in the form of a planar array, the size of which was determined by the size of the initial input layer. Each of the convolutional modules (light blue squares in Figure 3) contained the convolutional computation, the Batchnorm computation for the batch data normalization, and the rectified linear unit (RELU) activation function for the neuron activation in turn. The inception structure shown in Figure 3 is a classical structure proposed by a Google research team in 2014, which is a parallel structure. The inception structure parallelized the input through a convolutional kernel size of 1, a convolutional kernel size of 3, a convolutional kernel size of 5, and a maximum pooling layer, after which the four parts of the output were concatenated on a channel to obtain the final output. The dropout was a random discarding of a certain percentage of neurons via regularization to prevent overfitting of the model, and we set the activation probability of the neurons to 0.4.  The architecture of the LSTM network used in this study is shown in Figure 4. It contained three layers of LSTM cells for extracting the complex features from the timeseries data and two fully connected layers with softmax layers to output the probabilities of the various crops. To prevent overfitting, a dropout layer was added with the same parameter settings used for the dropout in the Conv1D network. of the various crops. To prevent overfitting, a dropout layer was added with the same parameter settings used for the dropout in the Conv1D network. The architecture of the LSTM network used in this study is shown in Figure 4. It contained three layers of LSTM cells for extracting the complex features from the timeseries data and two fully connected layers with softmax layers to output the probabilities of the various crops. To prevent overfitting, a dropout layer was added with the same parameter settings used for the dropout in the Conv1D network. Both the Conv1D and LSTM were constructed based on the Pytorch framework [48]. Both the Conv1D and LSTM used the cross-entropy loss function with L2 regularization as the optimization criterion, and the Adam optimizer was used for the parameter estimation. The learning rate was set to 0.0001, and the batch size was set to 64. The cross-entropy loss function was calculated as follows: Both the Conv1D and LSTM were constructed based on the Pytorch framework [48]. Both the Conv1D and LSTM used the cross-entropy loss function with L2 regularization as the optimization criterion, and the Adam optimizer was used for the parameter estimation. The learning rate was set to 0.0001, and the batch size was set to 64. The cross-entropy loss function was calculated as follows: In Equation (5), N is the number of training samples, i is the index of the samples, K is the number of target categories (K = 6 in this study), j is the index of the categories, p true i,j and p pred i,j represent the true and predicted probabilities that the ith sample belongs to category j, λ represents the coefficient of the L2 regularization term, and w represents all the parameters in the model that need to be learned.

Shallow Machine Learning Models
In this study, an RF model and an SVM model were used as the baseline models for comparing the classification performances of the different models in different contexts. The scikit-learn package in Python was used to implement the RF and SVM models [49]. The RF algorithm was an integrated classification algorithm based on a bagging strategy with decision tree-based classifiers [50]. The SVM algorithms performed the classification by separating the hyper-planes and performed the non-linear classification using widely used kernel functions [21,51]. RF algorithms and SVM algorithms have been widely used in remote sensing applications and have achieved great success in complex classification tasks [52][53][54]. In this study, a grid-based search strategy was used to select the appropriate model hyper-parameters for constructing the RF and SVM models. The details of the model hyper-parameter selection are presented in Table 2.

Experimental Design
Early-season crop identification requires that the model extracts the useful information from remote sensing time-series data with a limited length as early as possible. The process of early crop mapping or early-season crop identification time extraction used in this study is illustrated in Figure 5. Since the crop growing season in the Shiyang River Basin is from April to October, in this study, DOY103 (i.e., the 103rd day of the year) was set as the starting date of the growing season, and the vegetation index data for the subsequent dates were sequentially added to the classifier for the crop classification. The end date of the growing season was set to DOY303. DOY103 was in early April when the crops in the Shiyang River Basin had not yet been sown, and DOY303 was in late October, when the major crops in the basin had already been harvested. In this way, the classification accuracy increased with the length of the time series data over time, and at a particular point in time, the classification accuracy usually reached saturation and stopped increasing. Based on the change in the model's accuracy as the crop growing season progressed, we chose the time when the accuracy reached stability (classification accuracy greater than 90% of the post-season mapping accuracy level) as the time for the early crop mapping and early identification of each crop in the Shiyang River Basin. To distinguish the effects of the different temporal intervals on the early-season crop identification, three temporal intervals, i.e., 5, 10, and 15 days, were applied. In total, we applied 12 combinations of experimental settings for the three temporal intervals (5, 10, and 15 days) and four classifiers. The ground sample set for 2019 was used for the model training and the ground sample set for 2020 was used to validate the crop classification in 2020. In this study, the confusion matrix and F1 score were used to evaluate the accuracy of the classification results.
It should be noted that for the Conv1D model, the length of the input temporal data affects the number of neurons in the fully connected layer. In the early-season crop identification experiments, the length of the temporal data changed constantly, thus changing the dimensionality of the input data, which increased the workload and complexity of the model tuning. Therefore, when using Conv1D networks for classification, we masked the data with zeros at the time nodes that were not used to fix the input length of the time-series data at 49. This allowed the information contained in the input data to be consistent with that contained in the short time-series data while simultaneously unifying the model's architecture. For the other three classifiers, masking operations were not applied.

Accuracy Assessment
In this study, 16,880 crop samples obtained in 2019 were selected as the training sample set and 27,843 crop samples obtained in 2020 were selected as the test sample. For each classification, we measured the performance of the classification by calculating the confusion matrix and the F1 score for the test sample set. The overall accuracy (OA), user accuracy (UA), and producer accuracy (PA) are the three basic evaluation metrics in remote sensing classification. They depict the reliability of the classification results from different perspectives and are the most commonly used evaluation metrics for land cover classification at present. The OA is the probability that the classification labels of all of the classified samples are the same as the actual category labels. The UA is a conditional probability metric that describes the probability that a randomly selected sample from the classification results will have a category label that agrees with the actual category label on the ground. The PA is also a conditional probability metric that describes the probability that any sample taken from the test data will have the same category label as its test category label in the classification results. For each category, the F1 score describes the summed average of the PA and UA for a single category which balances the PA and UA. The F1 score is calculated as follows: where F1 class is the F1 score for a single class, pa class is the PA for a single class, and ua class is the UA for a single class. In this study, Card's correction was used to calibrate the PA considering the proportion bias of the ground sample sets and also to compute the 95% confidence interval in the accuracy evaluation of the thematic map [55].
McNemar's test was used to evaluate the statistical significance of the different scenarios of classification described in Section 5.2 [56].
Remote Sens. 2022, 14, x FOR PEER REVIEW 10 of 23 experimental settings for the three temporal intervals (5, 10, and 15 days) and four classifiers. The ground sample set for 2019 was used for the model training and the ground sample set for 2020 was used to validate the crop classification in 2020. In this study, the confusion matrix and F1 score were used to evaluate the accuracy of the classification results. It should be noted that for the Conv1D model, the length of the input temporal data affects the number of neurons in the fully connected layer. In the early-season crop identification experiments, the length of the temporal data changed constantly, thus changing the dimensionality of the input data, which increased the workload and complexity of the model tuning. Therefore, when using Conv1D networks for classification, we masked the data with zeros at the time nodes that were not used to fix the input length of the timeseries data at 49. This allowed the information contained in the input data to be consistent with that contained in the short time-series data while simultaneously unifying the model's architecture. For the other three classifiers, masking operations were not applied.  started to increase on around DOY100, reached saturation on around DOY150, started to decrease on around DOY175, and reached a relatively small and stable value on DOY200. The vegetation index of the alfalfa exhibited fluctuating characteristics, reaching local maxima on around DOY150, DOY200, and DOY250, and reaching local minima on around DOY165, DOY220, and DOY275 due to multi-harvesting. The other four crops (i.e., sunflower, corn, fennel, and melons) exhibited time-series vegetation index curves with similar characteristics due to their similar growth cycles. There was also a large overlap in the buffer section of these four vegetation indices, making crop classification more difficult. However, differences still existed. First, the peak stage in the sunflower vegetation index was narrower, followed by fennel, while corn had the widest peak. This was mainly due to the differences in the harvest dates of these three crops. In addition, corn had a higher NDVI and GNDVI than the other four crops in the mid to late growing season, but this was not reflected in the NDVI45 and NDWI. The fennel had the highest NDWI in the middle of the crop growing season. These differences suggest that extracting the deeper differences hidden in the time-series vegetation indices was the key to identifying the different crops.

Classification Performances of the Different Combinations of Classification Strategies
In this study, four classifiers (Conv1D network, LSTM network, RF algorithm, and SVM algorithm) and three different time-series data intervals (5, 10, and 15 days) were tested for classification using the 2019 samples for training and the 2020 samples for testing. We repeated the procedure for each scenario five times and then averaged the overall accuracy to reduce the uncertainty of the stochasticity. The mean value and standard error of the overall accuracy of each scenario are shown in Figure 7. The results show that the maximum overall classification accuracy achieved in this study was 0.87 when using a Conv1D network. Therefore, the time when the overall accuracy first exceeded 0.8 (greater than 90% of the maximum accuracy) was used as the time for the early crop mapping in the Shiyang River Basin.
The different classifiers had different classification performances. The best classifica-

Classification Performances of the Different Combinations of Classification Strategies
In this study, four classifiers (Conv1D network, LSTM network, RF algorithm, and SVM algorithm) and three different time-series data intervals (5, 10, and 15 days) were tested for classification using the 2019 samples for training and the 2020 samples for testing. We repeated the procedure for each scenario five times and then averaged the overall accuracy to reduce the uncertainty of the stochasticity. The mean value and standard error of the overall accuracy of each scenario are shown in Figure 7. The results show that the maximum overall classification accuracy achieved in this study was 0.87 when using a Conv1D network. Therefore, the time when the overall accuracy first exceeded 0.8 (greater than 90% of the maximum accuracy) was used as the time for the early crop mapping in the Shiyang River Basin.

Early Identification Time for Each Crop
By comparing the experimental results for the 12 classification scenarios, we found that the Conv1D network and 5-day interval of Sentinel-2 time-series vegetation index data could identify the crops more efficiently within the season. The results shown in Figure 7 indicate that the earliest time when the crops could actually be effectively identified in the Shiyang River Basin was on around DOY210, which was in the middle of the crop growing season in the Shiyang River Basin. Furthermore, we analyzed the earliest identifiable time for each crop, and the F1 score for each crop was found to vary ( Figure 8). The F1 score for wheat was almost stable on around DOY168, with an F1 score of 0.95. The alfalfa also achieved a stable F1 score on around DOY168, with an F1 score of 0.9. The earliest identifiable time for corn was on around DOY198, with an F1 score of 0.82. The fennel and sunflower had similar F1 scores, with low F1 scores before DOY180 and increasing rapidly after DOY180. The F1 score for melons did not exhibit a clear turning point, reaching 95% of the F1 score on around DOY200. Because the vegetation index at every time point was calculated using two data points before and two data points after the current phase in the interpolation of the temporal data, the actual earliest identifiable times for the wheat and alfalfa were on around DOY180, and those for the remaining four crops were on around DOY210. Figure 9 shows the corresponding phenological stages at the earliest identification times for five of these crops. The earliest identifiable time for wheat in the Shiyang River Basin was in its flowering stage, that of alfalfa was in the first harvest stage, that of corn was in the early heading stage, and those of fennel and sunflower were both in the transition period between the flowering and grouting stages. The earliest identification time for the melons was early August, but the varieties of the melons and the complexity of their phenological stages made it impossible to determine which phenological stage corresponded to their early identification time. The different classifiers had different classification performances. The best classification accuracy was 0.87 for the Conv1D when using the full time-series data (i.e., end-ofseason mapping), followed by 0.85 for the LSTM network and the SVM network, while the accuracy was only 0.82 for the RF algorithm. The classification accuracies of the four algorithms exhibited a similar pattern over time when mapping within the growing season. The classification accuracy improved rapidly in the early and middle parts of the growing season, plateaued in the middle of the growing season, and increased slowly in the late part of the growing season. However, the rate of increase of the classification accuracy and the time it took to reach stability were not the same for all four classification algorithms. The overall accuracies of the Conv1D network and the LSTM network increased rapidly between DOY160 and DOY200. The classification accuracy of the Conv1D network exceeded 0.8 for the first time on DOY198, with the actual classification accuracy being 0.82 (Figure 7a). The classification accuracy of the LSTM network reached 0.8 for the first time on DOY208 (Figure 7a). The overall accuracies of the RF and SVM algorithms increased between DOY140 and DOY240 and reached 0.8 for the first time on DOY258 and DOY238, respectively (Figure 7c,d), with significantly lower rates of increase of the classification accuracy than those of the two deep-learning algorithms. Considering that the data for the previous and next two phases need to be used when filtering and smoothing the time-series vegetation index data, the actual early mapping times for the Conv1D, LSTM, RF, and SVM algorithms were DOY208 (late July), DOY218 (early August), DOY268 (late September), and DOY248 (early September), respectively. The earliest mapping times for the SVM algorithm and the RF algorithm were later than DOY238 (end of August), which was already in the harvesting period for the summer crops in the Shiyang River Basin. Thus, it was concluded that the RF and SVM were not effective in early crop identification. Figure 7 also shows that the time-series data for the different temporal intervals had different classification performances. When using the Conv1D network for the classification, the overall accuracy of the crop classification using the 5-, 10-, and 15-day interval data exceeded 0.8 for the first time on DOY198, DOY203, and DOY218, respectively; while for the LSTM network, the classification accuracy when using the 5-, 10-, and 15-day interval data exceeded 0.8 on DOY208, DOY228, and DOY253, respectively. The SVM algorithm and RF algorithm performed the worst in this study, with classification accuracies exceeding 0.8 for all three intervals later than DOY238 and DOY258, respectively, and thus, they could not be used for the early crop identification. In addition, the two shallow machinelearning algorithms (the RF and SVM) had significantly lower classification accuracies for post-seasonal mapping when using 10-and 15-day interval data than when using 5-day interval data, whereas this did not occur when using the Conv1D and the LSTM networks. This may be due to the information redundancy of the multi-temporal features in the 5-day interval data, which reduced the classification accuracy of shallow machine-learning algorithms that use these features directly as the basis for the classification. In contrast, both the Conv1D and the LSTM networks automatically extracted the hidden features from the temporal data using their respective non-linear arithmetic units, reducing the effect of the information redundancy.
In conclusion, the early crop identification times in the Shiyang River Basin obtained using different classifiers with different intervals of time-series data varied. The Conv1D and the LSTM networks completed the classification task earlier and with higher accuracies than the SVM algorithm and the RF algorithm. In addition, a 5-day interval of data allowed the earliest actual early-season crop identification times to be obtained. In this study, the combination of the Conv1D network and a 5-day interval of time-series data was used to obtain the earliest (DOY198) crop identification with a high accuracy (OA = 0.82).

Early Identification Time for Each Crop
By comparing the experimental results for the 12 classification scenarios, we found that the Conv1D network and 5-day interval of Sentinel-2 time-series vegetation index data could identify the crops more efficiently within the season. The results shown in Figure 7 indicate that the earliest time when the crops could actually be effectively identified in the Shiyang River Basin was on around DOY210, which was in the middle of the crop growing season in the Shiyang River Basin. Furthermore, we analyzed the earliest identifiable time for each crop, and the F1 score for each crop was found to vary ( Figure 8). The F1 score for wheat was almost stable on around DOY168, with an F1 score of 0.95. The alfalfa also achieved a stable F1 score on around DOY168, with an F1 score of 0.9. The earliest identifiable time for corn was on around DOY198, with an F1 score of 0.82. The fennel and sunflower had similar F1 scores, with low F1 scores before DOY180 and increasing rapidly after DOY180. The F1 score for melons did not exhibit a clear turning point, reaching 95% of the F1 score on around DOY200. Because the vegetation index at every time point was calculated using two data points before and two data points after the current phase in the interpolation of the temporal data, the actual earliest identifiable times for the wheat and alfalfa were on around DOY180, and those for the remaining four crops were on around DOY210. Figure 9 shows the corresponding phenological stages at the earliest identification times for five of these crops. The earliest identifiable time for wheat in the Shiyang River Basin was in its flowering stage, that of alfalfa was in the first harvest stage, that of corn was in the early heading stage, and those of fennel and sunflower were both in the transition period between the flowering and grouting stages. The earliest identification time for the melons was early August, but the varieties of the melons and the complexity of their phenological stages made it impossible to determine which phenological stage corresponded to their early identification time.

Early Crop Mapping in the Shiyang River Basin
We used short time-series data from DOY63 to DOY198 (actual cut-off date DOY208) and the Conv1D network algorithm to conduct early crop mapping in the Shiyang River Basin (Figure 10). These maps were created using only early-season images from 2020, and the Conv1D classifier was trained using 2019 data without relying on the field samples collected in 2020. As can be seen from the maps, the Shiyang River Basin was dominated by food crops, with corn being the most widely grown and distributed throughout the basin, followed by wheat, which was mainly grown in the middle reaches of the Shiyang River Basin and close to the urban areas of Wuwei City. The cash crops such as sunflower, melons, and fennel were mainly located in the lower reaches of the Shiyang River Basin, in the northern part of Minqin County, which was extremely arid and economically backward. The fragmentation and small sizes of the local crop plots inevitably led to

Early Crop Mapping in the Shiyang River Basin
We used short time-series data from DOY63 to DOY198 (actual cut-off date DOY208) and the Conv1D network algorithm to conduct early crop mapping in the Shiyang River Basin (Figure 10). These maps were created using only early-season images from 2020, and the Conv1D classifier was trained using 2019 data without relying on the field samples collected in 2020. As can be seen from the maps, the Shiyang River Basin was dominated by food crops, with corn being the most widely grown and distributed throughout the basin, followed by wheat, which was mainly grown in the middle reaches of the Shiyang River Basin and close to the urban areas of Wuwei City. The cash crops such as sunflower, melons, and fennel were mainly located in the lower reaches of the Shiyang River Basin, in the northern part of Minqin County, which was extremely arid and economically backward. The fragmentation and small sizes of the local crop plots inevitably led to

Early Crop Mapping in the Shiyang River Basin
We used short time-series data from DOY63 to DOY198 (actual cut-off date DOY208) and the Conv1D network algorithm to conduct early crop mapping in the Shiyang River Basin (Figure 10). These maps were created using only early-season images from 2020, and the Conv1D classifier was trained using 2019 data without relying on the field samples collected in 2020. As can be seen from the maps, the Shiyang River Basin was dominated by food crops, with corn being the most widely grown and distributed throughout the basin, followed by wheat, which was mainly grown in the middle reaches of the Shiyang River Basin and close to the urban areas of Wuwei City. The cash crops such as sunflower, melons, and fennel were mainly located in the lower reaches of the Shiyang River Basin, in the northern part of Minqin County, which was extremely arid and economically backward.
The fragmentation and small sizes of the local crop plots inevitably led to misclassification of small patches and pixels on the plot boundaries. The confusion matrix after proportion calibration for this early mapping is presented in Table 3. The overall accuracy of the crop classification in the Shiyang River Basin using the short time-series data for DOY103 and DOY198 was 0.81, with a kappa coefficient of 0.79. The wheat had the highest PA and UA and was identified the most accurately, while the fennel had a low UA of 0.64, and the melons were susceptible to misclassification as fennel. The melons and corn were also misclassified, which was mainly due to the similarity of the agricultural calendars of these crops. Since the proportions of the melon and sunflower in the sample set were not consistent with the crop distribution in the basin, the proportion-corrected PA values for the melon and sunflower were low (0.58 and 0.51).
Remote Sens. 2022, 14, x FOR PEER REVIEW 16 of 23 misclassification of small patches and pixels on the plot boundaries. The confusion matrix after proportion calibration for this early mapping is presented in Table 3. The overall accuracy of the crop classification in the Shiyang River Basin using the short time-series data for DOY103 and DOY198 was 0.81, with a kappa coefficient of 0.79. The wheat had the highest PA and UA and was identified the most accurately, while the fennel had a low UA of 0.64, and the melons were susceptible to misclassification as fennel. The melons and corn were also misclassified, which was mainly due to the similarity of the agricultural calendars of these crops. Since the proportions of the melon and sunflower in the sample set were not consistent with the crop distribution in the basin, the proportion-corrected PA values for the melon and sunflower were low (0.58 and 0.51). Figure 10. Early-season crop map for 2020 in the Shiyang River Basin obtained using the Conv1D network and the images acquired before DOY198 in 2020. The Conv1D network was trained using the images acquired before DOY198 in 2019 and the samples from 2019. Figure 10. Early-season crop map for 2020 in the Shiyang River Basin obtained using the Conv1D network and the images acquired before DOY198 in 2020. The Conv1D network was trained using the images acquired before DOY198 in 2019 and the samples from 2019. Table 3. Confusion matrix of the early-season crop map for 2020 in the Shiyang River Basin obtained using the Conv1D network. The proportion calibration and 95% confidence interval calculation were based on the method of Olofsson et al. [55] for the PA, UA, and OA using the map accuracy library in the R language. The proportion was calculated by using the pixel number of each crop in the predicted map. To calculate the 95% confidence intervals, the sample size was set to 654, which is the total number of independent samples from the different crop plots, to avoid excessively narrow confidence intervals.

Influence of Crop Spectral and Phenological Characteristics on Early Identification Times
We obtained early crop mapping times and early identification times for each of the six crops in the Shiyang River Basin based on 5-day interval time-series Sentinel-2 vegetation index data and a Conv1D network classifier with interannual migration (Figures 7-9). The early mapping times for the crops in the Shiyang River Basin were DOY200-DOY210 (end of July), which was in the middle of the growing season of the crops in the Shiyang River Basin. This is similar to previous findings in that the use of more images improves the accuracy of crop mapping and that a higher accuracy can be obtained in the middle of the growing season [30,57]. In addition, we further analyzed the phenological stage in which the early identification times of the six crops were located and their spectral performances.
The earliest identification time for wheat was during its flowering period (DOY 170) when the wheat had completed its nutritional reproductive phase and the vegetation cover was at its peak. In this stage, wheat had a significantly higher vegetation index, effectively increasing the identifiability of wheat compared with the other crops. The earliest identification time for alfalfa was during its first harvest stage (DOY170), and the variations in the four vegetation indices from low to high to low ( Figure 6) also effectively reflected the fact that the first harvest of alfalfa was the key period for differentiating the alfalfa from the other crops. The early identification times for corn, sunflower, and fennel were during the early tassel, early grouting, and flowering stages, respectively. It should be noted that the F1 scores for corn, sunflower, and fennel increased rapidly after the beginning of July (DOY180) and reached a steady state on around DOY200 (Figure 8). This was because all three crops were in the nutrient stage of growth with rapid rootstock and vegetation cover development before July (DOY180), but the simple increase in greenness did not result in large differences between the four vegetation indices ( Figure 6). However, the plants entered the reproductive growing stage after July and differentiated into distinctive reproductive organs, providing the classifiers with more information to improve the performance. For example, the fennel exhibited an umbel shape and was bright yellow in color. The sunflower exhibited a head-shaped inflorescence and was orange in color. The corn exhibited a panicle or fleshy spike and was light yellow in color. The different shapes, pigment levels, and moisture contents of the crops cause the spectral properties of the crops to differ. This was also reflected by the performances of the four vegetation indices. The NDVI and GNDVI of corn were higher than those of fennel and sunflower after July, and the NDWI of fennel was relatively high ( Figure 6). These differences were crucial in distinguishing between corn, sunflower, and fennel. In addition, previous studies were able to validate the reliability of our results, i.e., the use of high-temporal-resolution Sentinel-2 type data allowed for the effective identification of summer crops (e.g., similar to corn) in the heading/flowering stage of the crops [57]. The earliest identification time for the melons was around DOY200, although it was not possible to determine the phenological stage of the melons corresponding to this period. However, as can be seen in Figure 6, the four vegetation indices of melons peaked on around DOY200, indicating that the melons had completed their nutritional growth phase and have entered their reproductive growth phase. In addition, the four vegetation indices for the melons on around DOY200 were lower than those for the sunflower, fennel, and corn, all of which had similar growing seasons. These differences suggest that late July to early August is a critical time for identifying melons.

Factors Decreasing the Accuracy of the Early Crop Mapping
Compared with the crop classification using samples and full-season data acquired in the mapping year, we made some trade-offs between the timeliness and accuracy for earlyseason crop identification. First, we applied our model trained using the 2019 samples to map the crops in 2020, and then, we used short time-series data (28 images) before DOY198 to advance the timing of the within-year mapping. These two strategies inevitably had an impact on the accuracy of the remote sensing classification of the crops. To understand the impacts of these two strategies on the classification accuracy of the early crop mapping, we constructed three classification scenarios, including crop classification using historical samples and early-season images (HSES), using historical samples and full-season images (HSFS), and using samples and full-season images for the mapping year (SFSMY). The HSFS scenario refers to transferring the classifier trained using the crop samples and full time-series data (49 images) for 2019 to the 2020 time-series data. The SFSMY scenario refers to training the classifier using the crop samples and full time-series data (49 images) for 2020 to predict the 2020 time-series data. Compared with the HSES scenario, which uses historical samples and short time-series data, the HSFS scenario uses historical samples and full time-series data and the SFSMY scenario uses samples from the mapping year and the full time-series. Hence, the differences in the accuracies of the three scenarios can help us understand the impact of the two strategies on the classification. We quantitatively compared the results of the three scenarios (Table 4) and conducted a mutual confidence test (Table 5) using McNemar's test (Foody, 2004). It should be noted that only the original 654 samples from the different plots were used to conduct McNemar's test in order to reduce the uncertainty caused by dependent samples. As is shown in Tables 4 and 5, the HSFS and SFSMY scenarios significantly outperformed the HSES scenario, and the OA values were 0.04 and 0.13 higher, with a 5% significance level. This indicates that the spectral variations and the lack of information from the middle and late stages of the growing season both decreased the accuracy of the early crop mapping. It should be noted that the significantly higher accuracy indicates that the spectral variations in the crop itself between 2019 and 2020 in the Shiyang River Basin were the more important factor limiting the accuracy of the early crop mapping. In addition, we analyzed the classification accuracy performances for the crops obtained using the three different mapping strategies (i.e., HSES, HSFS, and SFSMY). The F1 score of the HSES for the melons decreased significantly by 0.09 compared with that of the HSFS, and by 0.18 compared with that of the SFSMY. The F1 scores of the HSES for the fennel decreased significantly by 0.08 and 0.16, respectively, compared with the HSFS and the SFSMY, respectively. This suggests that the interannual spectral variation and information deficits caused by the use of short time-series data had a serious impact on the accuracy of the melon and fennel identifications, and these impacts were at a similar level. This can be explained in Figures 6 and 11. The vegetation index values for the fennel and melons after DOY198 exhibited larger differences compared with those of the other crops ( Figure 6), and the vegetation index values of the fennel and melons also exhibited obvious changes between 2019 and 2020 ( Figure 11). Compared with the HSFS and SFSMY, the F1 scores of the HSES decreased by 0.03 and 0.12 for sunflower, and by 0.04 and 0.13 for corn. This suggests that the spectral variations in the sunflower and corn between the two years were a major factor reducing their early mapping accuracies. Compared with the HSFS and SFSMY, the F1 scores slightly decreased for wheat by 0.1 and 0.3; however, these differences were not statistically significant. This indicates that spectral features before DOY198 were sufficient to identify wheat, and the interannual spectral variations influenced the wheat identification to a limited level. The F1 score of the HSES for alfalfa decreased by 0.05 and 0.13 compared with the HSFS and SGSMY, but the differences were not statistically significant, and the influence of the interannual spectral variations and short time-series data were not significant based on the results of the McNemar test. In conclusion, the early crop mapping accuracies of the six crops in the Shiyang River Basin decreased by different degrees compared with the mapping of the crops using samples and full-season data acquired in the mapping year, and the reasons for the decrease in the accuracies of the mapping of the six crops were also different. Moreover, identifying fennel and melon in the early growing season was challenging due to the large interannual spectral variations and smaller spectral differences in the early growing season.

Limitations and Future Work
The field samples in this study were not randomly distributed in the Shiyang River Basin due to the difficulty of conducting fieldwork in this area. The central and southern parts of the Shiyang River Basin are dominated by corn and spring wheat. The northern part of the Shiyang River Basin had a more complex planting structure and was dominated by corn, melon, fennel, sunflower, and alfalfa. In recent years, the wheat planting area has decreased significantly in the northern part of the basin because of the lower water use efficiency. The spring wheat samples were mainly collected in the central and southern parts of the Shiyang River Basin. Among the crops in the Shiyang River Basin, spring wheat has a unique growing season that starts much earlier (March to earlier April) and is therefore easier to identify. For corn, no significant difference in variety and growing season existed in the Shiyang River Basin. Non-randomly distributed wheat and corn samples can also capture their spectral attributes ( Figure 6). For field sampling in the northern part of the basin, we visited the main production zones of each crop and collected samples from each zone. The sampling strategy based on local knowledge partially compensates the uncertainties caused by the non-random sample distribution. Future studies can be improved by using random samples.
The minimum mapping unit used for the crop classification and crop type thematic map developed in this study are pixels. Future research should consider testing our methodology using objects (fields) as minimum mapping units as this will reduce within-field variations, which may improve classification accuracies.

Conclusions
In this study, we investigated the potential of using deep-learning algorithms and time-series Sentinel-2 data for early crop identification in the Shiyang River Basin. We conducted early-season crop identification experiments on four classifiers (Conv1D, LSTM, RF, and SVM) and time-series Sentinel-2 data with three different interval lengths

Limitations and Future Work
The field samples in this study were not randomly distributed in the Shiyang River Basin due to the difficulty of conducting fieldwork in this area. The central and southern parts of the Shiyang River Basin are dominated by corn and spring wheat. The northern part of the Shiyang River Basin had a more complex planting structure and was dominated by corn, melon, fennel, sunflower, and alfalfa. In recent years, the wheat planting area has decreased significantly in the northern part of the basin because of the lower water use efficiency. The spring wheat samples were mainly collected in the central and southern parts of the Shiyang River Basin. Among the crops in the Shiyang River Basin, spring wheat has a unique growing season that starts much earlier (March to earlier April) and is therefore easier to identify. For corn, no significant difference in variety and growing season existed in the Shiyang River Basin. Non-randomly distributed wheat and corn samples can also capture their spectral attributes ( Figure 6). For field sampling in the northern part of the basin, we visited the main production zones of each crop and collected samples from each zone. The sampling strategy based on local knowledge partially compensates the uncertainties caused by the non-random sample distribution. Future studies can be improved by using random samples.
The minimum mapping unit used for the crop classification and crop type thematic map developed in this study are pixels. Future research should consider testing our methodology using objects (fields) as minimum mapping units as this will reduce withinfield variations, which may improve classification accuracies.

Conclusions
In this study, we investigated the potential of using deep-learning algorithms and time-series Sentinel-2 data for early crop identification in the Shiyang River Basin. We conducted early-season crop identification experiments on four classifiers (Conv1D, LSTM, RF, and SVM) and time-series Sentinel-2 data with three different interval lengths (5, 10, and 15 days) for six crops in the Shiyang River Basin. By comparing the results of the different scenarios, we found that the two deep-learning algorithms performed better than the two shallow machine-learning algorithms in identifying the early-season crop types. The use of time-series Sentinel-2 data with a 5-day interval as the input of the Conv1D network yielded a higher accuracy in terms of earlier early crop mapping in the Shiyang River Basin. Both the influences of crop physical properties and the mapping strategy have been discussed in detail. We found that the interannual variations in the crops' spectral properties were the main factor that reduced the accuracy of the early crop mapping in the Shiyang River Basin compared with the mapping of the crops using samples and full-season data acquired in the mapping year. By employing frequently available remote sensing data and deep-learning methods, in this study, we systematically analyzed the possibility of conducting early-season crop identification in highly heterogeneous regions, which may have applications such as more effective crop yield prediction, improved agricultural disaster prevention, and better planting structure optimization.