Feature Fusion Approach for Temporal Land Use Mapping in Complex Agricultural Areas

: Accurate temporal land use mapping provides important and timely information for decision making for large-scale management of land and crop production. At present, temporal land cover and crop classiﬁcations within a study area have neglected the differences between subregions. In this paper, we propose a classiﬁcation rule by integrating the terrain, time series characteristics, priority, and seasonality (TTPSR) with Sentinel-2 satellite imagery. Based on the time series of Normalized Difference Water Index (NDWI) and Vegetation Index (NDVI), a dynamic decision tree for forests, cultivation, urban, and water was created in Google Earth Engine (GEE) for each subregion to extract cultivated land. Then, with or without this cultivated land mask data, the original classiﬁcation results for each subregion were completed based on composite image acquisition with ﬁve vegetation indices using Random Forest. During the post-reclassiﬁcation process, a 4-bit coding rule based on terrain, type, seasonal rhythm, and priority was generated by analyzing the characteristics of the original results. Finally, statistical results and temporal mapping were processed. The results showed that feature importance was dominated by B2, NDWI, RENDVI, B11, and B12 over winter, and B11, B12, NDBI, B2, and B8A over summer. Meanwhile, the cultivated land mask improved the overall accuracy for multicategories (seven to eight and nine to 13 during winter and summer, respectively) in each subregion, with average ranges in the overall accuracy for winter and summer of 0.857–0.935 and 0.873–0.963, respectively, and kappa coefﬁcients of 0.803–0.902 and 0.835–0.950, respectively. The analysis of the above results and the comparison with resampling plots identiﬁed various sources of error for classiﬁcation accuracy, including spectral differences, degree of ﬁeld fragmentation, and planting complexity. The results demonstrated the capability of the TTPSR rule in temporal land use mapping, especially with regard to complex crops classiﬁcation and automated post-processing, thereby providing a viable option for large-scale land use mapping.


Introduction
The global COVID-19 pandemic, locust plagues, and extreme weather events in 2020 seriously destabilized global food security according to the United Nations Food and Agriculture Organization (UNFAO). Along with the International Panel on Climate Change (IPCC) report on "Climate Change and Land" [1], both agencies have stated that the restoration of agricultural land, adjustments to global agricultural production, and strengthening of global food supplies are essential to managing future food security issues. Cropping structures also have a further significant impact on land use and soil degradation. only basic data for agricultural management at different regional scales, but also provides a method reference for temporal crop planting structure extraction in this study. In addition, remote sensing with high-resolutions, such as Worldview and unmanned aerial vehicles (UAVs), provides alternative techniques for monitoring heterogeneous crops in smallholder agriculture by object-oriented classification method [23,24]. Nevertheless, due to the difficult processing and lack of a red-edge band for complex crop extraction, it is difficult to apply these techniques in a large-area crop planting structure extraction. Therefore, according to the above researches, the medium-resolution image with the red-edge band is more effective in monitoring the complex agricultural crop planting structure by integrating vegetation indices, seasonal rhythm, and crop phenology.
For crop extraction, numerous approaches based on unsupervised and supervised classification techniques have been used to map geographic distributions and planting patterns. Currently, machine learning is defined as a scientific field that provides machines the ability to learn without being strictly programmed. Machine learning algorithms are used to identify different crops through single images [16], multi-temporal imagery [17], or fusion features [18]. The majority of the current research, however, focuses on the extraction of staple crops, and dynamic thresholds under variable terrains are rarely considered when building a decision tree [25]. Furthermore, these studies ignore the extraction of special crops across different regions, and accuracy is evaluated across the entire study area while neglecting the subregional-accuracy. For the regularization and automatic post-processing of classification results, seasonality is also often ignored. Furthermore, traditional methods, such as ERDAS Imagine, ENVI, and eCognition, lead to low efficiency large-area crop extraction owing to time-consuming processes, such as image data acquisition, preprocessing, and classification. All of these limitations highlight the need to develop more efficient and accurate methods to extract cropping structures.
With the advent of accessible satellite data, it is easier to create low-cost crop planting structure maps using features derived from remotely sensed spectral differences in vegetation and other surfaces over time [26]. These efforts have been aided by the success of object-oriented method and machine learning methods, such as support vector machines (SVM), random forests (RF), neural networks (NN) and deep learning (DL) [27], which allow for large-scale automated analysis of satellite imagery. For instance, membership probabilities of maximum likelihood classifier (MLC) [28], segmentation threshold of decision tree (DT) [29], or multilayer neural network [30] have been used on information extraction of multi-source images. The convenient data storage and cloud computing capabilities of the Google Earth Engine (GEE), as well as the freely-shared satellite imagery from the Landsat, MODIS, and Sentinel missions, in addition to other publicly available geospatial datasets, collectively provide the technical and data capabilities for the rapid extraction of agricultural planting structures [31,32].
Machine learning algorithms are an important component of supervised classification research, and similar studies have been carried out on land cover changes [33], cotton production [34], residential mapping [35], climate prediction [36], vegetation identification [37], and aquatic weeds [38]. Recently, non-parametric machine learning algorithms like RF or SVM can overcome the shortcomings of conventional classifiers like MLC since they are not constrained to assumptions like the parametric distributions of the input data [39]. Object-Based Image Analysis (OBIA) is the subdivision of an image into homogeneous regions through the grouping of pixels in accordance with determined criteria of homogeneity and heterogeneity, which considers the analysis of an "object in space" instead of a "pixel in space" [40]. A time series of Landsat TM/ETM+ images was acquired to generate the thematic map of sugarcane in Brazil based on OBIA, and the classification achieved an overall accuracy of 94% and a Kappa coefficient of 0.87 [41]. However, the segmentation scale is difficult to determine in complex crop structure extraction. SVM and RF are well-established machine learning techniques that have given promising accuracy in crop classification and have been shown to perform more accurately than other classifiers in crop classification [20,39]. Grabska et al. [42] and Waske and Benediktsson [43] concluded Remote Sens. 2021, 13, 2517 4 of 29 that SVM in the forest tree species (Sentinel-2 image) and landcover classification of multisensory datasets had higher classification accuracy when compared with MLC and DT. RF from Breiman [44] have successfully been applied for crop classification. Moreover, Pal [45] concluded that RF was more suitable for multivariate classification, and the training time is shorter in the case of nonlinear kernel than that of SVM.
The Random Forest (RF) is a more mature machine learning algorithm for the automatic extraction of features in remote sensing imagery [46]. In addition, it computes a feature importance score that can be used to select and reduce variables, and appears to be robust to high feature space dimensionality [44,45]. Li and Chen [47] concluded that RF had the best performance on multiple indicators out of five different machine learning algorithms, including neural network, decision tree, logistic regression, Naïve Baye and support vector machine; however, this method was based on mathematical theory and financial management activity, which did does not apply to multi-crop classifications in geographic space. Pareeth et al. [48] applied RF to classify land cover based on a time series of Landsat 8 imagery and achieved an overall accuracy (OA) of 87.2%, with a kappa coefficient of 0.85; however, this method included fewer categories (i.e., the number of extraction crops) and did not consider crop classification and the influence of terrain or seasonality. Son et al. [49] used multi-temporal Sentinel-2 imagery and field-based classification to successfully identify rice paddies in a small area of Taiwan. By analyzing the characteristic small size of rice fields and information on crop phenology, an approach for mapping rice-growing areas was developed at the field level using multi-temporal Sentinel-2 data. However, only the rice crop was extracted, along with difficulties in obtaining field level data over large; the small study area, single crop analysis, and excessive data processing times do not satisfy the requirements for extracting crop planting structures over large regions.
Presently, in temporal remote sensing, research studies based on vegetation index (VI) or time-series images are impressive [50,51]. Zhai et al. [52] developed the automated method based on satellite image time series to map the spatial distribution of three major crops (maize, rice, and soybean) in northeastern China. Wen et al. [53] propose the preconstrained method integrating phenology-based vegetation indexes and RF for multi-year (2012-2016) mapping in regions with a complex planting structure, such as areas cultivating maize, wheat, and sunflower, indicating that the phenology-based approach was effective in improving the classification accuracy. Liu et al. [54] and Liu et al. [55] use RapidEye and Landsat images to extract the distribution of single-season (including corn, soybean, etc.) and across-season crops (wheat in winter, and rice, cotton, corn, orchard, water, and bare-land in summer), which proves the effectiveness of red-edge band and timeseries Normalized Difference Vegetation Index (NDVI) in improving classification accuracy, respectively. The above studies completed the classification of complex crops at different regional scales but neglected the particularity of crop planting structure and accuracy among different subregions. However, there are few researches on the extraction of large-scale crop patterns across seasons in complex agricultural areas with 10 m-resolution mapping.
In this study, the Normalized Difference Water Index (NDWI) and Normalized Difference Vegetation Index (NDVI) time series for four land types (i.e., forest, cultivated land, urban areas, and water) were established for the study area using the GEE and Sentinel-2 imagery. In combination with a digital elevation model (DEM), priority for reclassification processing, seasonality, and other information, a decision tree and post-processing reclassification rule were constructed. Based on these results, the land cover and crop planting structures for the entirety study area were extracted. Finally, the feature importance, overall accuracy, and influential factors for the derived land mask and extraction results over time were analyzed in detail. Our method can extract the crop planting structure via the fusion of multiple features, such as the terrain, time series characteristics, priority, and seasonality (TTPSR), when using a 10 m composite Sentinel-2 image covering the study area in Henan Province, China. The purpose of this study is to provide a new set of methods and rules for Remote Sens. 2021, 13, 2517 5 of 29 the automated processing of crop structures across complex large-scale agricultural areas, as well as to establish a reference for regional to national crop extraction.  (Figure 1). Henan Province is classified as a monsoonal climate, extending from the northern subtropical to the warm temperate zone. The average annual temperature is between 10 and 17 • C, annual precipitation is between 407 and 1230 mm, and the frost-free period is >200 days. The corresponding periods of rain and heat are advantageous for cultivation. Crops within this study area are diverse and account for nearly 10% of the total grain production in China [56,57]. The terrain of the study area is highly variable with high elevations in the west and low elevations in the east, surrounded by mountains in the north, west, and south that gradually flatten out into hills, basins, and plains. The distribution of terrain is complex, and the degree of field fragmentation is particularly high in the mountainous and hilly areas. According to the 30 m DEM and terrain classification standard, the study area can be divided into three types: plains (<200 m), hills (≥200 m), and mountains (≥500 m; Figure 1b).

Data and Study
planting structure via the fusion of multiple features, such as the terrain, time series characteristics, priority, and seasonality (TTPSR), when using a 10 m composite Sentinel-2 image covering the study area in Henan Province, China. The purpose of this study is to provide a new set of methods and rules for the automated processing of crop structures across complex large-scale agricultural areas, as well as to establish a reference for regional to national crop extraction.

Study Sites
Henan Province, located in central China (31°23′ N-36°22′ N, 110°21′ E-116°39′ E), has a total area of ~165,000 km 2 ( Figure 1). Henan Province is classified as a monsoonal climate, extending from the northern subtropical to the warm temperate zone. The average annual temperature is between 10 and 17 °C, annual precipitation is between 407 and 1230 mm, and the frost-free period is >200 days. The corresponding periods of rain and heat are advantageous for cultivation. Crops within this study area are diverse and account for nearly 10% of the total grain production in China [56,57]. The terrain of the study area is highly variable with high elevations in the west and low elevations in the east, surrounded by mountains in the north, west, and south that gradually flatten out into hills, basins, and plains. The distribution of terrain is complex, and the degree of field fragmentation is particularly high in the mountainous and hilly areas. According to the 30 m DEM and terrain classification standard, the study area can be divided into three types: plains (<200 m), hills (≥200 m), and mountains (≥500 m; Figure 1b).

Sentinel-2 Data
With the launch of the Sentinel-2A and -2B satellites in 2015 and 2017, respectively, Sentinel-2, developed by the European Space Agency (ESA), comprises a constellation of two polar-orbiting satellites in the same sun-synchronous orbit, providing imagery across 13 spectral bands, with a 10-day revisit period and a maximum spatial resolution of 10 m (Table 1). Sentinel-2 has four red-edge bands (B5, B6, B7, and B8A), which provide important data for complex crop classification. B5 and B6 are particularly useful for red-

Sentinel-2 Data
With the launch of the Sentinel-2A and -2B satellites in 2015 and 2017, respectively, Sentinel-2, developed by the European Space Agency (ESA), comprises a constellation of two polar-orbiting satellites in the same sun-synchronous orbit, providing imagery across 13 spectral bands, with a 10-day revisit period and a maximum spatial resolution of 10 m (Table 1). Sentinel-2 has four red-edge bands (B5, B6, B7, and B8A), which provide important data for complex crop classification. B5 and B6 are particularly useful for red-edge position, B7 for calculating the inversion of the leaf area index (LAI) [58], and B8A is sensitive to LAI, chlorophyll, and biomass [59]. The data have been widely used in agricultural monitoring [60], ecological assessments [61], and land cover change analyses [62]. This study utilized the GEE to obtain Sentinel-2 archived data, as well as a quality band (QA60) that can be used for cloud masking to reduce the influence of noise on feature extraction. The Sentinel 2 satellite utilizes the Universal Transverse Mercator projection and follows the US-MGRS (U.S. Military Grid Reference System) to set the satellite orbit number; the number of images covering the study area ranged from (49SD-50SM) to (50RK-50SL). In addition, the three bands with a 60 m resolution (i.e., B1, B9, and B10) were removed during the classification process. In Henan Province, the main rotation structures are between wheat and maize (two primary crops per year), and the main growth periods are shown in Figure 2. Winter crops (primarily wheat) are generally sown in early October, enter a jointing stage in March of the following year; following a growing stage, these crops mature and are harvested at the end of May. Spring crops, such as maize (MAI) and peanuts (PEA), are usually sown in early May, enter their peak growth period in mid-June, and are harvested in mid-to late-August. Tobacco (TOB) is considered a summer crop because it is grown only once a year and harvested in late-August. Summer crops (primarily maize) are generally sown in early June, enter a growing phase in mid-July, and are harvested in mid-September. When selecting imagery, considering the specific phenological periods of crops across seasons is important, as well as limiting the time intervals between images as much as possible.
ote Sens. 2021, 13, x FOR PEER REVIEW 7 of period of spring and summer, and distinguish whether it is a spring crop from the follo ing two strategies. The first strategy is that if the winter-extraction is non-cultivated lan it can be determined that the crops are planted in spring. The other strategy is that if t image-feature is non-cultivated land in late-August or early-September, which indica that the spring crops have been harvested, then the images from mid-June to late-Ju should be reselected for spring crop extraction. With cloud cover less than 10% in 20 we obtained 346 and 153 images from March 1 to May 18 (for winter analysis) and fro August 1 to September 10 (spring-summer analysis), respectively.

Sample Data
An accurate spectral characteristic of different crops in remote sensing images is sential for crop extraction. We use the ArcMap 10.5 in Getac F110 computer and Geoma per software in handheld GPS named Jisibao to obtain ground data, which can load a In the early stages of crop growth, spectral information is largely uninformative owing to low crop coverage and a large relative influence from the ground surface [63]. Spectral information increases with growth and coverage, enabling crop identification. Accordingly, images from late February to early May were selected for winter crop classification while images from late July to early September were selected for summer crops. The planting area of crops in spring is small and relatively concentrated. To reduce the time and workload of classification, images are selected according to the over-lap time period of spring and summer, and distinguish whether it is a spring crop from the following two strategies. The first strategy is that if the winter-extraction is non-cultivated land, it can be determined that the crops are planted in spring. The other strategy is that if the image-feature is non-cultivated land in late-August or early-September, which indicates that the spring crops have been harvested, then the images from mid-June to late-July should be reselected for spring crop extraction. With cloud cover less than 10% in 2020, we obtained 346 and 153 images from 1 March to 18 May (for winter analysis) and from 1 August to 10 September (spring-summer analysis), respectively.

Sample Data
An accurate spectral characteristic of different crops in remote sensing images is essential for crop extraction. We use the ArcMap 10.5 in Getac F110 computer and Geomapper software in handheld GPS named Jisibao to obtain ground data, which can load and process image and vector data, and completely match the location information between the ground-data point and image. The Getac F110 is a GPS-equipped tablet with positioning accuracy <3 m. The GPS toolbox can provide a spatial-positioning module embedded in ArcMap, with location services for ground surveys. In order to distinguish the spectral characteristics of different ground objects, the difference was recorded in detail with more than ten kinds of bands-based false color composite, such as B8A/B11/B3, B8A/B4/B3, B6/B4/B1, B4/B3/B2 and so on. Based on this, the interpretation marks library is established as a reference for the next manual selection of sample data. The ground survey (Table 2) consisted of identifying precise locations via handheld GPS devices during crop growth phases to pair the crop type with the coordinate positions across different locations. The spatial distribution of the sample plots and points is shown in Figure 1a; notably, the ground survey of the spring and summer crops was carried out simultaneously. When collecting sample points on the ground, the location was ensured to be >5 m from the field boundary, and the total field area was greater than the area of three pixels in the imagery (30 m × 30 m). Based on this, the sample points and image pixels corresponding to ground objects can be matched, and the influence of mixed pixels can be reduced. The crop planting structure in summer is more complex than that in winter: a total of 2254 sample points were collected, 508 in winter and 1746 in summer. Planting structures in the individual fields of 115 sample plots (~500 × 500 m) were recorded in the same locations for winter and summer, with a total area of 2.51 km 2 .
Among all the sample plots, only the urban area remained unchanged (Table 2); the area of water cover changed in the summer due to the planting of lotus and other crops while the forested land area changed owing to interplanting of PEA and soybeans (SOY). Combined with the spectral characteristics of the ground survey data across different seasons and Tobler's First Law of Geography [64], the number of samples, including both the ground survey and visually interpreted data, was increased to 35,592 and 44,828 based on different bands combination in the winter and summer, respectively. During the classification process, samples of each subregional unit were randomly split into training (70%) and testing (30%) subsets. The classification codes in Table 2 were used for crop classification in the GEE; the abbreviations were used to classify the different land type mapping analyses. According to the ground survey results, the major winter crops/coverage types were wheat (WHE), garlic (GAR), rapeseed (RAP), and non-cultivated (WNC) land; summer crops were dominated by maize, peanuts, soybeans, rice (RIC), tobacco, millet (MIL), sesame (SES), rehmannia (REH), and other vegetables or crops (SOTH). The staple and minor crop are Remote Sens. 2021, 13, 2517 8 of 29 relative in different subregions, which is the major crop in one subregion while being minor in another subregion. In addition, different crop planting structures expressed notable regional aggregation characteristics; therefore, this study used the municipal administrative divisions with relatively homogeneous internal planting structures as the subregional unit. Land for construction, such as houses and roads 9.85% -Different bands contain different spectral information, and their combination can reveal a unique spectral signature for certain features [65]. The images with Tile Number T50SKE and land types for the sample plots in the winter (17 April 2020) and summer (4 September 2020) are shown in Figure 3. The optimization selection R/G/B of images in winter and summer were B8A/B4/B3 and B8A/B11/B4, respectively. The true color composite is a widely used Earth observation product for displaying satellite imagery. However, the false color composite of different bands can highlight different features, especially the red edge band for complex crops. Combined with the ground survey data, we attempted to achieve visual interpretation of crops in different seasons with a variety of different false color synthetic bands. The spectral features reflected by the ground survey data in different band combinations (including B8A/B4/B3, B8A/B11/B3, B8/B11/B3, B4/B3/B2, and others) were recorded and utilized, and the sample points were selected by visual interpretation combined with the spatial position and shape of the image.
Remote Sens. 2021, 13, x FOR PEER REVIEW 9 of 30 different false color synthetic bands. The spectral features reflected by the ground survey data in different band combinations (including B8A/B4/B3, B8A/B11/B3, B8/B11/B3, B4/B3/B2, and others) were recorded and utilized, and the sample points were selected by visual interpretation combined with the spatial position and shape of the image.

Overall Workflow
As shown in Figure 4, the overall workflow includes three stages: data processing, decision tree and classification, and reclassification and accuracy analyses. First, data processing included both ground survey data and image preprocessing. Second, NDWI and NDVI time series data were calculated for four landcover types (cultivated land, forest, water, and urban) based on the selected ground sample points, which were used to set the preliminary thresholds of a decision tree [66]. Then, a dynamic decision tree was created, combining the terrain and cultivated land areas in each subregional unit. The extraction results with and without masked cultivated land were classified based on the RF classifier. Finally, the RF extraction results for both seasons were processed using the reclassification rules, which were generated according to the terrain, seasonality, priority, and planting methods, such as interplanting. Feature importance and classification accuracy under different conditions were analyzed, and a single map of the crop planting structures with different fields in an attribute table through time was created for the study area. Overall, the TTPSR was constructed by multiple features, including the terrain, time series characteristics, priority, and seasonality.

Overall Workflow
As shown in Figure 4, the overall workflow includes three stages: data processing, decision tree and classification, and reclassification and accuracy analyses. First, data processing included both ground survey data and image preprocessing. Second, NDWI and NDVI time series data were calculated for four landcover types (cultivated land, forest, water, and urban) based on the selected ground sample points, which were used to set the preliminary thresholds of a decision tree [66]. Then, a dynamic decision tree was created, combining the terrain and cultivated land areas in each subregional unit. The extraction results with and without masked cultivated land were classified based on the RF classifier. Finally, the RF extraction results for both seasons were processed using the reclassification rules, which were generated according to the terrain, seasonality, priority, and planting methods, such as interplanting. Feature importance and classification accuracy under different conditions were analyzed, and a single map of the crop planting structures with different fields in an attribute table through time was created for the study area. Overall, the TTPSR was constructed by multiple features, including the terrain, time series characteristics, priority, and seasonality.

Time Series Feature Analysis and Dynamic Decision Tree Construction
Landcover types (cultivated land, forest land, water, and urban) were extracted and masked to reduce the influence of non-agricultural regions on agricultural structure extraction. The preliminary decision tree was created through the following process.

Time Series Feature Analysis and Dynamic Decision Tree Construction
Landcover types (cultivated land, forest land, water, and urban) were extracted and masked to reduce the influence of non-agricultural regions on agricultural structure extraction. The preliminary decision tree was created through the following process.
First, 50 sample points for each landcover type were selected in the study area. Forest samples were mainly selected in mountainous and hilly regions, areas of perennial crop growth were selected for cultivated land, buildings and roads were selected for urban cover, and rivers and reservoirs were used for water.
Second, the time series of the NDWI and NDVI for selected sample sites in each landcover type were calculated (Figure 5a,b) using Sentinel-2 images (2556 images from 1 October 2019, to 30 September 2020); the doySeriesByRegion in the GEE and initial threshold values incorporated the seasonal variability in the vegetation indices. Discriminating between water and other landcover types was relatively straightforward (Figure 5a,b); an NDWI > 0.1 threshold was used to mask water pixels. Forests and cultivated land can be remotely distinguished from each other during the following three stages (Figure 5b): late September to mid-November (first stage), February to March (second stage), and late May to mid-June (third stage). Combined with the ground survey data, the first stage was marked by a rapid decrease in the NDVI for cultivated land caused by the bare surface following crop harvest; the NDVI for forested land also decreased slightly, but enough vegetation coverage was maintained to display a notable difference between the two landcover types. In the second stage, distinguishing parts of the forest from winter crops was difficult due to the high vegetation index values for evergreen forests in the southwest region. During the third stage, large areas of vegetable cultivation in central and eastern Henan Province during crop rotation are indistinguishable from forested land; therefore, the first stage NDVI was selected to distinguish between forests, cultivated land, urban, and water.
Preliminary decision tree classification rules were established based on the above analyses. In addition to the extraction threshold of water defined above (NDWI > 0.1), forested land was defined as NDVI ≥ 0.45 from 20 September to 1 November, cultivated land was defined as 0.03 ≤ NDVI < 0.45, and urban-water were considered as NDVI < 0.03. Urban pixels were extracted by masking these results with the water NDWI results.
We randomly generated 1000 sample points in the cultivated land mask data, and verified the accuracy of cultivated land classification based on the high-resolution image of Google Earth. The result shows that there are 972 cultivated land sample points and 28 non-cultivated land sample points (most of which are intercropping under forest and garden land). The accuracy of cultivated land extraction is 97.2%, which can meet the requirements of the crop extraction.
According to the terrain and spectral influence between different land types in each subregional unit, we found that forested land had a greater influence on cultivated land in hilly and mountainous regions, and the upper threshold limit of the decision tree was adjusted accordingly. In the plains, the spectra of staple crops had a significant influence on minor crops (such as garlic and other crops, such as onion), while decreased interference was observed between the agriculture and non-cultivated land. Therefore, according to the initially established decision tree and above analyses, the NDVI-based threshold between cultivated and non-cultivated areas was determined to be 0.01. Based on the premise of the maximum area of cultivated land in each basin unit, a threshold for the dynamic decision tree between forests and cultivated land was also established. The terrain area and thresholds for each subregional unit, as well as imagery dates, are listed in Table 3.
Notably, obtaining imagery in the early period of summer cropping was difficult due to the influence of seasonal rain and clouds; hence, a portion of the images was selected in the later stages of crop growth. The threshold between forests and cultivated land in more mountainous areas was higher than those in the plains; therefore, when building the decision tree rules for different landcover types, factors, such as regional terrain and seasonality, were considered. We randomly generated 1000 sample points in the cultivated land mask data, and verified the accuracy of cultivated land classification based on the high-resolution image of Google Earth. The result shows that there are 972 cultivated land sample points and 28 non-cultivated land sample points (most of which are intercropping under forest and garden land). The accuracy of cultivated land extraction is 97.2%, which can meet the requirements of the crop extraction.
According to the terrain and spectral influence between different land types in each subregional unit, we found that forested land had a greater influence on cultivated land in hilly and mountainous regions, and the upper threshold limit of the decision tree was adjusted accordingly. In the plains, the spectra of staple crops had a significant influence on minor crops (such as garlic and other crops, such as onion), while decreased interference was observed between the agriculture and non-cultivated land. Therefore, according to the initially established decision tree and above analyses, the NDVI-based threshold between cultivated and non-cultivated areas was determined to be 0.01. Based on the premise of the maximum area of cultivated land in each basin unit, a threshold for the dynamic decision tree between forests and cultivated land was also established. The terrain area and thresholds for each subregional unit, as well as imagery dates, are listed in Table 3. Notably, obtaining imagery in the early period of summer cropping was difficult due to the influence of seasonal rain and clouds; hence, a portion of the images was selected in the later stages of crop growth. The threshold between forests and cultivated land in more mountainous areas was higher than those in the plains; therefore, when building the decision tree rules for different landcover types, factors, such as regional terrain and seasonality, were considered.

Classification Algorithm and Evaluation Indicators
RF is a machine learning algorithm built upon a classification and regression tree (CART) using the ensemble method developed by Breiman [44,67]. The algorithm utilizes bootstrap aggregation (i.e., bagging) to reduce overfitting and improve the generalization accuracy. The RF algorithm provides the feature importance assessed using the out-of-bag data samples [68], which has been used extensively to accurately classify various types of remote sensing imagery, such as the classification of glacial or non-glacial lakes [69] and for phenology-based crop type (sugar beets and grains) mapping [70].
The parameters of the RF classifier in the GEE were set to: numberOfTrees = 100; minLeafPopulation = 1, and the remaining parameters were left as their default values. The image acquisition dates and times of 15 characteristic bands for each subregional unit are listed in Table 3. The OA derived from the confusion matrix and kappa coefficient has been widely used in the literature for evaluation purposes; a detailed description of the methods can be found in Congalton [71]. As the training and test samples were split randomly, the classification results of each landcover type were repeated five times, and the average value was taken as the final value of the evaluation indicators, including the feature importance, OA, and kappa coefficients. Additionally, the reclassification accuracy was analyzed again with a combination of resampling plots and visual interpretation.
A raster of the resampling plots was used to ensure the consistency of the extraction results during the accuracy assessment while visual evaluations provided a direct method for analyzing the classification results. To assess the differences of seasonality, the feature importance of each classification result was normalized based on the maximum value; then, the average values of the feature importance for all subregional units with and without the cultivated land mask were calculated. The average of any feature, F, was computed as follows: where for any subregional unit, F i is the average feature importance of the ith feature repeated five times, F i-max is the maximum value of all F i , and n is the total number of subregional units.

Feature Band Establishment and Importance Assessment
Satellite imagery can be used to obtain spatially distributed reflectance data on plant growth and development at different stages of the growing season [72]. Vegetation indices are often used as parameters to evaluate the growth of vegetation and improve the classification accuracy of crops; however, the identification of texture features is difficult due to the limitations of an image's spatial resolution, small field area, and complex crop types. Therefore, 15 feature bands, including 10 spectral bands and five vegetation indices, were generated for crop classification in each subregional unit. The five vegetation indices examined were NDVI, NDWI, Normalized Differences Built-up Index (NDBI), Red Edge NDVI (RENDVI), and Enhanced Vegetation Index (EVI), computed with Equations (2)-(6), respectively, as follows: where B2, B3, B4, B5, B8, and B11 correspond to the bands of Sentinel-2, as listed in Table 1. Figure 6 shows how the features are significantly affected by the spectral information for different land types in different seasons. The top five feature importance average values were winter with cultivated land mask (W_mask)-B2, NDWI, RENDVI, B11, and B12; winter without a cultivated land mask (W_nomask)-B2, NDWI, RENDVI, NDBI, and B11; and summer with (S_mask) and without cultivated land mask (S_nomask)-B11, B12, NDBI, B2, and B8A. Ultimately, feature importance was affected by the difference in seasonal land types. B2 and B11 are of high importance in the classification process across both seasons. In winter, NDWI and RENDVI were more important than the other vegetation indices; however, both were calculated using B3, B4 and B5, thus reflecting that among red-edge bands, B5 is of greater importance in winter crop classification (i.e., B5 > B6 > B8A > B7). In summer, NDBI is the most important vegetation index, and the feature importance of B8A was higher than that of the other red-edge bands. In conclusion, both the blue bands and shortwave infrared spectral bands had the highest importance values while the importance of the red-edge bands varied seasonally across land types, consistent with the conclusions of Song et al. [13].

Reclassification by Fusion of Seasonal and Multiple Features
The results of the pixel-based supervised classification almost always produce numerous scattered tiny spots known as the "salt and pepper" phenomenon. To minimize this effect and improve the accuracy, the RF extraction results with the cultivated land mask were calculated according to the reclassification rules, which fused the terrain, land types, priority, and seasonality. All reclassification processing and statistics were programmed in Python 2.7 and Arcpy; the main process was as follows: 1. DEM resampling: The 30 m DEM in the study area was resampled to a resolution consistent with the extraction results (10 × 10 m), and the terrain (plains, hills, and mountains) was reclassified as 1, 2 and 3, respectively (Tresult).
2. Extraction results for each subregional unit: The existence of zero values in the extraction results was caused by the maximum outer boundary of each subregion unit. Therefore, a function named SetNull was used to treat zero values as null and resave the results for winter and summer, termed Wresult and Sresult, respectively.
3. Priority analysis of land types: The difference in the spectral characteristics of similar land types lead to an inconsistency in the classification results for grassland and green areas at highway medians in urban areas. Therefore, the combined dataset for the urban extraction results for both seasons, and the union results for urban and rural residential areas were used (Figure 7). The extraction results for winter-based unplanted areas (WNC) and summer-based tobacco (TOB) had greater accuracy due to the small area and relatively simple planting structure of cultivated land in mountainous areas; hence, the

Reclassification by Fusion of Seasonal and Multiple Features
The results of the pixel-based supervised classification almost always produce numerous scattered tiny spots known as the "salt and pepper" phenomenon. To minimize this effect and improve the accuracy, the RF extraction results with the cultivated land mask were calculated according to the reclassification rules, which fused the terrain, land types, priority, and seasonality. All reclassification processing and statistics were programmed in Python 2.7 and Arcpy; the main process was as follows: 1. DEM resampling: The 30 m DEM in the study area was resampled to a resolution consistent with the extraction results (10 × 10 m), and the terrain (plains, hills, and mountains) was reclassified as 1, 2 and 3, respectively (T result ).
2. Extraction results for each subregional unit: The existence of zero values in the extraction results was caused by the maximum outer boundary of each subregion unit. Therefore, a function named SetNull was used to treat zero values as null and resave the results for winter and summer, termed W result and S result , respectively.
3. Priority analysis of land types: The difference in the spectral characteristics of similar land types lead to an inconsistency in the classification results for grassland and green areas at highway medians in urban areas. Therefore, the combined dataset for the urban extraction results for both seasons, and the union results for urban and rural residential areas were used (Figure 7). The extraction results for winter-based unplanted areas (WNC) and summer-based tobacco (TOB) had greater accuracy due to the small area and relatively simple planting structure of cultivated land in mountainous areas; hence, the original extraction results were largely maintained. There were inconsistencies, however, with mountainous forested land (MFL), especially in deciduous forest regions (e.g., shrublands); hence, the resulting MFL was a union of both seasonal datasets. Finally, in each season, both hilly and plains regions had crops planted under the forest canopy, and water areas (WAT) had different seasonal flood patterns and crop planting structures, such that the original extraction results were maintained. Therefore, the priority for different land types was URB > TOB or WNC > MFL > WAT > Other classification results.
As there were 14 categories in the summer extraction system, Calresult was set to a 4bit code. When Wresult was WNC, MAI and PEA in the summer crops were reclassified as spring MAI and PEA.
5. Recoding and reclassification: The terrain code (Tcode), winter reclassification code (Wcode), summer reclassification code (Scode), and spring result code (Spcode) were automatically added and calculated based on the Calresult of each subregional unit. Then, according to the land type priority from Step (3), a single map of the reclassification results with different fields in the attribute table was generated. When SS = 06, or W = 6 in TSSW, the reclassification code for different seasons was 6-URB. A TSSW value of 3114 indicated that Tcode was 3-mountain, Wcode was 4-WNC, and Scode was 11-MAI; however, as Wcode was 4-WNC, Spcode should be 411-spring MAI; otherwise, Spcode is 11-MAI. All specific codes and land type abbreviations are listed in Table 2. 6. Filtering and result statistics: The nearest neighbor rule and a replacement threshold of "half" in the MajorityFilter tool were adopted to reduce noise by reclassifying tiny spots resulting from Step (5) into an adjacent large category to obtain the final classification result for each subregional unit. Finally, the area results for each were calculated using ZonalStatisticsStable tool.

Overall Accuracy
Accuracy analyses were carried out to test the performance of the RF algorithm under different conditions. Based on the same randomly generated training and testing datasets, Figure 8 shows the results of the five simulations for each subregional unit with and without the cultivated land mask across both seasons. In addition, the range of the Y-axis is set from 0.7 to 1 to compare the assessment indicators in the uniform range, with the following conclusions.
1. The OA and kappa coefficients of the extracted results in both seasons showed strong performances with RF compared with the sample point data. For all basic subregions, nearly 80% maintained an average OA > 0.85, and only two observed kappa coefficients were <0.8, i.e., Puyang City (0.770) and Shangqiu City (0.782), when assessed without the cultivated land mask in winter. With or without the cultivated land mask, the OA ranges in winter were 0.857-0.935 and 0.829-0.923 and the kappa coefficients were 0.803-0.902 and 0.770-0.891, respectively. Summer OA ranges observed with and without the 4. Coding rule establishment: Combining T result , S result , and W result , a new 4-bit coding (TSSW) raster data (Cal result ) was generated using the Raster Calculation in Arcpy and Equation (7): Cal result = 1000 * T result + 10 * S result + W result .
As there were 14 categories in the summer extraction system, Cal result was set to a 4-bit code. When W result was WNC, MAI and PEA in the summer crops were reclassified as spring MAI and PEA.
5. Recoding and reclassification: The terrain code (T code ), winter reclassification code (W code ), summer reclassification code (S code ), and spring result code (S pcode ) were automatically added and calculated based on the Cal result of each subregional unit. Then, according to the land type priority from Step (3), a single map of the reclassification results with different fields in the attribute table was generated. When SS = 06, or W = 6 in TSSW, the reclassification code for different seasons was 6-URB. A TSSW value of 3114 indicated that T code was 3-mountain, W code was 4-WNC, and S code was 11-MAI; however, as W code was 4-WNC, S pcode should be 411-spring MAI; otherwise, S pcode is 11-MAI. All specific codes and land type abbreviations are listed in Table 2. 6. Filtering and result statistics: The nearest neighbor rule and a replacement threshold of "half" in the MajorityFilter tool were adopted to reduce noise by reclassifying tiny spots resulting from Step (5) into an adjacent large category to obtain the final classification result for each subregional unit. Finally, the area results for each were calculated using ZonalStatisticsStable tool.

Overall Accuracy
Accuracy analyses were carried out to test the performance of the RF algorithm under different conditions. Based on the same randomly generated training and testing datasets, Figure 8 shows the results of the five simulations for each subregional unit with and without the cultivated land mask across both seasons. In addition, the range of the Y-axis is set from 0.7 to 1 to compare the assessment indicators in the uniform range, with the following conclusions. 2. The extraction accuracy of each sample pixel with the agricultural land mask was significantly higher than that without the cultivated land mask, consistent with the conclusions of Forkuor et al. [73]. Comparing the average OA for the different seasons, there were eight subregional units with an increase of more than 2%.
3. The accuracy in summer was higher than that in winter, and the number of extracted categories were also greater. When sample points for the winter were selected, there was an apparent phenomenon of foreign bodies with similar spectral characteristics among sapling forests, unplanted cultivated land, construction sites, plastic mulching areas, and greenhouses, thereby making these objects difficult to distinguish. Although the number of extracted categories in summer was greater, the spectral characteristics among the land types were more varied and relatively easier to distinguish. Thus, consistent with the conclusions of Lim et al. [74] and Shao and Lunetta [75], the classification accuracies were directly related to the sample homogeneity/heterogeneity. 4. When examining the estimated spatial distribution of the results (Figure 9a,b), the OA of the staple crops maintaining a large proportion of the area was higher than the accuracy of minor crops, consistent with the findings of Liu et al. [55]. There were some misclassifications and incorrect classifications of minor crops. Therefore, the distribution of minor and other crops with varied spectral characteristics also affected the OA.
5. The accuracy indices for the cities of Sanmenxia and Xinyang both indicated relatively low precision results, which was likely attributable to the terrain and high degree of field fragmentation. Sanmenxia is located on the Loess Plateau, where mountains and hills account for 86.38% and 13.61% of the total area, respectively (i.e., 0% plains; Table 3). In addition, the highly fragmented field plots, complex land types, and difficulty in obtaining ground survey data were the primary drivers of the observed low OA. Xinyang is located in southern Henan Province, where plains account for 86.72%; however, the field boundary structures are irregular and complex (Figure 9c,d), and spectral features among different land types interfere with each other, decreasing the OA. In conclusion, terrain factors and complex field boundary structures had a significant influence on the observed OA.

Comparison Analysis of Resampling Plots
The calculation of accuracy assessments based on field sample data is common practice; however, this approach may be unsuitable due to the complex topographical relationship between points and patches, in addition to the potential subjective bias in the spatial location selection of sample points [76]. Thus, we calculated the number of pixels within each land type, both in the resampling plots and the reclassification results, for winter and summer (Table 4), as well as the spatial distribution of different results for individual samples (Figure 10).  The aim of the analyses of the evaluation indicators was to identify the factors affecting the extraction accuracy of crop structuring under different conditions. We note that the number of classification targets did not significantly contribute to the observed OA. Furthermore, the OA was improved by the cultivated land mask while factors, such as the separation results of different ground objects, minor crops and other crops, interaction of spectral information among different land types, terrain factors, field boundary structures, and the complexity of crop planting methods, all had major impacts on crop identification. Therefore, future should focus on analyzing and discriminating between the seasonality of crop structuring, as well as the seasonal characteristics and differences of ground objects, to further improve the classification accuracy in complex agricultural areas.

Comparison Analysis of Resampling Plots
The calculation of accuracy assessments based on field sample data is common practice; however, this approach may be unsuitable due to the complex topographical relationship between points and patches, in addition to the potential subjective bias in the spatial location selection of sample points [76]. Thus, we calculated the number of pixels within each land type, both in the resampling plots and the reclassification results, for winter and summer (Table 4), as well as the spatial distribution of different results for individual samples (Figure 10).
The reclassification accuracy of staple crops, such as WHE, MAI, PEA, RIC, and RAP, was superior to that of minor and other crops. In winter, WNC had a higher extraction accuracy due to a greater areal extent and distinct spectral characteristics, but it remained difficult to distinguish from URB coverage. With the spectral influence of WHE, the path of wheat fields was incorrectly classified as GAR (Figure 10b), driving the observed increase in GAR. The estimated areas of winter forests (WFL) and URB land were more than that of the resampling plots due to the spectral influence of roadside trees, and the spatial resolution of the images lead to the area of winter WHE in this region largely being classified as WFL. Additionally, greenhouses in the cultivated land, and some unplanted land were mistaken for URB. Except for GAR and RAP, other winter crops (WOTH) had relatively small areas and lacked contrast, such that their accuracy was not evaluated here. These were the main drivers for the observed inaccuracies between GAR and field roads, and WHE and WFL, as well as WNC and URB in winter.  The reclassification accuracy of staple crops, such as WHE, MAI, PEA, RIC, and RAP, was superior to that of minor and other crops. In winter, WNC had a higher extraction accuracy due to a greater areal extent and distinct spectral characteristics, but it remained difficult to distinguish from URB coverage. With the spectral influence of WHE, the path of wheat fields was incorrectly classified as GAR (Figure 10b), driving the observed increase in GAR. The estimated areas of winter forests (WFL) and URB land were more than that of the resampling plots due to the spectral influence of roadside trees, and the spatial resolution of the images lead to the area of winter WHE in this region largely being classified as WFL. Additionally, greenhouses in the cultivated land, and some unplanted land were mistaken for URB. Except for GAR and RAP, other winter crops (WOTH) had relatively small areas and lacked contrast, such that their accuracy was not evaluated here. These were the main drivers for the observed inaccuracies between GAR and field roads, and WHE and WFL, as well as WNC and URB in winter. In summer, crops, such as PEA, MAI, SOY, and RIC, all had a similar spectral influence as WHE. As SOY fields were relatively scattered and small, its boundaries were difficult to extract owing to the spectral interference of other crops. RIC was concentrated in the southern region of the study area, characterized by complex land types, irregular field boundary structures, and strong terrain changes (Figures 9c,d and 11). These factors negatively influenced the classification accuracy of RIC, and were likely responsible for the difficulty in extracting the field boundaries for this region. Meanwhile, the driving factors for the observed overestimation of other crops were the more complex spectral characteristics from complex and fragmented fields, in addition to the variability in the harvest time for vegetables, sorghum, sweet potatoes, and medicinal plants, among others. In addition, an imbalance in the training samples lead to a considerably decreased amount of ground data for some categories, such as REH, SOTH, or summer unplanted areas (SNC), often resulting in the observed poor accuracy for these minority classes. Moreover, ground samples within the mountains were limited due to logistical and safety concerns. According to the visual interpretation of the sample points and reclassification results (Figure 11), SFL was the main landcover type observed in mountainous areas and had a high classification accuracy based on visual interpretation. Through an analysis of the pixel numbers in both sets of results, we found that the causes of the observed inaccuracy in crop classifications were similar to the drivers of the OA; however, the inconsistency between the regular spatial arrangement of the raster data, as well as the directions of the actual planting structures, also limited the spatial overlap between pixel and vector boundaries ( Figure 10), a further driver of the observed classification inaccuracies.
Remote Sens. 2021, 13, x FOR PEER REVIEW 21 of 30 Figure 11. Information of all subregions, including name, total area, categories, sample points, and area proportion. The first subgraph represents the template for displaying the results. The value of each land use type is the percentage of classification-area of the subregional total area. Moreover, (Sum, Spr) represents the area proportion during summer and spring, and the gray-cells indicate that this category is not classified in the subregion.  Figure 11. Information of all subregions, including name, total area, categories, sample points, and area proportion. The first subgraph represents the template for displaying the results. The value of each land use type is the percentage of classification-area of the subregional total area. Moreover, (Sum, Spr) represents the area proportion during summer and spring, and the gray-cells indicate that this category is not classified in the subregion.
Compared with the pixel numbers of the resampling plots, the extraction results of the staple crop areas were underestimated. In addition to variable terrain, field boundaries, planting structures, and mixed pixels, some classification errors were also caused by the difference between ground field boundaries and image arrangements. To improve the classification performance, appropriate resampling techniques or time series data should be the focus of future research.

Reclassification of Multi-Temporal Crop Planting Structure
The TTPSR-based approach was tested in Henan Province, and the multi-season 2020 crop planting structures at 10 m resolution were extracted. The total area represented by all land types was 165,325.18 km 2 , as calculated from the total extraction area of each subregional unit in Figure 11. The proportion of each land type in both seasons is listed in Table 5 and the spatial distribution results are shown in Figure 12i,ii. The proportion of SFL was higher than WFL as deciduous forests dominate the study area, in addition to the incorrect classification of young forest where vegetation fails to cover the ground surface, which can readily be mistaken for WNC from the high ground visibility. In addition, for WHE grown under deciduous forest canopies in the plains, its spectral pattern dominates from April to May, and can readily be classified using medium resolution imagery. Further, summer water areas were slightly higher than in the winter from seasonal changes in the climate and precipitation. Precipitation is mainly concentrated from June to August, flooding rivers and ponds. Compared with the summer reclassification results, the area of WFL and winter WAT decreased by 3.22% and 0.31%, respectively.
When comparing the results to crop areas from 2018 [57] (Table 5), the observed difference in the results was partly due to a lag in response to relevant provincial policies increasing the planting areas of more economically beneficial crops (e.g., peanuts), which has been observed in recent years. Combined with the visual interpretation, ground survey records, and the underestimated reclassification area of staple crops compared to actual planting areas in the resampling plots (see Section 4.1.2), the multi-seasonal reclassification results were largely consistent with the actual spatial distribution of crops in Henan Province, ultimately supporting the processing ability of TTPSR, as well as the extraction ability of complex crops and planting structures.
Based on Figure 12 and Table 5, forests are primarily distributed in the mountainous and hilly areas of northeastern, western, and southern Henan Province while the plains forests are mainly concentrated in southeastern Xuchang (Yanling County) and southern Xinyang (Huangchuan County). WAT coverage is mainly distributed in Nanyang and Xinyang. Urban areas account for 22.95% of the total area. WHE was the most widely planted crop, accounting for 32.22% of the total area, followed by MAI (22.02%). In winter, GAR was mainly planted in eastern Zhengzhou (Zhongmu County) and Kaifeng, and RAP was planted sporadically in southern Zhumadian and Xinyang. There were many unplanted paddies in Xinyang, which are difficult to sow due to standing field water after harvest. Unplanted land was relatively scattered, but primarily concentrated in the transitional mountainous and hilly areas in the cities of Sanmenxia and Luoyang, as well as to the west of Xuchang and the Nanyang Basin, areas mostly used for planting TOB and PEA in the spring. All other crops not analyzed individually were planted in smaller dispersed fields.   In the summer, planting areas for PEA, SOY, and RIC were relatively concentrated: PEA mainly grew in Xinxiang, Zhumadian, and Nanyang Basin; SOY was mainly grown in Xuchang and east of Shangqiu (Yongcheng Country); RIC was mainly distributed in Xinyang and to the south of Zhumadian, except for scattered distributions along the banks of the Yellow River; TOB was consistent with the unplanted land in winter; sesame (SES) was mainly concentrated in Zhumadian and southern Zhoukou; MIL was centered around Sanmenxia and to the west of Luoyang; REH was planted mostly near Jiaozuo; and SOTH, mostly composed of various vegetables, were mainly grown in eastern Zhengzhou, Kaifeng, and Zhoukou.

Discussion
It is well known that GEE as a web-based remote sensing platform with parallel processing capacity provides efficient classification and spatial analysis over image collections [77]. Traditional image-processing software, such as ENVI, requires time-consuming image preprocessing and personal computer memory, which limits the extraction of largescale land use areas. In addition to the ease-of-use of GEE, the cloud computing power and consolidated global remotely sensed data library make the scientific research easier for users, especially nonexperts who may not be proficient in the radiometric calibration and atmospheric correction satellite-image-based and large-scale computing required [78]. Moreover, users are able to share scripts openly, which makes application and analysis reproducible and has more advantages than other frameworks. In Henan Province, there are complex planting structures and various land use types with different seasonality, relatively concentrated planting areas of some crops, such as soybeans, peanuts, and sesame, and high degrees of field fragmentation, particularly in hilly areas. The above conditions indicate that it is necessary to use GEE for temporal land use mapping. Further, the experimental results supported the transferability of our approach for multi-temporal crop planting structure extraction over complex agricultural areas.

Comparison of Classification Results
Sentinel-2 is a widely used satellite remote sensing imaging technology, which emerged in recent years. It provides accessible data for multi-regional crop planting structures [5]. Presently, the agricultural application research of crop planting structure extraction based on Sentinel-2 covers a small number of crop types, mainly focusing on major crops, such as rice, wheat, corn and soybean [49,53]. However, few studies on the simultaneous classification of minor crops, including sorghum, sesame, and peanut, limit the application potential of such data [79]. In this research, crop classification accuracy with cultivated land mask was higher than that without cultivated land mask. These findings were consistent with the conclusions of Veloso et al. [80]. Erinjery et al. [81] assessed the capability of Sentinel-2 to discriminate different vegetation types in the tropical rainforests using MLC and RF with >75% accuracy, which is lower than that in this research (>83%). Yi et al. [82] obtained classification accuracy (>87%) of the multi-growth period for spring crops, including corn, wheat, melon, alfalfa, sunflower, and fennel, with a study area of 41,600 km 2 . Compared with 163,000 km 2 area and more complex crop types in this study, we considered crop type heterogeneity and classified land covers for each subregion. It is well known that using vegetation indices and mask data can improve crop classification accuracy, as different information can eliminate unwanted effects or capture crop characteristics from alternative perspectives [81]. In this research, the RF and TTPSR approaches were employed to classify the across-season crop types, and the classification accuracy and certainty remained stable when cultivated land masks and additional features were used, and crop types became complex.

Dynamics and Rule Synthesis
Further examinations of the extraction results by RF in the GEE suggested that the performances of the cultivated land mask data generated by the dynamic decision tree were Remote Sens. 2021, 13, 2517 23 of 29 significant. Figure 13 shows that the cultivated land as a mask can effectively reduce the misclassification of crops in forest land, although there are still some small misclassification patches. Compared with the high-resolution Google Earth Image, the evaluation results notably revealed the application of rules by fusion of multiple features. In addition, Figure  13 still shows the tiny patches in rural residential areas in different seasons, especially in winter, but this can be solved by the reclassification rules in Section 3.5, as well as the reference result in Figure 7, which also confirmed the validity of the reclassification rule.
tures were used, and crop types became complex.

Dynamics and Rule Synthesis
Further examinations of the extraction results by RF in the GEE suggested that the performances of the cultivated land mask data generated by the dynamic decision tree were significant. Figure 13 shows that the cultivated land as a mask can effectively reduce the misclassification of crops in forest land, although there are still some small misclassification patches. Compared with the high-resolution Google Earth Image, the evaluation results notably revealed the application of rules by fusion of multiple features. In addition, Figure 13 still shows the tiny patches in rural residential areas in different seasons, especially in winter, but this can be solved by the reclassification rules in Section 3.5., as well as the reference result in Figure 7, which also confirmed the validity of the reclassification rule. When extracting the crop planting structures of each subregional unit, the integrity of the cultivated land information directly affected the accuracy of the area; however, for medium resolution remote sensing imagery, as analyzed here, the mixed pixels of ground objects differed greatly under variable terrain conditions. Although different classes were readily extracted by fixed thresholds that were derived in this study, the influence of land types and terrain on the spectral response within each subregional unit was often ignored. The cultivated-land extraction accuracy based on dynamic decision tree is 97.3%, which is higher than that (96.7%) of Xiong and Kang [17]. The dynamic decision tree rules, fused with multi-temporal characteristics, regional terrain, and land types, not only reduced the impact of adjacent forests on cultivated land in the plains, but also maximized cultivated When extracting the crop planting structures of each subregional unit, the integrity of the cultivated land information directly affected the accuracy of the area; however, for medium resolution remote sensing imagery, as analyzed here, the mixed pixels of ground objects differed greatly under variable terrain conditions. Although different classes were readily extracted by fixed thresholds that were derived in this study, the influence of land types and terrain on the spectral response within each subregional unit was often ignored. The cultivated-land extraction accuracy based on dynamic decision tree is 97.3%, which is higher than that (96.7%) of Xiong and Kang [17]. The dynamic decision tree rules, fused with multi-temporal characteristics, regional terrain, and land types, not only reduced the impact of adjacent forests on cultivated land in the plains, but also maximized cultivated land identification in montane and hilly areas, an important component for improving the overall extraction accuracy of crop planting structures.
Furthermore, reclassification rules were constructed to reduce the workload of postprocessing extraction results. The priority of different land types in different seasons was also analyzed, focusing particularly on the spatial structure and relationship of limited urban landcover change over the analyzed period. The 4-bit coding (TSSW) and reclassification rules were established to account for differences in terrain, classification accuracies, and crop phenology. The enhanced reclassification accuracy following the implementation of these rules reflected the importance of integrating multiple features. This approach supported the use of automated processes for complex crop extraction in multi-temporal seasons; however, additional research is necessary to refine the dynamic thresholds and reclassification rules.

Differences in Feature Importance
Previous studies have shown that SWIR and red-edge bands are critical for the classification of water [83], forests [42], and crops [84]. Among the 15 feature bands examined in this study, the importance of SWIR and particularly the red-edge data varied greatly by season. In winter, the greatest importance observed among the red-edge bands was B5 only through its inclusion in the RENDVI index (individual importance was ranked seventh), whereas the highest observed red-edge summer importance was B8A, ranked fifth. Including the red edge bands led to improvements of overall accuracies in all cases, and had the most pronounced effect on crop classes, which was consistent with the conclusions of Patrick et al. [85] and Liu et al. [86]. However, the feature importance of commonly used NDVI and EVI indices were relatively low in this study because both were largely influenced by vegetation coverage and background canopy. Further difficulty derived from crop classification in areas of high vegetation coverage in the late growth period, during which the values of vegetation indices were more similar. The NDVI and NDBI highlighted the spectral difference between crop and non-crop regions, and thus maintained higher importance values. In summary, the NDBI and NDWI were important for defining non-vegetated regions, NDVI and EVI were more valuable for characterizing vegetation coverage and canopy characteristics, and red-edge bands were essential for distinguishing between complex crop types; however, only a fraction of the total vegetation indices were explored in this study, such that additional indices and spatial characteristics should be further examined in future research.

Diversity of Accuracy Evaluation
There were numerous uncertainties in both the verified samples and field verification of the classification results, which may have affected the overall classification accuracy. Therefore, one of the most important decisions in this study was to select the appropriate units of evaluation and types of test data. Most similar studies have utilized the pixel as the evaluation object of the extraction results, but others have suggested that this approach is not suitable to evaluate patch results, such that different methods should be considered according to the specific situation [87].
In this study, 30% of the test data were used to evaluate accuracy. The OA of the classification type was ≥85%, but the OA and kappa coefficients for different classification targets varied greatly within the same season. For the OA and kappa coefficient values of each subregional unit, the maximum observed difference was 7.80% and 9.86% in winter, and 8.98% and 11.57% in summer, respectively. Additionally, when comparing pixel numbers for the reclassification results and resampling plots, GAR, SFL, URB, and SOTH displayed large differences (with SNC as a notable exception), reflecting the classification errors between different land types. We therefore concluded that the use of pixels as the evaluation unit produced a certain degree of thematic error; when applying this approach to the entire >160,000 km 2 study area, it not only ignored the differences in the evaluation indicators between different regions, but the more detailed sources of accuracy error were also difficult to express.
Additionally, the lower the spatial resolution of the analyzed satellite imagery, the more land cover mixture is contained within each pixel, further contributing to the error between the classification values and ground measured data. Therefore, the pixel-based evaluation approach is limited when expressing the true spatial position of land types. Thus, when evaluating the accuracy of remote sensing classifications, one should not only perform assessments at the pixel-level, but should also strive to improve the classification accuracy of mixed pixels, as well as effectively managing the salt and pepper phenomenon. Utilizing a transfer probability matrix and confidence interval to evaluate the accuracy of classification results are more comprehensive.

Limitations and Future Perspectives
This study aimed to map temporal land use in complex agricultural areas by the fusion feature. It is quite common that the classification work will be affected by many constraints, such as spectral characteristics of ground-objects, ground sampling, and climatic conditions. However, during the GEE classification process and post-reclassification it is possible to classify land-use due to the following seasonal and phenological characteristics: (i) The spectral differences of deciduous forests and field roads during winter and summer; (ii) available images during crop growth stage; and (iii) the across-season differences of crop planting patterns. Therefore, in addition to the limitation of climate conditions, these constraints require the study area to be located in temperate area with marked seasons, especially for double-cropping areas. In other areas, such as tropical rainforests covered with evergreen vegetation, the unnecessary priority-feature in post-reclassification can also pose limitations. Although the approach here was effective when reclassifying the seven to 13 classes of multi-temporal crop planting structures across a large and complex region, an improved accuracy can be obtained by addressing the following aspects: tiny spot processing, modeling of influential factors, and time series changes in crop planting structures using transferable deep models.
1. Processing of tiny patches in classification results. Urban cover, such as construction sites, residential areas, and roads within agricultural fields, have been more accurately extracted using a union of multi-temporal extraction results; however, the classification results of minor crops and other crops were insufficient, even after using the MajorityFilter tool.
2. Classification feature and accuracy assessments. In addition to spectral information, Sentinel-2 also provides features, such as texture, shape, and spatial distribution information, which can significantly contribute to distinguishing targets with similar spectral characteristics [82]; thus, significant improvements can be achieved from the inclusion of this data. Additionally, features, such as texture and spatial location, have not been fully considered and utilized when addressing classification issues of crop interplanting; there are still some limitations in the refined classification of regional special crops. Classification accuracy can also be further improved by selecting the optimal time window, including diverse features, and extracting the highly overlapping phenological information of different crops. The factors affecting the classification accuracy that have been analyzed in this study require further research to determine whether the evaluation criteria can be established through mathematical functions by integrating multiple indicators, including the spatial resolution of images, terrain, field fragmentation, plant structure complexity, number of categories, separation result for different ground objects, and image quality.
3. Planting regionalization and spatiotemporal evolution. The reclassification mapping shows that the spatial distribution of PEA, SOY, SES, vegetables, and other crops displayed strong regional aggregation. More attention should therefore be paid to the spatiotemporal evolution of agricultural patterning to provide scientific data for the adjustment of planting structures and food security at the macro-level, including extracting the classification results of a time series of imagery combined with transferable deep models and the construction of clustering rules to form spatial distribution structures for specific crops.

Conclusions
Crop planting structures are closely tied to food security and have significant implications for climate change and sustainable land management. This study provided a feature fusion approach with the premise to clearly distinguish seasonal rhythm and regional characteristics for each subregional unit. The TTPSR approach used here can further improve the OA and mapping quality of crop planting structures by incorporating the terrain, time series characteristics, priority, and seasonality for the complex agricultural region of Henan Province, China. The cultivated land mask data established by dynamic decision tree and reclassification rules built by 4-bit coding (TSSW) not only reduced the spectral influence of other land types on cultivated land, but also effectively improved automatic reclassification of the multi-temporal extraction results, and broken previous limitations associated with a unified number of categories and OA achieved in traditional classification methods. In addition, 15 feature bands were considered to calculate the feature importance for both seasons; B2 and SWIR had the greatest importance for both seasons while the red-edge bands were dominated by B5 in the winter and B8A in the summer. The major factors affected the accuracy of crop extraction include terrain, mixed spectra, field boundary structures, and planting complexity, as well as the characteristic limitation of the raster data itself. It showed significant seasonal differences due to the diverse spectral characteristics caused by complex planting structures. Finally, the total area of the study area was 165,325.18 km 2 and the number of categories in winter and summer was seven to eight and nine to 13, respectively, and OA is higher than 0.85 for each basic subregion in multi-seasons. This approach for multi-temporal crop planting structure extraction can be applied to complex agricultural areas to successfully enhance the reclassification accuracy of crop planting structures in different seasons of each subregional unit.  The sample points and code files presented in this study are shared at https://drive.google.com/ file/d/1cMBlFcldijOM88zRp2mC_jJI37neGa33/view?usp=sharing or available upon request to the corresponding author.