Large-Scale and High-Resolution Crop Mapping in China Using Sentinel-2 Satellite Imagery

: Large-scale, high-resolution mapping of crop patterns is useful for the assessment of food security and agricultural sustainability but is still limited. This study attempted to establish remote sensing-based crop classiﬁcation models for speciﬁc cropping systems using the decision trees method and monitored the distribution of the major crop species using Sentinel-2 satellites (10 m) in 2017. The results showed that the cropping areas of maize, rice, and soybean on the Northeast China Plain were approximately 12.1, 6.2, and 7.4 million ha, respectively. The cropping areas of winter wheat and summer maize on the North China Plain were 13.4 and 16.9 million ha, respectively. The cropping areas of wheat, rice, and rape on the middle-lower Yangtze River plain were 2.2, 6.4 and 1.3 million ha, respectively. Estimated images agreed well with ﬁeld survey data (average overall accuracy = 94%) and the national agricultural census data (R 2 = 0.78). This indicated the applicability of the Sentinel-2 satellite data for large-scale, high-resolution crop mapping in China. We intend to update the crop mapping datasets annually and hope to guide the adjustment and optimization of the national agricultural structure.


Introduction
Ensuring global food security is one of the greatest challenges for scientists around the world [1]. As the country with the largest population in the world, China's food security is not only important to economic development and social stability but also to global food patterns [2,3]. China has enacted a series of measures to ensure food security during recent decades, including food self-sufficiency policies and environmentally friendly agricultural development strategies [4][5][6]. These policies limit the scale of water, pesticide, fertilizer, and arable land use, which increases the difficulty of new round adjustment strategies for agricultural structure [7][8][9]. Updated information on crop pattern mapping can provide significant scientific evidence for estimating agricultural production and food security. Mapping major crop patterns in China has great implications for policymaking in agricultural sustainability [10,11]. Despite the development of global land cover products, the specific information on crop types at large scales is still limited [12]. Additional studies should be performed to enrich crop mapping data to assist in the optimization of agricultural distribution.
Earth observation satellites can be effective tools for land use mapping and crop classification due to their ability to quickly and efficiently collect information in the field [13,14]. The mapping regions from 2017 to 2018, and to evaluate the classification accuracy of the crop mapping. The primary research question in this study was as follows: What thematic accuracy can be achieved for large-scale and high-resolution crop pattern classification in different regions with the use of Sentinel-2 images?

Study Area
Considering the first-order agroecological zones in China, which were defined based on the climatic, soil, and landform characteristics [38], we chose the Northeast Plain (NEP), North China Plain (NCP), and middle-lower Yangtze River plain (MYRP) as the study area ( Figure 1). This region, which has an area of approximately 2 million km 2 , is the main grain-producing area in China, covers almost 60% of the total farmland in China, and produces more than 60% of the national grain yield. Considering the terrain complexity on the MYRP, we chose the plain area as the study area but monitored the entirety of the NEP and NCP. The cropping system in each zone was totally different. The single-cropping system was widely adopted in the NEP, while the double-cropping system was widely adopted in the NCP. Both the double-cropping system and the triple-cropping system existed in the MYRP. Figure 1. The spatial distribution of the elevation and survey sites (a), and the cropland and Sentinel-2 images (b) in the main grain-producing area of China, including the Northeast Plain (red frame), North China Plain (blue frame), and middle-lower Yangtze River plain (purple frame). The rectangle in the right-bottom corner represents China's Nansha Islands and the nine-dash line.

Datasets
In this study, we collected a total of 6,266 Sentinel-2 images (almost 2.3 TB of computational storage) from February to October of 2017 and 2018, which covered the whole study area and the key phenological periods of the target crop species (Figure 1). All images were collected from the opensource data of the European Space Agency (http://scihub.copernicus.eu) and we set a 20% standard to control the cloud cover. The Sentinel-2 sensor provides 13 bands at a 10-m resolution and contains three bands at the red edge, which can better classify different types of vegetation. All data were acquired under low cloud conditions and were georeferenced to the UTM-WGS84 projection system. In total, we chose more than 100 actual sampling points in the study area (Figure 1a). These sampling points were selected by local agricultural experiment stations and agricultural observation stations. The training dataset was used to determine the general decision trees and to adjust the thresholds in subregions. We also established another dataset for accuracy evaluation. We randomly chose a total of 113 test counties and 34,995 sample points within the study area under the principle of uniform spatial distribution. The spatial distribution of the elevation and survey sites (a), and the cropland and Sentinel-2 images (b) in the main grain-producing area of China, including the Northeast Plain (red frame), North China Plain (blue frame), and middle-lower Yangtze River plain (purple frame). The rectangle in the right-bottom corner represents China's Nansha Islands and the nine-dash line.

Datasets
In this study, we collected a total of 6,266 Sentinel-2 images (almost 2.3 TB of computational storage) from February to October of 2017 and 2018, which covered the whole study area and the key phenological periods of the target crop species (Figure 1). All images were collected from the open-source data of the European Space Agency (http://scihub.copernicus.eu) and we set a 20% standard to control the cloud cover. The Sentinel-2 sensor provides 13 bands at a 10-m resolution and contains three bands at the red edge, which can better classify different types of vegetation. All data were acquired under low cloud conditions and were georeferenced to the UTM-WGS84 projection system. In total, we chose more than 100 actual sampling points in the study area ( Figure 1a). These sampling points were selected by local agricultural experiment stations and agricultural observation stations. The training dataset was used to determine the general decision trees and to adjust the thresholds in subregions. We also established another dataset for accuracy evaluation. We randomly chose a total of 113 test counties and 34,995 sample points within the study area under the principle of uniform spatial distribution. We also used the 90-m resolution SRTM DEM and SRTM SLOPE data, which were obtained from the Geospatial Data Cloud (http://www.gscloud.cn), as the terrain layer to assist the crop mapping research. We used 91Weitu software to choose 0.5-m resolution Google Map temporal images from February to October of 2017 and 2018 during crop-specific growing periods (crop jointing stage or mature stage), which were used as the data source for the accuracy test. We chose national statistical data obtained from the National Statistical Bureau of China (NSBC) (http://www.stats.gov.cn) to establish the target crop species and to evaluate the classification accuracy.

Methodology
We chose multiple indicators for the crop mapping and built remote sensing-based models for crop classification in the different cropping systems. The general technical process included the following procedures ( Figure 2): The selection of the target crop species, data collection and preprocessing, design of the phenology-based indicators, crop area mapping, and accuracy evaluation. We also used the 90-m resolution SRTM DEM and SRTM SLOPE data, which were obtained from the Geospatial Data Cloud (http://www.gscloud.cn), as the terrain layer to assist the crop mapping research. We used 91Weitu software to choose 0.5-m resolution Google Map temporal images from February to October of 2017 and 2018 during crop-specific growing periods (crop jointing stage or mature stage), which were used as the data source for the accuracy test. We chose national statistical data obtained from the National Statistical Bureau of China (NSBC) (http://www.stats.gov.cn) to establish the target crop species and to evaluate the classification accuracy.

Methodology
We chose multiple indicators for the crop mapping and built remote sensing-based models for crop classification in the different cropping systems. The general technical process included the following procedures ( Figure 2): The selection of the target crop species, data collection and preprocessing, design of the phenology-based indicators, crop area mapping, and accuracy evaluation.

Selecting Target Crop Species
The cropping systems in the different zones were totally different. We used national statistical data from 2015 to calculate the ratio of the areas of the major crops that were sown in the different regions (Table 1). We calculated nine major kinds of crop area (rice, maize, wheat, soybean, rape, peanut, cotton, potato, and coarse cereals) and selected targeted crops in each study area. On the Northeast Plain, a single-cropping system is used due to the limited temperatures and rainfall resources. Maize, soybean, and paddy rice were the main crops in this area, accounting for 98.11% of the total planting area. On the North China Plain, most farms chose a winter wheat-summer maize double-cropping system. The area of the wheat and maize can cover 90.26% of the farmland in this zone. On the middle-lower Yangtze River plain, both double-cropping systems and multiple cropping systems exist because of the sufficient climatic resources. Paddy rice, wheat and rape were major crops in this zone, accounting for 93.03% of the total planting area. Thus, we chose maize, soybean, and paddy rice as the target crop species on the NEP; wheat and maize as target crop species on the NCP; and paddy rice, wheat, and rape as target crop species on the MYRP.

Selecting Target Crop Species
The cropping systems in the different zones were totally different. We used national statistical data from 2015 to calculate the ratio of the areas of the major crops that were sown in the different regions (Table 1). We calculated nine major kinds of crop area (rice, maize, wheat, soybean, rape, peanut, cotton, potato, and coarse cereals) and selected targeted crops in each study area. On the Northeast Plain, a single-cropping system is used due to the limited temperatures and rainfall resources. Maize, soybean, and paddy rice were the main crops in this area, accounting for 98.11% of the total planting area. On the North China Plain, most farms chose a winter wheat-summer maize double-cropping system. The area of the wheat and maize can cover 90.26% of the farmland in this zone. On the middle-lower Yangtze River plain, both double-cropping systems and multiple cropping systems exist because of the sufficient climatic resources. Paddy rice, wheat and rape were major crops in this zone, accounting for 93.03% of the total planting area. Thus, we chose maize, soybean, and paddy rice as the target crop species on the NEP; wheat and maize as target crop species on the NCP; and paddy rice, wheat, and rape as target crop species on the MYRP. been established in the application of agricultural remote sensing [15]. Considering the different cropping systems and the mixed crop density in different regions, we chose a total of seven common indicators and designed a new indicator to better classify the crop patterns by investigating the sampling points. The normalized difference vegetation index (NDVI) is widely used in to monitor vegetation coverage [39]. The NDVI time-series (∆NDVI) for the crop growing period can be used in the classification of crops and other land cover types. The normalized difference water index (NDWI) and normalized difference building index (NDBI) can distinguish between water and buildings in remote sensing images [40]. The land surface water index (LSWI) more accurately classify different crops, especially paddy rice [41]. Other indicators used in our research included the red-edge parameter (REP) and the blue-green (B-G) ratio [42,43]. We also designed a new indicator called the normalized difference rice index (NDRI), which can better identify the paddy rice on the Northeast Plain. The calculation formula of NDRI was as follows: where ρ swir and ρ red represent the shortwave infrared band and red band, respectively. The indicators used in the different regions are shown in Table 2.

Mapping Crop Area
The choice of classification algorithm is usually based upon a series of factors, such as the availability of software, ease of use, performance, and expected overall accuracy. The decision trees (DTs) method has been widely used in land cover classification during past years, considered the advantages of being computationally fast, making no statistical assumptions, and the ability to handle data from different measurement scales [15,16]. On the other hand, software to implement DTs is readily available over the Internet. DTs construction involves the recursive partitioning of a set of training data, which is split into increasingly homogenous subsets on the basis of tests applied to one or more of the feature values. Thus, we chose the decision tree classifier for this task. We chose the univariate decision tree algorithm and built trees in each target study region considering different cropping systems.
For each study region, we labelled the training sample, calculated the value of spectral indicators in specific phenological period, and compared the differences of spectral characteristics occurring in different crop species. We tried to determine the threshold manually by maximizing the dissimilarity or minimizing the similarity of the descendant nodes, and obtained a preliminary decision tree of classifications in three agroecological zones. However, according to our investigation, we found that the crop phenological features could be different even in the different provinces. The time difference of the Sentinel-2 images causes different threshold values. Building a decision tree based on a single threshold value for each study region could make a certain error. Therefore, we set 10% of value difference as the standard of subregion division and divided the Northeast Plain (NEP), North China Plain (NCP), and middle-lower Yangtze River plain (MYRP) into 13, 16, and 10 subregions, respectively, according to the differences in crop phenological features and image times ( Figure 3). The goal of subdivision was to better identify the threshold of decision trees and improve the overall accuracy. We built different decision trees for the crop classification of different cropping systems and slightly adjusted the threshold values in different subregions because of the varying image times and management practices. Meanwhile, for each pixel of cropland, a time series dataset of the targeted spectral indicator

Accuracy Assessment
In this study, we calculated the whole crop area directly by remote sensing. We evaluated the Sentinel-2-derived crop maps based on ground truth data and agricultural census data. We referenced the 'National specifications of inspection, acceptance and quality assessment of digital surveying and mapping products' (GB-T18316-2001) and selected more than 10% of total counties as evaluation targets. Data from a total 113 survey counties with around 400 sample points of each were collected. We randomly chose these test counties and sample points under the principle of uniform distribution, which covered Northeast, North, and South China. We identified the crop pattern through visual interpretation using 0.5 m-resolution Google Maps temporal images. The spatial consistency of the crop pattern maps with the Google Maps image results were assessed for the test counties. We calculated the user's, producer's, and overall accuracies. The kappa index was also computed [44].
Due to the limitation of field survey data and error of manual visual interpretation, an accuracy assessment was also conducted with the national agricultural census dataset. A dataset of crop areas at the county level was calculated based on remote sensing derived crop maps. Estimated sown areas were then compared to those from the national agricultural census datasets. We conducted regression analysis for each crop and chose the r-squared value to evaluate the accuracy of results. We set 95% as the confidence coefficient and then tested the p-value.

Classification Model on the Northeast Plain
Considering the limited temperature and precipitation resources, a single-cropping system was widely adapted on the NEP. The crop growing season usually starts in April and ends in October. Similar growth periods for different crops increased the difficulty of the crop classification with the use of a single indicator. We chose the NDVI, REP, and NDRI for this area, which can better distinguish the different crop types. We found that using the value of the NDVI in May and July could easily suppress the interference of buildings, water, and forest. The value of the REP in the maize sampling sites during August was higher than those of other crops, which could be used to identify the maize planting areas. We also found that the NDRI in the rice sampling sites during August were lower than those in the maize and soybean sampling sites, which could be used to

Accuracy Assessment
In this study, we calculated the whole crop area directly by remote sensing. We evaluated the Sentinel-2-derived crop maps based on ground truth data and agricultural census data. We referenced the 'National specifications of inspection, acceptance and quality assessment of digital surveying and mapping products' (GB-T18316-2001) and selected more than 10% of total counties as evaluation targets. Data from a total 113 survey counties with around 400 sample points of each were collected. We randomly chose these test counties and sample points under the principle of uniform distribution, which covered Northeast, North, and South China. We identified the crop pattern through visual interpretation using 0.5 m-resolution Google Maps temporal images. The spatial consistency of the crop pattern maps with the Google Maps image results were assessed for the test counties. We calculated the user's, producer's, and overall accuracies. The kappa index was also computed [44].
Due to the limitation of field survey data and error of manual visual interpretation, an accuracy assessment was also conducted with the national agricultural census dataset. A dataset of crop areas at the county level was calculated based on remote sensing derived crop maps. Estimated sown areas were then compared to those from the national agricultural census datasets. We conducted regression analysis for each crop and chose the r-squared value to evaluate the accuracy of results. We set 95% as the confidence coefficient and then tested the p-value.

Classification Model on the Northeast Plain
Considering the limited temperature and precipitation resources, a single-cropping system was widely adapted on the NEP. The crop growing season usually starts in April and ends in October. Similar growth periods for different crops increased the difficulty of the crop classification with the use of a single indicator. We chose the NDVI, REP, and NDRI for this area, which can better distinguish the different crop types. We found that using the value of the NDVI in May and July could easily suppress the interference of buildings, water, and forest. The value of the REP in the maize sampling sites during August was higher than those of other crops, which could be used to identify the maize planting areas. We also found that the NDRI in the rice sampling sites during August were lower than those in the maize and soybean sampling sites, which could be used to identify the paddy rice planting areas (Figure 4). The general decision tree on the NEP is shown in Figure 5. identify the paddy rice planting areas (Figure 4). The general decision tree on the NEP is shown in Figure 5.

Classification Model for the North China Plain
The NCP is a major winter wheat-summer maize double-cropping system in China. Farmers usually plant winter wheat from October to June and plant summer maize from June to late September. The value of the NDVI was highest in the crop heading period, which can be used to distinguish the crops from other land cover types, such as buildings and water. To differentiate between the crops and other types of vegetation, such as forest and grass, we used the ΔNDVI between the crop heading and harvest period, which was significantly higher than those at other times ( Figure 6). The general decision tree for the NEP is shown in Figure 7. identify the paddy rice planting areas (Figure 4). The general decision tree on the NEP is shown in Figure 5.

Classification Model for the North China Plain
The NCP is a major winter wheat-summer maize double-cropping system in China. Farmers usually plant winter wheat from October to June and plant summer maize from June to late September. The value of the NDVI was highest in the crop heading period, which can be used to distinguish the crops from other land cover types, such as buildings and water. To differentiate between the crops and other types of vegetation, such as forest and grass, we used the ΔNDVI between the crop heading and harvest period, which was significantly higher than those at other times ( Figure 6). The general decision tree for the NEP is shown in Figure 7.

Classification Model for the North China Plain
The NCP is a major winter wheat-summer maize double-cropping system in China. Farmers usually plant winter wheat from October to June and plant summer maize from June to late September. The value of the NDVI was highest in the crop heading period, which can be used to distinguish the crops from other land cover types, such as buildings and water. To differentiate between the crops and other types of vegetation, such as forest and grass, we used the ∆NDVI between the crop heading and harvest period, which was significantly higher than those at other times ( Figure 6). The general decision tree for the NEP is shown in Figure 7.

Classification Model for the Middle-Lower Yangtze River Plain
Both the double-cropping system and the triple-cropping system occurred on the MYRP. Farmers usually plant single rice (from April to October), early rice (from April to July), or late rice (from July to October) in the summer and plant winter wheat or rape in the winter. We chose the Sentinel-2 images for late April or early May that were taken during the rice transplanting period to identify the rice cropping area. We found that the value of the NDVI in April between T1 and T2 could remove the influence of the water, wheat, and forest. The characteristics of the NDBI and NDWI in April were used to distinguish the rice from other types, such as buildings and rape (Figure 8). For the classification of winter wheat and rape, the images from middle of late March that were taken during the jointing period were chosen in our research. The value of the NDVI in March was totally different between the vegetation and the other land cover types (Figure 8). We used this characteristic to extract vegetation. Considering the color variation in the rape flowers and the other crops, we found that the value of the B-G in the rape sampling sites was significantly lower than those in wheat and forest. This feature could be used to identify the rape cropping areas. We also found that the

Classification Model for the Middle-Lower Yangtze River Plain
Both the double-cropping system and the triple-cropping system occurred on the MYRP. Farmers usually plant single rice (from April to October), early rice (from April to July), or late rice (from July to October) in the summer and plant winter wheat or rape in the winter. We chose the Sentinel-2 images for late April or early May that were taken during the rice transplanting period to identify the rice cropping area. We found that the value of the NDVI in April between T1 and T2 could remove the influence of the water, wheat, and forest. The characteristics of the NDBI and NDWI in April were used to distinguish the rice from other types, such as buildings and rape (Figure 8). For the classification of winter wheat and rape, the images from middle of late March that were taken during the jointing period were chosen in our research. The value of the NDVI in March was totally different between the vegetation and the other land cover types (Figure 8). We used this characteristic to extract vegetation. Considering the color variation in the rape flowers and the other crops, we found that the value of the B-G in the rape sampling sites was significantly lower than those in wheat and forest. This feature could be used to identify the rape cropping areas. We also found that the

Classification Model for the Middle-Lower Yangtze River Plain
Both the double-cropping system and the triple-cropping system occurred on the MYRP. Farmers usually plant single rice (from April to October), early rice (from April to July), or late rice (from July to October) in the summer and plant winter wheat or rape in the winter. We chose the Sentinel-2 images for late April or early May that were taken during the rice transplanting period to identify the rice cropping area. We found that the value of the NDVI in April between T1 and T2 could remove the influence of the water, wheat, and forest. The characteristics of the NDBI and NDWI in April were used to distinguish the rice from other types, such as buildings and rape (Figure 8). For the classification of winter wheat and rape, the images from middle of late March that were taken during the jointing period were chosen in our research. The value of the NDVI in March was totally different between the vegetation and the other land cover types (Figure 8). We used this characteristic to extract vegetation. Considering the color variation in the rape flowers and the other crops, we found that the value of the B-G in the rape sampling sites was significantly lower than those in wheat and forest. This feature could be used to identify the rape cropping areas. We also found that the value of the Agriculture 2020, 10, 433 9 of 16 LSWI in the wheat sampling sites was higher than that in forest, which was used to identify the wheat cropping areas. The general decision tree for the MYRP is shown in Figure 9.
Agriculture 2020, 10, x FOR PEER REVIEW 9 of 17 value of the LSWI in the wheat sampling sites was higher than that in forest, which was used to identify the wheat cropping areas. The general decision tree for the MYRP is shown in Figure 9.

Spatial Distribution of the Major Crops in the Main Grain-Producing Regions
We monitored the areas of the major crop sown in the three grain-producing areas from 2017 to 2018. The results showed that the crop planting area on the NEP was mostly concentrated on the Songnen Plain, Liaodong Plain, and Sanjiang Plain. We estimated that the total areas sown with maize, rice, and soybeans on the NEP were 12.1, 6.2, and 7.4 million ha, respectively. The maize cropping area was mainly distributed in the central-western portion of the NEP, while a few areas

Spatial Distribution of the Major Crops in the Main Grain-Producing Regions
We monitored the areas of the major crop sown in the three grain-producing areas from 2017 to 2018. The results showed that the crop planting area on the NEP was mostly concentrated on the Songnen Plain, Liaodong Plain, and Sanjiang Plain. We estimated that the total areas sown with maize, rice, and soybeans on the NEP were 12.1, 6.2, and 7.4 million ha, respectively. The maize cropping area was mainly distributed in the central-western portion of the NEP, while a few areas were located on the Sanjiang Plain. The rice cropping areas were mainly concentrated on the Sanjiang Plain in the Heilongjiang Province, while a few areas were located on the Liaodong Plain. The soybean cropping area was primarily distributed in the Jilin and Heilongjiang provinces in the northern-central region of the NEP (Figure 10a-c). The crop planting area on the NCP covered almost the whole region except the mountains and urban areas. We estimated that the total areas sown with winter wheat and summer maize on the NCP were 13.4 and 16.9 million ha, respectively. The winter wheat cropping area was mainly distributed in the piedmont region of the Taihang Mountain, Luxi Plain, and Huang-Huai Plain. Few areas of winter wheat were planted in the Fenhe Valley in the western part of the NCP. The maize cropping area on the NCP covered almost the whole area, including the Jiaodong Peninsula and some mountain areas (Figure 10g-h).
Rice is a major crop on the MYRP, including early rice, single rice, and late rice. We estimated that the total areas sown with rice, winter wheat, and winter rape on the MYRP were 6.4, 2.2, and 1.3 million ha, respectively. The rice cropping areas were mostly distributed in the northern part of the MYRP and the plains areas in the south. The winter wheat planting area was mostly concentrated in the northern part of the MYRP, while the rape planting area was distributed in the western and southern areas of the MYRP (Figure 10d-f). The crop planting area on the NCP covered almost the whole region except the mountains and urban areas. We estimated that the total areas sown with winter wheat and summer maize on the NCP were 13.4 and 16.9 million ha, respectively. The winter wheat cropping area was mainly distributed in the piedmont region of the Taihang Mountain, Luxi Plain, and Huang-Huai Plain. Few areas of winter wheat were planted in the Fenhe Valley in the western part of the NCP. The maize cropping area on the NCP covered almost the whole area, including the Jiaodong Peninsula and some mountain areas (Figure 10g-h).

Spatial Agreement with the Google Maps Image Results
Rice is a major crop on the MYRP, including early rice, single rice, and late rice. We estimated that the total areas sown with rice, winter wheat, and winter rape on the MYRP were 6.4, 2.2, and 1.3 million ha, respectively. The rice cropping areas were mostly distributed in the northern part of the MYRP and the plains areas in the south. The winter wheat planting area was mostly concentrated in the northern part of the MYRP, while the rape planting area was distributed in the western and southern areas of the MYRP (Figure 10d-f).

Spatial Agreement with the Google Maps Image Results
We evaluated the crop map estimated from the Sentinel-2 imagery based on the 0.5-m resolution Google Maps image results for the test counties. The results showed that the Sentinel-estimated crop map was consistent with the interpreted images from Google Maps, especially on the NCP. The overall accuracy on the NEP was 0.93 with a kappa index of 0.90 (Table 3). The identification accuracy for maize was relatively high in comparison with those of the other crops. The high accuracy occurred on the NCP with an overall accuracy of 0.96 and a kappa index of 0.92 because of the low interference from the other crop types, especially in the winter wheat cropping areas (Table 4). We achieved a relatively high accuracy on the MYRP considering the patchy and fragmented cropland in this area. The overall accuracy on the MYRP was 0.93, with a kappa index of 0.90 ( Table 5). The results fully demonstrate the advantage of high-resolution images in crop mapping research in areas with complex land features.

Evaluation with Agricultural Statistical Data
Due to the limitations of the field survey data, we also compared the remote sensing-estimated crop areas with those from the national agricultural census data at the county level. These two datasets agreed reasonably well, with an overall r-squared value of 0.78. The confidence coefficient of all models could reach the 99% level. The coefficients of determination between the Sentinel-estimated dataset and the agricultural census data at the county level on the NEP, NCP, and MYRP were 0.83, 0.81, and 0.71, respectively ( Figure 11). Considerable agreement was obtained for the maize and rice on the NEP and the wheat in the NCP, with r-squared values higher than 0.85. The accuracy of the estimated crop area on the MYRP was relatively low because of the interference of cloud and mountain elements. Fragmented cropland on the MYRP also increased the difficulty of the crop mapping, especially in the areas sown with rape.

Discussion
This study contributed to large-scale crop mapping using high-resolution satellite images and multiple indicators in China. Previous research has mapped the spatiotemporal dynamics of the maize and cropping intensity trends in China at the national scale by using MODIS (500 m) images [12,45]. Moderate-resolution images can achieve coverage of large areas and make large-scale monitoring possible. However, limited resolution and indicator systems could cause errors in crop distribution mapping research [30]. Multiple indicator systems have been widely used in local-scale studies [46]. It is possible to achieve almost completely accurate results for small areas with the development of quantitative monitoring technologies for remote sensing. However, few studies have focused on the extensive use of multiple indicator systems in large areas. On the basis of previous studies, we created a practical application for crop mapping research at the national scale with a combination of multiple indicator systems and high-resolution satellite images.
High-resolution satellite data have played an important role in remote sensing-based agricultural monitoring. The cost of large-scale use of high-resolution imagery is prohibitive. Some recent studies have monitored crop distributions in China with data from the GF-1 satellite, which was launched by China to obtain high-resolution imagery (i.e., 2 m/8 m/16 m) and a short revisit cycle (i.e., 4 days). However, limited wave bands (four bands) increase the difficulty of crop classification, especially in ferruginous plots [28]. This study is also a pioneering research to prove that the Sentinel-

Discussion
This study contributed to large-scale crop mapping using high-resolution satellite images and multiple indicators in China. Previous research has mapped the spatiotemporal dynamics of the maize and cropping intensity trends in China at the national scale by using MODIS (500 m) images [12,45]. Moderate-resolution images can achieve coverage of large areas and make large-scale monitoring possible. However, limited resolution and indicator systems could cause errors in crop distribution mapping research [30]. Multiple indicator systems have been widely used in local-scale studies [46]. It is possible to achieve almost completely accurate results for small areas with the development of quantitative monitoring technologies for remote sensing. However, few studies have focused on the extensive use of multiple indicator systems in large areas. On the basis of previous studies, we created a practical application for crop mapping research at the national scale with a combination of multiple indicator systems and high-resolution satellite images.
High-resolution satellite data have played an important role in remote sensing-based agricultural monitoring. The cost of large-scale use of high-resolution imagery is prohibitive. Some recent studies have monitored crop distributions in China with data from the GF-1 satellite, which was launched by China to obtain high-resolution imagery (i.e., 2 m/8 m/16 m) and a short revisit cycle (i.e., 4 days). However, limited wave bands (four bands) increase the difficulty of crop classification, especially in ferruginous plots [28]. This study is also a pioneering research to prove that the Sentinel-2 satellites could be useful for large-scale, high-resolution crop mapping research in China. We realized relatively high accuracy on the middle-lower Yangtze River plain, which has complex land cover features and fragmented cropland. The results fully demonstrated that the Sentinel-2 satellites, which have high resolutions, low revisit periods, multiple wave bands, and the advantage of low cost, can easily provide images for specific times and multiple indicator systems, which could be widely used in large-scale classification of land cover types.
We achieved relatively high classification accuracy compared with the interpreted Google Maps images. Though the temporal difference between Google Maps and Sentinel-2 images may bring some errors, cropping system in the study area would not change frequently. Thus, we did not take this kind of error into consideration. Visual interpretation of Google Maps would also cause some mistakes considering the different individual standard, especially in some blurry areas. We marked these challenge sample points and made the final decision through team discussion in order to reduce the human error in this part. However, the accuracy assessment with the county-level national census data was unsatisfactory. Previous studies of large-scale crop mapping have usually conducted an accuracy assessment with census data at the provincial level [45]. The use of provincial data could ignore some errors in local areas and reach relatively high accuracy levels. We conducted an accuracy assessment at the county level, which could enlarge the discrete degree to a certain extent. On the other hand, national census data at the county level have hysteresis characteristics and are easily affected by human factors, which could cause errors in the accuracy assessment. Cloud cover is also an uncertain factor for monitoring studies. A higher cloud cover ratio indicated higher uncertainties in the estimated results. The lack of images for specific times during the crop growing period increases the difficulty of crop classification. For example, entire crop cycles might be missed due to persistent cloud cover [47]. In this study, there is limited influence caused by cloud cover because of the short revisit period of Sentinel-2 satellite. We could almost obtain more than three available images in a single month, especially in NEP and NCP, while we could almost obtain at least one available image in a single month in MYRP. On the other hand, although we set a 20% standard to control cloud cover when downloading the images, missing images in some areas invalidated our decision model, especially in the critical crop growing period. We calculated the whole cloud area in our research and found that cloud area occurring in the specific growing period was less than 5% of the whole area, which could cause certain errors for the detection of the true situation.
Large-scale and high-resolution remote sensing monitoring is significant for the adjustment and optimization of agricultural production. Understanding the spatial and temporal variations of crop patterns is important for the creation of policy, assessment of food security, and development of sustainable agricultural practices [8,48]. We found that the sown area of the winter wheat on the NCP was much lower than that of the summer maize, especially in the Hebei Province. This result was also much lower than previous crop mapping research in 2009 [49]. The results showed that the crop rotation and fallow policy have been properly implemented because of groundwater overexploitation on the NCP [50]. Building remote sensing-based models in China for multiple cropping systems is also meaningful for nationwide crop mapping research [49]. Creating and periodically updating different crop distribution datasets is meaningful for estimating crop yield, analyzing changes in cropping intensity and adapting agriculture to climate change [45].
We estimated the main crop distribution in the major grain-producing areas of China and achieved relatively high accuracy in the plain areas. However, it is difficult to ensure accuracy for crop monitoring in mountain areas. We used 90-m resolution SRTM DEM and SRTM SLOPE data to remove the mountain areas and assumed that few crops were distributed in the mountains. Some crops could be planted in the mountain areas with relatively low yields due to limitations of the landscape, especially on the middle-lower Yangtze River plain. We could focus on the classification of crops in mountain areas and build a reasonable model for improving the accuracy of crop mapping in the future. We could also refine the crop types (i.e., spring maize/summer maize) and build a classification method for other field crops in the future, such as cotton and potato. On the other hand, the decision trees method is an ordinary but efficient method in crop classification research [15,16]. There are also some alternative methods based on machine learning or deep learning in this field. For example, the random forest algorithm has been widely used in recent years. It is an ensemble learning algorithm that consists of multiple decision trees or classified regression trees and can reduce error caused by artificial subjective factor to a certain extent [34,36]. In the future, we will conduct comparative studies between the different classification methods and optimize our algorithm in different cropping systems.
We built an open-access dataset on the internet (www.cross.ac.cn) to share the crop monitor results at county level and intend to update it annually. We hope our results can guide the adjustment and optimization of the national agricultural structure in the future.

Conclusions
This study built three remote sensing-based models for typical cropping systems using multiple indicators and monitored the distribution of the major crops in the main grain-producing area of China using Sentinel-2 satellite data. We found that the areas sown with maize, rice, and soybean on the NEP were 12.1, 6.2 and 7.4 million ha, respectively. The maize and wheat planting areas on the NCP were 16.9 and 13.4 million ha, respectively. The total sown areas of the rice, wheat, and rape on the MYRP were 6.4, 2.2, and 1.3 million ha, respectively. The estimated images agreed well with field survey data (average overall accuracy = 94%) and the national agricultural census data (R 2 = 0.78). This proves the applicability of the Sentinel-2 satellite data for large-scale, high-resolution crop mapping in China. We intend to update the crop mapping datasets annually and hope to guide the adjustment and optimization of the national agricultural structure.