1. Introduction
The Northeast Black Soil Region is China’s main commercial grain base [
1,
2,
3]. Timely and accurate access to crop planting information in the early stages of the Black Soil Region is of great significance to improving agricultural management and productivity and ensuring national food security [
4,
5].
Remote sensing has proven to be a practical and efficient way to obtain information for crop mapping [
6]. According to different monitoring phenological periods, current crop identification can be divided into pre-season, mid-season, and post-season crop identification, as early identification has relatively little research due to the lack of available image and sample data and the difficulty of capturing the characteristic information of the crop at the crop early stage [
7]. However, due to sufficient image and sample data, satisfactory identification accuracy has been achieved in mid-season and post-season, but the mapping results are obtained relatively late and cannot meet the demand of relevant departments for timely access to crop acreage information [
8,
9]. In contrast, early crop identification can obtain crop planting information at an earlier time and is more practical, which can provide timely information reference for planting management and food production security, make the formulation of relevant government policies more directional, and provide a guarantee for the healthy and sustainable agricultural development [
10,
11].
Finding distinguishable features of crops in the early growing season is essential to improving the accuracy of early crop identification [
12]. Currently, commonly used remote sensing identification features include spectral, spatial, temporal, and polarization features, as well as auxiliary features such as Digital Elevation Models (DEMs) [
13,
14,
15]. Among them, temporal characteristics can reflect crop growth and development during the growth stage. Many researchers have found that combining temporal characteristics with other features is conducive to improving crop identification accuracy at the early growth stage [
16,
17,
18]. Crop identification methods based on single-temporal remote sensing imagery distinguish ground objects by finding the features of crops that differ significantly in the “critical period”. For example, rice in the irrigation period, rapeseed in the flowering period, and cotton in the boll opening period appear white, and all have more distinctive characteristics compared to other crops in the key phenological period. Early identification of these crops can be achieved by finding sensitive bands and constructing indices to enhance their performance with only one phase of imagery [
19,
20,
21]. However, homozygosity and heterozygosity are more serious for areas with complex crop planting structures, which may lead to low recognition accuracy.
Multi-temporal remote sensing data can effectively capture the spectral confusion between crops caused by different crop phenological periods and effectively improve crop discriminability, which has been widely applied to remote sensing crop recognition [
22,
23]. Some researchers have collected images of crops over the entire fertility period from sowing to pre-harvest, then used multi-source remote sensing data fusion and multiple remote sensing indices to improve crop information and identify crops earlier. Wei et al. [
24] achieved early identification of corn, rice, and soybeans by collecting images of the entire growth period and integrating them with multiple time-phased information via an incremental design to make up for the lack of a single phase. In addition, some researchers have used different curve shapes in the time-series images of different crops to identify early-stage crops by mining obvious differences at certain periods or time points. They then used relevant decision knowledge or similarity matching to set appropriate thresholds for these differences. For example, Ashourloo et al. [
25] found that the summation of differences between the red and near-infrared reflectance in a time series of Landsat images of alfalfa was significant. Also, the average values of the near-infrared and red bands during the growing season were remarkably higher for alfalfa than for other crops. Based on these findings, a new vegetation index was constructed to achieve efficient automatic mapping of alfalfa. Based on the growth phenological characteristics of different crops, a time-weighted dynamic time warping (TWDTW) similarity matching algorithm was used to calculate the similarity distance between each image element to be classified and the crop standard sequence to achieve early identification of winter wheat [
26]. Zhang et al. [
27] successfully realized automatic early season mapping of winter wheat by phenological indicators such as NDVI integration, NDVI maximum, the relative rate of change, and a series of winter wheat discriminative classification rules, and then by a threshold method. However, they have integrated image data from the entire growth period, with mapping not completed until the end of the season. Different crops have unique growth and development rules. Over time, growth rates and spectral characteristics may differ [
28,
29], which brings a good opportunity for the early identification of crops. Qiu et al. [
30], combined with the knowledge of crop growth and development, designed the index of vegetation index variation in the early and late growth period of winter wheat, established the winter wheat extraction model, which efficiently and quickly realized the multi-year continuous mapping of winter wheat in ten provinces of North China, the main production area of China, with the overall recognition accuracy up to 92%. However, there is also a lack of analysis on which periods have greater growth differences and which features contribute more during these key periods. Therefore, our research objective is to mine the differential crop growth information from the early growing season, then construct multi-temporal indicators that can effectively highlight the differences in their growth processes and explore its impact on early crop identification in typical black soil areas in Northeast China. Our contributions are:
(1) Design indicators that can reflect crop growth characteristics and construct a crop growth characteristics dataset composed of a sequence of spectral bands and their derived indices for the early growing season.
(2) Explore the potential ability to use different growth datasets to differentiate crops at early growth stages and analyze their spatial and temporal variations.
(3) Achieve remote sensing recognition of crops in different periods based on different classifier models, and summarize the earliest identifiable date and identifiable accuracy of different crops.
4. Results
4.1. Trend Analysis of Crop Identification Overall Accuracy and Kappa Coefficient
In this study, the Cart decision tree, GBDT, RF and SVM classifiers were applied to remote sensing recognition of crops in each period. This included 1 recognition result on 8 May, 7 recognition results on 13 May, 63 recognition results on 28 May, 1023 recognition results on 12 June, and 32,767 recognition results on 7 July. The OA and kappa coefficients were used to evaluate the overall crop recognition performance of each classifier in each period. The overall accuracy and kappa coefficient trends in different periods were analyzed as follows.
Figure 5 shows the maximum overall recognition accuracy of the four classifiers in different periods. It can be seen that the most significant increase occurred from 12 June to 7 July, when the recognition accuracy increased from 75% to 97%, followed by 8 May to 13 May (60% to 74%). From 13 May to 28 May and 28 May to 12 June, the recognition accuracy changes were relatively low, with ranges of 67–77% and 71–81%, respectively. The OA of the four classifiers tended to increase as more feature information was added over time. The GBDT and RF performed better than the SVM and Cart classifiers throughout the period, with the maximum identification accuracies achieved on 12 June (about 81%) and 7 July (about 97%).
Figure 6 shows the maximum kappa coefficients of the four classifiers in different periods. The upward trend is similar to that of OA during the early stage. The most significant increase occurred from 12 June to 7 July, with a kappa coefficient range of 62–95%, followed by 8 May to 13 May (37–68%). From 13 May to 28 May and 28 May to 12 June, the changes were relatively low (63–68% and 62–71%, respectively). The GBDT and RF performed better than the SVM and Cart classifiers throughout the period, with the kappa coefficients peaking at around 71% on 12 June and about 95% on 7 July.
Overall, the identification performance of GBDT and RF was better than those of other classifiers in the early stage, and
Figure 7 shows the best recognition results for each period. The performance of SVM was comparable to that of Cart in the early stage, while the recognition ability gradually increased in the later stage, with Cart being worse than the other classifiers.
4.2. Analysis of Trends in Producer Accuracy and User Accuracy
In this study, the PA and UA were used to evaluate the recognition of each crop. The time at which the PA and UA reached above 85% was considered the earliest that each crop was identifiable. The PA and UA recognition analysis for each crop is as follows.
(1) Analysis of trends in producer accuracy and user accuracy for rice identification
Table 7 shows the maximum PA and UA achieved by each classifier at different periods for rice identification. The maximum PA was achieved on 8 May (87.9%), and the corresponding UA (80.1%) was lower, so the rice could not be effectively monitored at that time. The RF classifier performed best on 13 May, with its PA and UA being about 95%, followed by the GBDT (about 93%). In contrast, the Cart and SVM did not perform as well as the first two, but both could identify rice effectively, with PA and UA above 90%.
(2) Analysis of trends in producer accuracy and user accuracy for corn identification
Table 8 shows the maximum PA and UA that each classifier could achieve at different times for corn identification. It can be seen that both PA and UA increase with time. In particular, the maximum PA and UA achieved on 12 June was about 81% using GBDT and RF. Corn was able to be identified earliest in the period to 7 July using any of the four classifiers. Their performance was comparable, with the maximum PA and UA both being above 97%.
(3) Analysis of trends in producer accuracy and user accuracy for soybeans identification
Table 9 shows the maximum PA and UA achieved by each classifier for soybeans at different times. As with soybeans identification, the PA and UA of soybean identification also increased over time. The maximum PA and UA on 12 June was also around 81%. On 7 July, all four classifiers could effectively identify soybeans, with GBDT and RF having maximum PA and UA of 97%. The SVM and Cart are somewhat lower at 96% and 94%, respectively.
4.3. Different Remote Sensing Indices Temporal Contribution and Variation Characteristics
Crops have unique growth and seasonal phase-change characteristics. Multi-temporal spectral characteristics can provide effective crop identification information and reveal their changes over time [
67]. Usually, the changes in these characteristics are stable and can be used to distinguish between crops over a period of time. This study calculated the importance of all multi-temporal difference features for each period to reflect the relative importance of the features in different periods.
Table A1,
Table A2,
Table A3,
Table A4,
Table A5,
Table A6,
Table A7,
Table A8,
Table A9,
Table A10,
Table A11 and
Table A12 summarizes the top 10 optimal feature sets of each best combination type of three crops in each period. Features with a higher proportion and ranking are better for crop identification [
54]. The key spectral and temporal characteristics of the three crops are analyzed as follows.
(1) Early identifying characteristics and temporal changes characteristics of rice
Figure 8 shows the proportions of feature types in the top10 optimal rice feature sets of all best combination types in each period. Rice is a typical paddy crop, which behaves as a mixture of water and paddy in the early stage. The LSWI, SR, B11, and NDTI can effectively reflect the water changes in the canopy and canopy background during the growth and development of rice in this period so that rice crops can be effectively identified.
Table 10 shows the top10 optimal feature sets of the best combination types for rice in each period. It can be seen that on 8 May, only one combination of differential temporal phases (A) was able to achieve 81% recognition accuracy. On 13 May, two combinations of BF achieved the maximum classification accuracy (94.6%). In addition, the maximum recognition accuracy of each combination in that period was able to meet the requirement for early recognition of rice (≥91%).
Figure 9 shows the frequency of key temporal changes in the top10 feature sets of all the best combination types for rice on 13 May. The frequency of F was significantly greater than those of the other different temporal phases, indicating that the difference in remote sensing information between 13 May and 28 April is key information for early rice crop identification.
(2) Early identifying characteristics and temporal changes characteristics of corn
Figure 10 shows the proportion of different feature types in the top-10 optimal corn feature sets of all best combination types at each period. By early June, corn had undergone the sowing and emergence stages, and the images acquired at that time showed low vegetation cover information. The NDTI, SR, LSWI, and B11 accounted for a relatively large proportion during this period. As the crop grows, it gradually enters its peak growth period. In early July, when corn is at the seventh leaf stage, the vegetation cover is greater compared to the previous months. Features such as short-wave infrared bands, some vegetation red-edge bands, and indices of vegetation cover (B11, NDRE3, LSWI, VIgreen, RESI, NERED2) become more prominent and can effectively capture corn growth information.
Table 11 shows the top10 optimal feature sets of the best combination types for corn identification in each period. On 8 May, a 67.1% corn recognition accuracy was achieved with index combination A. The maximum recognition accuracy for the period to 13 May (72%) was achieved using the two combinations of AB. The classification achieved the maximum recognition accuracy for the period to 28 May (76.8%) using the three combinations of ABF. On 12 June, the maximum recognition accuracy for the period was achieved using five combinations of CDHIJ (82%). On 7 July, the maximum recognition accuracy for the period (98.2%) was achieved using eight and nine combinations of CEFGHIKM and ACDEFGHMN. In addition, the best combination of 15 combinations for the period achieved satisfactory early identification accuracy for corn (≥94%).
Figure 11 shows the frequency of key change temporal in the top-10 feature sets of all the best combination types for corn in this period. E, M, and L appeared to have higher frequencies indicating that the differences between 7 July and 12 June and 13 May and 28 May were obvious enough to identify corn effectively. In addition, the growth difference information on 28 May and 28 April (H) was also obvious.
(3) Early identifying characteristics and temporal changes characteristics of soybeans
Figure 12 shows the proportion of different feature types in the top10 optimal soybeans feature sets of all best combination types at each period. Corn and soybean have similar phenology in their early growth stage, and their performance is similar. As of early June, the soybeans had gone through the seeding, emergence, and three-leaf stages and with low vegetation cover information. The NDTI, SR, LSWI, and B11 indexes accounted for a relatively large proportion of this period. In early July, soybean enters the flowering stage and exhibits lavender-colored petals. During this period, its vegetative and reproductive development is concurrent and vigorous. The B11, NDRE3, LSWI, Vigreen, RESI, EVI, B12, and NDRE2 indexes play a key role in the identification of soybeans in this period, which is slightly different from corn.
Table 12 shows the top10 optimal feature sets of the best combination types for soybeans identification in each period. The best temporal combinations for soybean on 8 May, 12 June, and 7 July were the same as those for corn. The maximum identification accuracies achieved using A, CDHIJ, and BEFHIM were 63%, 81.8%, and 98.5%, respectively. On 13 May, the maximum soybean identification accuracy (70%) was achieved using three combinations of ABF, while on 28 May, the maximum identification accuracy (76.2%) was achieved using three combinations of AFG. On 7 July, the maximum identification accuracy for each combination was above 94% to achieve the soybean early identification accuracy.
Figure 13 shows the frequency of key change temporal in the top10 feature sets of all the best combination types for soybeans in this period. The frequency of E is the highest, followed by M, N, and H. Adding these different time phases can effectively capture soybean growth information and enable them to be identified effectively.
5. Discussion
(1) Potential of using crop growth difference information for early identification
Currently, remote sensing recognition of crops is mainly based on all available images acquired throughout the growth period [
68,
69,
70]. However, the limited imagery available in the early growth stages poses a great challenge to early crop identification [
71]. The paper investigated the impact of using differential temporal information for early crop identification. Specifically, a dataset of early-stage growth characteristics was constructed based on differences in spectral features among all available temporal phases of the crop at each period, and all possible combinations of each period were identified. The results show that the different time phase features, such as mid-May and late April (F), are more critical for the early identification of rice; for the early identification of corn, the difference time phase features, such as early July and mid-June, late May, mid-May, and late May and late April (E, M, L, H) play a key role in its early identification; for the early identification of soybeans, the difference time phases features such as early July and mid-June, mid-May, early May, and late-May and late-April (E, M, N, H) contribute more to its early identification. In addition, we compared the F1 accuracy of each crop obtained by this method with that in the study [
24]. They used Sentinel-2 remote sensing images to construct spectral and vegetation index feature sets of various crops in each growth period in an incremental manner. It investigated the early recognition of corn, rice, and soybean in Northeast China based on common classifiers. As can be seen from
Table 13,
Table 14 and
Table 15, compared with [
24], the recognition accuracy of crops at each stage obtained by our method is higher than that in this paper, mainly because the index we designed that can effectively highlight the crop growth characteristics, which comprehensively considers crop growth difference information and the time phase information, and can effectively amplify the difference between crops and distinguish them effectively. Therefore, the combination of spectral vegetation index features of crops with differential temporal features can effectively improve crop recognition accuracy at early growth stages.
(2) Effective identification features of crops at different periods
The use of suitable classification features can effectively improve the accuracy of remote sensing crop recognition [
72,
73,
74,
75]. Many studies have shown that spectral and vegetation indices derived from Sentinel-2 multispectral remote sensing images play a more important role in crop identification than spatial texture information in Northeast China [
34,
76]. Accordingly, this study selected ten spectral and 22 vegetation indices as image features and explored their differences for use in crop identification in different growth periods. We found that rice moisture information was more prominent in May and June, and some of the relevant vegetation indices constructed with short-wave infrared as input were more sensitive to the vegetation canopy moisture information [
77], which could effectively capture the moisture information of rice and thus identify rice as early as possible. Corn and soybean are dryland crops with the same phenological period [
78], and some indicators of low vegetation cover play a role in this period, but their recognition ability is limited. In July, when various crops enter their peak growth period, some indicators related to vegetation canopy cover and the red edge index can be fully utilized [
72,
79] to effectively identify corn and soybeans.
(3) Comparison of the performance of different classifiers in early crop identification
In this study, four common classifiers were selected to evaluate their recognition effectiveness in the early stage of crops. These classifiers were selected based on their wide application in land cover [
80,
81].
Table 16,
Table 17 and
Table 18 summarize the F1 recognition accuracy of each crop at different periods. RF can achieve a maximum rice identification accuracy of 96.8% as early as 13 May, while the accuracy of GBDT is about 2% lower and SVM and Cart are about 5.8% lower. Both GBDT and RF reached 98% accuracy for corn and soybeans as early as 7 July, while Cart and SVM were lower, at around 97% for corn and 96.7% and 94.8% for soybeans, respectively. Overall, with the addition of more temporal features, the recognition accuracy of the four classifiers tended to increase, with GBDT and RF achieving better results in identifying the three crops in the early stage.
6. Conclusions
This study constructed a dataset of early crop growth characteristics based on temporal phase difference feature information and explored its potential for use in early crop recognition in typical black soil areas of Northeast China. Firstly, a multi-temporal crop growth characteristics dataset was constructed using the different information on crops in different periods of the early stages. Then, the feature optimization method was used to select the best feature set for all possible combinations in each period, and the early key identification characteristics of different crops and their stage change characteristics were explored. Finally, the performance differences of four classifiers in early crop recognition and the recognition accuracy levels of crops in different periods were analyzed. The conclusions are as follows:
(1) The early crop growth method proposed in this study is intuitive and easy to understand. It can effectively amplify the differences between early-stage crops and improve the accuracy of crop identification. Therefore, it has great potential in the early identification of crops. It can also quickly and accurately map the crop in its early stages, providing information reference for relevant agricultural departments and having practical solid application value.
(2) The difference time phase feature can distinguish between crops and improve their identification accuracy in the early stage. Rice changed obviously between mid-May and late April (F) periods; corn changed more obviously between early July and mid-June, late May, mid-May, and late May and late April, which were periods E, M, L, H; soybean changed more obviously between early July and mid-June, mid-May, early May, and late-May and late-April, which were periods E, M, N, H.
(3) Short-wave infrared bands and vegetation index feature sensitivity to water information, and low vegetation coverage contributed more to the early identification of rice, such as LSWI, SR, B11, and NDTI. For corn and soybean, short-wave infrared band, red-edge index, and vegetation canopy cover indicators were key in identifying both, such as B11, NDRE3, LSWI, VIgreen, RESI, and NDRE2.
(4) Corn can be identified as early as 7 July, with both PA and UA above 97%; soybean can be identified as early as 7 July, with both above 94%; and rice can be identified as early as 8 May, with both above 90%.
(5) GBDT and RF performed comparably in crop recognition, followed by SVM, while the Cart classifier was poorer.