Prediction of Winter Wheat Maturity Dates through Assimilating Remotely Sensed Leaf Area Index into Crop Growth Model

Predicting crop maturity dates is important for improving crop harvest planning and grain quality. The prediction of crop maturity dates by assimilating remote sensing information into crop growth model has not been fully explored. In this study, a data assimilation framework incorporating the leaf area index (LAI) product from Moderate Resolution Imaging Spectroradiometer (MODIS) into a World Food Studies (WOFOST) model was proposed to predict the maturity dates of winter wheat in Henan province, China. Minimization of normalized cost function was used to obtain the input parameters of the WOFOST model. The WOFOST model was run with the re-initialized parameter to forecast the maturity dates of winter wheat grid by grid, and THORPEX Interactive Grand Global Ensemble (TIGGE) was used as forecasting period weather input in the future 15 days (d) for the WOFOST model. The results demonstrated a promising regional maturity date prediction with determination coefficient (R2) of 0.94 and the root mean square error (RMSE) of 1.86 d. The outcomes also showed that the optimal forecasting starting time for Henan was 30 April, corresponding to a stage from anthesis to grain filling. Our study indicated great potential of using data assimilation approaches in winter wheat maturity date prediction.


Introduction
Food security is always an important issue; high price and scarcity of food increased agriculture's capacity to feed burgeoning global population [1,2]. Ever-increasing pressures are being put on the sector while trying to tackle the associated environmental issues. Therefore, predictions of crop status, development, maturity and yield prior to harvest are critical to design regional and national agricultural policies, for example planning and deployment of agricultural machinery and optimal crop harvesting. Therefore, the objectives of this study are twofold: (1) to introduce a method to predict the winter wheat maturity date that combines a process-based model and remote sensing data, which could be feasible to provide a more comprehensive and dynamic description of crop growth; and (2) to evaluate the accuracy of this method for forecasting winter wheat maturity dates and find earlier prediction nodes based on the premise of ensuring the accuracy. The results of this work may significantly contribute to winter wheat maturity date prediction study and be beneficial for adjustment of harvest policy and increase in harvest yield.

Study Area
This study was conducted in major winter wheat planting areas of Henan province of China, which extended from 31.38 • N to 36.37 • N and 110.35 • E to 116.64 • E. The winter wheat planting region in this province was selected as the study area, which was about 5.425 × 10 6 ha and covered approximately one quarter of the entire winter wheat planting area of China (Data Source: National Bureau of Statistics of China, http://www.stats.gov.cn/). This region located in North China Plain as well as the warm temperate monsoon climate region, characterized by average annual precipitation of about 407.7-1295.8 mm, the average annual temperature of about 10.5-16.7°C from south to north and annual average sunshine duration of about 1285.7-2292.9 h, which is favorable for winter wheat cultivation (Data Source: National Bureau of Statistics of China, http://www.gov.cn/guoqing/2018-01/ 17/content_5257444.htm). In this study, 30-m winter wheat planting map is extracted from Landsat8 OLI and Sentinel 2 data by means of random forest method [23,24]. Winter wheat in the study area usually sows from October to November and harvest during May and June in the following year. Due to different weather conditions and sowing dates, the maturity dates show large spatial variations in the study area. Figure 1 shows locations of agrometeorological stations and field distribution of the study area.
Remote Sens. 2020, 12, x FOR PEER REVIEW 3 of 20 evaluate the accuracy of this method for forecasting winter wheat maturity dates and find earlier prediction nodes based on the premise of ensuring the accuracy. The results of this work may significantly contribute to winter wheat maturity date prediction study and be beneficial for adjustment of harvest policy and increase in harvest yield.

Study Area
This study was conducted in major winter wheat planting areas of Henan province of China, which extended from 31.38° N to 36.37° N and 110.35° E to 116.64° E. The winter wheat planting region in this province was selected as the study area, which was about 5.425 × 10 6 ha and covered approximately one quarter of the entire winter wheat planting area of China (Data Source: National Bureau of Statistics of China, http://www.stats.gov.cn/). This region located in North China Plain as well as the warm temperate monsoon climate region, characterized by average annual precipitation of about 407.7-1295.8 mm, the average annual temperature of about 10.5-16.7 ℃ from south to north and annual average sunshine duration of about 1285.7-2292.9 h, which is favorable for winter wheat cultivation (Data Source: National Bureau of Statistics of China, http://www.gov.cn/guoqing/2018-01/17/content_5257444.htm). In this study, 30-m winter wheat planting map is extracted from Landsat8 OLI and Sentinel 2 data by means of random forest method [23,24]. Winter wheat in the study area usually sows from October to November and harvest during May and June in the following year. Due to different weather conditions and sowing dates, the maturity dates show large spatial variations in the study area. Figure 1 shows locations of agrometeorological stations and field distribution of the study area.    //ladsweb.modaps.eosdis.nasa.gov/). The LAI time series were generated by stacking the images in temporal order and then reconstructed by the upper-envelope-based Savitzky-Golay (S-G) filter [25,26] to suppress the noise caused by clouds, water vapor, and aerosols.
S-G filter was proposed by Abraham Savitzky and MarcelJ. E. Golay in 1964. It is a weighted average algorithm of moving window, which is obtained by least square fitting of a given high-order polynomial in a sliding window. S-G filter can be expressed by Equation (1) where LAI i is initial LAI value, LAI sg i is S-G filtered LAI value, C i is the weight of LAI i , m is the window radius, N is the number of all data in the window, which equals to 2m + 1.
However, the traditional S-G filter algorithm always causes the filtered points to be obviously close to the maximum or minimum value of the initial curve. So, we reconstructed the MODIS LAI time series by upper-envelope-based S-G filter. The upper-envelope-based S-G filter saves the original and S-G filtered LAI time series and generates the upper envelope of LAI time series as Equation (2): where LAI o t is initial LAI at time node t obtained from MODIS LAI images, LAI sg t is S-G filtered MODIS LAI, LAI l t is the upper envelope of the two time series. The absolute difference of upper envelope and S-G filtered LAI time series is used as the determination for exiting iteration, and is calculated as Equation (3): where dif is the difference described as above. Then, the S-G filtered LAI time series will be iterated as the initial LAI time series by Equations (2) and (3) until dif reaches the threshold. In this study, the threshold was set as 0.5, which is an empirical value. If the threshold is greater than 0.5, the upper-enveloped-based S-G filter (UE-SG) filter curve is not smooth enough; if the threshold is less than 0.5, it will cost more time to finish UE-SG filter process. Figure 2 shows the comparison of MODIS LAI time series curves before and after upper-envelope-based S-G filtering.
Remote Sens. 2020, 12, x FOR PEER REVIEW 4 of 20 images in temporal order and then reconstructed by the upper-envelope-based Savitzky-Golay (S-G) filter [25,26] to suppress the noise caused by clouds, water vapor, and aerosols. S-G filter was proposed by Abraham Savitzky and MarcelJ. E. Golay in 1964. It is a weighted average algorithm of moving window, which is obtained by least square fitting of a given high-order polynomial in a sliding window. S-G filter can be expressed by Equation (1) where is initial LAI value, is S-G filtered LAI value, is the weight of , m is the window radius, N is the number of all data in the window, which equals to 2m + 1.
However, the traditional S-G filter algorithm always causes the filtered points to be obviously close to the maximum or minimum value of the initial curve. So, we reconstructed the MODIS LAI time series by upper-envelope-based S-G filter. The upper-envelope-based S-G filter saves the original and S-G filtered LAI time series and generates the upper envelope of LAI time series as Equation (2): where is initial LAI at time node t obtained from MODIS LAI images, is S-G filtered MODIS LAI, is the upper envelope of the two time series. The absolute difference of upper envelope and S-G filtered LAI time series is used as the determination for exiting iteration, and is calculated as Equation (3): where dif is the difference described as above. Then, the S-G filtered LAI time series will be iterated as the initial LAI time series by Equations (2) and (3) until dif reaches the threshold. In this study, the threshold was set as 0.5, which is an empirical value. If the threshold is greater than 0.5, the upperenveloped-based S-G filter (UE-SG) filter curve is not smooth enough; if the threshold is less than 0.5, it will cost more time to finish UE-SG filter process. Figure 2 shows the comparison of MODIS LAI time series curves before and after upper-envelope-based S-G filtering.

Agrometeorology Data
Winter wheat data (including emergence, anthesis and maturity date, yield, biomass data of leaves, stems and ears at different stages, LAI, field water capacity, soil moisture) are collected from 33 agrometeorological stations in the study area from 2014 to 2018, and are conducive in recalibrating the World Food Studies (WOFOST) model. Observations from 2014 to 2016 are used for calibrating WOFOST, other data are used for verifying the accuracy of maturity prediction. Table 1 shows the general growth stages of winter wheat in Henan province. Moreover, the observation date of each growth stage in Zhengzhou and Fanqu stations is also listed. Note: 1, 2, and 3 represent the 1st, 2nd, and 3rd 10-day periods in each month.

Meteorology Data and Weather Forecast
Meteorological data were obtained from China Regional Surface Meteorological Elements Dataset produced by Institute of Tibetan Plateau Research Chinese Academy of Sciences (http: //www.tpedatabase.cn/portal/MetaDataInfo.jsp?MetaDataId=249369). We preprocessed meteorological data (including irradiation, minimum temperature, maximum temperature, early morning vapor pressure, wind speed at 2 meters and precipitation, with spatial and temporal resolution of 0.1 • and 1 day, respectively, in a time resolution of 1 day and spatial resolution of 0.1 • ) to WOFOST weather input format [27,28]. This meteorology dataset has been increasingly used in agricultural research [29][30][31].
Weather forecast data in this study came from THORPEX Interactive Grand Global Ensemble (TIGGE) daily ensemble forecast data released by European Centre for Medium-Range Weather Forecasts (ECMWF, https://apps.ecmwf.int/datasets/data/tigge/levtype=sfc/type=cf/), which contains 6 elements (irradiation, minimum temperature, maximum temperature, early morning vapor pressure, wind speed at 2 meters and precipitation) required by WOFOST with spatial resolution of 0.25 • and provides meteorological forecast data for the next 15 days. The TIGGE data were resampled from 0.25 • to 500 m using nearest neighbor assignment method to match the assimilation unit size (500 m) of this study.
Since the TIGGE dataset used in this study can only provide weather forecast data for the following 15 days, the meteorological observation data and TIGGE forecast data cannot cover the entire growth period of winter wheat in the study area, when the forecasting node is set in March or April. Therefore, we used the most similar annual meteorological data at the same location from 1979 to 2016 to follow the weather forecast data exceeding the time limit of TIGGE by the methods proposed in the previous research [31,32]. The specific methods are as follows: The six meteorological elements required by WOFOST from 1979 to 2016 are collected in grid format. The TIGGE ensemble prediction data from the selected forecasting start time to the next 15 d is acquired. Then, the highest temperature data of this time period arranged in chronological order make up a vector. Calculating the vectorial angle formed by this vector and vector of the same period in every year from 1979 to 2016, the year with the smallest angle is determined as the year with the highest similarity, and weather data of the determined year is used for the period that TIGGE data cannot cover. We use the Equation (4) to calculate the angle of two vectors: In Equation (4), a = (a 1 , a 2 , . . . , a n ), b = (b 1 , b 2 , . . . , b n ), thus the Equation (4) can be described as Equation (5):

Parameterization of WOFOST Model
The WOFOST model was developed by the Wageningen University and Research Center of Wageningen, which aims to simulate crop growing process and reaction towards moisture, temperature, sunshine and other factors during the crop growth period. It provides quantification of developmental stages, estimates of biomass and grain yield at a daily time step for different crop types [33,34]. As a state variable, LAI can reflect the intensity of crucial processes such as photosynthesis, and it presents remarkable variation at different development stages. Therefore, it is a suitable state variable to adjust phenological growth by coupling remote sensing data and WOFOST model and optimizing relative parameters. The WOFOST model could run in different modes, e.g., potential mode, water-limited mode and nutrient-limited mode. Due to the good irrigation in the study area, we set the model to potential mode for this study.
The preprocessing of the WOFOST model is meant to re-calibrate and localize it for the target study area. In this study, we recalibrate the model manually based on observation data from Zhengzhou and Fanqu Stations, and our previous research [32,35,36]. The EFAST (Extend Fourier Amplitude Sensitivity Test) [37] algorithm was used to conduct a global sensitivity analysis for the WOFOST model. As for some wide-ranging and sensitive parameters like TSUM1 (effective heat sum from emergence to anthesis), TSUM2 (effective heat sum from anthesis to maturity) and IDEM (emergence date), we used the recorded date of emergence, anthesis, maturity and corresponding temperature data at all agrometeorological stations for the period 2014~2016 to calculate or determine the values of these parameters in these years, and then used the average values in these years to assign these parameters. Besides, we used the distance inverse weight (IDW) method to calculate the values of other pixels in the study area. For other parameters, we assigned one value to each parameter throughout the study area. As for the relatively stable parameters on spatial region, we determined the parameters by field measurements calculation. For example, we used the ratio of measured leaf area to leaf dry weight at various stages to determine the values of SLATB (specific leaf area as a function of development stage); the measured total dry weight increase in the crop and the increase in different organs (such as leaves, stems and ears) were used for calculating the values of parameters like FOTB (fraction of above ground dry matter increase partitioned to storage organs as a function of development stage), FSTB (fraction of above ground dry matter increase partitioned to stems as a function of development stage) and FLTB (fraction of above ground dry matter increase partitioned to leaves as a function of development stage). Furthermore, other parameters are calibrated according to previous research [32,35,36] or set as default value of WOFOST model.

Optimization Parameters Selection
The WOFOST model can simulate crop development stages by considering both temperature and photoperiod, or considering temperature only. In this study, the effect of photoperiod was not considered. Therefore, according to the principle of the WOFOST model, there are three dominating parameters in simulating the development stages despite the impact of photoperiod, which are emergence date (IDEM), effective heat sum from emergence to anthesis (TSUM1) and effective heat sum from anthesis to maturity (TSUM2). These parameters have significant spatial and temporal variations among years; therefore, we selected them as parameters to be optimized. Calibration of these parameters is based on historical data. Upper and lower values for the parameters are shown in Table 2.
Initial value − 10 Initial value + 10 Note: TSUM1 means effective heat sum from emergence to anthesis, TSUM2 means effective heat sum from anthesis to maturity, IDEM means emergence date.

Cost Function and Shuffled Complex Evolution-University of Arizona (SCE-UA) Algorithm
In this study, the parameters were optimized by means of shuffled complex evolution-university of Arizona (SCE-UA). It is a global optimization algorithm developed by Duan in 1993 [38], which introduces complex shape segmentation and blending ideas. It has outstanding performance in the search space efficiency, computational speed, and global search optimization ability. In addition, the SCE-UA algorithm's insensitivity to initial values of optimized parameters [39] and its advantages in optimization of nonlinear problems is manifest, thus it can reduce the requirement for prior knowledge when applying SCE-UA in coupling remote sensing and crop growth models [40][41][42]. It is worth mentioning that the SCE-UA algorithm was implemented by embedding it into the WOFOST model (Fortran Version).
According to previous research, the SCE-UA algorithm is more susceptible to the error of observation data than sequential assimilating algorithms [43]. Due to the mixed pixels and the impact of cloud and rain in 500-m resolution images, the value of MODIS LAI is often much lower than that of WOFOST model simulated LAI [44]. Therefore, we normalized LAI to solve this problem. That is, LAI from both the MODIS and WOFOST simulation are mapped to the range of 0~1, then the range of the two sources of LAI data are made the same while retaining the MODIS LAI trend information. The cost function is constructed as follows: LAI t obs and LAI t sim are MODIS LAI and simulated LAI on t-day. LAI max obs and LAI max sim are maximum of MODIS LAI and simulated LAI. LAI min obs and LAI min sim are minimum of MODIS LAI and simulated LAI. J is the value of cost function. During five consecutive iterations, if all of J's reach the threshold, SCE-UA stops and returns the optimal parameters set. Otherwise, if loops of iterations exceed the selected, the output parameter set will also be used as the optimal set. Then, we put this set into WOFOST model and run the whole framework.

Validation of Calibrated WOFOST Model
In this section, we collected field measured LAI from Zhengzhou and Fanqu stations from 2014 to 2016 to verify the calibration result. Figure 3 shows the LAI time series simulated by calibrated WOFOST and the LAI field measurement.  WOFOST-simulated LAI of winter wheat varies during the whole growth period: LAI increases slowly at the emergence stage, keeps constant before the overwintering stage, increases rapidly since the greening stage, reaches its maximum before the anthesis stage, gradually decreases from the anthesis stage to the maturity stage and decreases rapidly after the maturity. As can be seen in Figure  3, the trend and characteristics of simulated LAI time series at the two stations are basically consistent with the measured LAI. After calculating the difference between these two sources of LAI, the root mean square error (RMSE) of the two stations in the period 2014~2016 is 0.94 m 2 /m 2 , which indicates that the calibrated WOFOST model could simulate the LAI of the whole growth period well.
In addition, it can be seen from Figure 3 that there are uncertainties in some stations. There are two possible reasons: first, the inter-annual uncertainty exists in winter wheat field management, WOFOST-simulated LAI of winter wheat varies during the whole growth period: LAI increases slowly at the emergence stage, keeps constant before the overwintering stage, increases rapidly since the greening stage, reaches its maximum before the anthesis stage, gradually decreases from the anthesis stage to the maturity stage and decreases rapidly after the maturity. As can be seen in Figure 3, the trend and characteristics of simulated LAI time series at the two stations are basically consistent with the measured LAI. After calculating the difference between these two sources of LAI, the root mean square error (RMSE) of the two stations in the period 2014~2016 is 0.94 m 2 /m 2 , which indicates that the calibrated WOFOST model could simulate the LAI of the whole growth period well.
In addition, it can be seen from Figure 3 that there are uncertainties in some stations. There are two possible reasons: first, the inter-annual uncertainty exists in winter wheat field management, such as changes of the sowing date. These changes could cause some errors if we use the same set of crop parameters to simulate the development stages of winter wheat of various years, thus affect the accuracy of LAI simulation. Second, because the model calibration is based on the historical observation data of Zhengzhou Station, it is not necessarily suitable for winter wheat in other regions or other varieties.
Comparing the simulated LAI time series of Zhengzhou and Fanqu Stations from 2014 to 2016, it can be seen that the growth rate of winter wheat planted in Fanqu station is higher than that of Zhengzhou Station before wintering. A possible explanation is that the sowing date of Fanqu is earlier than that of Zhengzhou, because by observing the 2014 LAI time series, we can see the length from emergence to wintering of winter wheat in Zhengzhou Station is significantly shorter than that in the Fanqu, while the lower threshold temperature for ageing of leaves (TBASE) values in the two stations are set the same, and the climatic conditions of the two stations are relatively similar. Besides, it can also be seen that winter wheat in Zhengzhou grows faster than that in Fanqu from the green-up stages. Therefore, the change in sowing date has a great impact on the growth of winter wheat.

Validation at Agrometeorological Stations Scale
In this section, we used LAI as the assimilating variable, MODIS LAI as the remote sensing data source, observed and forecast meteorological data as the weather input of WOFOST model, and 15 May 2018 as the forecasting start time. We constructed a normalized cost function (J) based on MODIS LAI and WOFOST-simulated LAI. The SCE-UA algorithm is used to minimize the J to optimize the parameters for WOFOST crop model. This optimized parameter set is used to drive the WOFOST model to simulate the winter wheat growth process of each pixel in the study area, and retrieve the maturity prediction.
The anthesis stage is an important demarcation period for the transition from vegetative growth to reproductive growth of a crop. It is a crucial indicator for the evaluation of the framework in phenological development simulation well. Therefore, in this study, we use the observed dates of the anthesis and maturity recorded by the agrometeorological stations in the study area to test the prediction results jointly. Figure 4 shows MODIS LAI, WOFOST-simulated LAI at Huaxian Station without and with data assimilation, respectively. Figure 5 is the scatter plot between the predicted and observed dates of the anthesis and maturity.
Remote Sens. 2020, 12, x FOR PEER REVIEW 10 of 20 In Figure 4, the maturity date exhibited on the non-assimilated curve is DAE (days after emergence) 232, which is 10 days later than the observed date, while the assimilated maturity date is very close to the observed maturity date, which greatly improved the prediction accuracy of the maturity date. Besides, the variation trend of the assimilated LAI curve is closer to that of the MODIS LAI, which demonstrates that the optimization of parameters such as TSUM1 can actually optimize output LAI curve trend and avoid being affected by the difference between MODIS LAI and WOFOST simulated LAI. Figure 4 can provide an explanation for this: the non-assimilated anthesis date is DAE 177, which is also later than the observed date. After assimilation, we can see that the  Figure 5c,d show that the anthesis date and maturity date after assimilation basically coincide with the observation, and the slope of the trend line of the scattered points is around one, indicating that the assimilated anthesis date and maturity date are consistent with the observed date, with RMSEs of 2.12 d and 1.86 d, respectively. Therefore, this framework can predict the maturity of winter wheat accurately. Figure 6 shows the spatial distribution comparison of WOFOST-predicted maturity date without and with data assimilation. The spatial difference in Figure 6a is not obvious, which illustrated that maturity dates were mostly influenced by meteorological data and the spatial distribution of IDEM, TSUM1 and TSUM2 after IDW method (Section 2.3) when crop model is not assimilated with remotely sensed LAI. Due to the coarse spatial resolution of the weather input data and the setting of the IDW calibration method, the predicted maturity dates without assimilation had obvious patch distribution, and the maturity date of one pixel is often very close to its neighbors. However, the spatial variability of the maturity dates become obvious after data assimilation with MODIS-LAI In Figure 4, the maturity date exhibited on the non-assimilated curve is DAE (days after emergence) 232, which is 10 days later than the observed date, while the assimilated maturity date is very close to the observed maturity date, which greatly improved the prediction accuracy of the maturity date. Besides, the variation trend of the assimilated LAI curve is closer to that of the MODIS LAI, which demonstrates that the optimization of parameters such as TSUM1 can actually optimize output LAI curve trend and avoid being affected by the difference between MODIS LAI and WOFOST simulated LAI. Figure 4 can provide an explanation for this: the non-assimilated anthesis date is DAE 177, which is also later than the observed date. After assimilation, we can see that the curve's trend around the anthesis stage is much closer to the observed data and remotely sensed data, especially the anthesis period is pushed earlier. The assimilated anthesis date is DAE 173 and more in line with the observed date. From the results we can see that the assimilation framework significantly improves the prediction accuracy of the anthesis date, and further improves the prediction accuracy of the maturity date.

Spatial Distribution Comparison of WOFOST-Predicted Maturity Date without and with Data Assimilation
In addition, we can also find another effect brought by the accuracy improvement of maturity date, which is to adjust the range of the WOFOST simulated LAI. By analyzing the meteorological data of Zhengzhou Station and the daily activities simulation of winter wheat in the WOFOST model, we found if the simulated anthesis date is later than the observed date, it would lead to the winter wheat growing in a relatively warm environment in the greening and anthesis stage simulation, in which the temperature is higher, and the TSUM1 becomes bigger. It would result in the higher rate of growth and the higher peak of LAI output, which is bigger than six (Figure 4 green line); after assimilation, the simulated anthesis period is earlier, which is DAE 173 and this means the TSUM1 is lower than that without assimilation, thus reduced the LAI peak which is smaller than six (Figure 4 blue line). Therefore, the errors in the phenological growing stages will directly affect the range of LAI, and have a great impact on the simulation accuracy of the WOFOST model in terms of biomass and yield. Figure 5c,d show that the anthesis date and maturity date after assimilation basically coincide with the observation, and the slope of the trend line of the scattered points is around one, indicating that the assimilated anthesis date and maturity date are consistent with the observed date, with RMSEs of 2.12 d and 1.86 d, respectively. Therefore, this framework can predict the maturity of winter wheat accurately. Figure 6 shows the spatial distribution comparison of WOFOST-predicted maturity date without and with data assimilation. The spatial difference in Figure 6a is not obvious, which illustrated that maturity dates were mostly influenced by meteorological data and the spatial distribution of IDEM, TSUM1 and TSUM2 after IDW method (Section 2.3) when crop model is not assimilated with remotely sensed LAI. Due to the coarse spatial resolution of the weather input data and the setting of the IDW calibration method, the predicted maturity dates without assimilation had obvious patch distribution, and the maturity date of one pixel is often very close to its neighbors. However, the spatial variability of the maturity dates become obvious after data assimilation with MODIS-LAI (Figure 6b) which demonstrated the fact that using MODIS-LAI data enables us to account for local conditions (e.g., soil, farming practices, and weather conditions) and reduce their uncertainty effects on the maturity date simulations. In the southern part of the study area, due to the earlier sowing time of winter wheat and suitable climate, the maturity date is the earliest, followed by the central and eastern regions. However, there are still some mismatches. For example, some places in the southeast region appear to harvest late, which is related to the local planting habits because planting later naturally results in a later harvest. The northeast region is a bit limited in terms of light and heat due to higher latitude and has the latest maturity date. The overall trend of maturity of wheat from both without and with data assimilation is gradually pushed back from the southwest to the northeast, however, after assimilation, the maturity dates of both the west and northeast regions are the latest.

Spatial Distribution Comparison of WOFOST-Predicted Maturity Date without and with Data Assimilation
Remote Sens. 2020, 12, x FOR PEER REVIEW 12 of 20 ( Figure 6b) which demonstrated the fact that using MODIS-LAI data enables us to account for local conditions (e.g., soil, farming practices, and weather conditions) and reduce their uncertainty effects on the maturity date simulations. In the southern part of the study area, due to the earlier sowing time of winter wheat and suitable climate, the maturity date is the earliest, followed by the central and eastern regions. However, there are still some mismatches. For example, some places in the southeast region appear to harvest late, which is related to the local planting habits because planting later naturally results in a later harvest. The northeast region is a bit limited in terms of light and heat due to higher latitude and has the latest maturity date. The overall trend of maturity of wheat from both without and with data assimilation is gradually pushed back from the southwest to the northeast, however, after assimilation, the maturity dates of both the west and northeast regions are the latest.
The changes in the spatial distribution of the maturity date without and with data assimilation are mainly manifested in two aspects: (1) the assimilated maturity dates of the study area are generally postponed, especially in the high-altitude regions at the western part. The reasons for the analysis are the optimized IDEM and TSUM1. The optimized value has changed greatly from the initial calibrated value, leading to a remarkable trend change in the mature dates of the corresponding region after assimilation; (2) the maturity dates distribution after assimilation shows more detailed spatial variability. We can see that there are some certain differences in the maturity dates of adjacent winter wheat pixels. It demonstrates that the optimized parameters of WOFOST model by assimilating MODIS LAI can reflect the planting varieties and seeding time on a finer spatial scale to a certain extent.
(a) Maturity date without assimilation.
(b) Maturity date with assimilation.

Evaluation of Maturity Date Prediction Performance
In order to find out the prediction start time with higher accuracy as early as possible, we designed a set of predicting experiments. In this section, we selected five forecasting times to predict The changes in the spatial distribution of the maturity date without and with data assimilation are mainly manifested in two aspects: (1) the assimilated maturity dates of the study area are generally postponed, especially in the high-altitude regions at the western part. The reasons for the analysis are the optimized IDEM and TSUM1. The optimized value has changed greatly from the initial calibrated value, leading to a remarkable trend change in the mature dates of the corresponding region after assimilation; (2) the maturity dates distribution after assimilation shows more detailed spatial variability. We can see that there are some certain differences in the maturity dates of adjacent winter wheat pixels. It demonstrates that the optimized parameters of WOFOST model by assimilating MODIS LAI can reflect the planting varieties and seeding time on a finer spatial scale to a certain extent.

Evaluation of Maturity Date Prediction Performance
In order to find out the prediction start time with higher accuracy as early as possible, we designed a set of predicting experiments. In this section, we selected five forecasting times to predict the maturity dates of winter wheat, which were 15 March, 30 March, 15 April, 30 April and 15 May (same as the experiment in Section 3.2). Table 3 shows the time period of MODIS product (MCD15A3H) and meteorological data used in this study. Figure 7 exhibits the scatter plot of all the forecast starting times between observed and predicted dates. Table 4 shows the errors in those predictions.  It can be seen from the Figure 7 that both the two forecasting start times in March present large errors, the correlation between the predicted date and the observed dates is low, the correlation coefficient for the maturity dates is somewhat lower than that of the anthesis period, the scatter distribution is more dispersed, and the deviation of some points reaches 10 d. However, we found that the slope of the trend line between the predicted date and the observed date is about 1, which indicates that although there is a large error in the predicted growth period, there is no obvious systematic error (the intercept is between −3.51 and 33.27), and the overall deviation of the predicted value does not appear because the points are distributed around the 1:1 line in general.  Comparing the prediction results of 15 March with 30 March, it can be found that the errors of the two maturity periods are similar, but the prediction accuracy and consistency of the latter one in predicting anthesis date are improved. A feasible explanation is that most of the winter wheat in the study area enters the greening stage in mid-to-late February. After returning to green, the growth rate gradually increases. Therefore, winter wheat still grows slowly in the March 15 forecast experiment. Remote sensing LAI can only provide little trend variation, so the parameter optimization of the assimilation algorithm is not effective under this circumstance. On 30 March, as winter wheat enters the jointing stage, the trend of remote sensing LAI changes became more obvious, and the assimilation framework can improve the predicting accuracy of the anthesis stage to a certain extent.
The increase in RMSE is most obvious from 30 March to 15 April. The possible reason for the increase is that most of the winter wheat in the study area enters the heading stage in early and mid-April. The heading stage is the period when the LAI peaks during the whole growth period of the winter wheat, so heading stage is generally the period when the remote sensing and model-simulated LAI has the largest difference. Then, the parameter optimization by the assimilation method can adjust the peak of the output LAI of the model, so that the error of the predicted maturity can be reduced.
From the prediction results on 30 April and 15 May, we have found that the error in predicting the anthesis dates is even higher than that of the maturity dates, which points out an advantage of the assimilation framework in this study. That is, this framework is less affected by error transmission. Errors of state variables will gradually accumulate during the simulating process, resulting in simulation errors at maturity often larger than anthesis. But this framework assimilates all available LAI data for the whole growth period, that is, in this framework, the anthesis and maturity stages are optimized at the same time. Therefore, it is very likely that although there are errors in the anthesis period, these errors will be compensated in the maturity stage.
It can be seen from Figure 7 that with the postponement of the forecast time, the accuracy of the prediction (simulation) of the anthesis and maturity dates of winter wheat in the study area has gradually improved. The RMSE on 15 April is within 4 days: this increases to 2.5 on 30 April. On 15 May, the maturity prediction error is within 2 d, which shows high accuracy. In addition, the correlation between the predicted and the observed dates is also increasing, R 2 is increasing from about 0.7 to more than 0.9. The range of predicted and observed dates is closer to each other, and the difference in spatial variability is small. Therefore, the error and correlation coefficient on 30 April are remarkably close to 15 May, which is an ideal forecasting start time.

Prediction Results of Maturity Date for Different Forecasting Nodes
In Figures 6 and 8, we have found the winter wheat in the study area matures from the southwest region to the northeast, basically resulting from meteorological conditions such as solar radiation and temperature. However, there are some regions disobeying the overall trend in the study area. For example, winter wheat of the southeastern area matures later mainly due to the local field management such as the sowing date. Figure 8 shows that since 15 April, the spatial pattern of the predicted maturity dates begins to gradually conform to the actual situation, that is, the maturity dates gradually delay from the southwest to the northeast in the study area. It is obvious that the predicted maturity dates of most stations fluctuate greatly during the two forecasts in March, and the direction of fluctuation may be opposite. The fluctuation range has decreased since 15 April. After 30 April, the fluctuation gradually converges and stabilizes. This indicates that forecasting the maturity date on 30 April can generate better prediction results in the study areas.

Assimilation of Remotely Sensed LAI into WOFOST Model
LAI is the most commonly used state variable that connects remotely sensed observations and crop models, because it is involved in many important processes of crop growth, such as light interception, CO2 assimilation and vegetation transpiration. The remotely sensed LAI observation used in this study is MODIS LAI product (MCD15A3H) with 500-m spatial resolution. In reality, the field parcel size in China is always smaller than 500 m × 500 m, and the coarse pixels usually result in LAI values obtained from highly heterogeneous surfaces and leading to large scaling effects. Therefore, LAI observations at a finer spatial resolution can effectively reduce the spatial mismatch between the crop model and LAI observations, and it is an efficient way to improve the performance of data assimilation results [45]. Besides, the cost function used in this study is built by least square method, which takes both remote sensing observations and crop model simulations into account. However, this method does not consider the uncertainty of both of them. Therefore, a more

Assimilation of Remotely Sensed LAI into WOFOST Model
LAI is the most commonly used state variable that connects remotely sensed observations and crop models, because it is involved in many important processes of crop growth, such as light interception, CO 2 assimilation and vegetation transpiration. The remotely sensed LAI observation used in this study is MODIS LAI product (MCD15A3H) with 500-m spatial resolution. In reality, the field parcel size in China is always smaller than 500 m × 500 m, and the coarse pixels usually result in LAI values obtained from highly heterogeneous surfaces and leading to large scaling effects. Therefore, LAI observations at a finer spatial resolution can effectively reduce the spatial mismatch between the crop model and LAI observations, and it is an efficient way to improve the performance of data assimilation results [45]. Besides, the cost function used in this study is built by least square method, which takes both remote sensing observations and crop model simulations into account. However, this method does not consider the uncertainty of both of them. Therefore, a more thoughtful cost function that considers the uncertainty of observation and crop model could be investigated in the future work, for example, the four-dimensional variational data assimilation cost function (4DVar) [46,47], it can integrate remote sensing LAI (observations), crop model-simulated LAI (model simulations) and their uncertainties [48].

Maturity Date Prediction Using Crop Model Data Assimilation Scheme
Crop maturity is essential for evaluating crop productivity and crop management. The proposed maturity dates prediction method based on crop model data assimilation scheme successfully predicts winter wheat maturity date in 2018 of Henan province, and it has the best prediction accuracy since 30 April (RMSE = 2.42 d). This method shows significant applicability and has the potential to be applied to other winter wheat planting regions. Besides, it could be better if the approximate dates of growth stages and the general trend characteristics of LAI in the target study area are obtained prior to the prediction of maturity date, because the premise of this method is that the WOFOST model is localized well. If that basic information is scarce, it is difficult to determine the initial value of the parameters to be optimized (e.g., TSUM1, TSUM2, IDEM). In that case, the selected ranges of parameters for the optimization have to be enlarged in order to find the optimal values in those ranges. Meanwhile, this will increase the workload and greatly reduce the calculation efficiency.
Although this method has general applicability, there are also some limitations to the use of this approach. First, crop growth phenology in the WOFOST model is determined by temperature and photoperiod, but we only optimized temperature-related parameters (TSUM1 and TSUM2) in the present study. Therefore, the photoperiod-related parameters should be considered in the data assimilation scheme in future work. Second, there are some uncertainties in the meteorological data with regard to exceeding the prediction time limit of TIGGE. In this study, the meteorological data after the prediction time of TIGGE were followed by the similar year's data, and there is a certain deviation from the actual meteorological data. In future study, we can try some medium-and long-term weather forecasting methods, such as regional climate model and weather generator [49], to couple with crop model for crop growth phenology prediction.

Uncertainty of Maturity Prediction
Some errors of the assimilated results were caused by the uncertainty in this framework. First, MODIS LAI product may be affected by cloud or rain, and the reconstructing process could introduce new bias to LAI time series. Moreover, the spatial resolution of remote sensing data is relatively low, thereby some buildings or other vegetation may be mixed in the planting pixels of winter wheat, resulting in limited purity of winter wheat pixels in the study area, affecting the quality of remote sensing LAI data, and consequently introducing errors to the maturity prediction. Second, some primary parameters of WOFOST, such as AMAXTB or FLTB, were not recalibrated by pixels, so the unsuitability could also bring uncertainties. Third, in the process of minimizing the cost function by SCE-UA algorithm, it may be difficult to accurately find the global optimal parameter set that is closest to the actual situation, so there may be some errors in the optimized parameter set.
From the agricultural perspective, the maturity of winter wheat is mainly determined by both the species and the field management. As for the species, because of the differences in winter wheat varieties and field management in the study area, the spatial values of these parameters ordered by WOFOST cannot be obtained. Therefore, the same values of other parameters will also cause certain errors when running WOFOST model. This study explains the phenological difference between TSUM1 and TSUM2 parameterization. However, other factors could also change the growing process, such as the vernalization before flowering, and the drought during the reproductive period [50,51]. As for the field management, the sowing date is determined by farmers, which could directly define the phenological stage start. Besides, the irrigation is also important for winter wheat in the study area because most of this region needs irrigating, but some of them may not well irrigated. If winter wheat suffers from drought in the reproductive period, it could lead to an earlier maturity. Finally, because the measured data from agrometeorological sites were used for verifying the prediction accuracy of the growth stages, these data can only reflect the development stages of the field observed by the sites, and the scale may be much smaller than the MODIS LAI data. The difference in scale between the two data sources will also cause prediction errors.

Conclusions
In this study, a crop model data assimilation framework was constructed for retrieving the predicted maturity dates at the regional scale. We selected LAI as the assimilating variable, MODIS LAI as the remote sensing data source, meteorological observation and weather numeric forecasting data as the weather input of WOFOST model, and 15 May 2018 as the starting time node of forecasting. The parameters of WOFOST crop model was optimized by SCE-UA to minimize the normalized cost function. The conclusions of this research are as follows: (1) the predicted anthesis dates and maturity dates after assimilation basically coincide with observation, with RMSE of 2.12 d and 1.86 d, respectively. Therefore, this framework can predict the maturity of winter wheat accurately. Besides, this framework can optimize the parameters of WOFOST such as TSUM1 and actually optimize the output LAI curve trend and avoid being affected by the difference of the two sources of LAI. (2) Considering both the earlier forecasting time and the higher accuracy, predicting maturity on 30 April can obtain ideal results. In this study, we have found that with the postponement of the forecasting start time, the accuracy of the prediction (simulation) of the anthesis and maturity dates was gradually improved. The RMSE on 15 April is within 4 days, and decreases to about 2.5 days on 30 April. On 15 May, the maturity prediction error is within 2 days, showing a high accuracy. In addition, the correlation between the predicted and the observed dates is also increasing, with R 2 is increasing from about 0.7 to more than 0.9.