Hidden Markov Models for Real-Time Estimation of Corn Progress Stages Using MODIS and Meteorological Data

Real-time estimation of crop progress stages is critical to the US agricultural economy and decision making. In this paper, a Hidden Markov Model (HMM) based method combining multisource features has been presented. The multisource features include mean Normalized Difference Vegetation Index (NDVI), fractal dimension, and Accumulated Growing Degree Days (AGDDs). In our case, these features are global variable, and measured in the state-level. Moreover, global feature in each Day of Year (DOY) would be impacted by multiple progress stages. Therefore, a mixture model is employed to model the observation probability distribution with all possible stage components. Then, a filtering based algorithm is utilized to estimate the proportion of each progress stage in the real-time. Experiments are conducted in the states of Iowa, Illinois and Nebraska in the USA, and our results are assessed and validated by the Crop Progress Reports (CPRs) of the National Agricultural Statistics Service (NASS). Finally, a quantitative comparison and analysis between our method and spectral pixel-wise based methods is presented. The results demonstrate the feasibility of the proposed method for the estimation of corn progress stages. The proposed method could be used as a supplementary tool in aid of field survey. Moreover, it also can be used to establish the progress stage estimation model for different types of crops.


Introduction
Crop phenological phases, which reflect the timing of recurring biological events, play an important role in US agricultural production, management, planning and decision-making.Real-time accurate estimation of crop progress stages is desired during growing season.The National Agricultural Statistics Service (NASS) of the United States Department of Agriculture (USDA) issues Crop Progress Reports (CPRs) [1] weekly during the growing season, listing primary progresses of selected crops in major producing states and Agricultural Statistics Districts (ASDs) through field survey.Despite its accuracy, only site-specific information in limited geographical extent can be provided.Especially, field survey is labor intensive and time-consuming.Therefore, it is very necessary to find a supplementary way for efficient, quantitative, and accurate estimating of crop progress stages.
Many algorithms and techniques have been developed to detect specific crop phenological stages, e.g., greenup, maturity, senescence and dormancy [2].A comparison of several existing methods for analyzing remotely sensed time series is provided in Hird and McDermid [3], and Atkinson et al. [4].Most of methods focus on the timing, duration and intensity of the growing season [5].Growing degree days (thermal time or heat units) are generally applied for measuring different developmental events of crop [6,7].Most of the process oriented crop growth models, e.g., WOrld FOod STudies (WOFOST) [8], Cropping Systems Simulator (CropSyst) [9], Decision Support System for Agrotechnology Transfer (DSSAT) series [10], Soil-Plant-Air-Water (SPAW) [11], and Soil-Water-Atmosphere-Plant (SWAP) [12], mainly use thermal index to define different phonological stages as well as to quantify the developmental rates.For example, the developmental rate of the WOFOST model is defined as a crop/cultivar specific function of ambient temperature, possibly modified by photoperiod [8].However, in the latter portion of the temperate growing season, day-length and water stress becomes important factors impacting crop growth and the onset of senescence, which make the models mathematically complex [13].
Remote sensing techniques, which provide consistent measurements at broad-scale and frequent time intervals, have been increasingly adopted to detect crop progress stages.High temporal frequencies products are collected from coarse-to-moderate spatial resolution platforms, e.g., Advanced Very High Resolution Radiometer (AVHRR), Moderate Resolution Imaging Spectroradiometer (MODIS), Satellite Pour l'Observation de la Terre (SPOT) VEGETATION (VGT), and Sea-Viewing Wide Field-of-View Sensor (SeaWiFS).A rigorous review of the sensors characteristics led to the hypothesis that MODIS is most likely to achieve the best results followed by SPOT-VGT and lastly by AVHRR in agricultural monitoring [14].Although crop progress metrics derived from satellite data may not necessarily correspond directly to conventional terrestrial phenological events [15], they implicitly link to the specific crop growth status.Vegetation Indices (VIs), especially the Normalized Difference Vegetation Index (NDVI), which reflects terrestrial crop cover and growth condition [16], are frequently utilized in crop progress studies.The methods of crop progress stage detection using VIs time series can be broadly grouped into four categories [13]: thresholds, derivatives, smoothing functions, and fitted models [17].However, most of these methods have been designed only on spectral pixel-wise analyses, ignoring that remote sensing image provides information of crop on both spectral responses and spatial pattern.
Generally, image texture changes during the crop life cycle, due to regional differences on crop sowing and growth, density of green leaves, soil background effects, etc.It provides a new perspective to analyze the growth of crops from satellite data.Culbert et al. [18] stated how image texture measures are affected by surface phenology.Shen et al. [19] explored the links between fractal dimension and corn progress stages from MODIS-NDVI time series, and shown that fractal dimension can be used as an index of heterogeneity for measuring corn progress stage changes.
Therefore, we will inherit the advantages of multisource features from thermal index, as well as spectral responses and spatial pattern of remote sensing, and incorporate Accumulated Growth Degree Days (AGDDs), mean NDVI, fractal dimension to estimate corn progress stages.According to the NASS's CPRs, crop progress stages in the state-level represent as progress percentages [20].To directly estimate corn progress percentages at the state-level, a Hidden Markov Model (HMM) based method is proposed, which utilizes the corresponding state-level global features.The HMM method can be regarded as a dynamic version of the Bayesian approach to model uncertainty [21].It has been successfully applied in speech recognition [22] and computational molecular biology [23].In the field of agriculture, numerous researches concern the use of HMMs and multi-temporal remote sensing images for automatic land cover classification incorporating knowledge of phenology into the classification process [24,25].A rare example of phenology detection is presented in [26], where HMMs are brought into vegetation dynamics analysis from time series of satellite remote sensing.The main difference between the model proposed in this paper and the one presented in [26] is that our model is directly constructed in the state-level, which can be trained and verified by existing field survey data, e.g., NASS's CPRs.More specifically, the Gaussian mixture model has been employed to define the probability distribution of all probable stage components.Our method is designed for real-time data processing, and uncertainties of data have been considered and processed to ensure the raw feature inputs can be handled directly.Moreover, this method takes growth-related features covering spectral responses, spatial pattern and environmental factor into account, which provides a supplemental way for the information collection of US corn progress.
The rest of this paper is organized as follows.In Section 2, we give a brief description on the study area and date sets.In Section 3, multisource features and the related processing of extraction are introduced.Mechanisms and schedules of proposed methods are clarified and discussed in Section 4. In Section 5, the performance of the proposed method is evaluated in terms of qualitative and quantitative measures.A conclusion is drawn in Section 6.

Study Area and Data Sets
The study area locates in the US Corn Belt, the most intensively cultivated region of the Midwest United States.It comprises of three major corn-producing states, including Iowa (ranges from 40°36'N to 43°30'N, and 89°5'W to 96°31'W), Illinois (ranges from 36°58'N to 42°30'N, and87°30'W to 91°30'W), and Nebraska (ranges from 40°N to 43°N, and 95°25'W to 104°W) (Figure 1).Corn, an annual crop, is the predominant crop in these regions.According to the definition of corn progress stages by USDA/NASS, two important corn progress categories are usually surveyed in the whole growing cycle: the progress of farming activities (i.e., planted and harvested) and phenological stages (i.e., emerged, silking, dough, dent, and mature) [20].Historical Climatology Network (USHCN) [27].USHCN is a high-quality network of US Cooperative Observer Network stations, specially selected for analyzing long-term variability and change in the whole contiguous United States [27].In this study, 23, 33, and 37 meteorological stations are chosen for the states of Iowa, Illinois, and Nebraska, respectively (Figure 1 and Appendix: Table A1).The meteorological stations were selected with a number of criteria including length of period of record, and spatial coverage.

Feature Extraction
Multisource features are used as the input of the HMM model.Global features, which are measured in the state-level, include mean NDVI, fractal dimension, and AGDDs.Along the corn life cycle, these features are in different forms of distribution (Figure 2).The mean NDVI and fractal dimension curves are unimodal and bimodal, respectively.The AGDDs is in a monotonic curve.Mean NDVI and fractal dimension are extracted from remote sensing image.AGDDs data are extracted from the site-specific observations of meteorological stations.The implementation of the feature extraction is introduced in the following subsections.

Mean NDVI
NDVI measures the greenness of crop on spectral response of remote sensing image.To extract the mean NDVI in the state-level, data pre-processing has been conducted over the original daily MODIS-NDVI time series.The data pre-processing mainly includes: image compositing, which composites daily NDVI images into weekly composite products by Maximum Value Composite (MVC) [28], and image masking, which eliminates non-corn pixels from weekly NDVI image with the mask of NASS's CDL.Relevant agreements of the data pre-processing procedures can be found in [19].Mean NDVI is extracted directly by statistics on masked weekly NDVI image.It is worth mentioning that no additional pre-process has been conducted on the time series, e.g., smoothing, and filtering.

Fractal Dimension
Fractal dimension measures the roughness of corn NDVI image, which can be used as an index of heterogeneity to reflect corn growth status.As stated in [19], roughness varies in the growth season due to corn fields are at different stages, i.e., fractal dimension time series reflects spatiotemporal changes of corn in the life cycle.The fractal dimension is estimated by Dimensionality-Reduction based Differential Box-Counting (DR-DBC) algorithm [19].The estimation process is also conducted on the masked weekly NDVI image.

AGDDs
Photoperiod and temperature are two common environmental factors that significantly affect crop growth and development [29].Modern corn hybrids are less vulnerable to photoperiod, but are greatly affected by temperature [29].In the studies of crop growth, temperature is often presented as growing degree days.AGDDs, defined as an explanatory temperature variable from some consistent start date until a specific subsequent, are usually related to crop growth in the corn life cycle [6,7].
Ambient minimum and maximum temperatures are observed by meteorological station, i.e., the raw data are in the site-specific form.To extract the global AGDDs, two steps should be taken: (1) calculate the mean minimum/maximum temperature of state-level from site-specific observations; and (2) calculate AGDDs from adjusted mean minimum and maximum temperatures.We define daily minimum and maximum temperatures of the th meteorological station at the Day of Year (DOY) as and , respectively.The state-level mean minimum and maximum temperatures at the DOY are defined as and , respectively.In addition, to calculate the AGDDs, adjustment should be performed on and by the rules of corn response to temperature stress.The adjusted and are defined as and , respectively.and are generated by weighted average of all available meteorological stations within the corresponding administrative border.The Thiessen polygon approach [30], a geospatial technique, is applied to graphically weight meteorological station data.In this approach, each station is weighted in direct proportion based on its area of influence in the total area of specified administrative unit.It assumes that any point of temperature condition is equal to that of the nearest station.We take the calculation of as example, and is the same.The over a state is calculated by where is the number of available meteorological stations over the given state, e.g., state of Iowa, 23; is the weight of station , which can be determined by its corresponding influence area of station , i.e., / , where is the administrative area of the given state, and represents the influence area of station that is divided by Thiessen polygon.The methods we used for calculating in this paper are especially applicable and useful to avoid the ambient temperature data, which may be missing from the time series of records in actual practice.
Growth and development in crops is temperature dependent.Development does not occur unless temperatures exceed a lower base temperature , and ceases as temperatures exceed an upper threshold [31].In the United States, the usual low temperature stress of corn or base temperature is 10° [32].In addition, previous studies have shown that corn growth slows at temperatures above 30° [32].Therefore, we use 10° and 30° to adjust and accordingly.That is, if the lowest temperature for a day is below the 10°, then 10° is used as the , and if the highest temperature is over the 30°, then 30° is used as .Start date is set as 1 April.Given the and , and according to the rule of adjustment, the AGDDs at the DOY can be calculated by

Corn Progress Percentages Estimation
As a doubly embedded stochastic process, HMM involves at least two levels of uncertainty: a hidden stochastic process that is not directly observable, but can be observed only through another set of stochastic processes that generate the sequence of observations [21].In our model, the observable variables (multisource features) include mean NDVI, fractal dimension, and AGDDs, while the unobservable (hidden) variables are corn progress stages.In the following sections, we will introduce the HMM briefly, then focus on the estimation of corn progress percentages for any specified time, as soon as the remote sensing and meteorological data are available.

Specifying an HMM
Corn progress can be assumed as a Markov process with hidden stages , … , , and observation sequence , , .In this study, the hidden stages consist of pre-season ( ), planted ( ), emerged ( ), silking ( ), dough ( ), dent ( ), mature ( ), and harvested ( ).The pre-season stage, which represents the period when corn hasn't been planted, is added as the first time interval to facilitate the design of the model.Let where, , 1, … , .
In this study, the sequence , … , is assumed to be a typical Markov chain with a first-order Markov assumption, i.e., stage at can only be decided by stage of previous latent variable and independent of all other stages.We abbreviate as , .In addition, the observation at time can only be determined by its corresponding stage .is abbreviated as .Thus, the probability that mentioned in Equation ( 3) is also equal to

Mixture Model in HMMs
HMM could be understood as combining a Markov chain model with a mixture model [33].Two embedded stochastic processes in the HMM related to two chains: the external chain of observations and the internal chain of hidden stages (Figure 3).It is capable to represent uncertainties on stage determination and on observation [26].In addition, "HMMs viewed as mixture" [34] represents that observation at each time node might be impacted by multiple hidden stages, assuming observation that forms clusters can be modeled as a mixture of components.A single time node corresponds to a mixture distribution with component densities , i.e., each stage of discrete variable represents a different component.The probability of observation is given by where, can be regarded as the weight of the th component, and ∑ 1 .Mixture model provides flexibility and precision in modeling the underlying statistics of corn progress stages.HMM uses discrete hidden stage representations.It is applicable to combine the hidden stage of continuous probability space models and the discrete stage of HMMs to model time series with continuous but nonlinear dynamics.In our case, the continuous observation HMM, the entry of is given by continuous probability density functions, i.e., Gaussian distribution.

NASS's CPRs Normalization
The NASS's CPRs record the progress percentages of each growth stage by the percent complete (area ratio) in the ASD-level or state-level, e.g., percent complete of stage at time noted as , 1, … , ).Ratios represent stages complete, rather than the proportion of each stage occupancy over an administrative unit in current time (Figure 4(a)).Corn phenological stages are unimodal in the life cycle.For a single corn plant, the arrival of , 1 means has already completed.That is, is nested within , e.g., 19% of dough ( ) has completed means at least 19% of silking ( ) had completed already.In our model, the normalized CPRs or stage prior can be straightforward to signify the area ratio of stage occupancy at time for a specific administrative unit.Theoretically, can be calculated from the original recording of NASS's CPRs.It should be noted that some stages have no records of NASS's CPRs data, because stage has not arrived or even passed by.Thus, before the calculation, we need a data filling process for CPRs data.If the data recorded has not reached a certain stage, the value is set to 0, and if the developmental stage has passed, then the value is set to 1.
We assume that each stage can only transform up to itself or its next stage within a week, For example, in Figure 4, emerged ( ) takes at least 9.5 days (bigger than a week) delays after the planted ( ).Then, is calculated by , , For example, in the 31th week (Iowa, 2011) shown in Figure 4(a), stages of dent ( ), dough ( ) and silking ( ) have completed 1%, 19% and 96%, respectively.Based on Equation ( 6), we know that this region has 1%, 18%, 77% and 4% of corn plants at dent, dough silking and emerged ( ) stages, respectively (Figure 4(b)).

HMM Parameters Determination
As mentioned above, an HMM consists of three probabilities: initial probability distribution, stage transition probabilities, and observation probabilities.They can be estimated from archive data with the following processes.

Initial Probability Distribution
The initial probability distribution or stage prior probabilities, which specifies the onset time, characterizes the stage of model if observations are not taken into account [26].To estimate the probability of each stage at the onset of growth season, we need a prior knowledge about preferential months for corn sowing [25].Generally, the 13th week of our study area, is assumed at the pre-season stage for most corn plants.For practical applications, the initial probability can be calculated from the statistics on historical records at the same time slice, e.g., the initial stage probabilities of the 13th week are estimated by averaging of normalized records of NASS's CPRs on all available years at the same week.

Stage Transition Probability Matrix
In conventional HMM, the stage transition probability matrix with is a global parameter, i.e., all weeks along the life cycle share the same transition probability matrix.However, this is not adapted to model corn growth.Transition probabilities should be allowed to vary, similarly to time inhomogeneous Markov chain [35].The time-dependent transition probabilities also can be found in tumor expression profiles [36] and financial time-series data analysis [35].In our case, the matrix varies along corn growth.For example, at the start of the life cycle, corn is likely to be at the initial progress stage, i.e., the transition from current stage to itself is strong, and to its next stage is weak.However, with the time passes by, transition to initial progress stage becomes weak gradually, and then vanished.Generally, the transition variation depends on the biophysical mechanisms and external factors driving corn plant growth.The former depends greatly on characteristics of a particular crop, e.g., breeding [37].The latter conditioned mostly by soil characteristics, elevation, irradiation, temperature, precipitation, and human disturbances as well [26].We consider stage transition probability matrix as a local HMM parameter (time-dependent).The probability , varies when time changes.We assume a life cycle is unimodal, i.e., stage only can transform to itself or its next stage (Figure 5)., can be calculated directly from normalized NASS's CPRs data.Thus, , is calculated by where, ∑ , 1. and determine the position of stages with respect to the time variable .
There are four restrictions in Equation (7).The first three relate to self-transition probability at the end, self-transition probability in the chain, and forward stage change, respectively.For example, if , then all transitions except , and , are zero element., and , respectively correspond to the second and third restrictions, which sum up to 1.
In practical applications, the stage probability transform matrix is determined through a two-step strategy as follows: (1) mean progress percentages are calculated by averaging of normalized recordings on all available years at each time slice; and (2) stage transform probability matrix of each time slice is calculated by Equation (7).The first averaging step will result in a certain degree of errors especially progress stages that are produced by unexpected factors, including climate change, farming practices, and natural disasters.However, the hidden stages in our HMM, which controlled by stage transition probabilities, are regarded as stochastic process and it is able to incorporate uncertainties.

Observation Probability Matrix
The probability of observation being generated in a certain stage is called the observation probability.In this paper, the feature vector is comprised of mean NDVI, fractal dimension, and AGDDs.Observation values continuously change with the effect of phenological alternation.Moreover, there would be multiple progress stages occupied in the same time period of an administrative unit.Thus, we model observation to be a mixture of stages (Figure 3).The mixing weights are determined by the area ratio of each stage occupation, which coincide with the prior probabilities of each stage .Probability density function associated with observations for each administrative unit can be modeled by a multivariate Gaussian distribution.Thus, in Equation ( 5), is a linear superposition of Gaussian distribution, and is parameterized on mean vector and covariance matrix . is given by where, refers to the dimensionality of the observation space, and 3 causing three kinds of features were selected in this study.
In our model, and are global HMM parameters, i.e., they are independent to time.The th component weight (or mixing coefficient) knowns from ground surveying, i.e., only and are unknown.Given an observation sequence , , , we can determine and using maximum likelihood.The log-likelihood function with parameter space , is given by and it can be estimated by Expectation Maximization (EM) [38] iterative algorithm with E-step (expectation) and M-step (maximization).EM starts with initial values for the parameters and , and iteratively performs these two steps until convergence to a local maximum of the likelihood function.In the 1 th iteration, the and are calculated by where In our case, the observation probabilities are composed by eight Gaussian distribution components.We approximately assume / ∑ to initialize the with Equation ( 10) and with Equation ( 11), then iteratively perform EM algorithm until convergence to a local maximum.Meanwhile, we record the corresponding mean vector and covariance matrix during the last iteration.

Progress Percents Estimation
The goal of this paper is to determine the area proportion of each progress stage.This problem can be regarded as computing the posterior over the hidden stages at each time , given HMM parameter , and all available observations up to the current time, e.g., | , … , .This is an online process (real-time), and can be solved by filtering based algorithms.We should emphasize that filtering, smoothing (offline), and prediction problems all compute the probability of hidden stages for given observations, e.g., | , … , .More specifically, the difference is the smoothing problem compute by , the filtering , and prediction .We note , , … , as , which represents the probability of all the observation up to time and the stage at time is , then Generally, the forward algorithm is directly adapted to calculate for the observation sequence of increasing interval .Then, can be obtained recursively according to with the initial forward probabilities as the joint probability of state and initial observation After estimating | , … , which also represents area ratio of stage occupancy at time for a specific administrative unit, an inverse normalize transfer process should be deployed.We note as the progress percent of stage at time .Then it can be calculated by where, 2, … , .

Results and Discussions
By constructing a general HMM framework for corn progress stages estimation with multisource features, seven key corn progress stages and its percentages can be estimated in the real-time.Experiments have been conducted on the states of Iowa, Illinois, and Nebraska of the United States.A decadal data (2002 throughout 2011) during the corn growing seasons is selected for this study.The 13th week is set as the start date.

RMSE Results
The accuracy of the experiments is evaluated by Root Mean Squared Error (RMSE) [39], which measures the difference between estimated and observed values.The values in the following are expressed as a percentage, because the estimation results in corn progress percentages.Lower values indicate less residual variance.The evaluation covers the whole corn life cycle.The pre-season stage is not included in the error evaluation, because it is only defined to facilitate our model.Figure 6 shows the RMSE of all seven NASS/USDA defined corn progress stages in states of Iowa, Illinois, and Nebraska, respectively.All the results are the average of 100 runs.In each run, 7 year data are randomly selected for training, and remaining 3 year are used for testing.Results reported in error bar are significantly better, with confidence level 95%.
By analyzing errors in the processing of stage percentages determination (Figure 6), we find that the RMSE increases gradually, and reaches the first greater maximum around the 20th week.The corresponding RMSE is 18.29% (Iowa), 23.71% (Illinois), and 16.82% (Nebraska).This is likely caused by the uncontrollable planting practices, e.g., different planting speed in different year.Then, the RMSE decreases gradually until the 25th week.By referring to Figure 4, we find that in this period the proportion of emerged stage increases gradually, and only emerged left around the 25th week.In the week, results are less affected by overlaps of stages.After the emerged stage, the RMSE reaches another maximum around the 28th week, and then fluctuates around 18.0% (Iowa), 18.5% (Illinois), and 16.5% (Nebraska) until the 43th week.This is likely caused by progress stage overlaps, e.g., harvest stage is overlapped with dented and mature stages.During this period, results will be more affected by model errors, which are generated in the parameters estimation of state transition probabilities and observation probabilities.In addition, the errors inherited from original data have also impacted the accuracy of estimation.

Accuracy Comparison
Although it is not the same case on the estimation of corn progress percentages, we try to find a comparable case to compare our method with spectral pixel-wise based method.Yu et al. [40] developed kernel-based methods to estimate the corn phenological stages that defined by NASS/USDA.The base kernel, which is determined from modeling annual NDVI profiles of previous years in the pixel-wise, is tolerant to noisy data and missing data.Comparison in different combinations on threshold (global or local), masking (percentage of pure pixels, e.g., 90% and 100%), and filter algorithms (e.g., quintic polynomial, double Sigmoid, Savitzky-Golay, and Spline) have been conducted.The test is conducted in the state of Iowa in year 2006.The study also gives the RMSE of modeled results against NASS's CPRs for the whole year.The lowest RMSE is 24.6%, corresponding to the combination of local threshold, pure pixels and Spline-based smoothing method.
To perform better comparison with the results that presented in [40], we convert our weekly RMSE into whole year RMSE.Our result shows that the RMSE is 13.27% (Iowa), 16.14% (Illinois), and 12.91% (Nebraska).The comparison results indicate that our method is better than spectral pixel-wise based methods on the estimation of corn progress to some extent.

Performance and Analysis
One of the advantages of this method is that we can estimate the proportion of each corn progress stage in a real-time through the established model and the currently acquired data.The estimated results can correspond to NASS's CPRs directly.The following factors can significantly impact the results.
(1) The accuracy of NASS's CPRs.The NASS's CPRs are surveyed data, and mainly depended on the subjective assessment of investigators.Thus, a bias error is inevitably introduced in the NASS's CPRs data [41]; (2) The quality of MODIS NDVI.Noise has inevitably disturbed the daily MODIS-NDVI images, e.g., cloud cover, missing data, mixed pixels, or some of the systematic errors that reduce the index value of daily MODIS-NDVI images; (3) The reliability of meteorological data, regarding to the observation data of weather stations, data missing, instrumentation, or observation station location change may affect the data homogeneity and spatial coverage; (4) Irregularities in raining and temperature pattern in different years, e.g., extensive drought occurs in a particular year, can significantly affect the stability of results.It would specially impacted on HMM parameters training, e.g., the stage transition probability matrix.(5) The insufficiency of temporal resolution.The temporal resolution of data is an important factor that affects the accuracy of corn progress stages estimation.As shown in Figure 4, the emerged stage just 9.5 days delays to planted stage, and dent stage approximate 15.4 days delays to dough stage.Accurate distinction between these growth stages requires a higher temporal resolution.It is really intractable that we have to trade off temporal resolution and data quality.
There are many suggested ways, which would improve the accuracy of the results.One believes that although an image compositing process has been conducted and contributed to eliminate noises in this paper, more reliable quality control for original remote sensing data will suppress noises and improve the accuracy of the corn progress percentage estimation.Another potential solution is to use high-order HMMs.High-order stage transition dependency would result in good modeling of stage duration [42].Mari et al. [43] carried out a comparative study between first-and second-order HMMs on automatic word recognition.Seifert et al. [44,45] utilized high-order HMM to improve modeling of spatial dependencies between chromosomal regions.Derrode et al. [46] introduced a high-order hidden Markov chain for unsupervised SAR image segmentation, which allows one to take into account more complex and correlated noise.Study on the applicability of high-order HMM for estimating corn progress stages is needed to further determine.

Conclusion
Remote sensing and meteorological data have been separately employed for detecting crop progress stages in most recent studies.In this paper, we have performed the integration of multisource data for retrieving corn progress metrics.Three features in the state-level have been chosen, including mean NDVI, fractal dimension, and AGDDs.The mean NDVI and fractal dimension are extracted from MODIS-NDVI, while the AGDDs derived from meteorological data.It is worth mentioning that the fractal dimension, which indicates the spatial pattern of remote sensing image, is used to measure the changes of corn crop along the life cycle.In order to estimate corn progress stages in the real-time, and directly relate to ground survey data, e.g., NASS's CPRs, an HMM-based method has been proposed.The multisource features are considered as the input to the model, and no additional pre-process is conducted, e.g., smoothing, and filtering.It is also worth mentioning that the developed model is different from conventional HMM models in several aspects: (1) The stage transition probability matrix has been considered as a local HMM parameter, which is reasonable for modeling the growth of corn; (2) Because several stages may jointly affect the observation in the state-level at each time node, the observation probability matrix has been constructed with a mixture model, i.e., observation at each time node is viewed as the mixture of stages with Gaussian distribution.The modified HMM is suitable for estimating the corn progress percentages in the real-time.Experimental studies have been conducted in the states of Iowa, Illinois, and Nebraska of the United States.Comparisons between our method and a series of VIs time series based methods also have been implemented.The results demonstrate that the proposed method performed well on the real-time estimation of corn progress stages.The corn progress percentages can be estimated with accuracies of ±12.91%-16.14%,which is better than the results of spectral pixel-wise based methods (±24.6%).Although the described examples were performed on corn crop and the state-level data sets, the proposed method is also applicable for the real-time estimation of progress stages on other types of crop in multiple county-level.

Figure 1 .
Figure 1.Illustration of study area and selected meteorological stations.The study area covers three states of the United States: Iowa, Illinois and Nebraska.Stations are marked as circle dots, and colors are labeled for different states.The number of meteorological stations of Iowa (blue dots), Illinois (green dots), and Nebraska (red dots) is 23, 33, and 37, respectively.

( 1 )
Daily NDVI time series, which is derived from the atmospherically corrected MODIS MOD09GQ (MODIS Surface Reflectance Daily L2G Global 250 m) dataset with 250 m spatial resolution.This data set is publicly available through the "Vegetation Condition Explorer" (http://dss.csiss.gmu.edu/NDVIDownload/),maintained by the Center for Spatial Information Science and Systems (CSISS), George Mason University.(2) NASS's Cropland Data Layer (CDL), which is a raster, geo-referenced crop-specific land use data layer.The spatial resolution of years 2006-2009 is 56 m, and the rest is 30 m.The data set is publicly available via "CropScape" (http://nassgeodata.gmu.edu/),produced operationally by USDA/NASS.(3) NASS's CPRs, which record the percent complete (area ratio) of crop fields that has either reached or completed a specific progress stage over a specific administrative unit.It is publicly available via NASS's "Quick Stats 2.0" service (http://www.nass.usda.gov/Quick_Stats/).More details of the first three data sets can refer to [19].(4) Daily minimum and maximum temperatures, which are derived from the United States

Figure 5 .
Figure 5. Illustration of corn progress stage transition along a life cycle.