Determining Rice Growth Stage with X-Band SAR: A Metamodel Based Inversion

Rice crops are important in the global food economy, and new techniques are being implemented for their effective management. These techniques rely mainly on the changes in the phenological cycle, which can be investigated by remote sensing systems. High frequency and high spatial resolution Synthetic Aperture Radar (SAR) sensors have great potential in all-weather conditions for detecting temporal phenological changes. This study focuses on a novel approach for growth stage determination of rice fields from SAR data using a parameter space search algorithm. The method employs an inversion scheme for a morphology-based electromagnetic backscattering model. Since such a morphology-based model is complicated and computationally expensive, a surrogate metamodel-based inversion algorithm is proposed for the growth stage estimation. The approach is designed to provide estimates of crop morphology and corresponding growth stage from a continuous growth scale. The accuracy of the proposed method is tested with ground measurements from Turkey and Spain using the images acquired by the TerraSAR-X (TSX) sensor during a full growth cycle of rice crops. The analysis shows good agreement for both datasets. The results of the proposed method emphasize the effectiveness of X-band PolSAR data for morphology-based growth stage determination of rice crops.


Introduction
Temperate climatic conditions with easy access to water sources provide optimum conditions for rice cultivation.In the history of agricultural practices, rice farming goes back to almost 8000 BC.Currently, rice is a major source of income for the rural communities all around the world.According to the International Rice Research Institute (IRRI), worldwide rice production totaled 969 million tons in 2010 [1].Frequent and efficient monitoring strategies are necessary for the optimization of economic competitiveness and the estimation of the associated environmental impacts (e.g., methane emissions).Concerning these issues, researchers have focused on finding sustainable monitoring methods for agricultural fields.
In the last decade, remote sensing techniques have frequently been used as a viable method to monitor agricultural areas [2,3].To this end, two main data sources are presented: optical and Synthetic Aperture Radar (SAR) systems.Optical systems measure reflected sunlight and provide spectral properties of their targets.Additionally, due to short wavelengths (λ < 2500 nm), they are mainly susceptible to atmospheric factors.On the other hand, SAR systems are superior compared to optical systems concerning their temporal coverage with their "all-weather" day and night imaging capability.Furthermore, they play a significant role in environmental monitoring with their sensitivity to the physical alterations in monitored objects.However, to provide a clear explanation of such changes, one has to understand the relation between the object and the backscattered energy in SAR systems.This relation can be a complex interaction between the object and the sensor parameters including frequency, polarization setting and incidence angle.
The selection of appropriate sensor parameters is crucial for agricultural monitoring.For instance, with SAR systems, one should match the size of the structural parts of the crops with the available wavelength (frequency) of the system to identify the effects of morphological changes.Furthermore, the use of different frequencies and incidence angles changes the scattering behavior and the attenuation of the waves inside the canopy [4].Thus, in crop monitoring, high-frequency ( f > 5 GHz), in other words low wavelength (λ < 6 cm), systems are expected to be more sensitive to morphological changes than low-frequency systems.
Several studies monitoring paddy rice fields show that plant morphological structures have a high correlation with the backscattering behavior of high frequency electromagnetic (EM) waves [5][6][7][8][9][10][11].There are currently two mainstream approaches to monitor crops, namely backward and forward.The first class of approaches is based on the statistics of the polarimetric parameters [8,[12][13][14][15][16][17][18][19][20][21].Among these, Inoue, Y. et al. [8,15] have completed the most comprehensive work to date by observing the full phenological cycle with different frequencies, polarizations and incidence angles.Such methods also depend on several factors including temporal variations in the structural density, type of crop or the use of seeds with different genotypes.In the literature, similar temporal trends have been observed for polarimetric descriptors during the phenological cycle, such as intensity, entropy, alpha and phase differences [14,15].The statistical classification methods in the literature are cost-effective and easy to implement.However, the high dynamic range of the polarimetric parameters makes it not trivial to develop widely applicable approaches that cover the full growth period in different locations.Consequently, current thresholding-based statistical methods need to be adjusted for each new monitoring campaign depending on the region of the world in which they take place.Additionally, they usually require a full-time series for each new campaign.Therefore, most of the current statistical methods are not capable of explaining the growth cycle of rice crops in different areas while avoiding high training costs.
Unlike the statistical ones, the second class of approaches is based on the scattering behavior of an EM wave inside the canopy.This behavior can be explained by analytical relations between the crop physical structure (i.e., morphology) and the backscattering coefficients [22][23][24][25][26].Such backscattering models take the plant morphology as input and provide the scattering properties of the EM wave (i.e., backscattering coefficients) as output [27].Since they consider the geometrically-simplified crop morphology, they usually have sophisticated mathematical algorithms.With such a level of complexity, knowing the sensitivity of the model outputs on the input parameters is essential to understand the dynamics of the model.Global Sensitivity Analysis (GSA) provides the necessary quantitative tools to assess it properly [28,29].However, the implementation of GSA can result in high computational costs due to the large number of model evaluations needed in standard sampling-based approaches [28].An effective strategy to significantly reduce this computational burden is by using Polynomial Chaos Expansion (PCE), a surrogate modeling technique that has exceptional convergence properties for the estimation of Sobol indices for complex models [29].It reduces the total computational cost to that of its training set, which is typically relatively small.Therefore, PCE makes GSA computationally efficient.Finally, after the assessment of the importance of model parameters, it is possible to develop an inversion scheme for the crop morphology from polarimetric observations.Despite their higher complexity (mathematical and computational), inverse methods have the potential to provide a much deeper insight into the actual growth stage of a crop field, as they take into account the quantitative interaction between the plants and the EM field.Additionally, they can provide a continuous estimate of the growth stage, because they are directly sensitive to the underlying plant morphology.
An incoherent EM backscattering model [22] is chosen for this study.The model considers a simplified plant morphology with a higher number of unknowns than the number of measurables.For such an ill-posed problem, an analytical inversion approach is infeasible.Furthermore, the presence of speckle noise in the measured intensity data makes pixel-sized inversion algorithms less efficient.In this paper, a parameter search space algorithm combined with a PCE surrogate of the full EM model is considered as a powerful option to handle these issues for the model inversion scheme.
This article proposes a new and effective method to determine the growth stage of flooded and broadcast-sown rice fields in large-scale cultivation areas using Polarimetric SAR (PolSAR) data.Apart from the previously-mentioned growth phase determination methods for the rice fields, the proposed method focuses on the effect of the changes in crop morphology on the polarimetric backscattering intensities during the phenological cycle.It extends the a priori growth phase information with an EM backscattering model in a computationally-efficient framework.The EM model inversion consists of a PCE-based parameter space search algorithm.Finally, the results are combined to determine the growth stage of the field from its morphology.
The paper provides the theory behind the proposed inversion approach in detail along with concise information about PCE metamodels and GSA in Section 2. Section 3 covers the test areas with the ground measurements and TerraSAR-X (TSX) campaigns.Section 4 presents the main results of GSA and growth stage estimation.Section 5 summarizes the work with an overall view on the proposed growth stage estimation approach in the context of precise agriculture.

Growth Stage Determination
Understanding the growth phases of the rice growth cycle is crucial in explaining their effect on the SAR system responses.The most common rice cultivation practice begins by flooding the fields several weeks before sowing.There are two main planting methods: transplanting and broadcasting.In transplanting, the seedlings are prepared and then transplanted to the fields to provide regular spacing between the plants.On the contrary, with broadcast sowing, the seeds are thrown on the flooded fields, resulting in spatial morphological heterogeneity [1].
Three significant growth periods can be identified in rice cultivation: vegetative, reproductive and maturative.The full cycle takes between 120-150 days depending on agricultural and environmental factors.In the literature, the rice growth cycle is defined by two distinct scales: International Rice Research Institute (IRRI) [1] and Biologische Bundesanstalt, Bundessortenamt und CHemische Industrie (BBCH) [30].The IRRI scale divides the growth cycle into five phases, while the BBCH scale uses 100 stages between 0 and 99.A general overview of the growth cycle is presented in Figure 1 with sample morphologies.In this research, the IRRI scale is chosen as the a priori growth phase information.

•
Vegetative period: The crops increase in height and structural density, depending on several factors such as soil properties, temperature and seeding density.The stalk orientation stays mostly vertical.Since the plants are structurally weak, the duration of this period strongly depends on the environmental conditions and the genotype of the crops.

•
Reproductive period: As the plants become stronger, they become less sensitive to the environmental stresses.Plant height and density continue to increase heterogeneously together with increasing wet biomass, which leads to varying orientation in leaves and stalks due to increasing weight.The flag leaf forms through the end of the period.

•
Maturative period: Excess water in the fields is drained, leading to a reduction in plant total biomass due to lower moisture content.Grains become more mature and heavier.
In this study, the growth stages of the fields are determined using the approach shown in Figure 2.This method uses three different inputs: PolSAR data, growth information and the backscattering (EM) model.PolSAR data are used in two steps of the algorithm: feature clustering of polarimetric parameters for BBCH assignment and the parameter space search algorithm.Phenological data with a priori growth phase information are used for determining the growth boundaries and trends and training a first PCE metamodel that predicts the BBCH scale based on the available morphological parameter (PCE BBCH ).The backscattering model is then surrogated by an additional PCE metamodel for each of the various growth phases (PCE EM ).Later, the outputs of the feature clustering, PCE EM and growth trends as natural limitations are integrated into the parameter search space algorithm.Finally, the growth stages are estimated using clustered PolSAR data and the PCE BBCH .

Backscattering Model
In this study, the canopy is modeled as uniformly-distributed individual plants over a half space that represents the flooded ground used in broadcast seeding.The backscattering coefficients are estimated using the first order solution of the radiative transfer equation.The chosen model provides a structural description of the plants including their simplified crop morphology (e.g., stalks, tillers, leaves and panicles).The model also includes the backscattering enhancements and resulting wave clustering effects from the scatterers [22].The resulting backscattering intensities are estimated for different polarimetric channels by performing Monte Carlo (MC) simulations using Foldy-Lax multiple scattering equations [31].
In the simulations, rice plants with vertically-oriented stalks are placed randomly in a unit area A, as in broadcast seeding.The locations of the plants are randomized automatically in each iteration of the MC simulation to provide spatial heterogeneity.Inside A, there are n s plants with n t tillers with average height h s and diameter d s .Each tiller has n l leaves with length l l and width w l and n p panicles with length l p and width w p .The complex dielectric constants are s,l for all plant structures (e.g., tillers, leaves and panicles) and g for the underlying ground.
The current first order solution of the electromagnetic scattering problem considers four major scattering mechanisms (P n ), visualized in Figure 3: 1. Direct scattering from the scatterers 2. Scattering from the canopy followed by reflection from the ground 3. Reflection from the ground followed by scattering from the canopy 4. Reflection from the ground followed by scattering from the canopy, again followed by reflection from the ground: The top surface of the canopy is indicated as z = 0 and the underlying surface as z = −h.It is possible to model the behavior of an incident wave Ēi in the direction (θ i , φ i ), of the incidence and look angle, using (1).The model follows the finite cylinder approximation [32,33] for stems, tillers and panicles and the physical optics approximation [34] for leaves.Additionally, the model variables are listed as: type of the morphological structure, t; scattering matrix element where q and p are polarization channels (q, p for H,V) for scattered and incident waves, f t qp ; propagation vector of the incident and scattered wave, ki,s p ; Fresnel reflection coefficients, R p (θ) and R q (θ).Lastly, the effect of the attenuation due to mixed structural scatterers inside the canopy is considered by the M qp term: where the angular brackets represent configurational average, h is the height of the canopy and k 0 is the free space wave number.Structural location vectors are given as: The backscattering coefficients for the polarimetric channel, qp, are estimated from the ratio between the amplitudes of the scattered and incident electrical waves (3).
where A i is the illuminated area and r is the distance between the sensor and the target.For the MC simulation, the backscattering coefficients are averaged over 200 realizations.

Polynomial Chaos Expansion and Global Sensitivity Analysis
Sparse polynomial chaos expansions (PCE) are a well-known technique in the uncertainty quantification literature, and they are well suited to inversion problems.Compared to other surrogate modeling techniques such as Gaussian process modeling (also known as kriging, [35]) or support vector regression [36], they are particularly well suited for the solution of inverse problems.Indeed, their global approximation character combined with the strict relation they share with Sobol variance decomposition [29,37], as well as and their built-in error estimators can be directly applied to assess the identifiability of parameters in inverse problems.

Polynomial Chaos Expansion
To reduce the high costs associated with the MC simulation of morphology-based scattering, the computational model can be substituted with a metamodel, a computationally inexpensive analytical approximation of the full computational model.Due to its versatility and relatively low training costs, sparse PCE [38] is an ideal candidate.PCE is a spectral decomposition technique that allows one to represent a finite-variance scalar-output function Y = M(ξ) as: where ξ ∈ R M is the random vector of morphological parameters, a j ∈ R is a set of scalar coefficients and the Ψ j (ξ) ∈ R form a polynomial orthonormal basis with respect to the functional scalar product (expectation value): where D ξ is the support of ξ and f ξ (ξ) the Probability Density Function (PDF) of the input random vector ξ.Due to the linearity of Equation ( 4), the a j coefficients can be non-intrusively and efficiently calculated using compressive-sensing-based least-square minimization techniques (e.g., least angle regression-based selection [38]) from a training set of full model evaluations of M(ξ).The size of the training set determines the maximal complexity and the accuracy of the resulting metamodel.PCE was implemented in MATLAB ® within the UQLab framework [37,39].

Global Sensitivity Analysis: Sobol Indices
GSA allows one to quantify the effect of the variability of each of the input parameters in ξ on the variability of the model response M(ξ).A widely-accepted global sensitivity measure in the uncertainty quantification literature is given by the variance-decomposition-based Sobol indices [28].
The basic form of variance decomposition consists in representing a computational model M(ξ) as a sum of functions depending only on increasingly larger subsets of the input vector ξ as follows: where the M ij...s are scalar functions depending on the subset of input variables {ξ i , ξ j , ..., ξ s }.
In Sobol [28], it is demonstrated that such a decomposition exists for every finite-variance functional and that it is orthonormal, hence yielding unique coefficients.Sobol indices are defined as the ratio of the variance of each term D ij...s in Equation ( 6) to the total variance D: It is demonstrated that a close relation exists between variance decomposition and PCE coefficients, which allows for the calculation of Sobol indices directly from the PCE coefficients a j without the need for additional sampling [29].Therefore, the total costs of the GSA reduce to the calculation of the PCE training set.

Feature Clustering for BBCH Assignment
Agricultural fields are known to have spatial morphological heterogeneity due to growth competition.This condition is observed mostly in fields with broadcast seeding practices.Therefore, this structural heterogeneity is also expected for all growth stages at any time t in a field.The definition of the BBCH scale takes this heterogeneity into account and states that BBCH growth stage assignment must be done on the dominant morphology within the field [30].In other words, the assigned BBCH value of a field has to represent at least 50% of all crops.Due to the same growth stage and similar morphology, crops are expected to have similar polarimetric scattering behaviors.Thus, to provide this requirement for the BBCH value assignment, the PolSAR data are clustered to obtain the smallest group with 50% of the samples using the well-known K-means algorithm in the space of statistically independent PolSAR parameters (i.e., σ o HH , σ o VV and ρ).Details about the clustering methodology are given in [40].

Parameter Space Search Algorithm
The proposed solution is designed as a constrained optimization problem by considering the ill-posed condition of the scattering model.In the literature, there are two different ways to handle similar optimization problems: deterministic and distribution-based approaches.The former approach converges to a single optimum value [41], whereas the latter converges to a distribution of values [42].In this case, because SAR data have high variance, mainly due to speckle noise, a method that converges to a single intensity value would be ineffective.Especially for deterministic methods, the presence of variance reduces the rate of convergence and increases the degree of classification error.On the other hand, distribution-based approaches are capable of handling problems with different levels of variance.The proposed parameter search space algorithm links the a priori IRRI growth phase (i.e., phase S) and the backscattering model [22].The proposed parameter space search approach follows the flow scheme given in Figure 4.The method starts with the simulation of the parameter space using the growth-phase-specific PCE EM .At this step, the growth phase, S, also determines the parametric range (min-max) of the morphological descriptors.The corresponding parameter space, P S , can be visualized as a hyper-grid Equation (8).For any S, the coordinate of a single point in the grid has the information to define a rice canopy with morphology and structural density.However, the biologically-impossible structures present in the data cloud of the ground measurement database need to be eliminated.For this purpose, the P S is constrained using the convex-hull method based on the morphological growth information, which was collected from the literature and ground campaigns.
After preparing the P S , the PCE EM is used to simulate P S and obtain the corresponding observable spaces for each polarimetric channel O S qp , including the backscattering intensities.The fitness function of the distribution-based optimization problem in a constrained parameter space is given by: In Equation ( 9), the σ qp and E(•) represent the measured backscattering intensity and the expected value of the difference between measured and observable SAR intensities, respectively.In the proposed method, there are two constraints on the P S , which explain the relation between P S and O S qp .
1. Backscattering intensity: Each O S qp covers a wide range of intensities based on the corresponding morphologies in the P S .However, the intensity values obtained from the SAR data only cover a small range of O S qp .In order to consider the spatial heterogeneity of the field, the mean (µ σ qp ) and standard deviation (ν σ qp ) of the measured backscattering intensities are calculated for each data cluster of polarimetric channels.Each O S qp is then bounded according to the intensity constraints given byEquation (10).
The confined observable space C qp has the same dimensionality as O S qp , but fewer samples.The link between the O S qp and P S is used to select the corresponding morphologies from the P S for each polarimetric channel to obtain the constrained parameter spaces, B qp .Thus, the morphological structures that are included in each B qp have a similar σ qp with respect to the measured SAR intensity values.2. Morphological consistency: In PolSAR data, a particular intensity value may correspond to different physical structures.This constraint resolves the ambiguity by taking the intersection of all B qp sets of each N polarimetric channels as seen in Equation (11).
The resulting set I includes the multidimensional parameter distributions for the nine inputs in Equation (8).The morphology vectors are kept intact to preserve the plant morphologies for the last step of the analysis.

Assignment of Growth Stages by PCE BBCH
The last step of the proposed approach considers the sample distribution of resulting morphologies that are included in set I. Since the BBCH stage is not physically measurable and strongly subjective, there is a need for a relation between morphological parameters and the BBCH stage.This link has a complex and non-linear behavior.Moreover, subjective decision criteria of the BBCH scale lead to a high degree of variation due to the variation in biophysical parameters.Therefore, PCE BBCH is trained by taking samples from the morphological measurements as input, X, on the corresponding BBCH stages as output, Y.This metamodel is then used to estimate the BBCH stages (BBCH est ) for the set of I. Finally, the growth stage of the field is determined by calculating the mode of the distribution of BBCH est .

Test Area and Ground Measurements
This study was carried out in two independent rice cultivation sites located in Spain and Turkey.Figure 5 shows the location of the fields.Both sites are sowed by the broadcast technique over the flooded ground.Figure 6 summarizes the timeline of the ground measurements and SAR acquisitions for both datasets with IRRI phases.

Isla Major, Spain
The site is located in the Isla Major region, South of Seville, centered at 37°7 53 N and 6°19 32 E. The region has a flat topography and covers an area with an average radius of 20 km.The ground campaign was conducted in 2009 to measure the phenological stage, canopy height, plant and tiller density for the whole cultivation period (May-October).Figure 5a presents the location of the test area and the fields that were chosen for the ground measurements.

Ipsala, Turkey
The site is located in the Thrace region, North West of Istanbul, centered at 40°47 59 N and 26°13 14 E. The region has a flat topography and covers an area with an average radius of 15 km.The ground campaign was conducted in 2014 to measure the phenological stage, morphological parameters (stalk diameter and length, leaf width and length), plant, tiller and leaf density for the whole cultivation period (May-September).Figure 5b presents the location of the test area and the fields that were chosen for ground measurements.

SAR Dataset
In this study, data from the TerraSAR-X (TSX) mission are used.It operates at a central frequency of 9.65 GHz with a wavelength of 31 mm.As an advantage to the other systems, TSX allows frequent monitoring of environmental changes with a temporal resolution of 11 days.Therefore, it is one of the best options on the market for agriculture monitoring purposes.
All data were acquired in descending strip map mode and processed by the German Aerospace Center to the standard product Level 1b, i.e., single look complex (SLC) data (16-bit) with a n~2-m pixel-size.Later, the data were co-registered by using bi-linear interpolation with an average root mean squared (RMS) accuracy of 0.1 pixels.Before the analysis, multi-looking was applied on the data with a boxcar of 11 × 11 pixels to reduce the speckle noise.Figure 6 presents the acquisition plan of the dual polarization (HH and VV) SAR data with a central incidence angle of 31°for both test sites.

Results and Discussion
In this paper, a stack of HH/VV dual-polarization descending TSX images over rice fields located in Spain and Turkey was employed to check the effectiveness of the proposed methodology.This section presents the GSA analysis of the EM backscattering model and the accuracy assessments of the growth stage determination algorithm.Since the algorithm has a stepwise scheme, the accuracy analysis is provided separately for each step.The discussions about the analysis outcomes are given in their specific sub-sections.
Figure 7 presents the boundaries of six crop morphology parameters from the Ipsala 2014 campaign with their quantiles and max-min values for each growth phase, S. For the P S , boundaries are further extended by 5% as a safety factor to consider morphological anomalies, such as over-or under-growth conditions.

Accuracy Assessment: Backscattering Model
In this study, the theoretical backscattering model estimates the HH and VV backscattering intensities of the rice canopies, at a central frequency of 9.65 GHz and at an average incidence angle of 31°to be consistent with the TSX beam.The effect of the variation in the incidence and look angle during the acquisitions is assumed as constant along the scene.
Before applying the proposed inversion-based classification, GSA was used to identify the parameters that most affect the backscattering coefficients; this step is known as model-reduction.During this step, the sensitivity of the backscattering coefficients to environmental parameters such as plant and ground dielectric values has been evaluated.The results are reported in Table 1.The variability in both the real and imaginary parts of the dielectric constants within the given boundaries ((22.0 + 6.0i∼30.0+ 10i) for the canopy and (60.0 + 15i∼80.0+ 25.0i) for the ground) has a significantly low impact on the model response.Indeed, the corresponding Sobol indices are much lower than, e.g., those of the stalk height parameter.Therefore, the dielectric constants were kept constant during the analysis by setting them to values based on [26].Table 2 summarizes the values of the parameters assumed to be constant during the simulations of the ground measured data for the estimation of backscattering intensities.The reported values of the parameters are determined either from the SAR settings or the existing literature [22].Constant parameters can represent the average properties of a rice canopy compared to real environmental conditions.Figure 8 shows the correlation between the results of the theoretical morphology-based backscattering model from the simulation of the Ipsala ground measurements from 2014 and the acquired TSX data in dB.The results as the mean and standard deviation of the estimated values are grouped into three available growth phases in two polarimetric channels, i.e., HH and VV.In the corresponding figure, each growth phase is represented by a different color and symbol.Good agreement is obtained between the SAR measurements and the model simulations for both polarimetric channels.There is clearly a strong correlation between measured and estimated backscattering intensity values for both polarimetric channels.For the full dataset, the 2D coefficient of determination (R 2 ) and the root mean square error (RMSE) are calculated to be 87.1% and 2.02 dB for the HH channel and 84.6% and 1.91 dB for the VV channel.Additionally, when the growth phases are considered separately, for the HH channel, the RMSE values of the first three stages are calculated as 2.69, 1.83 and 1.96 dB, respectively.For the VV channel, they are calculated as 2.21, 1.91 and 1.58.The implementation of the backscattering model does not include the underlying surface nor the morphological 3D orientation information.The underlying surface is assumed to be water all through the cultivation cycle with high dielectric constant.The 3D orientation of the morphological components is neglected because the model has been shown to provide reasonable accuracy even without their inclusion (see Figure 8).Including the rotational parameters for each plant would result in additional scatter of the EM response, comparable to the effect of the stochastic placement of the plants in the area described in Section 2.1.

PCE EM and Global Sensitivity Analysis
A PCE EM is generated by preparing a sample of size 2000 of the full model for each of the five growth stages identified previously, hence at a total training cost of 10,000 model evaluations.Note that this is a one-time cost: after the PCE EM is trained, no new full model evaluations are necessary to evaluate the PCE EM on new sets of morphological parameters.The PCE EM surrogates the mean and the variance of the forward backscattering model of the MC simulations in all polarimetric channels.
Regarding computational costs, the evaluation of the PCE EM is comparatively inexpensive.In other words, while the original implementation of the backscattering model required approximately 22 h to calculate the response to 2000 simulations on a computer with 24 GB RAM and 8 cores, the PCE EM needed only 0.04 s with a single core on the same hardware.Therefore, such an improvement allows increasing the size of the parameter and the observable spaces significantly, which in turn enhances the variation in the crop morphology input vectors.
Figure 9 shows the results of the accuracy and GSA of the PCE EM .The figure is structured as an array with the first two rows visualizing the accuracy analysis and the last row visualizing the GSA results.Each column corresponds to a growth phase with all chosen crop morphological parameters.Accuracy analysis of the PCE EM is given with a corresponding polynomial degree (P.Deg.),R 2 , RMSE and Leave-One-Out (LOO) error [38].The results are discussed below in detail for each growth phase.Early vegetative: For both polarimetric channels, the stage-specific PCE EM can approximate the backscattering coefficients perfectly.For HH and VV channels, the R 2 values are calculated to be 99.4% and 98.3%, respectively.The estimated RMSE values are 0.25 dB and 0.34 dB for HH and VV channels, respectively.The GSA of the theoretical model shows that stalk height is the primary source of the variance in the model output.Besides, the sensitivity to the variation in stalk diameter is observed to be stronger in the HH channel.Late vegetative: During this stage, the significant growth in the plants increases the dynamic range of the intensity values in both polarimetric channels.This variance is also detected in the stage-specific PCE EM outputs.The results of the accuracy analysis show that R 2 and RMSE values are calculated to be 89.2% and 1.78 dB for HH and 83.5% and 1.93 dB for the VV channel.GSA shows that the major source of the variance in the model output originates from the stalk height, stalk diameters and the number of tillers.In addition, the HH channel is slightly more sensitive to stalk density compared to the VV channel.Early reproductive: As the plant enters this phase, head leaves and panicles are observed.The accuracy assessment of the PCE EM reports the R 2 and RMSE for the HH channel as 89.1% and 0.94 dB and the VV channel as 80.0% and 0.97 dB.Concerning GSA, the model is observed to be sensitive to stalk height in both polarimetric channels.Additionally, the HH channel is sensitive to the changes in the number of tillers.On the other hand, the VV channel is found to be sensitive to the variation in panicle width and number of panicles.Late reproductive: For each polarimetric channel, the accuracy assessment of the stage-specific PCE EM provides R 2 and RMSE values as 81.9% and 0.95 dB for the HH channel and 89.9% and 0.56 dB for the VV channel.For the GSA, the source of the model variability is related to the changes in stalk height for both polarimetric channels.Furthermore, the number of tillers and the number of panicles are other sources of variability for the HH and VV channels, respectively.Maturative: During the last stage of the growth cycle, the accuracy of the stage-specific PCE EM is estimated for R 2 and RMSE values as 84.4% and 0.96 dB for HH and 89.8% and 0.58 dB for the VV channel.Moreover, the sources of the variation in the model outputs are found to be stalk height for HH and VV and the number of tillers for HH.
To sum up, growth-phase-specific PCE EM can estimate the outputs of the theoretical backscattering model with high accuracy.Minimum R 2 and maximum RMSE values for the full cycle are calculated to be 80.0% and 1.98 dB.Therefore, the replacement of the backscattering model with the surrogate PCE EM is acceptable.While the highest accuracy is observed in the early vegetative stage, the lowest accuracy is in the late vegetative stage due to ranges of corresponding parameter spaces as shown in Figure 7. GSA shows that throughout the growth cycle, model outputs in both polarimetric channels (HH and VV) are most sensitive to the stalk height.However, for the HH channel, the number of tillers and stalks become important starting from the late vegetative phase due to their effect in increasing the attenuation inside the canopy.

Structures of the Parameter and Observable Spaces
The proposed approach follows a search algorithm that depends on two multi-dimensional spaces, parameter P and observable O qp .The first space is built up based on the morphological parameters as a regular grid.In the current case, the increments of the grid were chosen as 1 cm for stalk height and leaf length, 1 mm for stalk diameter and leaf width, 1 unit for tiller and leaf number and, finally, 10 units for plant number.For each growth phase P, a different number of possible samples is obtained.Table 3 summarizes the remaining sample sizes throughout the analysis.The reason behind the different number of samples is due to the varying ranges (maximum and minimum values) of parameters as seen in Figure 7. Later, when the biologically unrealistic morphologies are eliminated in accordance to the crop morphology database, the sample sizes reduce to the values shown in Row 3 of Table 3.Here, it is observed that several morphologies in the parameter space are not biologically favored.In the next step, intensity and matching morphology constraints are implemented.Table 3 provides the average values from this study for the minimum and the maximum number of samples remaining after each constraint is applied.These values can change based on the variance of the data, which is either due to the structural heterogeneity of the region or due to the size of the smoothing window.Finally, the remaining samples are used as an input to the PCE BBCH .A random growth stage can correspond to several different plant morphologies.Besides, the variance of the crop morphology can affect the BBCH results.Therefore, this relation is achieved using a PCE metamodel that relates ground measurements to their corresponding BBCH stages by PCE BBCH .
Figure 10 shows the results of the accuracy analysis of the PCE BBCH .The plot is given for the BBCH stages measured in the field versus the ones estimated by the PCE BBCH .For the training, 200 randomly chosen ground measurements are taken into consideration from the 2014 Ipsala ground campaign.The coefficient of determination is calculated to be 94.0%.The overall RMSE value is found to be 5.80 stages, with the lowest variance in the early vegetative stage and the highest in the maturative stage.This can be explained by the degree of the morphological complexity.In other words, as the structure gets more complicated, a lower accuracy for the PCE BBCH is observed.Lastly, the proposed scheme can provide a continuous growth trend in the BBCH scale.

Accuracy Assessment: BBCH Assignment
The growth stage determination method proposed in this study makes the BBCH scale directly available for broadcast seeded rice monitoring using PolSAR.In other words, the phenological stage of a rice field of interest can be estimated by observing its polarimetric response and that of the surrounding area.Therefore, to prove the consistency of the approach, the PCE metamodel-based inversion method is independently applied to each test field present in each TSX acquisition from Isla Major and Ipsala.
The accuracy analysis through correlation plots of the proposed algorithm applied to all test fields of the available TSX data is shown in Figure 11a,b for the Isla Major and the Ipsala sites, respectively.The value of R 2 between ground measured and estimated BBCH is 94.1% for the Isla Major and 84.1% for the Ipsala test sites.Additionally, the RMSE deviation from the measured value is found to be 7.66 BBCH stages for Isla Major and 5.24 BBCH stages for Ipsala data.Unfortunately, the Ipsala data are not available for the full cycle.The analysis shows that, while the proposed algorithm tends to overestimate the earlier stages, it underestimates the later stages.This can be explained by crop morphology and subjective assignment of the BBCH stages.The inclusion of the PCE BBCH -based growth stage assignment improves the overall accuracy.Field-and full-scale growth maps are visualized in Figures 12 and 13, respectively.

Conclusions
This paper has demonstrated that X-band HH/VV dual-polarization SAR data are suitable for the estimation of flooded and broadcast-sowed rice field growth stages on a continuous scale, in terms of BBCH.This is due to the sensitivity of the X-band polarimetric descriptors to small-scale morphological changes.The validation of the proposed approach carried out at the field level provided an error of less than 10 BBCH stages.Additionally, the R 2 between the ground measurements and the algorithm estimation is found to be consistently higher than 80.0%.
Since the proposed methodology gives promising results, it may encourage agriculturists and local authorities to use products based on SAR data for their monitoring purposes.The main strengths, limitations and opportunities of the proposed methodology are:

•
The algorithm depends on the rice crop morphology.The proposed approach extends the usage of existing classification algorithms.The results of the current classification algorithms can be used instead of the a priori growth phase information as a coarse classifier.The proposed method introduces the PCE EM -based parameter search space approach, resulting in an estimate of the BBCH based on crop morphology.• Several genotype variations are available for rice crops.The range of admissible morphological parameters (e.g., crop height vs. leave size) may, therefore, need to be extended should data on new/additional crop morphologies become available.The proposed method can easily be updated automatically with each new crop morphology dataset by appropriately extending the allowed morphological parameter space.Therefore, each new dataset will contribute to the preservation of the plant morphological growth principles for different genotypes.The possibility to extend the base morphological datasets allows the proposed approach to be extended to include new morphologies.

•
The proposed method can make detection of in-field heterogeneities possible for observing growth abnormalities.The included feature clustering approach handles polarimetrically similar regions of the field separately, and therefore, spatially-localized problems (e.g., sickness or overgrowth) can be handled, unless they have statistically a representative number of samples.

Limitations
• Even though the results are promising, some aspects were omitted in the chosen backscattering model such as the 3D orientation of the scatterers, the curvature of the leaves and panicles and the agronomical exceptions as extreme water loss from the plants.Besides, according to the Directorate of Trakya Agricultural Research Institute, the rice fields located in Turkey are kept flooded until 10-15 days before harvesting.Therefore, the current implementation of the model only considers the flooded conditions and misses the non-flooded periods.

•
The performance of the morphology estimation strongly depends on the performance of the backscattering model and the environmental conditions.The slight bias in the EM model predictions that can be observed in Figure 8 may be related to the slight bias in the reconstruction response w.r.t. the ground truth in Figure 11a,b.A quantitative study of the effects of model bias on the inversion results would require additional high-quality ground truth measurements.Nevertheless, it is expected that improvements in the model predictivity, especially when effective at reducing model bias, could similarly improve the accuracy of the inversion results.• Since the proposed approach was developed for fields with flooded or strongly moist underlying surfaces, further studies are needed to assess its applicability for fields with dry or slightly moist soil.

•
The chosen theoretical backscattering model can be replaced by any other morphology-based EM backscattering model.The alternative models may lead to higher accuracies with a higher number of parameters.However, the uncertainties of the inputs should also be taken into account.Therefore, it is possible to state that, for an improvement in the inversion accuracy, the alternative models should have lower variance in their outputs, which can be achieved by inclusion of the cross-polarimetric channels (HV and VH).Additionally, the proposed approach is also applicable to the monitoring of different crop types by simulating their morphology and the underlying ground information with the theoretical EM backscattering model.

•
With the inclusion of the metamodels, the computational cost of the inversion algorithms decreases significantly.This may lead to the development and integration of new backscattering models with realistic crop morphology.
Future work will focus on the evaluation of the proposed methodology using different frequencies, crop types, as well as incidence angles.Ongoing and future missions such as Tandem-L and Sentinel-1 will be excellent opportunities for these evaluations.

Figure 1 .
Figure 1.Growth cycle of a rice plant with the corresponding International Rice Research Institute (IRRI), Biologische Bundesanstalt, Bundessortenamt und CHemische Industrie (BBCH) scale and sample structure.

Figure 2 .
Figure 2. Block diagram of the proposed approach.

Figure 3 .
Figure 3.The scattering mechanisms involved in the chosen EM model.

Figure 4 .
Figure 4. Block diagram of the proposed parameter space search algorithm.

Figure 6 .
Figure 6.Accordance plot of SAR acquisition and ground measurement dates (day of year) given with color-coded IRRI stages for both test sites.

Figure 8 .
Figure 8.The TSX measured versus the theoretical backscattering model in Equation (1) predicted HH and VV channel backscattering intensities from the Ipsala 2014 campaign.

Figure 9 .
Figure 9. [Row 1&2] Relationship between backscattering model and PCE EM simulated σ o (in dB) values given for five growth stages with corresponding polynomial degree (P.Deg.),R 2 , RMSE and LOO values.True value indicates the scattering model simulated values.[Row 3] GSA results given for five growth stages with corresponding Total Sobol' indices for each input parameter.

Figure 10 .
Figure 10.Relationship between ground measured and PCE BBCH simulated BBCH values with corresponding R 2 for the training data.

Figure 12 .
Figure 12.Phenological stage estimation results of the proposed algorithm obtained over two different Region Of Interest (ROI) located in the Ipsala 2014 dataset.The growth stages are given as estimated/measured BBCH stage.

Figure 13 .
Figure 13.Growth maps with the phenological stage estimation with the BBCH scale in two different areas exploiting their temporal behavior.The date of the images is given as Day of Year (DoY).

Table 2 .
Input parameters that are kept constant for the backscattering model evaluations.

Table 3 .
An average sample size of parameter space during each step of the search algorithm procedure.