Retrieval of Soil Moisture

Passive microwave remote sensing is one of the most promising techniques for soil moisture retrieval. However, the inversion of soil moisture from brightness temperature observations is not straightforward, as it is influenced by numerous factors such as surface roughness, vegetation cover, and soil texture. Moreover, the relationship between brightness temperature, soil moisture and the factors mentioned above is highly non-linear and ill-posed. Consequently, Artificial Neural Networks (ANNs) have been used to retrieve soil moisture from microwave data, but with limited success when dealing with data different to that from the training period. In this study, an ANN is tested for its ability to predict soil moisture at 1 km resolution on different dates following training at the same site for a specific date. A novel approach that utilizes information on the variability of soil moisture, in terms of its mean and standard deviation for a (sub) region of spatial dimension up to 40 km, is used to improve the current retrieval accuracy of the ANN method. A comparison between the ANN with and without the use of the variability information showed that this enhancement enables the ANN to achieve an average Root Mean Square Error (RMSE) of around 5.1% v/v when using the variability information, as compared to around 7.5% v/v without it. The accuracy of the soil moisture retrieval was


Introduction
Soil moisture controls several processes at or near the land surface.These processes include: the partitioning of rainfall into infiltration and run-off [1], the partitioning of available energy into latent and sensible heat [2], the drainage to ground water and/or surface water [3], as well as the growth of vegetation [4].All these processes are highly nonlinear functions of soil moisture [5], and so accurate soil moisture observations are crucial in order to forecast and predict these processes accurately.
Various approaches have been developed over the past two decades to infer near-surface soil moisture from remotely sensed measurements of surface temperature, radar backscatter, and microwave brightness temperature [6].Among these, microwave radiometry has been the most successful of the remote sensing approaches for soil moisture estimation [7,8], due to its ability to penetrate cloud, its direct relationship with soil moisture through the soil's dielectric constant, and a reduced sensitivity to land surface roughness and vegetation cover [9][10][11][12].Consequently, the Soil Moisture and Ocean Salinity (SMOS) mission [13] launched in November 2009 is the first soil moisture dedicated satellite, using a 1.4 GHz 2-D interferometric polar orbiting L-band passive microwave sensor.This satellite will provide data on the 0-5 cm deep soil moisture content with a repeat cycle ranging from one to three days depending on latitude and the expected soil moisture accuracy of 4% v/v.Soil moisture retrieval will be from the L-band Microwave Emission of the Biosphere (L-MEB) model that simulates the microwave emission from a soil-vegetation layer [14].L-MEB requires a great deal of ancillary information, including vegetation and surface type dependent parameters such as the vegetation opacity, vegetation scattering albedo and roughness, making the global application of this method difficult.While generic parameters have been published for a variety of land cover types [11], they have not been widely tested [14].
To overcome this difficulty, researchers in soil moisture problems have utilized data driven approaches.Artificial Neural Networks (ANNs) are one of the possible alternate approaches that have been used for soil moisture retrieval [15].Specific examples include the work of Liou et al. [16] and Liu et al. [7], who used an Error Propagation Learning Backpropagation (EPLBP) to retrieve soil moisture (SM) from 1.4 GHz, 6.9 GHz and 10.65 GHz brightness temperature observations.Liou et al. [16] used simulated data from the Land Surface Process/Radiobrightness (LSP/R) model during a two-month dry-down of a prairie grassland having vegetation wet biomass of 3.7 km/m 2 [17], with 5% of 8,640 simulated Tb-SM pairs randomly chosen to train the ELPBP.Another 5% of the data were randomly chosen for testing.Experiments were then conducted for different combinations of brightness temperature (Tb) frequencies with different viewing angles.The retrieval result was a Root Mean Square Error (RMSE) of less than 1% v/v for all cases tested, and a correlation coefficient of better than 0.9.Conversely, Liu et al. [7] used Tb data obtained by the PORTOS radiometer over a wheat field during its three months growth cycle in 1993 (PORTOS-93) and 1996 (PORTOS-96).The EPLBP was trained with a subset of data from PORTOS-93 and tested on: (i) the remaining subset of PORTOS-93, and (ii) data from PORTOS-96.For both test cases, nearly half of the testing data were used for the validation of the EPLBP.The number of data used for training and testing were less than 10 points for each test case.From their research, the RMSE of retrieval achieved was about 5% v/v for all cases in 1993 and about 4% v/v in 1996.Although the retrieval result was encouraging, providing some confidence for the use of ANN for soil moisture retrieval, the small number of data used for training and testing was questionable.Moreover, the characteristics of the training and testing data in terms of its statistical centrality were not presented and if the data used for testing is similar to the training data, or the testing data is a subset of the training data, it is expected that the ANN would produce overly optimistic results.
For the case of using field data, the potential of ANN application within the context of ESA's SMOS mission [18] over land has been studied by Angiuli et al. [19].The standard backpropagation algorithm was trained with simulated data of the land emission model and tested with field data, using L-band radiometric observations of bare soils obtained during two field experiments: T-REX and MOUSE.A total of 2,000 data were simulated with 1,400 used for training and 600 for testing.The input vectors included Tb at different incidence angles, surface temperature and surface roughness, while the output vector was the soil moisture value.It was reported that the maximum RMSE was 7% v/v with the best RMSE obtained being 5% v/v, when applied to the data from one field experiment.For the other field experiment data, it was reported that the ANN tended to underestimate the soil moisture for water content larger than 15%, due to the fact that the ANN was trained using simulation data that in large percentage corresponded to water contents lower than this value.The ANN managed to obtain a RMSE of 4% v/v for soil moisture values lower than 15%.Moreover, the most current research by Lakhankar et al. [20] using active microwave data showed that when the "trained" ANN is tested on the same study areas but for a different date, the RMSE obtained was around 7-8% v/v, with RMSE being 4.4% v/v when it is tested with an independent hold-out sample on the same date and 3.6% v/v on the hold-out sample from the training data.This is a typical problem of ANN models which could not generalized for "out-of-range" data.However, it is crucial to solve this problem in order for ANN to be applicable to soil moisture retrieval.Clearly, the normal approach of ANN for soil moisture mapping fails to capture the natural variability of land surface conditions, leading to unacceptable results in soil moisture retrieval when applied to field data.
Consequently, this paper presents a novel ANN methodology that makes it applicable to a wider range of conditions to those used for training.The methodology utilizes the variability of soil moisture in terms of the mean and standard deviation to capture the land surface variability, and is demonstrated for soil moisture retrieval at 1 km spatial resolution using single incidence angle dual polarized passive microwave data at 1 km resolution.When applied to a study area of 40 km × 40 km, the retrieval process is done over smaller regions (termed "sub-grid" hereafter) to capture the spatial variability.The main assumption is that the variability between adjacent 1 km pixels is more similar than for pixels that are further apart.The mean and standard deviation values are calculated based on the 1 km data which falls inside the predetermined sub-grid size.The results show that using a combination of spatial statistics and sub-region division, the ANN can predict soil moisture evolution over time to a suitable accuracy (less than or equal to 4% v/v, corresponding to that expected from the SMOS mission, termed as "target error" in this paper).
While the requirement for information on the mean and standard deviation of soil moisture at coarse scales is a limitation of the method, one application of this approach may be in soil moisture downscaling.The mean and standard deviation values at a resolution coarser than the desired downscaled resolution can be retrieved from radar or optical sensors and this proposed methodology can then be incorporated with passive microwave data for downscaling purpose.

Study Area and Data Sets
The study area is situated in the northern part of the Goulburn River catchment, located in a semiarid area of south-eastern Australia.This catchment extends from 31°46'S to 32°51'S and 149°40'E to 150°36'E with elevations ranging from 106 m in the floodplains to 1,257 m in the northern and southern mountain ranges (Figure 1).The area monitored during NAFE'05 was an area of approximately 40 km × 40 km, centered in the northern part of the catchment.Much of the original vegetation has been cleared to the north of the Goulburn River, where the dominant land uses are grazing and cropping.The southern part of the catchment is largely uncleared with extensive areas covered by eucalypt forest.Consequently the study area was chosen for its moderate-to-low vegetation cover condition.The area was logistically divided into two sub-areas, the "Merriwa" area in the east and the "Krui" area in the west.The brightness temperature observations used in this study have been collected during the month-long NAFE field campaign held in November 2005.The campaign included extensive airborne passive microwave observations together with spatially distributed and in situ ground monitoring of soil moisture.For the purpose of this analysis, only pertinent details of the data are presented.A more detailed description of the data can be found in [21].

Microwave Data
Flights were conducted between October 31 and November 25 2005 using a small aircraft from the Airborne Research Australia national facility carrying the Polarimetric L-band Multi-beam Radiometer (PLMR).The PLMR measured both H and V polarized brightness temperatures (Tb) using a single receiver with polarization switch at incidence angles of +/−7°, +/−21° and +/−38.5°across track.
For the purpose of this study, only the "regional" L-band microwave data is used, providing full coverage across the 40km area with ground resolution of 1 km, resulting in a total of 1600 data points across 4 dates.All flights conducted were centered on the 6 a.m.overpass time of SMOS, and therefore well replicate the SMOS mission of the soil moisture retrieval conditions [22].Due to the rough terrain, the actual pixel size varied between approximately 860 and 1,070 m, due to flying at a constant altitude above the median elevation of the study area.
In order to effectively use the push-broom radiometer data for soil moisture mapping, it was first necessary to account for the effect of varying beam angles and for the effects of varying soil temperature during the acquisition through a normalization procedure.The soil temperature was corrected by using the daily average of the soil temperatures at the monitoring stations recorded during the regional observation window.Previous studies by [23][24][25] showed that the angle normalization procedures can be developed over mixed land covers, by assuming that the deviation between measurements at different beam positions is due to the Fresnel effect, and that for a given day the Fresnel effect for a particular beam is constant for the range of soil moisture and vegetation present.From the studies mentioned above, it is assumed that by using a daily average for all data in an area at a common angle, errors due to anomalies in a particular beam that are not present in the others (e.g., a small water body) would be minimized.With this assumption, the normalization is applied as follows: (i) the daily average Tb over land targets is computed for each beam, (ii) a correction factor is computed by taking the ratio between the averages of each beam to the average of the reference beam, (iii) all the data for each beam on each day are then corrected using: where Tb i is the individual Tb acquisition to be normalized, Tb is the normalized value, and i Tb and ref Tb are the daily average Tb values of the beam to be normalized and the reference beam respectively.The regional Tb observations were normalized to the incidence angle of the radiometer outermost beams (±38.5°) using the procedure described above.This choice of angle was motivated by the fact that at close-to-nadir incidence angles, H-and V-polarized Tb values are very similar, while at off nadir, the V-polarized Tb data at higher incidence angle is generally higher than the H-polarized values (the amount of difference varies depending on the land surface conditions).The polarization difference yields information on the polarizing effect of the vegetation canopy when using wider incidence angle [18].After normalization the Tb data is gridded onto a reference grid with uniform 1 km resolution.By averaging several individual Tb acquisitions into one Tb value for each cell, anomalies in individual readings are eliminated and the signal noise is reduced.

Soil Moisture Data
The soil moisture data for the 40 km × 40 km study area at 1 km nominal resolution was derived using the L-MEB model.A detailed description of this retrieval can be obtained from [22].For the purpose of this paper, only pertinent details from [22] are presented.
The soil moisture maps derived from the 1 km airborne data have two major advantages with respect to ground point measurements: (i) they have a larger extent, covering the entire study area and therefore characterize the soil moisture variability within all the coarse-scale pixels, and (ii) each soil moisture observation represents an integrated value over a 1 km area, therefore overcoming the limitation of point data which only provides information for the domain sensed by a ground probe (a few centimeters).
The L-MEB model [26] is based on a simplified zero-order radiative transfer model, called the "tauomega" approach [27].The model takes into account the effect of a vegetation cover on soil emission, using ancillary data on land cover, near surface soil moisture and canopy temperature, and soil textural properties.For this study the ancillary data were obtained from either existing databases or derived from satellite imagery.In principle, ground collected data were given priority where possible.In the case where satellite imagery has been used, the dataset with the finest available resolution were chosen.This choice aims at minimizing any errors associated with the ancillary data so that the effects of land surface heterogeneity can be isolated.A summary of the ancillary data is shown in Table 1.
Table 1.Summary of the ancillary data used for the L-MEB model [22].Soil moisture was retrieved by [22] for each cell of the 1 km Tb grid using the L-MEB model together with the ancillary data described in Table 1.The soil moisture output of the L-MEB model was limited to a maximum soil moisture value of 58% v/v, derived from the analysis of the maximum soil moisture achieved at the monitoring stations.Conversely, no lower limit was imposed on the retrieved soil moisture.The average accuracy of soil moisture retrieval at 1 km resolution using L-MEB was 3.8% v/v and in all cases better than 6% v/v over a variety of land surface conditions in the study area [22].

Training, Validation, Testing and Verification Data
To find the optimum neural network architecture, the data was divided into: training, validation and testing sets through random sampling.The fully trained ANN is evaluated using the testing set.At this stage, the "trained" ANN is considered to be able to obtain an acceptable error when it is used for other similar situations.To verify that the "trained" ANN is able to maintain a similar result, "verification sets" are used.
The regional data measured on the 31 October 2005, 7 November 2005, 14 November 2005 and 21 November 2005 was the target dataset.As MODIS scenes were available for only three of the four days during NAFE'05, only data on 7, 14 and 21 November 2005 will be considered and discussed in this study.Data was binned into a 1 km reference grid for the whole 40 km × 40 km area.On occasions of missing data, i.e., the data from MODIS are not totally cloud free, the Inverse Distance Weighted (IDW) scheme was used to interpolate the values based on the surrounding values.As a result, a total of 1,600 data were obtained for each date.The data on 7 November 2005 were divided into training, validation and testing sets while data of 14 and 21 November 2005 were used for verification purpose.

Training Algorithm
In earlier work [28], a comparison of different training algorithms of Backpropagation Neural Network was undertaken using data from the same field experiment.It was found that the Broyden-Fletcher-Goldfarb-Shanno (BFGS) optimization method derived from the Newton method obtained the best retrieval results when different training algorithms were trained and tested with the same data set.For this reason, the BFGS method has been chosen for this study.The details of the neural network parameters used are given in Table 2.

Pre-and Post-Processing Data for ANN: Standardization
The preprocessing aims to transform the data into a better form for the network to use [29].This process, normally known as normalization or standardization, speeds up the training process of the ANN and reduces the likelihood of the ANN getting stuck in local minima.The normalization is mainly used to transform the input features to the same range of values in order to minimize the bias within the ANNs for one feature over another [30].The training time is reduced by starting the training process for each feature within the same scale.This process is especially useful when the inputs of an application are on widely different scales.One advantage of using the statistical normalization is to reduce the effects of outliers in the data [30].There are different ways to normalize the data, and one of these is the statistical or Z-score normalization technique, which uses the mean and standard deviation for each feature across a set of training data to normalize each input feature vector.The input data will be normalized so that they have zero mean and unity standard deviation using: where z norm is the normalized data, x is the input, x is the mean and σ x is the standard deviation of the input data, the mean output, y is equal to 0, and the standard deviation of the output, σ y is equal to 1. Equation 2 can be simplified and written as: Using Equation 3, the input data is transformed into normalized data.Besides obtaining the mean and standard deviation of the input of the training data, these statistical values are also obtained for the target of the training data in order to normalize the target data.The data are normalized before the training is commenced.All values leaving the ANNs will be in the standardized format.These values must subsequently be "destandardized" to provide meaningful results.This can be achieved by reversing the standardization algorithm used on the input modes [31].The de-normalization of the normalized soil moisture is calculated based on Equation 4: where y" is the de-normalized soil moisture values, z" norm,y is the predicted soil moisture values in normalized format, σ y and y is the standard deviation of mean of the soil moisture.
Figure 2 shows the process for the normalized and de-normalized process to pre-and post-processing the data.The usual way of data preprocessing for ANNs is to obtain the standard deviation and mean from the training data [32].Once the means and standard deviations are computed for each feature over the set of training data, they must be retained for use as weights in the final system design [30], otherwise the performance of the ANN will vary significantly if the ANN is trained using a different data representation than the unnormalized data.The study by Minns and Hall [33] points out the importance of standardizing data, particularly when the ANN fails to extrapolate correctly to out-of-range values.Their research concluded that, in practice, the ANN can only be used in the recall mode with data that it has "seen" before.In their rainfall-runoff model, different standardization factors were applied to the training and verification cases.For the case where they applied the same standardization factor from the training data to the verification data, the result was notably poorer.The results obtained from their research emphasized the care required in choosing the standardization factors.Moreover, previous work by the author of this paper [34] has shown that the performance of the ANN can be improved if the ANN is presented with data having similar statistical mean and standard deviation values.
In this research, the statistical normalization steps are taken to preprocess the data.In contrast to the usual way of using only the mean and standard deviation of the training data, the mean and standard deviation of the soil moisture of the validation, testing and evaluation data are also calculated to normalize the data.The rationale behind this is: (i) The training is done using data from a single date.The condition of wetness/dryness for each date will almost certainly differ from the training date.From the literature, electromagnetic models that require a lot of ancillary data are often used to cover the variety of conditions required for training of the ANN.However, the spatial and temporal variability captured using the mean and standard deviation of the testing data can help to avoid this.
(ii) Even within a square meter, the surface soil moisture variance can be as large as over a whole field [35].Therefore, it is crucial to incorporate the variability information in the soil moisture retrieval problem.A statistical approach such as calculating the central tendency (mean or median) and variability (variance or standard deviation) have been used to deal with field-scale variability [36].In this study, this information is incorporated by standardizing the output with the mean and standard deviation values.
Within the same date, the spatial difference in soil moisture within the 40 km × 40 km target area is large.This is mainly due to the fact that soil moisture is influenced by many factors, making the estimation of its value using remote sensing a challenge.Consequently, the retrieval process of the whole 40 km × 40 km area involves dividing it into smaller regions.The spatial difference of soil moisture within a smaller region is smaller compared to a larger area, so when the spatial area is smaller, the prediction of the soil moisture value is expected to be easier.Selection of the size of this smaller region is explained later in this paper.For the purpose of this research, the mean and standard deviation of the soil moisture which are used for the normalization purpose were calculated based on the soil moisture for the grids that fall within the predetermined smaller region.

Input Data Selection
Besides the dual polarized brightness temperature, other ancillary data that potentially affects soil moisture retrieval are also assessed.Such ancillary data include the Normalized Difference Vegetation Index (NDVI) and Land Surface Temperature (LST) from the Moderate Resolution Imaging Spectroradiometer (MODIS).NDVI data were calculated from Band 1 and Band 2 of MODIS/Aqua Surface Reflectance Daily L2G Global 250 m data while the LST values are obtained from the MODIS/Aqua Land Surface Temperature and Emissivity Daily L3 Global 1 km data.Both of these ancillary data are gridded to the same 1 km reference grid as used in the processing of the brightness temperature and soil moisture data.The sensitivity of the ANN towards the inputs is measured by the change of Root Mean Square Error (RMSE) when an input is added to the model.The ANN is initialized and trained repeatedly until the global error between the referenced and computed values is driven down to an acceptable level.The ANN is then tested using the testing data which were randomly selected from the training set.

Selection of Sub-grid Size
The evaluation of the ANN was undertaken using data from two different dates, 14 and 21 November 2005, with the training data from 7 November 2005.Consequently, the land surface conditions for the evaluation dates were different from the training date.To account for both the spatial and temporal variability, the method of sub-region division was applied, and hence the optimum sub-region size needed to be determined.To avoid the use of any data from the 14 and 21 November 2005 before the verification process, the selection of sub-grid size and ANN architecture used data only from 7 November 2005.An area of size 20 km × 20 km in the north-west of the study area was arbitrary selected for this purpose.Among these 400 data, 5% (20 data) were randomly selected for each of the validation and testing.Training of the ANN was repeated until the ANN produced an acceptable accuracy.At this stage, the weights and biases of the ANN were retained.This ANN was next used for the selection of the different sub-grid sizes and ANN architecture.Sub-grid sizes of 2 km × 2 km, 3 km × 3 km, 4 km × 4 km, 5 km × 5 km, 6 km × 6 km, 7 km × 7 km and 8 km × 8 km were arbitrary selected from the north-east corner, with each sub-grid a sub-set of the next (Figure 3).

Neural Network Architecture
The number of input and output nodes is directly linked to the application.The different input combinations assessed include: (i) dual-polarized (H and V) brightness temperatures (TbH, TbV), (ii) TbH, TbV and NDVI, (iii) TbH, TbV and LST, and (iv) TbH, TbV with NDVI and LST.The output node is the soil moisture content.
Besides input and output layers, a decision needs to be made regarding the number of hidden layers and neurons.While ANNs with two hidden layers can represent functions of any shape [37], there is currently no theoretical reason to use neural networks with more than two hidden layers [38].For this reason, the ANN architecture assessed here uses either one or two hidden layers; using too few or too many hidden neurons may undermine any application.Too few hidden neurons will cause under fitting whereby complicated signals within the data cannot be detected accurately by the ANN.On the other hand, using too many hidden neurons will cause over fitting whereby the neural network has so much information processing capacity that the limited amount of information contained in the training set is not enough to train all of the neurons in the hidden layers.Moreover, if too many hidden neurons are used, the length of training time will increase.The best way to optimize the number of hidden layers and hidden neurons is through trial and error method [39].

Results and Discussion
During the selection of the sub-grid size, the use of NDVI and LST as ancillary data for the soil moisture retrieval is also tested.Moreover, experiments were also conducted to determine the optimum ANN architecture.The results are summarized in terms of RMSE (% v/v) and correlation coefficient between the actual and predicted soil moisture values.To evaluate this methodology, the trained ANN was evaluated using the data from the 14 and 21 November 2005.Verification of the importance of combination of the both the standardization factor and the sub-grid method proposed in this paper is also presented.The dependency of the proposed method on the accuracy of the mean and standard deviation values are also shown.

Sub-grid Size and ANN Architecture
Tables 3 to 6 show that the retrieval results deteriorate as the sub-grid size increases.For the purpose of sub-grid size selection, the optimum sub-grid size, i.e., the largest sub-grid size where the ANN can obtain the target error of 4% v/v is first identified.For the combination of TbH and TbV (Table 3), the optimum sub-grid size where the ANN can obtain an error of equal or less than 4% v/v (target error) is at 4 km × 4 km, with the number of hidden neurons being 20, 50 and 100 in the single and two hidden layers (equal number of hidden neurons in each layer).When the input consisted of TbH, TbV and NDVI (Table 4), with a single hidden layer of 20 neurons and two hidden layers of 10 hidden neurons in each layer (10:10), the ANN again achieved the target error at a sub-grid size of 4 km × 4 km.The same optimum sub-grid size was obtained for the case of TbH, TbV and LST (Table 5) and TbH, TbV, NDVI and LST (Table 6).As a result of this, the optimum sub-grid size is taken to be 4 km × 4 km.The optimal number of inputs, hidden layers, and neurons was selected from the column of 4 km × 4 km sub-grid size.The lowest RMSE obtained at the sub-grid size of 4 km × 4 km is 2.79% v/v (R 2 = 0.84) with two hidden layers of 50 neurons in each layer when TbH, TbV and LST were used as the input (Table 5).For the same sub-grid size, the use of only TbH and TbV as input managed to obtain a RMSE of 2.86% v/v (R 2 = 0.88) using single hidden layer of 20 neurons.Although this RMSE is slightly higher than when LST was included, fewer resources are needed.Moreover, according to Vonk et al. [40] simple neural networks are preferred over complex ones, provided they exhibit similar performances.Therefore a single hidden layer of 20 hidden neurons is preferred compared to two hidden layers of 50 neurons in each layer.Figure 4 shows the graph of

Single Layer
Two Layers RMSE and R 2 obtained when different combinations of inputs were used with this architecture.From this graph, it is clearly seen that the use of TbH and TbV promises the lowest RMSE value.Therefore the ANN architecture chosen was two inputs (TbH and TbV), a single hidden layer of 20 neurons, and 1 output.

Testing and Evaluation
Using the determined optimal ANN architecture (two inputs, one hidden layer of 20 neurons and one output), training was undertaken using data from the 7 November 2005.From the 1,600 data of the 7 November 2005, 1,440 data (90%) were randomly selected for training, with the remaining data divided equally (80 data or 5%) for the validation and testing of the ANN.The ANN was then trained to minimize the RMSE between the referenced and retrieved soil moisture values.At this stage, the weights and bias of the ANN, termed as "trained ANN", were stored for verification using all the data of 14 and 21 November 2005.For each of the verification dates, the retrieval was carried out for each of the 1 km cells within the 4km×4km sub-grid from top-left corner to the right and then down until the lower-right corner of the target area.The statistical mean and standard deviation for each of the 4km×4km sub-grids were calculated to standardize and de-standardize the data.
The statistical mean and standard deviation for each date are given in Table 7, showing that the training data from the 7 November 2005 is wetter compared to the verification cases of the 14 and 21 November 2005.This shows that the spatial condition of the training data is different from the data used for the evaluation cases.Figure 4 shows clearly show that, at the sub-grid size of 4 km × 4 km, the use of ANN with TbH and TbV as the input managed to produce the lowest RMSE and highest R 2 with TbH and TbV as the combination inputs.For the evaluation phase, the target area of 40 km × 40 km is divided into sub-grids of 4 km × 4 km.In other words, the retrieval using the trained ANN is not done for the entire 40 km × 40 km at once, but rather, the retrieval process is divided into 4 km × 4 km sub-grids.The soil moisture values at 1 km resolution are retrieved during each retrieval step at the sub-grids.This process is similar to a "window" that starts from the north-west corner of the target area, moving to the right and down.
Using this methodology, the RMSE between the actual and predicted values were 3.9% v/v with R 2 = 0.80 for the 14 November 2005 and 3.4% v/v with R 2 = 0.88 for the 21 November 2005 respectively.The actual and predicted soil moisture maps are shown in Figure 5, while the correlation relationships are shown using scatter plots in Figure 6.The correlation coefficients chart shows that the predicted and actual soil moisture values are linearly correlated.

Verification: Standardization Factors and Sub-Grid
To verify the use of different standardization factors and sub-grid sizes, a series of experiments were conducted.The ANN of two inputs, one hidden layer of 20 neurons and one output, was trained using data from the 7 November 2005.In contrast to Section 4.2., the standardization factors for the inputs and output of the training data set is reserved to be used for the evaluation cases.This is shown in Figure 7.The trained ANN was then evaluated using the data from the 14 and 21 November 2005, with the evaluation divided into two categories: (i) without sub-grid, and (ii) with sub-grid.The difference between these two categories was that the soil moisture was either retrieved at each of the 1 km cells for the whole target area of 40 km × 40 km (category (i)), or at each of the 1 km cell within the 4 km × 4 km sub-grid.For category (i), the mean and standard deviation values were calculated for the whole 40 km × 40 km area while for category (ii), these statistics values were calculated from each of the 4 km × 4 km sub-grids.
From Table 8, it is clear that with the same standardization factors across the dates, the retrieval accuracies were around 6-9% v/v, which was beyond the acceptable error.Moreover, Table 8 also shows that the use of sub-grid areas does not help to improve the retrieval accuracy when the standardization factors are the same across the dates.The use of the sub-grids is mainly to do with dealing with spatial variation.When the same standardization factors were used, the temporal variation across the dates was not captured.With same standardization factors, i.e., standardization factors obtained from the training data, the retrieval results were the same for both with and without the use of sub-grid.This is as predicted as the de-standardization process was done using the same "scaling factor" from the training data for both with and without sub-grid categories.A further analysis was carried out using different standardization factors for each evaluation date, but with the retrieval of the 1 km resolution soil moisture undertaken for the whole 40 km × 40 km at once, i.e., without sub-grid division.For this experiment, the temporal variation was captured using the different standardization factors but the spatial variation was neglected (i.e., without the use of sub-grid division).The results of this experiment are in Table 8, showing that the retrieval accuracies were improved (4.6% v/v and 5.5% v/v), but still beyond the acceptable retrieval error.The retrieval results were the best with the use of the proposed methodology of combining the sub-grid and different standardization factor methods.

Sensitivity: Accuracy of Mean and Standard Deviation Values
A priori information required by this methodology is information about the mean and standard deviation of the soil moisture at each of the sub-grid sizes.While this paper assumed such information was equal to the values calculated from the soil moisture within the sub-grid, such data will not be available in practice, and the mean and standard deviation of sub-grid moisture will need to be estimated from alternative means.Consequently, the sensitivity of results to the accuracy of these values was assessed.Multiple regression method was applied to regress the soil moisture values from TbH, TbV, NDVI and LST values.For each of the dates, 18 data (around 1%) of the data were randomly selected for the regression purpose.The rationale behind the small number of data selected is to simulate a situation where these data were ground-truth data.With more data selected, the regression formula will be more accurate, but at the same time, more sample points will need to be taken if ground sampling is taking place.The RMSE and R 2 between the actual and regressed soil moisture values are shown in Table 9.The trained ANN from Section 4.2 is evaluated using the regressed soil moisture values.The results are given in Table 9, showing that the accuracy of the predicted soil moisture with this methodology depends greatly on the mean and standard deviations.The predicted soil moisture values using ANN produce errors that are comparable to the error between the regressed and actual soil moisture values.From this experiment, it is clear that the application of this method depends on the accuracy of the mean and standard deviation values of soil moisture used.Large errors in these spatial statistics will result in large errors in the soil moisture retrieval accuracy when applying this proposed methodology.

Conclusions
This paper has presented a methodology which combines variability as the standardization factors together with sub-regional division method, yielding soil moisture retrieval with an acceptable error when testing with field data.The effects of different standardization factors both with and without sub-regional divisions were also shown and discussed.Compared to the usual ANN application to soil moisture retrieval, the combination of these two methods has solved the problem of "out-of-range" conditions when the trained ANN is used for data that is totally new and not previously "seen" by the ANN during the training process.The typical "out-of-range" conditions are mainly due to the inability of the ANN to capture the spatial and temporal variability of the soil moisture.Temporal variability is a common condition in soil moisture retrieval problems when retrieval is needed for dates different from that used in the training process, as it is not easy to cover all the possible conditions during the training.The surface soil moisture variance observed within a square meter (i.e., the spatial variability) can be as large as for a whole field, whereas in this research study, the target area is 40 km × 40 km in size.The methodology proposed in this paper has solved both of these problems.Despite the encouraging results, the main challenge of this method is the accurate estimation of the spatial variability in terms of the mean and standard deviation of the soil moisture as the standardization factors for the ANN.Although the mean and standard deviation within a pre-determined sub-grid size was calculated using the actual soil moisture values in this paper, a sensitivity analysis showed that the methodology depends highly on the accuracy of these two values.Consequently, the proposed approach is highly dependent on the accurate knowledge of the mean and standard deviation of the soil moisture values.Therefore, this method may be more appropriate for soil moisture downscaling.

Figure 1 .
Figure 1.Overview of NAFE'05 focus farms within the Krui and Merriwa areas.

Figure 2 .
Figure 2. The pre-and post-processing of the data.

Figure 3 .
Figure 3. Sub-grid size and ANN determination using data on 7th November 2005.The red-filled data were used for training.The sub-grid sizes used are shown with unfilled boxes.

Figure 4 .
Figure 4.The results obtained for the 4 km × 4 km sub-grid when using the trained ANN of pre-determined ANN architecture with different input combinations.

Figure 7 .
Figure 7.The standardization factors from the training data that were used for the verification cases.

Table 2 .
The training parameters for the BFGS training algorithm.

Table 3 .
The impact on RMSE and R 2 for different numbers of hidden layers, hidden neurons and number of square grid subdivisions when using only TbH and TbV as input.

Table 4 .
As for Table3but using TbH, TbV and NDVI as input.

Table 5 .
As for Table3but using TbH, TbV and LST as input.

Table 6 .
As for Table3but using TbH, TbV, NDVI and LST as input.

Table 7 .
Statistical means and standard deviations for the dual polarized brightness temperature and soil moisture values for training (7 November) and verification (14 November and 21 November).

Table 8 .
Results of using same and different standardization factors from the training data for cases of with and without region divisions.

Table 9 .
RMSE and R 2 between the regressed and actual soil moisture values.