Estimating Soil Moisture with Landsat Data and Its Application in Extracting the Spatial Distribution of Winter Flooded Paddies

Dynamic monitoring of the spatial pattern of winter continuously flooded paddies (WFP) at regional scales is a challenging but highly necessary process in analyzing trace greenhouse gas emissions, water resource management, and food security. The present study was carried out to demonstrate the feasibility of extracting the spatial distribution of WFP through time series imagery of volumetric surface soil moisture content (θv) at the field scale (30 m). A trade-off approach based on the synergistic use of tasseled cap transformation wetness and temperature vegetation dryness index was utilized to obtain paddy θv. The results showed that the modeled θv was in good agreement with in situ measurements. The overall correlation coefficient (R) was 0.78, with root-mean-square ranging from 1.96% to 9.96% in terms of different vegetation cover and surface water status. The lowest error of θv estimates was found to be restricted at the flooded paddy surface with moderate or high fractional vegetation cover. The flooded paddy was then successfully identified using the θv image with saturated moisture content thresholding, with an overall accuracy of 83.33%. This indicated that the derived geospatial dataset of WFP could be reliably applied to fill gaps in census statistics.


Introduction
China is an important rice-producing country, containing 22% of the world's harvested area of rice and 34% of the world's rice production [1].Winter continuously flooded paddies (WFP) are widely distributed in the mountainous area of Southwest and South China, and remain flooded and fallow after harvest.This rice ecosystem type is formed as part of the process of long-term rice cultivation for the purpose of easing the problem of water shortage for the next rice growing period [2].Therefore, timely and accurate information in the spatiotemporal distribution of WFP, at fine resolution (ď30 m), can help guide irrigation scheduling and water resource management in precision agriculture.Furthermore, because they are a recognized source of greenhouse gases (GHGs), there is increasing interest in the role of rice fields in global warming.Cai et al. [3] reported that flooding of rice paddies in the non-rice growing season can not only trigger continuous methane (CH 4 ) emissions during that season, but also emit large amounts of CH 4 in the subsequent rice growing season.As a result, the water regime represented by surface soil moisture in the season prior to rice cultivation is a primary concern when estimating CH 4 emissions from anthropogenic sources [4].However, no statistical data are available for this parameter in China, meaning it is usually judged subjectively when forming estimations at the global scale.This has inevitably contributed to estimation error and a large degree of uncertainty with respect to such results.
Over the past decade, satellite-based techniques, including optical-thermal and microwave remote sensing methods, have been used in many studies for the purpose of regular and regional determination of surface soil moisture [5][6][7][8][9][10][11][12][13][14].Based on these data, the seasonal water status can then potentially be derived as a reliable substitute for expert judgment in estimating regional CH 4 emissions.Microwave sensors are able to sense the land surface under all-weather conditions without the impact of cloud contamination [15].However, approaches of this kind carry with them several well-known limitations.Passive microwave sensors have a coarse spatial resolution (ě25 km), which is far from adequate for field-scale monitoring owing to the small units of farmland in China.Despite achievements with fairly high spatially resolved data (10-100 m), the application of active microwave sensors (e.g., SAR) usually falls prey to data availability and high computational cost at large scales.Besides, the disturbance effects of soil roughness and vegetation coverage on the SAR backscatter data used in surface soil moisture estimates have also been identified [5].These weaknesses could be largely overcome using optical-thermal remote sensing data, as they have adequate spatial and temporal resolution to capture the seasonal variation in surface soil moisture at large scales and most of the data are widely available free of charge (e.g., Landsat series data).Numerous attempts have been made to monitor surface soil moisture content using methods combining information from visible and near infrared (VNIR) bands and land surface temperature (LST) [7,12,[16][17][18][19].The temperature vegetation dryness index (TVDI) is a commonly used optical-thermal infrared remote sensing method that has proven to be feasible in retrieving surface soil moisture content over partially vegetated areas [10,12,20].However, estimation errors have been found to be higher with low fractional vegetation cover surface when using the soil wetness index (SWI, similar to TVDI), which may constrain the application of TVDI [7].A complementary solution suitable for sparsely vegetated surfaces (e.g., bare soil) is thus needed to improve the retrieval of soil moisture solely using TVDI.
Tasseled cap transformation (TCT) is a linear transformation method for various remote sensing data.Not only can it conduct data volume compression, but it can also provide parameters associated with physical characteristics, such as brightness, greenness, and wetness indices [21][22][23][24][25][26][27][28].Several studies have pointed out that soil moisture status is the primary characteristic expressed in wetness (TCTW) [22,23,25,28], and better water mapping results have been achieved using TCTW compared with other indices [29].However, very little literature exists on the topic of quantifying the soil moisture content using TCTW.On the other hand, although TCT coefficients based on Landsat 8 Operational Land Imager (OLI) at-satellite reflectance were derived by Baig et al. [21], the method for transformation involved potential estimation error.First, the spectral coverage of Thematic Mapper (TM) differs somewhat from that of the OLI sensor, and the TM transform coefficients were developed based on digital number (DN) values [22].Thereby, estimation error may have arisen if the targeted feature space was created by directly applying the DN-based TM transformation coefficients to OLI at-satellite reflectance data.Second, the six-band TCT coefficients may ignore the contribution of the new deep blue band to the TC feature derivation.Third, the transformed results were not validated using other reliable datasets.
The main aims of the present study were: (1) To generate a updated set of Landsat 8 OLI TCT coefficients with an improved method; (2) To access the capability of the synergistic application of soil wetness-related indices involving TCTW and TVDI in combination with an advanced machine learning method for estimating soil moisture content; (3) To present a procedure for the extraction of WFP using multi-temporal surface soil moisture content imagery.The proposed method utilizes, for the first time, the TCTW based on Landsat 8 OLI to quantitatively estimate the soil moisture content, and can be regarded as a preparatory study for the use of future optical-thermal infrared sensors, such as Sentinel-2 and Worldview-3.

Study Area
Paddy rice agriculture in China can be roughly classified into four major rotation systems: (1) Single rice: The major cropping system in Northwest China, Northeast China, and the mountainous and hilly parts of Southwest China.(2) Double rice: Spans from the north of Nanling to the middle and lower reaches of Yangtze River.(3) Rice-upland crop: Adopted across the southern Liao River, Huai River and Hai River valley, Yellow River area and the south of Huai River and Qinling.(4) Double rice-upland crop: Mainly distributed from the tropical and subtropical area to the south of the middle and lower reaches of Yangtze River.
Three major rice-producing provinces, namely Jiangsu, Sichuan and Hunan, were chosen for soil moisture field sampling (Figure 1).Rice-upland crop rotation is the dominant cropping system across Jiangsu province, where the landscape is typically flat and the soil surface is relatively homogenous.In Hunan province, the subtropical monsoon climate brings abundant precipitation and a long warm season, and double rice and double rice-upland crop rotation systems mainly occur.In Sichuan province, rice-upland crop rotation is the major cropping system in rice paddies of Chengdu plain, while single rice cropping is commonly adopted in the mountainous and hilly regions, where a large fraction of the rice paddies are continuously flooded during winter.

Study Area
Paddy rice agriculture in China can be roughly classified into four major rotation systems: (1) Single rice: The major cropping system in Northwest China, Northeast China, and the mountainous and hilly parts of Southwest China.(2) Double rice: Spans from the north of Nanling to the middle and lower reaches of Yangtze River.
(3) Rice-upland crop: Adopted across the southern Liao River, Huai River and Hai River valley, Yellow River area and the south of Huai River and Qinling.(4) Double rice-upland crop: Mainly distributed from the tropical and subtropical area to the south of the middle and lower reaches of Yangtze River.
Three major rice-producing provinces, namely Jiangsu, Sichuan and Hunan, were chosen for soil moisture field sampling (Figure 1).Rice-upland crop rotation is the dominant cropping system across Jiangsu province, where the landscape is typically flat and the soil surface is relatively homogenous.In Hunan province, the subtropical monsoon climate brings abundant precipitation and a long warm season, and double rice and double rice-upland crop rotation systems mainly occur.In Sichuan province, rice-upland crop rotation is the major cropping system in rice paddies of Chengdu plain, while single rice cropping is commonly adopted in the mountainous and hilly regions, where a large fraction of the rice paddies are continuously flooded during winter.A case study approach was applied across the main rice planting regions (e.g., the Dongting Lake region and the Xiangjiang River region) in Hunan and Jiangxi provinces (Figure 2) to provide detailed illustrations for the identification of WFP from the time series of volumetric surface soil moisture content (θv) images.The chosen area was representative owing to the presence of abundant WFP and the major rice crop rotation systems in China.A case study approach was applied across the main rice planting regions (e.g., the Dongting Lake region and the Xiangjiang River region) in Hunan and Jiangxi provinces (Figure 2) to provide detailed illustrations for the identification of WFP from the time series of volumetric surface soil moisture content (θ v ) images.The chosen area was representative owing to the presence of abundant WFP and the major rice crop rotation systems in China.

Field Sampling
From 21 October 2013 to 19 March 2014, three field surveys were conducted in the study area to obtain the θv in rice paddy areas using a mobile Delta-T WET-2 sensor [30] (Table 1).The soil moisture was directly measured between 9:00 a.m. and 4:00 p.m. for each sampling day.A wide range of soil moisture conditions were identified by the θv observations to enable sampling of rice paddy areas, from 22.7% in dry rice-oilseed rape cropping areas, to 72.07% in flooded fallow paddies.A total of 64 field sites (located with GPS), randomly distributed across the rice planting area with a transection of about 5 km, were sampled in Jiangsu, Sichuan, and Hunan provinces separately, in the context of the θv retrieval algorithm (Figure 1).Each sample was the average of three observations taken within a sampling area of about 30 × 30 m 2 at 0-6 cm depth.Before the θv measurements were made, a soil-specific calibration for paddy soil measurements was performed using these samples.Permittivity and gravimetric measurements of soil moisture were obtained at several representative sampling sites as a reference dataset.The custom parameters for paddy soil were then determined by following the calibration procedure mentioned in the product user manual [31].

Field Sampling
From 21 October 2013 to 19 March 2014, three field surveys were conducted in the study area to obtain the θ v in rice paddy areas using a mobile Delta-T WET-2 sensor [30] (Table 1).The soil moisture was directly measured between 9:00 a.m. and 4:00 p.m. for each sampling day.A wide range of soil moisture conditions were identified by the θ v observations to enable sampling of rice paddy areas, from 22.7% in dry rice-oilseed rape cropping areas, to 72.07% in flooded fallow paddies.A total of 64 field sites (located with GPS), randomly distributed across the rice planting area with a transection of about 5 km, were sampled in Jiangsu, Sichuan, and Hunan provinces separately, in the context of the θ v retrieval algorithm (Figure 1).Each sample was the average of three observations taken within a sampling area of about 30 ˆ30 m 2 at 0-6 cm depth.Before the θ v measurements were made, a soil-specific calibration for paddy soil measurements was performed using these samples.Permittivity and gravimetric measurements of soil moisture were obtained at several representative sampling sites as a reference dataset.The custom parameters for paddy soil were then determined by following the calibration procedure mentioned in the product user manual [31].

Tasseled Cap Transformation
A set of 10 consecutive pairs of clear/near cloud-free Landsat Enhanced Thematic Mapper Plus (ETM+) and OLI images that captured both leaf-on and leaf-off conditions under various geographical distributions were selected in this study (Table 2).Each pair of Landsat ETM+/OLI scenes shared the same spatial coverage with an 8-day interval between the OLI and the ETM+ images.The detailed spectral characteristics of Landsat ETM+ and OLI sensors were given in Table 3. Landsat 8 image has narrower visible red (R) and near infrared (NIR) bands than ETM+ image [32].All of these L1T products, which have undergone radiometric and terrain correction processes, were freely provided by the Geospatial Data Cloud, Computer Network Information Center, Chinese Academy of Sciences [33].Raw DN values were then calibrated to at-satellite reflectance prior to using them for image processing in order to eliminate the impact of sun illumination geometry and atmospheric scattering [25,34].The estimates of soil moisture in paddy rice areas were made during the winter periods of 2013 and 2014.The satellite dataset considered in the soil moisture estimation comprised two sets of Landsat images (Table 1).One set was obtained contemporaneously with the field surveys (#1-5) over the study area and were used to develop the procedure for soil moisture retrieval.The other set was gap-filled product and acquired for the extraction of WFP (#6-11).In order to convert the DN values into land surface reflectance, atmospheric correction was applied to each Landsat image using the Fast Line-of-sight Atmospheric Analysis of Spectral Hypercubes (FLAASH) module integrated in the software of Environment for Visualizing Images (ENVI, Version 5.1).Then, an automated cloud and cloud shadow detection algorithm-Fmask-was adopted to remove the cloud-contaminated pixels in all the images before further analysis [35].

Ancillary Data
First, because this study focused on soil moisture estimation at the paddy surface, a MODIS-based rice paddy map of Hunan and Jiangxi provinces produced from Li et al. [36] was used as a prerequisite to exclude the other land use types (Figure 2).The rice paddy map was georeferenced and resampled to the same resolution and coverage as the Landsat 8 imagery.
Second, a 2-km water holding capacity (WHC) dataset for China was acquired from the Soil Information System of China (SISChina) [37].The relationship between modeled WHC and in situ saturated moisture content (SMC) data in the study area is shown in Figure 3.The SMC data were all obtained from the saturated soil just beneath the flooded paddy.The result of linear regression analysis showed that WHC is highly correlated with SMC (R 2 = 0.867) at saturated soil sample sites.The slope and intercept were 0.828 ˘0.086 and 0.252 ˘0.033, respectively.The soil SMC dataset was then derived from the WHC dataset using an empirical law (Equation ( 1)).These data were used to set the threshold for extracting the WFP.To eliminate these underlying influences, an ensemble average of rotation matrices across the input images was calculated as the TCT for the Landsat 8 OLI imagery.The approach described below was designed to accomplish the derivation of the transformation.
Firstly, a total of 5404 random pseudo-invariant samples, including deep water, dense vegetation and manmade surface, were evenly selected by visual interpolation from the first nine pairs of Landsat OLI/ETM+ scenes given in Table 2 to guide the rotations.The selected objects were considered to be invariant during an 8-day interval between the paired OLI and these ETM+ images  To eliminate these underlying influences, an ensemble average of rotation matrices across the input images was calculated as the TCT for the Landsat 8 OLI imagery.The approach described below was designed to accomplish the derivation of the transformation.
Firstly, a total of 5404 random pseudo-invariant samples, including deep water, dense vegetation and manmade surface, were evenly selected by visual interpolation from the first nine pairs of Landsat OLI/ETM+ scenes given in Table 2 to guide the rotations.The selected objects were considered to be invariant during an 8-day interval between the paired OLI and these ETM+ images [25].An Ordinary Procrustes analysis was then performed on all these nine pairs of Landsat scenes to align the Landsat 8 OLI principal component analysis (PCA) transformed space represented by the selected samples with the pair-wise linked Landsat 7 ETM+ tasseled cap space.The nine initial rotation matrices were subsequently defined through multiplication between the transformation coefficients developed by Procrustes and PCA.
Secondly, a Generalized Procrustes analysis was adopted to obtain the consensus orthogonal rotation matrices from the nine input initial rotation matrices.The transformation was derived through a calculation of the ensemble average rotation matrix of all scenes.
Finally, the Gram-Schmidt procedure was used to preserve the orthogonality of all seven axes, enabling the identification of a new set of orthogonal axes that highlighted the tasseled cap features the most for Landsat 8.A detailed description of the Procrustes analysis algorithms is provided in Supplementary Material 1.

Methods for Extracting the Spatial Distribution of Winter Flooded Paddies
Miscellaneous rice cropping systems in the three representative provinces may pose challenges to the accurate acquisition of the soil moisture.To overcome this problem, an integrated approach was used to sense different fractional vegetation cover and soil moisture conditions from different rice cropping systems (Figure 4).With pairs of feature values (TCTW and TVDI) derived from the Landsat images and θ v measurements were obtained, the TCTW and TVDI indices were introduced as the regressors with θ v as the output in an advanced machine learning algorithm, the general regression neural network (GRNN) [38,39].Over the past decade, probabilistic neural networks like GRNN have been widely used for solving nonlinear problems and preforming predictions because of their flexible network structures, high fault tolerances, and robustness [39][40][41][42][43][44][45][46].Additionally, the primary benefit of GRNN over a traditional back propagation neural network is its ability to obtain a satisfying fit of the data rapidly, with only a few training samples available, and without additional parameter inputs [39,44].These features make GRNN a very efficient tool for constructing predictors for the desired variables. of their flexible network structures, high fault tolerances, and robustness [39][40][41][42][43][44][45][46].Additionally, the primary benefit of GRNN over a traditional back propagation neural network is its ability to obtain a satisfying fit of the data rapidly, with only a few training samples available, and without additional parameter inputs [39,44].These features make GRNN a very efficient tool for constructing predictors for the desired variables.After attaining the monthly paddy θv images during a winter season (December to February), a threshold was needed to differentiate the flooded paddies from non-flooded paddies in each time series of θv images.The overlapping areas of flooded rice paddies of all three tiles could then be regarded as continuously flooded paddies in winter (Figure 4).To this end, the derived SMC distribution data of paddy soil within the case study area were taken as the threshold for After attaining the monthly paddy θ v images during a winter season (December to February), a threshold was needed to differentiate the flooded paddies from non-flooded paddies in each time series of θ v images.The overlapping areas of flooded rice paddies of all three tiles could then be regarded as continuously flooded paddies in winter (Figure 4).To this end, the derived SMC distribution data of paddy soil within the case study area were taken as the threshold for identifying flooded paddies.If the pixel value in the θ v image is larger than the pixel value in the according SMC image, it will then be classified as a flooded pixel.

Tasseled Cap Transformation Coefficients
The Landsat 8 tasseled cap coefficients are listed in Table 4 and compared with the coefficients derived by Baig et al. [21].Except for the deep blue band, the two sets of TCT coefficients are generally similar.There are, however, still some notable differences between them, and the addition of the OLI deep blue visible band may be the cause.The distribution of band weights were somewhat altered due to the added band.On the other hand, the modified methodology is another potential impact factor contributing to the differences between the two sets of coefficients.The quasi-synchronous ETM+ tasseled cap space was taken as the reference target instead of the TM transformed OLI data.These differences highlight the importance of a standardized rotation procedure of the tasseled cap transformation.

Confirmation of Tasseled Cap Features
To confirm the tasseled cap features after transformation, nearly 7000 pseudo-invariant samples, including deep water, dense vegetation, sparse vegetation, manmade objects, and bare soil were selected randomly from the validation image (#10 image given in Table 2).
The brightness, greenness and wetness axes are presented by the "tasseled cap" concept for the two transformations (Figure 5).The shapes of the selected samples from the validation scene in the two sets of tasseled cap spaces are similar, whereas their orientation and relative location differ moderately from each other (Figure 5).The greenness and wetness values of our transformation (hereafter referred to as ETM+-based TCT) were generally lower than those of Baig et al. [21] (hereafter referred to as TM-based TCT).In the brightness-greenness space, the greenness values of bare soil samples derived from the TM-based TCT are all greater than 0, and remain stable as the brightness value increases (Figure 5a).On the contrary, the ETM+-based TC transformed space reveals that the greenness values of bare soil samples are all less than 0, and there is a slight decline in its greenness value as the brightness value rises (Figure 5b).Considering that soil has the highest spectral reflectance in the short-wave infrared band (SWIR1, with a central wavelength of 1.609 µm) within the spectral range of Landsat 8 [47], a smaller loading in the SWIR1 band may lead to lower greenness values of bare soil samples obtained after transformation, making it easier to discriminate bare soil pixels from sparse vegetation pixels using the greenness alone (Figure 5).Larger disparities are found in brightness-wetness/wetness-greenness space.Besides deep water, the wetness value of dense vegetation as well as a few sparse vegetation and manmade objects samples derived from TM-based TCT are also greater than 0, which shows it is not possible to distinguish water pixels from dense vegetation pixels effectively by using wetness index only.It is suggested that the ETM+-based TC transformed space could better highlight the discrepancy between water bodies and other land cover types due to the wetness values of all land cover types derived from ETM+-based TCT being negative, except water pixels, and the wetness values increasing as the brightness values generally decrease.Taken together, the vegetation cover could be identified from the land surface using a threshold of Greenness index > 0, while the water body could be easily extracted from a TCTW image using a threshold of TCTW > 0. Therefore, it would be very efficient if the TCT-derived indices were introduced in a classification process.

Validation of Tasseled Cap Transformation
A total of 2667 pseudo-invariant samples, including dense vegetation, deep water and manmade surfaces, were selected randomly as an independent validation dataset.Mean bias error (MBE), mean absolute error (MAE), and root-mean-square error (RMSE) were used to assess the goodness of fit between the OLI-based TC value and the quasi real-time ETM+-based TC value from the two transformations.

Validation of Tasseled Cap Transformation
A total of 2667 pseudo-invariant samples, including dense vegetation, deep water and manmade surfaces, were selected randomly as an independent validation dataset.Mean bias error (MBE), mean absolute error (MAE), and root-mean-square error (RMSE) were used to assess the goodness of fit between the OLI-based TC value and the quasi real-time ETM+-based TC value from the two transformations.
MBE " 1 n where n is the number of provinces, and E i and O i are the estimated and observed θ v , respectively.The MBE corresponds to the metrics of overestimation (´) and underestimation (+) [48].The method is perfect when MBE = 0, MAE = 0 and RMSE = 0.The results achieved in terms of the three TC features are summarized in Table 5.There appeared to be an obvious overestimation of the TM-based TCT greenness (MBE = 0.0588) and wetness values (MBE = 0.0739), though the correlation coefficients of both approaches were significant at the 0.01 level.Although the estimations of brightness from the two methods were comparable, better performance for greenness and wetness was achieved by our method; the R 2 increased by 0.0133 and 0.4428, the MAE decreased by 0.0312 and 0.0643, and the RMSE decreased by 0.0414 and 0.0811, respectively.Overall, it can be seen that the transform coefficients of this study can better maintain the tasseled cap features.The estimation accuracy, for the wetness index in particular, improved greatly.
There are several possible explanations for the improvement in our estimation.First, there is continuity between the band settings of Landsat 8 and Landsat 7, as well as some new spectral characteristics including the addition of a deep blue band, a shortwave-infrared band, and refinement of the spectral range of several bands corresponding to those of Landsat 7 ETM+ [49].These may bring estimation error if the TM TCT coefficients are directly applied to OLI data for creating the target feature space.Second, the DN-based TM TCT coefficients are suitable for raw DN images rather than radiometrically calibrated data.Therefore, an increase in the estimation error is likely when taking the at-satellite OLI images as input data.In addition, many studies have proven the effectiveness of at-satellite reflectance-based TCT coefficients [25,26]; however, the impacts of atmospheric scattering and sun illumination geometry still exist.The direct application of all samples to the images used to calculate the tasseled cap transformation coefficients may also lower the estimation accuracy.Our method has the capability to overcome the deficiencies mentioned above and can thereby obtain relatively accurate tasseled cap transformation indices.

Soil Moisture Content Retrieval
The relationship between the TCTW, TVDI and θ v could be complicated by the presence of various vegetated conditions in rice paddies during winter [50].The rice paddy may be partially or fully covered by rice straw, grass, or crop canopy (e.g., rape).This may have a negative effect on the θ v retrieval algorithm.It has been reported that the Normalized Difference Tillage Index (NDTI) is a good indicator for differentiating vegetation/crop residue from soil [51,52].NDTI can be calculated by surface reflectance values from the two shortwave-infrared (SWIR) bands: where ρ SW IR1 and ρ SW IR2 refer to the surface reflectance values from Band 5 and Band 7 in the Landsat-7 ETM+ sensor and Band 6 and Band 7 in the Landsat-8 OLI sensor, respectively.Therefore, as summarized from all the NDTI values at the in situ field samples, the error statistics were calculated for two NDTI classes of <0.2 and ě0.2.NDTI < 0.2 represented a low fractional crop residue/vegetation cover condition, whereas NDTI ě 0.2 indicated a moderate or high fractional crop residue/vegetation cover condition.Moreover, the ground water condition could also be a potential factor of influence for θ v estimation.Additionally, it is infeasible to separate the flooded or non-flooded status in paddies from different provinces using a fixed θ v threshold, because the saturated volumetric moisture content of paddy soil may differ in terms of variations in soil texture.Thus, we roughly classified the flooded or non-flooded status by visual interpretation.Taken together, during the non-rice growing period (usually from December to February), the rice paddy surface conditions can mainly be classified into four categories: The 64 field samples were randomly separated into training (46/64) and validation (18/64) samples.Different regressor combinations were inputted to the GRNN model in order to investigate the best performance for θ v retrieval.Assessing the accuracy was achieved by comparing the estimation results with the in situ measured data in terms of different input configurations (Figure 6).The error statistics (bias, MAE and RMSE) were calculated based on the validation dataset for different scenarios (Table 6).The correlation coefficients of all the input configurations were significant at the 0.01 level.The TCTW+TVDI configuration gave the best performance as an input into the model, with the highest correlation coefficient (R = 0.78, P = 0.000) and lowest estimation errors (MAE = 7.19 and RMSE = 7.83).It seemed to be less effective in estimating θ v (for the retrieval process) using either the TCTW or TVDI configuration as the input into the model.The TVDI configuration yielded the largest estimation errors (MAE = 8.94 and RMSE = 10.39), with the lowest correlation coefficient (R = 0.63, P = 0.005).
There was an underestimation of 3.69% in the TVDI configuration but an overestimation of 3.54% in the TCTW configuration.More specifically, the TCTW had a problem in estimating the process of θ v in non-flooded and vegetated areas, where considerably larger errors (MAE = 15.72,RMSE = 15.90) were produced compared to the samples with respect to other land cover types (Table 6).
where ρSWIR1 and ρSWIR2 refer to the surface reflectance values from Band 5 and Band 7 in the Landsat-7 ETM+ sensor and Band 6 and Band 7 in the Landsat-8 OLI sensor, respectively.Therefore, as summarized from all the NDTI values at the in situ field samples, the error statistics were calculated for two NDTI classes of <0.2 and ≥0.2.NDTI < 0.2 represented a low fractional crop residue/vegetation cover condition, whereas NDTI ≥ 0.2 indicated a moderate or high fractional crop residue/vegetation cover condition.Moreover, the ground water condition could also be a potential factor of influence for θv estimation.Additionally, it is infeasible to separate the flooded or non-flooded status in paddies from different provinces using a fixed θv threshold, because the saturated volumetric moisture content of paddy soil may differ in terms of variations in soil texture.Thus, we roughly classified the flooded or non-flooded status by visual interpretation.Taken together, during the non-rice growing period (usually from December to February), the rice paddy surface conditions can mainly be classified into four categories: (1) Flooded (NDTI < 0.2); ( 2) Flooded (NDTI ≥ 0.2); (3) Non-flooded (NDTI < 0.2); and (4) Non-flooded (NDTI ≥ 0.2).The 64 field samples were randomly separated into training (46/64) and validation (18/64) samples.Different regressor combinations were inputted to the GRNN model in order to investigate the best performance for θv retrieval.Assessing the accuracy was achieved by comparing the estimation results with the in situ measured data in terms of different input configurations (Figure 6).The error statistics (bias, MAE and RMSE) were calculated based on the validation dataset for different scenarios (Table 6).The correlation coefficients of all the input configurations were significant at the 0.01 level.The TCTW+TVDI configuration gave the best performance as an input into the model, with the highest correlation coefficient (R = 0.78, P = 0.000) and lowest estimation errors (MAE = 7.19 and RMSE = 7.83).It seemed to be less effective in estimating v (for the retrieval process) using either the TCTW or TVDI configuration as the input into the model.The TVDI configuration yielded the largest estimation errors (MAE = 8.94 and RMSE = 10.39), with the lowest correlation coefficient (R = 0.63, P = 0.005).There was an underestimation of 3.69% in the TVDI configuration but an overestimation of 3.54% in the TCTW configuration.More specifically, the TCTW had a problem in estimating the process of θv in non-flooded and vegetated areas, where considerably larger errors (MAE = 15.72,RMSE = 15.90) were produced compared to the samples with respect to other land cover types (Table 6).The presence of non-paddy rice vegetation, during the non-rice growing period, will increase the greenness signal and reduce the soil signatures as the canopies of green plants or straws partially or fully cover the paddy surface area [53].It has been suggested that the soil moisture status is the primary characteristic expressed in TCTW, while only a limited amount of plant moisture variation is represented in the tasseled cap space [22].As a consequence, the TCTW would work well on the condition that the land surface consisted of merely bare soil without any other materials.However, a much higher accuracy was achieved under flooded and vegetated surface conditions than under non-flooded and non-vegetated surface conditions.The cause of this may be that the water signature often dominates over the flooded rice paddy surface, as it is by a mixture of rice residue, soil and water bodies, and it then conceals the weakness of TCTW in estimating the soil moisture over vegetated areas.Most of the soil and vegetated surfaces are obscured under flooded conditions.
In contrast to the TCTW configuration, the TVDI configuration failed to estimate θ v over the bare soil areas, with the largest range of RMSE (13.02%-14.15%)(Table 6), especially for the samples with a high moisture status (θ v > 50%) (Figure 6).Furthermore, there was a high probability of θ v underestimation over flooded and bare soil areas (MBE = ´13.60%).Mallick et al. [7] pointed out that the estimation results from an LST-VI index method like TVDI or SWI may be less reliable with low fractional vegetation cover because the wet edge is set as a horizontal base of the LST-VI triangle.Despite this, the same situation was observed in our experiment based on the parameters regressed from the sloping wet edge.Hence, we hypothesized that the definition of the wet edge was not a major concern affecting the prediction ability of TVDI.There are several likely causes for the considerably higher estimation error produced by the TVDI method.First, the differences in the optical-thermal infrared spectral bands and the performances of the instruments between Landsat 7 and Landsat 8 can give rise to slight variations in the derived Normalized Difference Vegetation Index (NDVI) and LST data used during the model training and prediction phase.Second, the non-linear stretching effect of NDVI can enhance the low-vegetation cover part within the full fractional vegetation cover range [54], and the NDVI value over the bare soil area may be overestimated to some extent.Third, the land surface emissivity of the bare soil area was simplified to the soil emissivity as a substitution.This may lower the accuracy of the LST retrieval over the bare soil area.Because of this, the model parameters were not perfectly regressed, the prediction ability of TVDI decreased over the bare area.It can therefore be assumed that the combination of TCTW and TVDI can provide high sensitivity to changes in soil moisture over both sparsely and densely vegetated areas during the fallow period.

Extraction of the Winter Continuously Flooded Paddy
After the time series θ v images were collected using the TCTW+TVDI configuration modeling, we generated flooded paddy tiles for each θ v image with SMC thresholding over the case study area.Finally, the spatial distribution of WFP represented by the overlapping areas of flooded rice paddies of all three tiles in winter in Hunan and Jiangxi provinces was extracted based on the rice paddy map (Figure 7).Winter flooded paddies were found to be sparsely distributed throughout the rice planting areas of Hunan and Jiangxi provinces.The amount of WFP around the Dongting Lake and Poyang Lake regions was larger than in the other scenes probably due to the presence of significantly larger amounts of rice paddies.The ratio of the area of WFP to the total area of rice paddy under each Landsat scene coverage was then calculated and summarized in Table 7.The average slope of the map of the case study area was used to classify the scene coverage into flat and hilly areas.The fraction of WFP in flat areas was comparable to that of hilly areas in Hunan, whereas there was an obvious increase in the fraction in flat areas than that of hilly areas in Jiangxi.

Extraction of the Winter Continuously Flooded Paddy
After the time series θv images were collected using the TCTW+TVDI configuration modeling, we generated flooded paddy tiles for each θv image with SMC thresholding over the case study area.Finally, the spatial distribution of WFP represented by the overlapping areas of flooded rice paddies of all three tiles in winter in Hunan and Jiangxi provinces was extracted based on the rice paddy map (Figure 7).Winter flooded paddies were found to be sparsely distributed throughout the main rice planting areas of Hunan and Jiangxi provinces.The amount of WFP around the Dongting Lake and Poyang Lake regions was larger than in the other scenes probably due to the presence of significantly larger amounts of rice paddies.The ratio of the area of WFP to the total area of rice paddy under each Landsat scene coverage was then calculated and summarized in Table 7.The average slope of the map of the case study area was used to classify the scene coverage into flat and hilly areas.The fraction of WFP in flat areas was comparable to that of hilly areas in Hunan, whereas there was an obvious increase in the fraction in flat areas than that of hilly areas in Jiangxi.Although Southwest China boasts a well-developed river system and relatively abundant water resources, the complex terrain and large fraction of mountainous areas always result in there being a long distance between the cropland and water sources, increasing the difficulty involved in water resource utilization.Most regions are unable to irrigate during the dry season because they do not possess the necessary water storage facilities, infrastructure and technology.Storing water in winter is an effective method for some areas to ease the pressure on water resource shortages in the rice growing period.However, long-term flooding is one of the main factors causing high CH 4 emissions during the rice growing period in southern China.The average CH 4 flux from a rice paddy that was flooded in the winter season (about 36.2 g¨CH 4 ¨m´2 ) is approximately 1.9 times that from a rice paddy that experienced a drained winter season [3,4,55].Under these conditions, other rice varieties (e.g., water-saving and drought-resistant rice) and single rice-upland winter crop rotation could be effective options for mitigating CH 4 emissions without reducing grain yields under water-saving irrigation in the traditional WFP [55,56].Therefore, the spatial distribution data of WFP could not only be useful for GHG emissions estimation, but also for guiding the selection of rice varieties and cropping patterns, as well as water resource management in local agricultural production.On the contrary, the plain areas have much better irrigation conditions, and thus there is no need to maintain WFP for supplementing water during the rice growing season.
However, corroboration of a continuous status of flooding remains difficult to fulfill at present.As an alternative solution, confusion matrices were calculated based on the ground-truth dataset to assess the accuracy of flooded paddy identification from the five single-date images in the study area (Table 8).The 18 validation field sites mentioned in Section 3.4, consisting of flooded (6/18) and non-flooded (12/18) samples, were used as the ground-truth dataset, and there was good agreement between the identification result and ground-truth data [57].Furthermore, we compared our estimate with the result from Yan et al. [58], who suggested 10% as the fractional amount of WFP in the CH 4 emissions of both Hunan and Jiangxi.Our results were 15.48% and 18.61%, for the fractional coverage of WFP in the total rice paddy area.This was evident in the case of the comparison between various modeled results and satellite observations, and regional discrepancies were subsequently found (e.g., Sichuan basin) [59].Moreover, the large dependence on the surface water status data in determining the CH 4 emissions has been exemplified in a report by Jonai and Takeuchi [60].Significant underestimation of the fraction of WFP may result in inconsistencies between modeled results and satellite observations.The validation of the WFP maps derived from remotely sensed data has a long way to go owing to a lack of census statistics.Further efforts should be made to create a validation database consisting of dynamic data collected from regional moisture monitoring networks and surveys from a considerable number of local people.

Conclusions
A methodology that combines Landsat-based TCTW and TVDI for regional mapping of flooded paddies over both sparsely and densely vegetated areas has been presented.The implementation and innovative aspects of this methodology are as follows: First, the new 7-band Landsat 8 OLI TCT coefficients were updated.The result showed that a good separation among different land cover types was apparent after our transformation, with substantially smaller prediction error than the existing transformation coefficients.The improved procedure for TCT derivation can also be extended to other forthcoming multi-spectral remotely sensed data (e.g., Sentinel-2 and Worldview-3), if the target sensor in question has a close similarity with the reference sensor (with known TCT coefficients).Second, this is the first time that a synergistic TCTW/TVDI approach has been performed to estimate surface soil moisture content with Landsat images, and satisfying results with various vegetation cover types and ground water conditions have been achieved.Finally, the WFP was extracted from the overlapping area of the time series imagery of flooded paddies with an accuracy of 83.33%.
The methodology described in this paper can be applied to generate a regional geospatial database of WFP, which can then be used to support various studies of agricultural water use, GHG emissions estimates, and food security in China and Asia.This work can also be considered as an exploratory study for the use of other optical-thermal sensors.To obtain a more accurate WFP map, future efforts may focus on improving the retrieval algorithm in the soil moisture estimation process and using remotely sensed time series imagery with high resolution in both time and space.

Figure 1 .
Figure 1.Locations of the 64 field sample sites in Jiangsu, Hunan and Sichuan provinces.

Figure 1 .
Figure 1.Locations of the 64 field sample sites in Jiangsu, Hunan and Sichuan provinces.

Figure 2 .
Figure 2. Map of the Hunan and Jiangxi provinces study area showing the distributions of rice paddies, water bodies, and the six remotely sensed image scenes used the WFP extraction method.

Figure 2 .
Figure 2. Map of the Hunan and Jiangxi provinces study area showing the distributions of rice paddies, water bodies, and the six remotely sensed image scenes used the WFP extraction method.

Figure 3 .
Figure 3. Relationship between WHC and in situ SMC at saturated soil sample sites.

2. 5 .
Data Processing 2.5.1.Methods for Tasseled Cap Transformation Different sun illumination and atmospheric conditions of the images may disturb the derived tasseled cap value, and thus each image has a somewhat different set of transformation coefficients.

Figure 3 .
Figure 3. Relationship between WHC and in situ SMC at saturated soil sample sites.

2. 5 .
Data Processing 2.5.1.Methods for Tasseled Cap Transformation Different sun illumination and atmospheric conditions of the images may disturb the derived tasseled cap value, and thus each image has a somewhat different set of transformation coefficients.

Figure 4 .
Figure 4. Flowchart of the algorithm for extracting WFP from Landsat images.

Figure 4 .
Figure 4. Flowchart of the algorithm for extracting WFP from Landsat images.

Figure 5 .
Figure 5.Comparison of tasseled cap spaces derived by (a) Baig et al. (2014) and (b) this study.

Figure 5 .
Figure 5.Comparison of tasseled cap spaces derived by (a) Baig et al. (2014) and (b) this study.

Figure 7 .
Figure 7. Spatial distribution of WFP derived from 2013-2014 Landsat data in selected areas with a zoom of scene #8.

Figure 7 .
Figure 7. Spatial distribution of WFP derived from 2013-2014 Landsat data in selected areas with a zoom of scene #8.

Table 1 .
List of the Landsat images acquired over the study area in 2013-2014.

Table 1 .
List of the Landsat images acquired over the study area in 2013-2014.

Table 2 .
Landsat images used for TCT derivation.
Note: The Coastal aerosol band is a newly added band in Landsat 8 OLI sensor used for coastal and aerosol studies.

Table 5 .
Assessment of the two tasseled cap transformations.

Table 6 .
Assessment of the model performance under different scenarios.

Table 7 .
Fraction of WFP within Hunan and Jiangxi provinces.The fraction of WFP represents the ratio of the area of WFP to the total area of rice paddy under the corresponding Landsat scene coverage. *

Table 8 .
Assessment of the accuracy of flooded paddy identification.