Enhancing Noah Land Surface Model Prediction Skill over Indian Subcontinent by Assimilating SMOPS Blended Soil Moisture

: In the present study, soil moisture assimilation is conducted over the Indian subcontinent, using the Noah Land Surface Model (LSM) and the Soil Moisture Operational Products System (SMOPS) observations by utilizing the Ensemble Kalman Filter. The study is conducted in two stages involving assimilation of soil moisture and simulation of brightness temperature (Tb) using radiative transfer scheme. The results of data assimilation in the form of simulated Surface Soil Moisture (SSM) maps are evaluated for the Indian summer monsoonal months of June, July, August, September (JJAS) using the Land Parameter Retrieval Model (LPRM) AMSR-E soil moisture as reference. Results of comparative analysis using the Global land Data Assimilation System (GLDAS) SSM is also discussed over India. Data assimilation using SMOPS soil moisture shows improved prediction over the Indian subcontinent, with an average correlation of 0.96 and average root mean square difference (RMSD) of 0.0303 m 3 /m 3 . The results are promising in comparison with the GLDAS SSM, which has an average correlation of 0.93 and average RMSD of 0.0481 m 3 /m 3 . In the second stage of the study, the assimilated soil moisture is used to simulate X-band brightness temperature (Tb) at an incidence angle of 55 ◦ using the Community Microwave Emission Model (CMEM) Radiative transfer Model (RTM). This is aimed to study the sensitivity of the parameterization scheme on Tb simulation over the Indian subcontinent. The result of Tb simulation shows that the CMEM parameterization scheme strongly inﬂuences the simulated top of atmosphere (TOA) brightness temperature. Furthermore, the Tb simulations from Wang dielectric model and Kirdyashev vegetation model shows better similarity with the actual AMSR-E Tb over the study region.


Introduction
Land surface states represent one of the crucial factors that govern the global and regional climate through the exchange of moisture and energy between different land classes and the overlaying atmosphere. Soil moisture is a critical land surface state having considerable influence on land-atmosphere exchanges. This in turn controls the hydro-meteorological parameters responsible for partitioning the surface incident radiation into latent and sensible heat fluxes respectively. Literature presents studies examining the influence of soil moisture on various hydro-meteorological events [1] have utilized the soil moisture data for drought and flood risk estimation. Koster et al. [2] studied the influence of soil moisture on precipitation and found three hotspot regions in Great Plains of North America, the Sahel and India. While in situ measurements provide accurate estimates of soil moisture, a real time continuous spatio-temporal representation of soil moisture over the global extent becomes difficult. The Land surface Model (LSM) simulations are equipped to provide a continuous estimate of soil moisture over space and time. Since LSMs are physics based models, they can reliably be used globally without calibration. Some of the widely used global LSMs include the Noah model [3,4], the Community Land Model (CLM) [5], the Variable Infiltration Capacity (VIC) model [6], and the Mosaic model [7]. Land surface state estimates provides well needed information to different related fields like weather and climate studies, hazard mitigation of hydrological extreme events and water resource management. The prediction accuracy of LSM land surface states depend on various factors like the model's physics, accuracy of the forcing data, input land surface parameters and land surface initialization. The LSM prediction skills can be improved by constraining the model's prediction with observations using data assimilation (DA) [8].
Data assimilation is a novel data merging technique established by meteorologists [9] that has been used successfully to improve operational weather forecasts and in the field of oceanography [10] recent years, many researchers have utilized data assimilation techniques to exploit the increased availability of remotely sensed land surface variables to improve the model predictions in the field of hydrology [11][12][13][14]. Different studies have evaluated the effect of soil moisture assimilation on LSM prediction. Rolf et al. [8] have evaluated the performance of ensemble Kalman filter (EnKF) for soil moisture estimation using a synthetic experiment over the Southern Great Plane (SGP) in United States (US). Dunne & Entekhabi [15] studied the reanalysis approach of soil moisture assimilation over Little Washita and El Reno stations in SGP. Yang et al. [16] studied the effect of using microwave data assimilation on soil moisture estimation over the Tibetan plateau and Mongolian plateau. Rodell et al. [17] have utilized the data assimilation techniques on a global scale to assimilate multiple observation data in four different LSMs of Global Land Data Assimilation System (GLDAS). GLDAS provides land surface variables on a global scale from four LSMs (NOAH, VIC, CLM, and CATCHMENT) at three hourly intervals on two different spatial resolutions of 0.25 • × 0.25 • and 1 • × 1 • . Often, studies using DA tends to involve complexities associated with data handling and computational burden when dealing with a large study area and large amount of remotely sensed observations. Soil moisture plays a key role in controlling land surface energy balance. Hence, a realistic initialization of soil moisture in hydro-meteorological models will improve the model prediction skills. To date, near surface soil moisture products are made available from a number of satellite borne sensors operating in the passive and active microwave regions. The active microwave sensors provide near surface soil moisture (SSM) at global scales by measuring the backscattered value from the surface, unlike the passive sensors that estimate soil moisture from the surface emitted brightness temperature. Some of the remote sensing based soil moisture products include those from the Advanced Microwave Scanning Radiometer for Earth Observing System (AMSR-E) [18], WindSat [19] and Soil Moisture Ocean Salinity (SMOS) by the European Space Agency (ESA), different evaluation studies have been conducted on this products (e.g., [20][21][22][23]).
The Advanced Scatterometer (ASCAT) on board the Meteorological Operation (MetOp) satellite is a real-aperture radar which measures the microwave backscatter at the C-band in VV polarization. Soil moisture products from different satellite sensors vary significantly from one another based on the frequency of sensor used, the retrieval algorithm employed, and the spatial and temporal resolution. Especially the temporal resolution of the observation data can have adverse effect on soil moisture assimilation [24]. In this regard, use of combined soil moisture product from different sources can considerably improve the soil moisture prediction using assimilation techniques.
Scant literature exists reporting the study of the effect of DA on the LSM prediction skill over the Indian subcontinent. An improved soil moisture prediction over India can be helpful in many applications, such as hydrological forecasts, weather prediction and water resource management.
Keeping in mind the above factors, the main purpose of present study was to assimilate the combined soil moisture product over the Indian subcontinent and to check the sensitivity of the simulated brightness temperature (Tb) in comparison with different parameterization schemes of Radiative Transfer Model (RTM). For the present study, the Soil Moisture Operational Products System (SMOPS) [25] was used for assimilation. SMOPS is a multi-satellite blended product released by NOAA (National Oceanic and Atmospheric administration) NESDIS (National Environmental Satellite, Data, And Information Service) for assimilating soil moisture from all available satellite soil moisture products into the operational Global Forecast System (GFS). Multi satellite products like SMOPS can provide soil moisture at high temporal resolution which can improve the effect of assimilation on LSM prediction skills. Further details regarding SMOPS data used for the study are given in Section 2.3. To the best knowledge of the authors, no study exists in the literature that has examined the effects of DA using SMOPS over the Indian subcontinent.
This study is organized as follows. Section 2 briefly describes the study area, datasets and models used for the present work. It also briefly explains the LSM spinup and RTM. The methodology adopted by this study is outlined in Section 3 along with the comparison method adopted for the study. Section 4 contains the results of the Study. The diagnostics of the assimilation performance is discussed in Section 5. Conclusion of the study is given in Section 6.

Study Area
This study is conducted on the Indian subcontinent spanning from lower left latitude of 8.125 • N and longitude of 68.125 • E to upper right latitude of 37.375 • N and longitude of 97.375 • E with a total number of 13,924 pixels having a resolution of 0.25 • × 0.25 • . Figure 1 shows the study region, with varied topographic distribution involving deserts along the Northwest (Thar Desert), snow clad mountains in the North (Himalaya), High Plateau region (Tibetan Plateau) in east, etc. The study region also includes largest Delta (Ganges Delta). The region is mainly subjected to four types of weather conditions: winter (December-February), summer (March-May), monsoon (June-September), and retreating southwest monsoon (October-November). Major portion of rainfall over India occurs during the Indian summer monsoonal months. As such, for the present study, the monsoon season is selected for the evaluation of results.  [25] was used for assimilation. SMOPS is a multi-satellite blended product released by NOAA (National Oceanic and Atmospheric administration) NESDIS (National Environmental Satellite, Data, And Information Service) for assimilating soil moisture from all available satellite soil moisture products into the operational Global Forecast System (GFS). Multi satellite products like SMOPS can provide soil moisture at high temporal resolution which can improve the effect of assimilation on LSM prediction skills. Further details regarding SMOPS data used for the study are given in Section 2.3. To the best knowledge of the authors, no study exists in the literature that has examined the effects of DA using SMOPS over the Indian subcontinent. This study is organized as follows. Section 2 briefly describes the study area, datasets and models used for the present work. It also briefly explains the LSM spinup and RTM. The methodology adopted by this study is outlined in Section 3 along with the comparison method adopted for the study. Section 4 contains the results of the Study. The diagnostics of the assimilation performance is discussed in Section 5.Conclusion of the study is given in Section 6.

Study Area.
This study is conducted on the Indian subcontinent spanning from lower left latitude of 8.125°N and longitude of 68.125°E to upper right latitude of 37.375°N and longitude of 97.375°E with a total number of 13,924 pixels having a resolution of 0.25° × 0.25°. Figure 1 shows the study region, with varied topographic distribution involving deserts along the Northwest (Thar Desert), snow clad mountains in the North (Himalaya), High Plateau region (Tibetan Plateau) in east, etc. The study region also includes largest Delta (Ganges Delta). The region is mainly subjected to four types of weather conditions: winter (December-February), summer (March-May), monsoon (June-September), and retreating southwest monsoon (October-November). Major portion of rainfall over India occurs during the Indian summer monsoonal months. As such, for the present study, the monsoon season is selected for the evaluation of results.

The Land Parameter Retrieval Model (LPRM) AMSR-E Soil Moisture
The LPRM soil moisture is a product of the NASA's Aqua satellite which carries Advanced Microwave Scanning Radiometer-Earth Observing System (AMSR-E). It operates in polar Sun-Synchronous orbits, with equator crossings at 0130 local time (LT) in descending pass and 1330 LT in ascending pass. It measures earth's upwelling brightness temperature at six different frequencies (6.9, 10.7, 18.7, 23.8, 36.5 and 89.0 GHz) and two different polarizations with an incidence angle of 55 • .
The NASA-VUA (Vrije Universiteit Amsterdam) algorithm [26] is used to derive land surface parameters, surface soil moisture (SSM), land surface temperature, and vegetation water content from AMSR-E brightness temperature, using LPRM. This is based on a three-parameter retrieval model for deriving SSM and vegetation water content (VWC) using one dual-polarized channel (either 6.9 or 10.7 GHz). The forward radiative transfer model in LPRM is a one vegetation layer (τ-ω) model. The advantage of LPRM over NSIDC (National Snow and Ice Data Centre) soil moisture is the use of high frequency channels to retrieve land surface temperature and parameterization of vegetation optical depth [27], which optimize the soil moisture retrieval. The level 3 LPRM gridded datasets provide SSM at a spatial resolution of 0.25 degree, with two files per day for ascending and descending pass. The LPRM data are available for a period from June 2002 to October 2011.For the present study the LPRM data for the monsoon months of June, July, August, and September from 2010 to 2011 is used to compare with the results of data assimilation.

SMOPS Blended Products
Soil Moisture Operational Products System (SMOPS) is a multi-satellite global soil moisture product developed by NOAA and NESDIS as pre-processor to assimilate all available satellite soil moisture data products into the Global Forecast System (GFS), which retrieves soil moisture from low-frequency microwave satellite sensors. The retrieval strategy of SMOPS is to retrieve soil moisture using the microwave brightness temperature from baseline sensors which are either AMSR-E(Primary) or WindSat (Secondary, WindSat is used as baseline sensor after failure of AMSR-E) using the single channel retrieval (SCR) algorithm, and then merge soil moisture retrievals from the European Organization for the Exploitation of Meteorological Satellites(EUMETSAT) ASCAT and SMOS mission of the European Space Agency (ESA) on to baseline retrieval. The SCR algorithm utilizes 10.7 GHz horizontally polarized level-1b brightness temperature from AMSR-E or WindSat to extract soil moisture on a global grid, further to increase the spatial and temporal coverage of the soil moisture product ASCAT and SMOS soil moisture products are merged after they are scaled to the AMSR-E climatology using the cumulative distribution function (cdf) matching approach. The preliminary step in SMOPS blending is to obtain soil moisture data products; once the soil moisture retrievals are obtained, they are scaled to the climatology of the AMSR-E retrievals, and the latest retrieval for the grid will be selected to represent the soil moisture observation for that grid. When multiple retrieval observations are available for a grid, equal weights are applied to obtain the merged soil moisture product. The SMOPS global soil moisture maps are generated on 0.25 × 0.25 degree grids on 6-hourly and daily intervals, which includes volumetric soil moisture values of the surface (1-5 cm) soil layer along with associated quality information and metadata. Further details regarding the SMOPS algorithm can be obtained from [25]. For the present study, SMOPS daily product was assimilated into Noah LSM on daily basis for 2010-2011.The SMOPS observations are scaled to the models climatology by using the cumulative distribution function matching approach before DA. The SMOPS data were obtained from NASA's LIS Data Portal.

Global Data Assimilation System (GDAS)
The Global Data Assimilation System (GDAS) is an atmospheric assimilation system by NCEP (National Centers for Environment Prediction) [28], which assimilates observations such as surface observation, balloon data, wind profiler data, aircraft report, buoy observations, radar observation and satellite observations on to a gridded model space by using four-dimensional multivariate approach [17]. The GDAS operational analysis is done for four synoptic hours: 00, 06, 12, and 18 UTC, in addition to this 3-h and 6-h background forecast is done for each analysis. The GDAS data are stored in Gaussian grid with 768 grid points in longitude and 364 grid points in latitude, with 64 model atmospheric levels. For the present study, the GDAS forcing data enhanced by GLDAS team at NASA Goddard Space Flight Center (GSFC) has been used in conjunction with NOAH LSM.

Ecoclimap
Ecoclimap is an open source global land surface parameter database available at 1 km resolution developed by Meteofrance and Centre national de la recherché scientifique (CNRS) France. In Ecoclimap database surface parameters are retrieved from different sources. Parameters including soil type, clay fraction, sand fraction, and silt fraction are obtained from FAO datasets. The land cover parameters are obtained by combining several datasets (like the coastline, lakes and river valleys) at 1 km resolution. These are obtained from the Digital Chart of the World (DCW) database with the main land cover map adapted from the University of Maryland (UMD), and the permanent snows and wetlands obtained from the Global Gridded Surfaces of Selected Soil Characteristics (IGBP-DIS) map [29]. For the present study, Ecoclimap data were used to obtain land surface parameters for the CMEM RTM.

ISMN Ground Data for Tibetan Plateau
The international soil moisture network is a global initiative coordinated by the Global Energy and Water Exchanges Project (GEWEX) along with the Group of Earth Observation (GEO) and the committee on Earth Observation Satellites (CEOS). This study utilizes 31 ISMN soil moisture stations in central Tibetan Plateau (CTP) for validation, as shown in Figure 2. The CTP soil moisture network is setup over an area of 100 × 100 km 2 within a cold and semi-arid region at an average elevation of 4650 m above sea level. The CTP network was installed in three different stages during 2010-2012. In the first stage, a network of 30 stations (large network) was installed covering the complete CTP, which started measurements in August 2010. In second stage, 20 more stations were deployed to increase the spatial density of observation stations within an area of 25 × 25 km 2 (medium network) in July 2011. In the third stage, six more stations were deployed to increase the spatial density of 10 × 10 km 2 (small network) in June 2012 [30]. and satellite observations on to a gridded model space by using four-dimensional multivariate approach [17]. The GDAS operational analysis is done for four synoptic hours: 00, 06, 12, and 18 UTC, in addition to this 3-h and 6-h background forecast is done for each analysis. The GDAS data are stored in Gaussian grid with 768 grid points in longitude and 364 grid points in latitude, with 64 model atmospheric levels. For the present study, the GDAS forcing data enhanced by GLDAS team at NASA Goddard Space Flight Center (GSFC) has been used in conjunction with NOAH LSM.

Ecoclimap
Ecoclimap is an open source global land surface parameter database available at 1 km resolution developed by Meteofrance and Centre national de la recherché scientifique (CNRS) France. In Ecoclimap database surface parameters are retrieved from different sources. Parameters including soil type, clay fraction, sand fraction, and silt fraction are obtained from FAO datasets. The land cover parameters are obtained by combining several datasets (like the coastline, lakes and river valleys) at 1 km resolution. These are obtained from the Digital Chart of the World (DCW) database with the main land cover map adapted from the University of Maryland (UMD), and the permanent snows and wetlands obtained from the Global Gridded Surfaces of Selected Soil Characteristics (IGBP-DIS) map [29]. For the present study, Ecoclimap data were used to obtain land surface parameters for the CMEM RTM

ISMN Ground Data for Tibetan Plateau
The international soil moisture network is a global initiative coordinated by the Global Energy and Water Exchanges Project (GEWEX) along with the Group of Earth Observation (GEO) and the committee on Earth Observation Satellites (CEOS). This study utilizes 31 ISMN soil moisture stations in central Tibetan Plateau (CTP) for validation, as shown in Figure 2. The CTP soil moisture network is setup over an area of 100 × 100 km 2 within a cold and semi-arid region at an average elevation of 4650 m above sea level. The CTP network was installed in three different stages during 2010-2012. In the first stage, a network of 30 stations (large network) was installed covering the complete CTP, which started measurements in August 2010. In second stage, 20 more stations were deployed to increase the spatial density of observation stations within an area of 25 × 25 km 2 (medium network) in July 2011. In the third stage, six more stations were deployed to increase the spatial density of 10 × 10 km 2 (small network) in June 2012 [30].

Land Surface Model Framework
The present study utilizes the Land Information System (LIS) [31,32]software framework developed by the Hydrological Sciences Laboratory at NASA's Goddard Space Flight Center for the

Land Surface Model Framework
The present study utilizes the Land Information System (LIS) [31,32] software framework developed by the Hydrological Sciences Laboratory at NASA's Goddard Space Flight Center for the assimilation of SMOPS soil moisture in Noah LSM, which provides a framework for sequential assimilation of various observations into different Land Surface Models. The version 3.6 of Noah LSM in LIS framework was used for the present study. The Noah LSM is adapted from Oregon State University (OSU) LSM [33], which consists of two soil layers with thermal conduction equations for soil temperature and a form of Richard's equation for soil moisture. A major effort had been undertaken by NCAR (National Center for Atmospheric Research), NCEP, the U.S. Air Force Weather Agency (AFWA), and the university community to develop and implement a unified Noah LSM, which is an enhanced version of the OSU/Noah LSM with four soil layers of increasing thicknesses of 10, 30, 60 and 100 cm [34]. The Noah LSM is based on coupling of the diurnally dependent Penman potential evaporation approach of [35], the multilayer soil model, and the primitive canopy model of [36].

LSM Spin-up
The Improper initialization of Land surface models can result in erroneous outputs for the earth system process. LSM initial conditions are the spatially varying land surface variables that describe the surface water and energy states at the commencement of the simulation. These land surface states include the soil moisture and temperature of each layer, the canopy water content, and other vegetation properties. The Initial conditions vary for different models because of its climatology, which is affected by the model physics, the input forcing data, vegetation, soil, and topographic parameters. Since the model climatology differs from those observed in nature, the initial conditions are set of states from long term simulation of stable LSM and consistent forcing [37]. The spinup time according to project for Intercomparison of Land Surface Parameterization schemes (PILPS) Phase 1 [38] is the number of yearly repeated loops required so as the model states of the last simulation is exactly same as the previous run. The Global Soil Wetness Project (GSWP) adopted a convergence criterion to define the spin-up of LSM, which is the number of yearly loop when the difference between the total soil moisture content for the present and the previous simulation is less than ±5%. The Noah LSM for the present study is spun up by cycling seven times (21 years) through the period from 1 January 2008 to 31 December 2010 using the meteorological forcing from GDAS with details as given in Table 1. The monthly mean difference of total soil moisture content for the complete study region was found to be less than ±5% after seven loops (21 years) with the previous simulation

Data Assimilation Framework
The sequential data assimilation algorithms proceed recursively through time, alternating between a model forecast stage and an observation assimilation update stage. For the present study, the Ensemble Kalman Filter (EnKF) was used as data assimilation algorithm.

Ensemble Kalman Filter
The EnKF data assimilation algorithm provides an optimal way to merge the model states and the corresponding observations considering their error characteristics. It has been widely used for hydrological assimilation [8], especially for soil moisture assimilation [39][40][41]. The EnKF algorithm is a Bayesian filtering process, which alternates between an ensemble forecast step, where an ensemble of model states is propagated forward in time using the model equations, and a state variable update step, where the simulated state is updated with the observation states when the observations are available.
The updated land surface states are also known as analyzed estimate, which is given as where i is the grid number, j is the number of ensembles from 1 to 30 and U t i,j , F t i,j and O t i,j are updated states, forecast states, and observation state vectors, respectively, H is observation operator which relates model states to observation states. K is kalman gain matrix, given by where C t ψψi and C t ωωi are error variances for forecast and measurement estimates, respectively. The LIS framework adopts a one-dimensional approach of EnKF algorithm by [42]. The ensemble size adopted for this study was thirty, based on a study by [43], where it was shown that there was no considerable improvement in the results when the ensemble size is more than twelve.

Perturbation Attributes
The error covariance matrix from the ensemble of land surface conditions is used to calculate the Kalman gain matrix using Equation (2). For the present study, perturbations on meteorological forcing inputs, model estimated state variables and observations are applied to maintain an ensemble size thirty of surface land conditions that represents the uncertainty in the soil moisture estimates with details as shown in Table 2. Normally distributed zero mean additive perturbation was applied for downward longwave radiation, and log-normally distributed (mean 1) multiplicative perturbations for precipitation and downward shortwave radiation. Additionally, the Noah LSM estimated soil moisture for all four layers are perturbed with additive normal distribution with zero mean based on study by [41].

Radiative Transfer Model
The Community Microwave Emission Model (CMEM) is a forward model developed by ECMWF to simulate low frequency (1-20 GHz) top of the atmosphere (TOA) microwave brightness temperatures (Tb) from the surface of the earth [44]. The CMEM Radiative transfer model adopts a three dielectric layer (soil, vegetation, and atmosphere) model to calculate the TOA microwave brightness temperature. The TOA brightness temperature for a snow free area with polarization p can be expressed as: T Btov,p = T Bsoil,p ·exp(-τ veg,p ) + T Bveg,p (1 + r r,p ·exp(−τ veg,p )) + T Bad,p ·r r,p ·exp(−2τ veg,p ) where T Btov,p (K) is the up-welling atmospheric emission; τ atm,p is the atmospheric optical depth; T Btov,p (K) is the top of vegetation brightness temperature, where vegetation is represented as a single-scattering layer; T Bsoil,p (K) is the soil Tb contribution; T Bveg,p (K) is the vegetation Tb contribution and T Bad,p downward atmospheric Tb contributions; r r,p is the soil reflectivity of the rough surface; and τ veg,p is the vegetation optical depth along the viewing path of the satellite.
The CMEM physics has different parameterization schemes for soil, vegetation and atmospheric dielectric layers such as adopted in L-Band Microwave Emission of the Biosphere (LMEB, [45]). Based on the studies conducted by [46,47], the present study adopted the Kirdyashev opacity model and the Wang and Schmugge dielectric model as parameterization scheme one and CMEM default parameters as scheme two with further parameters as given in Table 3. The Land surface parameters were obtained from ECOCLIMAP data.

Study Overview
The present study is conducted to investigate the performance of LSM after assimilating SMOPS blended soil moisture products over the Indian subcontinent. This is performed by comparing the results with different soil moisture measurements and simulations. The methodology carried out for the present study involved the following three stages: (a) Data assimilation, where the Noah LSM was initialized with spinup runs(explained in Section 2.7), and then the initialized model was assimilated with SMOPS soil moisture by using NASA's LIS framework for 0.25 • × 0.25 • spatial resolution with the procedure as shown in Figure 3. A similar procedure was used by Kumar et al. [43]. (b) The monthly mean results for JJAS months from data assimilation were compared with the GLDAS soil moisture data and LPRM AMSR-E soil moisture. Two spatial domains within study region were studied using daily mean time series plots. Domain 1 was over central India, while domain 2 was over the Central Tibetan Plateau (CTP).In this stage, the results over the CTP were also compared with the ground soil moisture observations from ISMN data.
(c) The land surface variables from the data assimilation were utilized to simulate the TOA brightness temperature at 10.7 GHz and at an incidence angle of 55 • , which were compared with the actual AMSR-E Tb. The comparison of soil moisture from the data assimilation and the GLDAS Noah soil moisture was performed using the soil moisture product from Vrije Universiteit Amsterdam (VUA) in collaboration with NASA (LPRM-AMSR-E). This is discussed in Section 3.2. The LPRM soil moisture dataset was selected for validation over the NSIDC soil moisture based on the study by Rudiger et al. [27], where the LPRM based soil moisture outperformed the NSIDC soil moisture over France. Furthermore, Cho et al. [48] showed that the LPRM based AMSR-E soilmoisture outperformed the NSIDC soil moisture for some regions in Northeast Asia. The second stage of comparison was conducted by using the AMSR-E 10.7 GHz Tb as the reference data explained in Section 3.4.

Comparing Simulated Soil Moisture with LPRM AMSR-E soil Moisture
In this study the monthly mean soil moisture for layer 1 (0-10 cm) of Noah LSM (after SMOPS soil moisture assimilation) is compared with the GLDAS layer 1 soil moisture from Noah LSM and LPRM AMSR-E soil moisture. The mean of ascending and descending pass of LPRM soil moisture is used as reference. Comparison is conducted for the monsoon season (JJAS) of 2010-2011. Furthermore, the daily mean soil moisture from layer 1 was studied by creating a monthly time series plots for two domains. For the second domain (CTP) in addition to the above data ISMN ground observation data was also used to compare the results. In order to compare the soil moisture indices (ranging from 0-1) to the model based soil moisture, the AMSR-E soil moisture indices were converted to physical units of m 3 /m 3 following the approach by [49]. By utilizing the 90% confidence interval of Noah simulations assuming that the distribution is Gaussian: Where μ (sm Noah ) is the mean for each pixel, σ (sm Noah ) is the standard deviation for each pixel in the simulated surface soil moisture, and sm 01 is the AMSR-E soil moisture indices (range from 0 to 1). The Tibetan plateau had ISMN ground soil moisture data, which was used in validation of second domain. The difficulty in using the field probe based soil moisture involved a mismatch in The comparison of soil moisture from the data assimilation and the GLDAS Noah soil moisture was performed using the soil moisture product from Vrije Universiteit Amsterdam (VUA) in collaboration with NASA (LPRM-AMSR-E). This is discussed in Section 3.2. The LPRM soil moisture dataset was selected for validation over the NSIDC soil moisture based on the study by Rudiger et al. [27], where the LPRM based soil moisture outperformed the NSIDC soil moisture over France. Furthermore, Cho et al. [48] showed that the LPRM based AMSR-E soilmoisture outperformed the NSIDC soil moisture for some regions in Northeast Asia. The second stage of comparison was conducted by using the AMSR-E 10.7 GHz Tb as the reference data explained in Section 3.4.

Comparing Simulated Soil Moisture with LPRM AMSR-E soil Moisture
In this study the monthly mean soil moisture for layer 1 (0-10 cm) of Noah LSM (after SMOPS soil moisture assimilation) is compared with the GLDAS layer 1 soil moisture from Noah LSM and LPRM AMSR-E soil moisture. The mean of ascending and descending pass of LPRM soil moisture is used as reference. Comparison is conducted for the monsoon season (JJAS) of 2010-2011. Furthermore, the daily mean soil moisture from layer 1 was studied by creating a monthly time series plots for two domains. For the second domain (CTP) in addition to the above data ISMN ground observation data was also used to compare the results. In order to compare the soil moisture indices (ranging from 0-1) to the model based soil moisture, the AMSR-E soil moisture indices were converted to physical units of m 3 /m 3 following the approach by [49]. By utilizing the 90% confidence interval of Noah simulations assuming that the distribution is Gaussian: sm amsr−e = sm 01 int + 90% sm Noah − int − 90% sm Noah + int − 90% sm Noah ; with, int + 90% sm Noah = µ sm Noah + 1.64 × σ sm Noah int − 90% sm Noah = µ sm Noah − 1.64 × σ sm Noah (5) where µ (sm Noah ) is the mean for each pixel, σ (sm Noah ) is the standard deviation for each pixel in the simulated surface soil moisture, and sm 01 is the AMSR-E soil moisture indices (range from 0 to 1).
The Tibetan plateau had ISMN ground soil moisture data, which was used in validation of second domain. The difficulty in using the field probe based soil moisture involved a mismatch in the spatial scale between the point-based observation and the grid-based LSM simulations. Considering the fact that soil moisture has high spatiotemporal variation, thus the measurement of a single station generally cannot represent the simulations of a LSM grid. To overcome this issue, a spatial averaging method was adopted. Domain 2 the area includes 31 ISMN stations, which were spatially averaged and compared with the average of grids from LSM simulation, the spatial averaging can reduce the noise and make the comparison more reliable. The same method was adopted by [50] for comparing the GLDAS simulations over the Tibetan plateau.

Statistical comparison between Assimilated, GLDAS and LPRM Soil Moisture
To quantify the soil moisture prediction skill of assimilated and GLDAS products they are compared with LPRM ASMR-E soil moisture as reference, using second order statistics.
The correlation coefficient (R) is computed to verify the closeness in the spatial patterns of the simulated and observed soil moisture fields, computed for reference soil moisture (r) with standard deviation σ r and predicted soil moisture (p) having a standard deviation of σ p for a study area with N grids as given below.
The centered RMS difference is calculated as given below: They are related by the following formula: Utilizing this relation, Taylor diagram represents this second order statistics in a two-dimensional plot [51]. The Taylor diagram is arranged in such a way that, the standard deviation is denoted by the radial distance of the point from the origin, the correlation with the observation is represented by the angle in polar plot and the centered RMS difference is represented by the distance of the point from the observation point on the X-axis.
Recent studies have adopted the triple collocation (TC) [52][53][54] to evaluate soil moisture measurement from different sources. In this method, three collocated datasets jointly provide sufficient constraints for determining the error variance in the datasets. For the present study the error variance between the DA results and the GLDAS is obtained by adopting the Extended Triple Collocation (ETC) [53], using the temporally and spatially collocated absolute values of soil moisture for JJAS months of 2010-2011.

Brightness Temperature Simulation Experiment
Studies have tried to improve soil moisture prediction by assimilating low frequency microwave Tb to the LSM [55,56] due to its high correlation with soil moisture which requires the LSM coupled to a forward radiative transfer model to convert the forecast state to the observation state during the assimilation stage. In this method, one of the crucial factors affecting the prediction skills of LSM, is the forward RTM used as the observation operator. Few studies have addressed this issue of global calibration of RTM [57].
Very few studies have examined the feasibility of the CMEM RTM over India to simulate brightness temperature (Tb), In the present study, the sensitivity of CMEM with two parameterization schemes as explained in Section 3.4 is conducted using the soil moisture from the assimilation experiment to simulate 10.7 GHz brightness temperature at an incidence of 55 • , which is compared with the actual AMSR-E Tb. As the study is conducted to check the variation in Tb simulation with change in parameterization rather than to simulate accurate AMSR-E Tb, calibration was not essential for the CMEM model. This study will help to understand the behavior of the RTM when using as a forward model in assimilation of Tb in LSM for the Study region.

Results
In the present study, the SMOPS soil moisture is assimilated in to the Noah LSM using the NASA's LIS data assimilation framework for a period of two years from January 2010 to December 2011. The improvement in LSM prediction over the study area is evaluated using the LPRM AMSR-E soil moisture as reference data over the complete study period for the monsoon season JJAS (June, July, August, and September).
The effect of DA on soil moisture simulation was evaluated by computing the cumulative bias between the open loop (OL without DA) simulation and the DA run over JJAS for 2010-2011, as shown in Figure 4.

Results
In the present study, the SMOPS soil moisture is assimilated in to the Noah LSM using the NASA's LIS data assimilation framework for a period of two years from January 2010 to December 2011. The improvement in LSM prediction over the study area is evaluated using the LPRM AMSR-E soil moisture as reference data over the complete study period for the monsoon season JJAS (June, July, August, and September).
The effect of DA on soil moisture simulation was evaluated by computing the cumulative bias between the open loop (OL without DA) simulation and the DA run over JJAS for 2010-2011, as shown in Figure 4.   Figure 4 shows that during the months of July and August for both the years the assimilation has maximum effect when the complete study region receives the peak monsoonal rain, while in June and august the simulation effect is more prominent over the western and eastern cost.
A comparison of results was done on two sub domains within the study area. In first stage of validation, the simulated soil moistures were compared with the LPRM AMSR-E soil moisture. In the second stage of study, the predicted soil moisture was used to simulate brightness temperature, which was compared with the actual brightness temperature.

Evaluation of Simulated Soil Moisture.
The simulated and the GLDAS soil moisture for the study region is evaluated for JJAS months of 2010-2011. Figures 5 and 6 represent the monthly mean surface soil moisture for 2010 and 2011, respectively. The assimilation product shows a good similarity with the LPRM AMSR-E soil moisture as compared to the GLDAS soil moisture. The result shows a large variation in soil moisture particularly for two regions highlighted in the Figure 1.Results show that the assimilated surface soil moisture depicts a better spatial distribution as compared to the GLDAS. Figure 4 shows that for the month of June, the GLDAS comparatively underestimates the soil moisture over western India, Similarly in the month of July, August and September GLDAS underestimates the soil moisture over the northwestern region. All products show good similarity capturing the high SSM over Maharashtra. The comparison also shows the soil moisture after assimilation experiment underestimates the soil moisture over the eastern tip of India compared to GLDAS and LPRM AMSR-E products. The possible causes for the variation in results are discussed in Section 5.
From the Taylor diagram (Figure 7) it is evident that the assimilation results show strong correlation (r > 0.9), especially for the months of august and September for both the years. The RMSD values of assimilated soil moisture(SM) and GLDAS were observed to be 0.0248 and 0.0422 for  Figure 4 shows that during the months of July and August for both the years the assimilation has maximum effect when the complete study region receives the peak monsoonal rain, while in June and august the simulation effect is more prominent over the western and eastern cost.
A comparison of results was done on two sub domains within the study area. In first stage of validation, the simulated soil moistures were compared with the LPRM AMSR-E soil moisture. In the second stage of study, the predicted soil moisture was used to simulate brightness temperature, which was compared with the actual brightness temperature.

Evaluation of Simulated Soil Moisture
The simulated and the GLDAS soil moisture for the study region is evaluated for JJAS months of 2010-2011. Figures 5 and 6 represent the monthly mean surface soil moisture for 2010 and 2011, respectively. The assimilation product shows a good similarity with the LPRM AMSR-E soil moisture as compared to the GLDAS soil moisture. The result shows a large variation in soil moisture particularly for two regions highlighted in the Figure 1. Results show that the assimilated surface soil moisture depicts a better spatial distribution as compared to the GLDAS. Figure 4 shows that for the month of June, the GLDAS comparatively underestimates the soil moisture over western India, Similarly in the month of July, August and September GLDAS underestimates the soil moisture over the northwestern region. All products show good similarity capturing the high SSM over Maharashtra. The comparison also shows the soil moisture after assimilation experiment underestimates the soil moisture over the eastern tip of India compared to GLDAS and LPRM AMSR-E products. The possible causes for the variation in results are discussed in Section 5.
From the Taylor diagram (Figure 7) it is evident that the assimilation results show strong correlation (r > 0.9), especially for the months of august and September for both the years. The RMSD values of assimilated soil moisture(SM) and GLDAS were observed to be 0.0248 and 0.0422 for September 2010. The values of the Taylor diagram are summarized in Table 4.        Two regions showing maximum variation in monthly surface soil moisture (as highlighted in Figure 1) were examined further. The daily mean time series plots were generated for these two regions. In addition to the LPRM AMSR-E soil moisture, the central Tibetan plateau region was compared with the ISMN ground soil moisture data by spatially averaging the 31 ground stations within the sixteen cells and comparing with the average value of 16 cells. Figure 8 shows the time series plots for domain 1 and Figure 9 shows the time series plots for domain 2.  Two regions showing maximum variation in monthly surface soil moisture (as highlighted in Figure 1) were examined further. The daily mean time series plots were generated for these two regions. In addition to the LPRM AMSR-E soil moisture, the central Tibetan plateau region was compared with the ISMN ground soil moisture data by spatially averaging the 31 ground stations within the sixteen cells and comparing with the average value of 16 cells. Figure 8 shows the time series plots for domain 1 and Figure 9 shows the time series plots for domain 2. Two regions showing maximum variation in monthly surface soil moisture (as highlighted in Figure 1) were examined further. The daily mean time series plots were generated for these two regions. In addition to the LPRM AMSR-E soil moisture, the central Tibetan plateau region was compared with the ISMN ground soil moisture data by spatially averaging the 31 ground stations within the sixteen cells and comparing with the average value of 16 cells. Figure 8 shows the time series plots for domain 1 and Figure 9 shows the time series plots for domain 2.

Time series plot for Domain 1 (Central India)
For the central India region, the GLDAS data underestimate the SSM for JJAS months of both the years. For all months except September, the LPRM AMSR-E soil moisture follows a trend similar to the soil moisture from the assimilation. For the month of June, it can be seen that the soil moisture increases suddenly in assimilation while in GLDAS it dips down.

Time series plots for Domain 2 (CTP)
The Central Tibetan Plateau is a high altitude region, which is monitored for soil moisture by ISMN from august 2010. Figure 9 shows that for the months of June, July and August 2010 the GLDAS product underestimates SSM, while the assimilated SSM shows a better correlation with ground data. Similarly, for September 2010, the GLDAS underestimates the SSM, as shown in previous study discussed in Section 5. For June and September 2011, the assimilated SSM shows better correlation with ground data, while for July and August 2011 it underestimates the SSM.
Though the main theme of this study was to study the effect of assimilation on soil moisture prediction skills of LSM for India, it would be equally worthwhile to understand and compare the resulting Tb from CMEM. During day time, the dry vegetation is comparatively more transparent to microwave signals, and, therefore, the quality of descending passes might be better in capturing the microwave signals over vegetation, while during nighttime (ascending pass), the vegetation water content can influence the soil moisture signals. Hence, it was decided to study the ascending and descending pass separately for the selection of best orbit representing soil moisture. [58] found that the nighttime AMSR-E soil moisture produced better correlation with field based measurements. An accurately simulated low frequency microwave Tb can be used in the synthetic experiments to check the feasibility of a satellite project. The simulated Tb can be used to create emissivity maps that can be used for precipitation retrievals.

Time series plot for Domain 1 (Central India)
For the central India region, the GLDAS data underestimate the SSM for JJAS months of both the years. For all months except September, the LPRM AMSR-E soil moisture follows a trend similar to the soil moisture from the assimilation. For the month of June, it can be seen that the soil moisture increases suddenly in assimilation while in GLDAS it dips down.

Time series plots for Domain 2 (CTP)
The Central Tibetan Plateau is a high altitude region, which is monitored for soil moisture by ISMN from august 2010. Figure 9 shows that for the months of June, July and August 2010 the GLDAS product underestimates SSM, while the assimilated SSM shows a better correlation with ground data. Similarly, for September 2010, the GLDAS underestimates the SSM, as shown in previous study discussed in Section 5. For June and September 2011, the assimilated SSM shows better correlation with ground data, while for July and August 2011 it underestimates the SSM.
Though the main theme of this study was to study the effect of assimilation on soil moisture prediction skills of LSM for India, it would be equally worthwhile to understand and compare the resulting Tb from CMEM. During day time, the dry vegetation is comparatively more transparent to microwave signals, and, therefore, the quality of descending passes might be better in capturing the microwave signals over vegetation, while during nighttime (ascending pass), the vegetation water content can influence the soil moisture signals. Hence, it was decided to study the ascending and descending pass separately for the selection of best orbit representing soil moisture. [58] found that the nighttime AMSR-E soil moisture produced better correlation with field based measurements. An accurately simulated low frequency microwave Tb can be used in the synthetic experiments to check the feasibility of a satellite project. The simulated Tb can be used to create emissivity maps that can be used for precipitation retrievals. However, it is evident that the DA simulation has lower error variance compared to the GLDAS over the complete study region.  Figure 11 illustrates the observed (AMSR-E) and simulated brightness temperature at horizontal and vertical polarization for ascending orbit for the parameterization scheme one from Table 3. Figure 12 shows the simulation for same parameterization for descending orbit. Figure 13 shows the observed and simulated TB with parameterization scheme two from Table 3 for ascending pass. Similarly, Figure 14 shows the same configuration for descending pass. The first column represents results for 16 June 2010, second column for 11 July 2010, third column 12 August 2010, and fourth column for 13 September 2010. It can be seen that the results show huge variation in simulated Tb, because of the parameterization scheme.   Figure 11 illustrates the observed (AMSR-E) and simulated brightness temperature at horizontal and vertical polarization for ascending orbit for the parameterization scheme one from Table 3. Figure 12 shows the simulation for same parameterization for descending orbit. Figure 13 shows the observed and simulated T B with parameterization scheme two from Table 3 for ascending pass. Similarly, Figure 14 shows the same configuration for descending pass. The first column represents results for 16 June 2010, second column for 11 July 2010, third column 12 August 2010, and fourth column for 13 September 2010. It can be seen that the results show huge variation in simulated Tb , because of the parameterization scheme. However, it is evident that the DA simulation has lower error variance compared to the GLDAS over the complete study region.  Figure 11 illustrates the observed (AMSR-E) and simulated brightness temperature at horizontal and vertical polarization for ascending orbit for the parameterization scheme one from Table 3. Figure 12 shows the simulation for same parameterization for descending orbit. Figure 13 shows the observed and simulated TB with parameterization scheme two from Table 3 for ascending pass. Similarly, Figure 14 shows the same configuration for descending pass. The first column represents results for 16 June 2010, second column for 11 July 2010, third column 12 August 2010, and fourth column for 13 September 2010. It can be seen that the results show huge variation in simulated Tb, because of the parameterization scheme.

Discussions.
The present study simulated the spatio-temporal variability of satellite derived soil moisture from SMOPS (Soil Moisture Operational Products) over the Indian Subcontinent. A data period of 2010-2011 during the Indian summer monsoonal months of JJAS was chosen for the study.
The results show that assimilation of SMOPS soil moisture successfully simulates the soil moisture variability in comparison with LPRM AMSR-E soil moisture product. The study was examined over India and in particular over the two domains of central India and Tibetan Plateau as highlighted in Figure 1. Graphical comparisons in Figures 5 and 6 show a good similarity of assimilated SSM with the LPRM SSM, which is in accordance with the average correlation(Assimilated SSM = 0.9645 and GLDAS SSM = 0.9322) and average RMSD(Assimilated SSM = 0.303 m 3 /m 3 , GLDAS SSM =0.0481 m 3 /m 3 ) values from the Taylor plots, as summarized in Table 4. The error variance from the triple colocation analysis, as shown in Figure 4 also supports the previous analysis results as the error variance for GLDAS is more compared to SMOPS DA. One of the crucial factors for the observed variation can be due to the difference in the depths of surface soil moisture. This is because Noah LSM considers the surface soil moisture (SSM) as average of first layer (10 cm thickness), while the AMSR-E soil moisture can only represent surface soil moisture of lower depths (~2-5 cm). As the results are evaluated for the monsoon season, the primary factor affecting the soil moisture stems from precipitation. The main factor causing the variation in SSM from GLDAS and SSM from present study is the assimilation of the SMOPS soil moisture. The biases may be caused due to the forcing data, LSM parameters and bias in the LPRM AMSR-E product.
The time series plots for JJAS months of 2010-2011 over the Central Tibetan Plateau show that the GLDAS underestimates the SSM. This is in accordance with earlier studies [59,60],The primary cause can be attributed to the stratification of soil properties induced by the high soil organic carbon contents in the Tibetan Plateau. The organic carbon content within soil properties are not being represented in the NOAH LSM physics. Furthermore, the mismatch in the soil depths of the LSM simulations and in situ measurements can contribute towards the uncertainty. The LSM simulated

Discussions
The present study simulated the spatio-temporal variability of satellite derived soil moisture from SMOPS (Soil Moisture Operational Products) over the Indian Subcontinent. A data period of 2010-2011 during the Indian summer monsoonal months of JJAS was chosen for the study.
The results show that assimilation of SMOPS soil moisture successfully simulates the soil moisture variability in comparison with LPRM AMSR-E soil moisture product. The study was examined over India and in particular over the two domains of central India and Tibetan Plateau as highlighted in  Table 4. The error variance from the triple colocation analysis, as shown in Figure 4 also supports the previous analysis results as the error variance for GLDAS is more compared to SMOPS DA. One of the crucial factors for the observed variation can be due to the difference in the depths of surface soil moisture. This is because Noah LSM considers the surface soil moisture (SSM) as average of first layer (10 cm thickness), while the AMSR-E soil moisture can only represent surface soil moisture of lower depths (~2-5 cm). As the results are evaluated for the monsoon season, the primary factor affecting the soil moisture stems from precipitation. The main factor causing the variation in SSM from GLDAS and SSM from present study is the assimilation of the SMOPS soil moisture. The biases may be caused due to the forcing data, LSM parameters and bias in the LPRM AMSR-E product.
The time series plots for JJAS months of 2010-2011 over the Central Tibetan Plateau show that the GLDAS underestimates the SSM. This is in accordance with earlier studies [59,60]. The primary cause can be attributed to the stratification of soil properties induced by the high soil organic carbon contents in the Tibetan Plateau. The organic carbon content within soil properties are not being represented in the NOAH LSM physics. Furthermore, the mismatch in the soil depths of the LSM simulations and in situ measurements can contribute towards the uncertainty. The LSM simulated soil moisture represents an average (for layer one of Noah LSM, it is average of first 10 cm thickness) of the layer while the ground measurements are the soil moisture at a specific depth.
Among the factors affecting the prediction skill of LSM simulation (like soil properties, vegetation properties, and topographic information), the soil properties are the most important. This is because the hydraulic conductivity value depends on the soil type, directly impacting the soil moisture simulation. For the present study, the soil properties are extracted from the Food and Agriculture Organization (FAO) [61], which mainly includes the porosity, clay fraction, sand fraction, silt fraction and soil texture. Uncertainties can stem from the representation of soil properties [62,63]. Future works shall be directed towards understanding the uncertainties affecting the LSM prediction.
The T B simulation case study shows that, the CMEM RTM is highly sensitive to the parameterization scheme adopted. The results from the two parameterization schemes show large variation in ascending and descending pass for both the polarizations. The assimilation of T B in LSM as observation to improve soil moisture estimation accuracy requires a forward RTM as observation operator. The crucial factor affecting the prediction skill in this assimilation method depends on the accuracy of the RTM to simulate T B from the simulated soil moisture. The results show that further detailed parameterization study is required for the RTM to be adopt as a forward operator in the data assimilation system for the soil moisture improvement, as no literature is available in this regard.

Summary and Conclusions
In this study, a data assimilation method was adopted for the Indian subcontinent to study the effect of assimilation on Land Surface Model (LSM) soil moisture prediction accuracy. Contrary to the other global assimilation products like Global Land Data Assimilation System (GLDAS), this study assimilates Soil Moisture Operational Products (SMOPS) soil moisture as observation state. The simulated results of soil moisture after data assimilation were evaluated by Land Parameter Retrieval Model (LPRM) AMSR-E soil moisture. Furthermore, a detailed time series of soil moisture for two small regions were conducted to understand the variation of soil moisture on daily scale. The Central Tibetan Plateau (CTP) region was selected to obtain a dense in situ measurement network to compare with the simulated results.
The results show considerable improvement in the soil moisture simulation compared to GLDAS. The source of bias may be attributed to the precipitation forcing data (GDAS), which has strong influence on the soil moisture. Further study is required in this matter regarding the accuracy of GDAS precipitation, which shall be part of future studies. The error in simulation can also be caused due to the bias in the parameters, two most important parameter pertaining to soil moisture simulation being the soil type and the land cover, further detailed study is necessary in this regards to check the accuracy of the Food and Agriculture Organization (FAO) soil properties and University of Maryland Department of Geography (UMD) land cover classifications.
The T B simulation result shows that the Community Microwave Emission Model (CMEM) Radiative transfer Model (RTM) is highly sensitive to the parameterization scheme adopted to simulate brightness Temperature (Tb), with parameterization scheme one (with Kirdyashev opacity model and the Wang dielectric model) outperforming scheme two. Hence, future parameterization study is required for the study region to increase the accuracy of Tb simulation, as no subsequent literature is available in this regard for the study region.
This study helps to understand the effect of soil moisture assimilation on prediction skill of LSM over the Indian subcontinent. The improved soil moisture estimates can be used in different hydrological and climate studies. The Tb simulation result helps to understand the behavior of CMEM RTM to different parameterization scheme in the study region. A well calibrated RTM can help in accurate estimation of Tb, which can be used in different studies, mainly it can be used as a forward operator in the data assimilation system. The study will allow in a near future to realistically simulate soil wetness on a subcontinent scale.