Scale Impact of Soil Moisture Observations to Noah-MP Land Surface Model Simulations

Due to the limitations of satellite antenna technology, current operational microwave soil moisture (SM) data products are typically at tens of kilometers spatial resolutions. Many approaches have thus been proposed to generate finer resolution SM data using ancillary information, but it is still unknown if assimilation of the finer spatial resolution SM data has beneficial impacts on model skills. In this paper, a synthetic experiment is thus conducted to identify the benefits of SM observations at a finer spatial resolution on the Noah-MP land surface model. Results of this study show that the performance of the Noah-MP model is significantly improved with the benefits of assimilating 1 km SM observations in comparison with the assimilation of SM data at coarser resolutions. Downscaling satellite microwave SM observations from coarse spatial resolution to 1 km resolution is recommended, and the assimilation of 1 km remotely sensed SM retrievals is suggested for NOAA National Weather Service and National Water Center.


Introduction
Soil moisture (SM) is an important variable in coupled climate models and numerical weather prediction systems due to its impacts on land-atmosphere water, energy and carbon exchanges [1][2][3]. In situ observations reasonably track the SM status, but they are limited at local and even site scales [3]. The constraints of the traditional in situ observations can be compensated for by remote sensing technology that has shown the unique value of providing quantitative SM estimations at larger scales.
Optical and thermal infrared satellite SM sensing started in 1970, and several approaches were developed to exploit the relationships between surface reflectance and the SM [4][5][6]. However, these empirical relationships-based SM observations are significantly impacted by the soil spectral characteristics, and could not be obtained on cloudy days [7]. Microwave satellite technologies were thus developed to archive accurate SM retrievals [8][9][10]. However, active microwave radars are typically impacted by surface roughness and vegetation structure [11], and passive microwave-radiometers-based SM estimations are generally at tens of kilometers resolutions due to the limitations of satellite antenna technology [12][13][14].
Aiming to advance the use of microwave SM retrievals over local and regional scales, many downscaling approaches have been proposed to produce finer resolution satellite SM data [15][16][17][18][19]. Particularly, observations from other satellite sensors at finer spatial resolution are used as ancillary inputs to achieve accurate fine spatial resolution SM [20,21]. The feasibilities and notable advantages of the developed downscaling approaches have been evaluated with in situ observations. However, comprehensive assessments on the advantages and disadvantages of the finer resolution SM observations are hampered by the limitations of in situ sites' spatial distributions, and the uncertainties from the scale discrepancy and quality of the ancillary observations. Considering the spatial inhomogeneity [22,23], the downscaled satellite soil moisture data may benefit from direct inter-comparisons over the original coarse spatial resolution observations. There are also lots of open scientific questions related to understanding the impacts of assimilating finer spatial resolution SM observations on model performance, model requirements for finer spatial resolution SM observations and the operational application of finer spatial resolution SM observations.
To address these questions, SM estimations at different spatial resolutions are synthetically generated in this paper, and the impacts of assimilating finer resolution SM observations on Noah-MP model skills are then examined. The Noah-MP [24] land surface model (LSM) is a component of the National Water Model (NWM) that provides 1-km spatial resolution streamflow predictions over the entire continental United States (CONUS). The goal of this study is to identify the needs of finer spatial resolution SM data in the sequential SM data assimilation system, and in turn to investigate the potential application of higher spatial resolution SM observations in the operational models.

Noah-MP Land Surface Model
The Noah-MP LSM was developed to improve the Noah model that has been widely used in operational numerical weather prediction (NWP) and climate models. The Noah-MP model uses a separate vegetation canopy and multiple options for land-atmosphere interaction processes to accommodate numerous combinations of parameterization schemes for an ensemble representation of processes in nature [24]. It has been used in the WRF-Hydro model that is the core of the National Water Model (NWM) system. Similar to the Noah model, the Noah-MP also has four soil layers with thicknesses of 10, 30, 60, and 100 cm.
The Land Information System (LIS) is a software framework developed by the National Aeronautics and Space Administration (NASA). The LIS integrates the use of ground and satellite observations, along with the advanced LSMs and computing tools, to accurately characterize land surface states and fluxes [25]. LIS version 7.2 integrates the Noah-MP version 3.6 that has the same dynamic core with the Noah-MP model used in the operational WRF-Hydro model. Based on the LIS platform, the Noah-MP version 3.6 was employed to conduct the synthetic experiment in this paper.

Synthetic Experiment
A synthetic experiment is designed to evaluate the impacts of assimilating different spatial resolution SM data. Based on the Noah-MP model, the basic structure is [26]: (1) a control run (CTR) is conducted as a single realization to represent the "true" state of the Noah-MP model, using the optimal meteorological forcing data. (2) According to Table 1, the Noah-MP is driven by the perturbed meteorological forcing data and state variables, referred as open loop run (OLP). This indicates that the Noah-MP model runs without the benefits of data assimilation under suboptimal forcing and initialization conditions, with the assumption that a systematic error in model output between perturbed and unperturbed forcing and state conditions should not be caused by adding unbiased uncertainties in the ensemble Kalman Filter (EnKF) data assimilation system [27]. (3). In the data assimilation (DA) cases, synthetic observations at 1, 5, 12.5, 25 and 100 km spatial resolutions are assimilated into the OLP run using the EnKF. To inter-compare the Noah-MP model skills with the benefits of assimilating different resolution synthetic observations, the DA01km, DA05km, DA12km, DA25km and DA100km assimilate 1, 5, 12.5, 25 and 100 km synthetic observations, respectively. As the DA cases were also forced by the same sub-optimal forcing inputs and state variables as those used in the OLP run, the differences between the DA cases and the OLP run are good metrics to evaluate the impacts of data assimilation with. Given the same assimilation strategy and the same forcing data and state variables, the differences among the five DA cases should only come from assimilating different spatial resolution synthetic observations. Before data assimilation, 1, 5, 12.5 and 100 km synthetic observations were all reprocessed as 25 km spatial resolution. Particularly, synthetic observations from 625 pixels at 1 km resolution, 25 pixels at 5 km resolution and four pixels at 12.5 km resolution were simply averaged into one 25 km resolution pixel. However, the synthetic observations of one pixel at 100 km resolution were used to fill the corresponding four pixels at 25 km resolution. These reprocessed 25 km synthetic observations were then bias-corrected to the CTR run-based 0-10 cm SM climatology using the CDF-matching method with CDFs built for each land grid over the study domain during the study period [28,29].
In this paper, the ensemble size for both of the OLP run and each of the five DA cases was set as 12, as that is the optimal ensemble size for a sequential SM data assimilation system [30]. The CTR run and the Noah-MP model under the ensemble condition were all spun up by cycling five times through the period from January 1st, 2015 to December 31st, 2018. All of them were then conducted over the same period with one hour time step inputs and daily outputs. The daily bias-corrected synthetic observations were assimilated into the Noah-MP model at 00:00Z with updating 0-10 cm SM initialization. All simulations in this paper were forced by precipitation, near-surface air temperature, near-surface wind, downward shortwave/longwave radiation and surface pressure from the Global Data Assimilation System product [31]. Both of the CTR and OLP runs were conducted at 25 km spatial resolution, and all simulations were conducted over a study area from 25 • N, 125 • W to 50 • N, 75 • W that was basically a gridded CONUS domain. Table 1.

Perturbation Type SD Cross Correlation for Forcing Variable Perturbations
Precipitation SW LW

Ensemble Kalman Filter
The EnKF has been widely applied in sequential SM data assimilation [30,33]. Given an ensemble of model variable state vectors, the EnKF updates an ensemble forecast step using a Monte Carlo approximation [33]. Based on the perturbed forcing data, state variables and model parameters, the model states (Y) for each ensemble member propagated forward in the forecast step as where the Kalman gain matrix K is given by The matrix M is the observation vector, and the matrix H replies on the observations. The error variance µ t Y was set as a constant value 3% as LIS examples [26].

Results
Assuming the CTR run represents the "true" state of the Noah-MP land surface model, Figure 1 documents differences in root mean square differences (RMSDs) for SM simulations in 0-10 cm soil layer during the 2015 to 2018 period. The red color shading indicates that the DA01km performs better than each of the four coarser resolution DA cases, including DA05km, DA12km, DA25km and DA100km, yet the blue color shading means that DA01km shows modest performance. The inter-comparison results present similar patterns with DA01km case, showing significant improvements in 0-10cm SM estimations in the CONUS mid-west areas. Specifically, relative to the DA100km case, the DA01km case exhibits the great improvements as larger than 0.02 m 3 /m 3 in the west CONUS in Figure 1d; whereas the insignificant differences in grey color cover the rest study areas.

Results
Assuming the CTR run represents the "true" state of the Noah-MP land surface model, Figure  1 documents differences in root mean square differences (RMSDs) for SM simulations in 0-10 cm soil layer during the 2015 to 2018 period. The red color shading indicates that the DA01km performs better than each of the four coarser resolution DA cases, including DA05km, DA12km, DA25km and DA100km, yet the blue color shading means that DA01km shows modest performance. The inter-comparison results present similar patterns with DA01km case, showing significant improvements in 0-10cm SM estimations in the CONUS mid-west areas. Specifically, relative to the DA100km case, the DA01km case exhibits the great improvements as larger than 0.02 m 3 /m 3 in the west CONUS in Figure 1d; whereas the insignificant differences in grey color cover the rest study areas. Propagating surface information to a deeper soil layer primarily relies on the inherent surfacedeeper connection of the LSM. The behaviors of SM simulations for surface soil layer in Figure 1 are well mirrored in the 40-100 cm SM simulations ( Figure 2). Specifically, more remarkable RMSD differences can be seen in Figure 2 with the assumption that the CTR run simulations are the "true" state. With regards to mid-west CONUS, DA01km demonstrates a more robust agreement with the CTR run simulations, with significantly reducing RMSD values over each of the four coarser resolution DA cases. With the benefits of assimilating 1 km SM data, improvements on the Noah-MP SM estimations in 40-100 cm soil layer reach to 0.05 m 3 /m 3 . However, slight degradations caused by the DA01km case scatter in the southwest and south CONUS. Propagating surface information to a deeper soil layer primarily relies on the inherent surface-deeper connection of the LSM. The behaviors of SM simulations for surface soil layer in Figure 1 are well mirrored in the 40-100 cm SM simulations ( Figure 2). Specifically, more remarkable RMSD differences can be seen in Figure 2 with the assumption that the CTR run simulations are the "true" state. With regards to mid-west CONUS, DA01km demonstrates a more robust agreement with the CTR run simulations, with significantly reducing RMSD values over each of the four coarser resolution DA cases. With the benefits of assimilating 1 km SM data, improvements on the Noah-MP SM estimations in 40-100 cm soil layer reach to 0.05 m 3 /m 3 . However, slight degradations caused by the DA01km case scatter in the southwest and south CONUS. Compared to the OLP run, however, the study domain-averaged RSMD value for soil temperature is increased 0.16K (12.5 % increase) by the DA01km case. The DA01km case performs modestly over the OLP run, but it takes the best performance in the five DA cases. Relative to the Compared to the OLP run, however, the study domain-averaged RSMD value for soil temperature is increased 0.16K (12.5 % increase) by the DA01km case. The DA01km case performs modestly over the OLP run, but it takes the best performance in the five DA cases. Relative to the DA05km, DA12km, DA25km and DA100km cases, the study domain-averaged RSMDs for 0-10 cm ST simulations are significantly reduced by 0.15 K (19.1% reduction), 0.14 K (17.9% reduction), 0.10 K (12.8% reduction) and 0.14 K (17.9% reduction), respectively. More remarkable improvements with the benefits of 1 km SM data assimilation are found for 40-100 cm ST simulations. With respect to the CTR run, the study domain-averaged RSMDs are dramatically decreased by 0.62K (82.7%), 0.69K (92.1%), 0.66K (88.8%), 0.47K (82.6%) and 0.61K (82.7%) in comparison with the OLP run, DA05km, DA12km, DA25km and DA100km, respectively. The strong four-year (over 2015-2018 period) consistency of results in Figure 3 indicates that the inter-comparisons in this paper are qualitatively stable and thus likely representative of a longer analysis period.  Figure 3 indicates that the inter-comparisons in this paper are qualitatively stable and thus likely representative of a longer analysis period.

Discussions and Summary
A synthetic experiment was conducted in this paper to investigate the potential impacts of assimilating finer spatial resolution SM data on Noah-MP land surface model performances. The results here demonstrate: (1) With the benefits of assimilating 1 km SM data, Noah-MP modelbased SM and ST simulations are significantly improved in comparison with the assimilation of coarser spatial resolution SM data. (2) The LSM used in this paper is the Noah-MP3.6 model, but similar results can be obtained in other LSMs-based SM data assimilation systems due to the foundation of the results here serving as a general synthetic experiment. (3) LSMs are the components of most numerical weather prediction (NWP) and climate models. Given the better performances of SM and ST simulations, it is expected to improve NWP and climate model skills, with the benefits of assimilating 1 km SM observations through their positive impacts on the exchange estimations of water and energy between land surface and atmosphere. (4) LSM outputs such as SM are also critical variables in drought and flood monitoring. In terms of the significant

Discussions and Summary
A synthetic experiment was conducted in this paper to investigate the potential impacts of assimilating finer spatial resolution SM data on Noah-MP land surface model performances. The results here demonstrate: (1) With the benefits of assimilating 1 km SM data, Noah-MP model-based SM and ST simulations are significantly improved in comparison with the assimilation of coarser spatial resolution SM data. (2) The LSM used in this paper is the Noah-MP3.6 model, but similar results can be obtained in other LSMs-based SM data assimilation systems due to the foundation of the results here serving as a general synthetic experiment. (3) LSMs are the components of most numerical weather prediction (NWP) and climate models. Given the better performances of SM and ST simulations, it is expected to improve NWP and climate model skills, with the benefits of assimilating 1 km SM observations through their positive impacts on the exchange estimations of water and energy between land surface and atmosphere. (4) LSM outputs such as SM are also critical variables in drought and flood monitoring. In terms of the significant improvements in LSM skills, the capabilities of drought/flood monitoring can be ideally enhanced with the benefits of 1 km SM data assimilation.
Rather than simply asserting that assimilating finer spatial resolution SM data leads to a better performance, it should be noted that the differences among the DA05km, DA12km, DA25km and DA100km cases are relatively small in comparison with the dramatic improvements from the benefits of the DA01km case. This means that downscaling SM observations from coarser spatial resolution to 5 km may not exhibit significant improvements on LSM performance, as expected. Yet, downscaling satellite SM retrievals from coarser spatial resolution to 1 km can ideally enhance LSM skills (Figures 1-3).
This synthetic experiment was designed based on a 25 km SM data assimilation system. The land surface variables (for instance: land cover, surface albedo and vegetation index) used in the CTR run, OLP run and DA25 km case were all regridded from 1 to 25 km spatial resolution to satisfy the requirements of the data assimilation system. Although the Noah-MP3.6 model for each case is forced by the same meteorological forcing data, the OLP run and the DA25km are benefited by the same spatial resolution (25 km) land surface variables with the assumption of the 25 km CTR run representing the "truth", which may result in the performances of the OLP run and DA25km being overestimated in this paper.
In summary, the assimilation of 1 km spatial resolution SM data has a better capacity to improve Noah-MP model performance than the assimilation of any other spatial resolution SM data tested. With respect to the CTR run simulations, the Noah-MP model with the benefits of assimilating 1 km SM data sets is more successful in estimating SM and soil temperature (ST) for both 0-10 cm and 40-100 cm soil layers, reducing the probability of greater RMSD values in comparison with the coarse SM DA cases. Based on this result, downscaling microwave satellite SM observations from coarser spatial resolution to 1 km resolution is recommended, and the assimilation of 1 km remotely sensed SM retrievals is suggested for NOAA National Weather Service and National Water Center.
Author Contributions: J.Y. and X.Z. designed the research and wrote the paper. Both authors reviewed and edited the manuscript. All authors have read and agreed to the published version of the manuscript.