Regionalization of Root Zone Moisture Estimations from Downscaled Surface Moisture and Environmental Data with the Soil Moisture Analytical Relationship Model

: Root-zone soil moisture (RZSM) plays a key role in the hydrologic cycle and regulates water–heat exchange. Although site observations can provide soil proﬁle moisture measurements, they have a restricted representation. Satellites can determine soil moisture on a large scale, yet the depth of detection is limited. RZSM can be estimated on a large scale using the soil moisture analytical relationship (SMAR) and surface soil moisture (SSM). However, the applicability of the SMAR to different deep-root zones and covariate sources is unclear. This paper investigates the applicability of the SMAR in the Shandian River Basin, upstream of the Luan River in China, by combining site and regional soil moisture, soil properties, and meteorological data. In particular, we ﬁrst compared the estimation results of the SMAR at different depths (10–20 cm; 10–50 cm) and using covariates from different sources (dataset, SMAR-P1; literature, SMAR-P2) at the site in order to generate SMAR calibration parameters. The parameters were then regionalized based on multiple linear regression by combining the SMAR-P1, SMAR-P2, and SMAR calibration parameters in the 10–50 cm root zone. Finally, the Shandian River RZSM was estimated using regional surface soil moisture and the aforementioned regionalized parameters. At the site scale, diffusion coefﬁcient b obtained in the 10–20 cm root zone at the same depth as the surface layer exceeded the upper limit of the SMAR by one. This is not ﬁt an environment within the site context, and thus the SMAR is not applicable at this particular depth. The opposite is observed for the 10–50 cm root zone. In addition, SMAR-P1 (RMSE = 0.02) outperformed SMAR-P2 (RMSE = 0.04) in the estimation of the RZSM at 10–50 cm. Parameter regionalization analysis revealed the failure of SMAR-P2 to pass the signiﬁcance test ( p > 0.05) for building a multivariate linear model, while SMAR-P1 successfully passed the signiﬁcance test ( p < 0.05) and ﬁnished the parameter regionalization process. The median RMSE and median R 2adj of the regional RZSM results were determined as 0.12 and 0.3, respectively. The regional RZSM agrees with the spatial trend of the Shandian River. This study examines the suitability of the SMAR model in varying deep-root zones and with diverse covariate sources. The results provide a crucial basis for future utilization of the SMAR.


Introduction
Root-zone soil moisture (RZSM) is the soil moisture below the surface layer that reaches the root layer of vegetation.It plays a key role in the hydrologic cycle and regulates the terrestrial-atmosphere water-heat exchange [1].It also plays an important role in water resource management, crop yield estimation, evapotranspiration estimation, drought assessment, and flood variability [2][3][4][5][6].With the advancement of remote sensing technology, numerous multi-scale surface soil moisture products have been developed, including Water 2023, 15, 4133 2 of 16 those from the European Space Agency Climate Change Initiative (ESA-CCI) [7], the Soil Moisture and Ocean Salinity (SMOS) [8] and Soil Moisture Active and Passive (SMAP) [9] satellites.In contrast, research on root-zone soil moisture (RZSM) is relatively lacking due to the corresponding technological restrictions and the complexity of the environment and deep layers.Vegetation root systems can absorb water from varying depths of the soil and facilitate the acquisition of nutrients [10].Estimating RZSM can enhance our understanding of water and nutrient movement within the soil profile.
Numerous studies have employed surface soil moisture (SSM) products to estimate RZSM, the majority of which have developed models based on the coupling of SSM and RZSM, making great research advancements [11,12].Commonly utilized methods include data assimilation [13,14], machine learning [15][16][17], exponential filtering (ET method) [18,19], and the soil moisture analytical relationship (SMAR method) [20][21][22].The assimilation of SSM products such as SMOS and SMAP during model simulation can effectively improve the estimation of RZSM, outperforming model simulations alone [13,14].However, data assimilation presents challenges due to its high computational demands and limited applicability for nonlinear models [23][24][25].Machine learning can identify trends and characteristics via mining data for information.For example, artificial neural networks (ANNs) can accurately estimate RZSM by detecting nonlinear variations in soil profile moisture.ANNs can compensate for data assimilation, yet their interpretation of water movement is limited [16,17].The ET method, developed by Wagner using surface and deep-water balance modeling, has since been improved by Albergel [19].The model uses T, characteristic time, as the sole parameter.Compared to machine learning, ET is computationally simple and has an improved interpretation, yet the physical meaning of parameter T is unclear [26,27].In contrast, the SMAR infiltration model, which is inspired by the ET method and features physically consistent parameters, presents an objective approach for the analysis of soil water movement.Through its ability to overcome the limitations of previous models, it provides a more efficient estimate of RZSM by decoupling the water balance [20].
The SMAR model has been validated and applied at the point scale in North America, the Middle East, East Asia, and Africa.Faridani et al. [21] investigated the nonlinear movement of soil moisture and enhanced the SMAR moisture loss parameter to improve the simulation accuracy of RZSM (5-135 cm) at the African site and RZSM (10-90 cm) at the North American site.Farokhi et al. [28] estimated RZSM (10-100 cm) by combining ASMR2 and downscaled AMSR2 soil moisture data using SMAR, with the latter reducing RMSE by 20%.Gheybi et al. [29] investigated the influence of input data on the estimation of RZSM (10-50 cm) in SMAR.The authors compared the performance of different SSM inputs, including site and image elements, and found that the spatial difference between the two inputs caused notable discrepancies in the estimation results.The aforementioned SMAR studies focus on a single root layer, while the root layer at different depths is less considered.Soil moisture transport varies with the root zone depth.Therefore, examining the soil moisture in multiple root systems can offer insights into the spatial and temporal variability of RZSM and its response to vegetation across different root depths [30].
The SMAR model has been adopted in several regional-scale studies.Du [23] and Zhuang [31] estimated RZSM in the Laohahe River Basin and Tibetan Plateau using the SMAR model.They combined several data types for parameter estimation to achieve results that closely match the actual environment.Baldwin et al. [32] employed satellite data and the SMAR in the Eastern United States for parameter calibration and to generate regional parameters, estimating regional RZSM with an average RMSE of less than 0.06 m 3 •m −3 .The uncertainty in SMAR parameter regionalization presents a challenge for the generalization of the SMAR model from the point to the regional scale.Parameter regionalization requires soil data as covariates.Differences in the covariates can affect the estimation of regional RZSM.Covariates are primarily sourced from two domains, namely, existing products such as the Harmonized World Soil Database (HWSD) [33] and soil texture data [34].However, current studies typically adopt a single covariate source for the estimation of regional RZSM and do not consider variations in the applicability of different sources in the SMAR.
The main objective of this study was to regionalize the use of the SMAR model in the Shandian River Basin and to explore, for the first time, the effects of different root zone depths and covariates on the model.The Shandian River is situated in the upper reaches of the Luan River, a region designated for water conservation that is vital for the water supply of Beijing, Tianjin, and Hebei.It covers extensive agricultural, pastoral, and forest areas that intertwine harmoniously [35].Integrated remote sensing experiments conducted in the basin can provide improved data support for the water cycle and energy balance [35][36][37].The rest of this paper is structured as follows: Section 2 outlines the data obtained for the study area and the methodology used to analyze the data.Sections 3 and 4 present and discuss the results, respectively, and Section 5 concludes the paper.

Datasets
The Shandian River spans across Hebei Province and the Inner Mongolia Autonomous Region, extending a total of 877 km.The river is situated in a temperate continental climate with a seasonal permafrost zone (Figure 1).Rainfall occurs mainly during summer, with the typical yearly precipitation averaging between 300 and 500 mm across most regions [38].The study area generally exhibits mountainous topography, along with pre-mountain and river plains.Land cover types include grasslands in the north, croplands in the south, and woodland in the east, in addition to some scrublands.existing products such as the Harmonized World Soil Database (HWSD) [33] and soil texture data [34].However, current studies typically adopt a single covariate source for the estimation of regional RZSM and do not consider variations in the applicability of different sources in the SMAR.
The main objective of this study was to regionalize the use of the SMAR model in the Shandian River Basin and to explore, for the first time, the effects of different root zone depths and covariates on the model.The Shandian River is situated in the upper reaches of the Luan River, a region designated for water conservation that is vital for the water supply of Beijing, Tianjin, and Hebei.It covers extensive agricultural, pastoral, and forest areas that intertwine harmoniously [35].Integrated remote sensing experiments conducted in the basin can provide improved data support for the water cycle and energy balance [35][36][37].The rest of this paper is structured as follows: Section 2 outlines the data obtained for the study area and the methodology used to analyze the data.Sections 3 and 4 present and discuss the results, respectively, and Section 5 concludes the paper.

Datasets
The Shandian River spans across Hebei Province and the Inner Mongolia Autonomous Region, extending a total of 877 km.The river is situated in a temperate continental climate with a seasonal permafrost zone (Figure 1).Rainfall occurs mainly during summer, with the typical yearly precipitation averaging between 300 and 500 mm across most regions [38].The study area generally exhibits mountainous topography, along with premountain and river plains.Land cover types include grasslands in the north, croplands in the south, and woodland in the east, in addition to some scrublands.

In Situ Data
The datasets were obtained from the 2019-2020 Soil Temperature and Moisture Measurement Dataset of the Integrated Remote Sensing Experiment on Water Cycle and Energy Balance in the Shandian River Basin [35].The study area includes 34 stations at different scales (large, 100 km; medium, 50 km; and small, 10 km) to enable a multi-scale investi-gation.Soil moisture and temperature were quantified at five depths (3 cm, 5 cm, 10 cm, 20 cm, and 50 cm) at every designated area.Soil moisture was measured with a 5TM sensor at an accuracy of ± 3% m 3 •m −3 and a data logging interval of 10-15 min.The systematic errors contained in the raw data were calibrated using the calibration formula in [35], described as: where SMC v (m 3 •m −3 ) and SMC 5TM (m 3 •m −3 ) are the calibrated and original volumetric soil moisture, respectively.The stratified data were combined into the RZSM at 10-20 cm and 10-50 cm, as required for this study, as follows: where i is the soil layer; RZSM 10−i is the RZSM of the 10−i cm layer; Z ri is the soil layer depth; and θ i is the soil moisture in the i-th layer.Following the preprocessing of the measured data, 22 stations with a relatively complete time series were selected and the corresponding hourly data were converted to a daily scale (Table S1).

Auxiliary Data
The SMAR parameters were obtained from the China Soil Hydraulic Parameters Dataset, the China Soil Characterization Dataset, the China Depth to Bedrock (DtB) map from the Land-Air Interaction Research Group at Sun Yat-sen University, China (http://globalchange.bnu.edu.cn/research/data,accessed on 22 December 2022) [39,40], and potential evapotranspiration data.Hydraulic and characterizing parameters were measured at various depth intervals (0-0.045m, 0.045-0.091m, 0.091-0.166m, 0.166-0.289m, 0.289-0.493m, 0.49-0.829m, and 0.829-1.383m), with a spatial resolution of approximately 1 km.Soil characterization parameters included porosity, bulk density (BD), and soil texture (sand, clay, silt).Soil hydraulic parameters included the site capacity (θ 33 ) and wilting point (θ 1500 ).The China DtB (m) Map was derived from the random forest and gradient boosting tree algorithms, as well as soil, climate, and vegetation factors [41].The dataset is accessible in three versions, with resolutions of 100 m, 1 km, and 10 km, respectively.This study selected the 1 km resolution version.Potential evapotranspiration (PET) was obtained from the China 1 km monthly dataset on potential evapotranspiration (2019-2020) with a spatial resolution of 0.00833 • [42].The soil attribute, hydraulic parameter and potential evapotranspiration data were extracted using ArcMap 10.8 (Esri).Auxiliary data were resampled temporally and spatially according to the measured data stratification.The formats of all data were unified.

SSM Dataset
The 1 km global daily surface soil moisture dataset (GD-SSM) is the first seamless global surface soil moisture and humidity dataset at a 1 km resolution for the period 2000-2020 [43].It combines the current highly accurate remote sensing product ESA-CCI with reanalyzed data from ERA5, resulting in a continuous dataset.The downscaling model was constructed using ISMN (International Soil Moisture Network) site data and machine learning techniques, leading to a high-resolution and spatio-temporally continuous dataset of surface soil moisture [44].The product was validated using ISMN site data, improving its feedback with an R of 0.89 and an unbiased root mean square error (ubRMSE) of 0.045 m 3 •m −3 .For this study, we selected the regional SSM as the input for 2020.The ISMN site dataset included soil moisture observations from the Soil Moisture Temperature Wireless Sensor Network in the Shandian River Basin (SMN-SDR), established from 18 July 2018 to 28 September 2018 [35].The accuracy of this product has been verified.The mean simulated and observed values are 0.165 m 3 •m −3 and 0.166 m 3 •m −3 , respectively, both of which have a bias of −0.001 m 3 •m −3 and ubRMSE of 0.032 m 3 •m −3 , with an R value of 0.91.The site data and ancillary parameter data were combined and used as the input for the SMAR regional application.

Methods
In this study, the surface layer was set at a depth of 0-10 cm, while the depths of 10-20 cm and 10-50 cm were designated to the root zone.The 10-20 cm root layer enabled investigation into the viability of the SMAR model at equivalent depths for both surface and root layers.The 10-20 cm and 10-50 cm layers were used to analyze the differences in the SMAR model across root zones.First, we optimized the model parameters of 10-20 cm and 10-50 cm using the SMAR model and a genetic algorithm (GA) to assess soil moisture estimations in the root zone.The SMAR model inputs include SSM and model parameters, as well as two covariates, namely, porosity and root zone field capacity (Section 2.2.1).The covariates were obtained from: (i) the soil property and soil hydrology datasets (Section 2.1.2),denoted as P1; and (ii) the soil texture dataset and previous research [34] (Table S2) denoted as P2.By combining P1 and P2 with their equivalent SMAR parameters as SMAR-P1 and SMAR-P2, respectively, we compared the applicability of the two parameter sets.We then analyzed the significance of the model, soil, and climate parameters.Based on the findings, we evaluated the constructed multiple regression model.Finally, the developed multivariate model was adopted to scale the SMAR model parameters from the point to the regional scale and subsequently compute soil moisture in the root zone (Figure 2).
ISMN site dataset included soil moisture observations from the Soil Moisture Temperature Wireless Sensor Network in the Shandian River Basin (SMN-SDR), established from 18 July 2018 to 28 September 2018 [35].The accuracy of this product has been verified.The mean simulated and observed values are 0.165 m 3 •m −3 and 0.166 m 3 •m −3 , respectively, both of which have a bias of −0.001 m 3 •m −3 and ubRMSE of 0.032 m 3 •m −3 , with an R value of 0.91.The site data and ancillary parameter data were combined and used as the input for the SMAR regional application.

Methods
In this study, the surface layer was set at a depth of 0-10 cm, while the depths of 10-20 cm and 10-50 cm were designated to the root zone.The 10-20 cm root layer enabled investigation into the viability of the SMAR model at equivalent depths for both surface and root layers.The 10-20 cm and 10-50 cm layers were used to analyze the differences in the SMAR model across root zones.First, we optimized the model parameters of 10-20 cm and 10-50 cm using the SMAR model and a genetic algorithm (GA) to assess soil moisture estimations in the root zone.The SMAR model inputs include SSM and model parameters, as well as two covariates, namely, porosity and root zone field capacity (Section 2.2.1).The covariates were obtained from: (i) the soil property and soil hydrology datasets (Section 2.1.2),denoted as P1; and (ii) the soil texture dataset and previous research [34] (Table S2) denoted as P2.By combining P1 and P2 with their equivalent SMAR parameters as SMAR-P1 and SMAR-P2, respectively, we compared the applicability of the two parameter sets.We then analyzed the significance of the model, soil, and climate parameters.Based on the findings, we evaluated the constructed multiple regression model.Finally, the developed multivariate model was adopted to scale the SMAR model parameters from the point to the regional scale and subsequently compute soil moisture in the root zone (Figure 2).

SMAR Model
The SMAR model classifies soil into surface and subsurface layers, with the water between the two layers connected by infiltration.The subscripts 1 and 2 in the following equations denote the first and second layers, respectively.During a rainfall event, according to the Green-Ampt method [45], any excess water from the surface layer above the

SMAR Model
The SMAR model classifies soil into surface and subsurface layers, with the water between the two layers connected by infiltration.The subscripts 1 and 2 in the following equations denote the first and second layers, respectively.During a rainfall event, according to the Green-Ampt method [45], any excess water from the surface layer above the field capacity will flow into deeper layers.Rapid penetration occurs from the surface to deeper levels within a short timeframe.This can be described as follows: where y(t) [-] represents the fraction of soil saturation infiltrating into the root-zone soil; n 1 is the soil porosity of the surface layer; Zr 1 is the depth of the topsoil layer; s 1 (θ 1 /n 1 )[-] is the relative saturation of the surface soil; and Sc 1 [-] is the relative saturation of the surface field capacity.The soil water balance in the deep layer is controlled by two main factors, namely, infiltration and soil water loss.Loss of water and diffusion can be expressed in terms of the normalized coefficients a and b, defined as: where Sw 2 [-] is the relative saturation of the root zone wilting point; n 2 is the soil porosity of the root zone layer; Zr 2 is the depth of the root zone layer; and V 2 [LT −1 ] is the root-zone soil water loss coefficient (ET and percolation losses).By combining Equations ( 3) and ( 4), the soil water balance equation becomes: The initial condition assumes that the relative saturation x 2 (t) is equal to zero, and thus the analytical solution of this equation can be derived as: or in discrete form as: Expanding Equation ( 7) and setting ∆t = t j − t i , the expression for deep soil moisture is: this can be written as s 2 : The parameters Sw 2 , Sc 1 , a, and b can be calculated with reference to real data (soil depth, field capacity, and soil water loss).Due to the wide parameter range setting, the RZSM may exceed 1 at times.For such cases, we set RZSM to 1.

SMAR Parameters Optimization
MATLAB R222b(MathWorks) was used for the GA modeling platform, with the initial population and number of iterations set as 100 and variance rate as 0.9.GA models were constructed for various parameter (SMAR-P1 and SMAR-P2) and root zone (10-20 and 10-50 cm) combinations.A warm-up period was established to correct the error resulting from the discrepancy between the model's initial state (Sc 2 ) and the actual state.The preheating periods for the 10-20 cm and 10-50 cm layers were 5 and 20 days, respectively, and the results of the preheating periods were removed.The calibration period spanned from 6 January 2019 to 6 January 2020 and from 21 January 2019 to 21 January 2020 while the validation period ranged from 7 January 2020 to 31 December 2020 and from 22 January 2020 to 31 December 2020.Following this, we selected the RMSE between the measured and estimated RZSM as the objective function for the parameter optimization.We then established the model parameter constraints.More specifically, the constraints for Sw 2 , Sc 1 , a, and b were [0,1], while that of V 2 was set to [0,2.5] based on previous research and practical scenarios [20].The model constraints could vary between 10 and 20 cm, and optimization could not be performed when b was set to [0,1], resulting in the absence of conditional constraints on b.Please refer to Section 4.1 for more details on the analysis.

Multiple Linear Regression
Multiple linear regression analysis explores the relationship between one dependent variable and several independent variables.The regression is expressed as follows: where a 1 , a 2 , • • • , a n is the regression coefficient, c is a constant, x is the independent variable, and y is the dependent variable.
In this study, we developed a multiple linear regression model that integrates the SMAR parameters optimized by GA with soil, vegetation, and climate factors, to enable the regionalization of the SMAR parameters.The relationships between the soil property data and soil hydraulic parameters, as well as the SMAR model parameters, lacked clarity.Thus, we employed put-back sampling to selected soil and climatic factors together with the SMAR model and establish binary, ternary, and quadratic linear regression models.The multiple regression modeling process involves the process of weighing two essential indicators: the coefficient of determination R 2 and the p-value (p-v).The construction success of the model is determined by the p-v, while R 2 indicates the explanatory power of the auxiliary variables of the SMAR model parameters.Therefore, we identified the combination with the highest R 2 value while ensuring that the significance level (p-v < 0.05) is met.The SMAR model parameters, which were optimized by the GA, were selected from population values.To minimize parameter uncertainty, we generated 20 sets of SMAR parameters using MATLAB loops and calculated the mean values to analyze the variation of the p-v.The final model was created using the parameter means with reference to the p-value test.

Evaluation Metrics
The RMSE, Nash-Satcliffe efficiency (NSE), and corrected coefficient of determination (R 2 adj ) were employed to evaluate the results.These indicators are defined as follows: where SM est and SM obs denote the relative soil moisture estimates and relative soil moisture measurements, respectively; N is the number of water measurements; and p is the number of characters.

In-Site Simulation Results
The SMAR was implemented at a daily temporal resolution for all stations.Figure 3 presents the SMAR parameter optimization results for the study site using the collected data.Sw 2 , Sc 1 , a, and b all decreased with the increasing soil depth, yet Sw 2 and Sc 1 were much less sensitive to soil depth compared to a and b.The soil depth in the root zone did not directly impact the Sc 1 and Sw 2 calculations, resulting in negligible effects on both variables.The root zone played a key role in the calculation of a and b, directly impacting the obtained values.A considerable disparity was observed in the depth between the root zones of 10-20 cm and 10-50 cm, resulting in marked differences.SMAR-P1 and SMAR-P2 exhibited inconsistent a and b values at the 10-20 cm depth.The median values of a for both SMAR-P1 and SMAR-P2 exceeded 0.4, with a high water loss.This is inconsistent with results from similar environments [20,23].Moreover, the most suitable value of b ranged within [0,1].SMAR-P1 and SMAR-P2 were significantly anomalous, with b values exceeding one.The SMAR model parameters were inconsistent for the case of the root zone depth equal to the surface layer depth, which reduced the accuracy of the water movement model results.Therefore, the 10-20 cm root zone depth was not examined in subsequent analyses.
not directly impact the Sc1 and Sw2 calculations, resulting in negligible effects on both variables.The root zone played a key role in the calculation of a and b, directly impacting the obtained values.A considerable disparity was observed in the depth between the root zones of 10-20 cm and 10-50 cm, resulting in marked differences.SMAR-P1 and SMAR-P2 exhibited inconsistent a and b values at the 10-20 cm depth.The median values of a for both SMAR-P1 and SMAR-P2 exceeded 0.4, with a high water loss.This is inconsistent with results from similar environments [20,23].Moreover, the most suitable value of b ranged within [0,1].SMAR-P1 and SMAR-P2 were significantly anomalous, with b values exceeding one.The SMAR model parameters were inconsistent for the case of the root zone depth equal to the surface layer depth, which reduced the accuracy of the water movement model results.Therefore, the 10-20 cm root zone depth was not examined in subsequent analyses.Figure 4a-c presents the simulation and evaluation of RZSM based on site-measured data.The soil moisture results from the validation period of SMAR-P1 and SMAR-P2 (R 2 adj = 0.74 and 0.67, respectively) were lower than those from the calibration period (R 2 adj = 0.89 and 0.89, respectively) (Figure 4b,c).However, the overall accuracy was similar, indicating that the model calibration parameters were able to accurately estimate future root-zone soil moisture.The SMAR-P1 (R 2 adj = 0.82) estimates were more accurate than those of SMAR-P2 (R 2 adj = 0.79).This indicates that the soil information acquired using SMAR-P1 can model root-zone soil moisture better than that of SMAR-P2 (Figure 4a). Figure 4d-f displays the NSE, R 2 adj, and RMSE results for each individual site, respectively.The results indicate that the estimation accuracy of SMAR-P1 is higher than that of SMAR-P2 (Figure 4a-c).The median NSE values for SMAR-P1 and SMAR-P2 were determined as 0.53 and −0.65, while the median RMSE values were 0.02 and 0.04, and the median R 2 adj values were 0.58 and 0.63, respectively.The spatially averaged results were consistent with the single site results.In particular, SMAR-P2 could reflect the RZSM trend in the point-scale environment, yet its simulation accuracy and trend response were lower than those of SMAR-P1.4b,c).However, the overall accuracy was similar, indicating that the model calibration parameters were able to accurately estimate future root-zone soil moisture.The SMAR-P1 (R 2 adj = 0.82) estimates were more accurate than those of SMAR-P2 (R 2 adj = 0.79).This indicates that the soil information acquired using SMAR-P1 can model root-zone soil moisture better than that of SMAR-P2 (Figure 4a). Figure 4d-f displays the NSE, R 2 adj , and RMSE results for each individual site, respectively.The results indicate that the estimation accuracy of SMAR-P1 is higher than that of SMAR-P2 (Figure 4a-c).The median NSE values for SMAR-P1 and SMAR-P2 were determined as 0.53 and −0.65, while the median RMSE values were 0.02 and 0.04, and the median R 2 adj values were 0.58 and 0.63, respectively.The spatially averaged results were consistent with the single site results.In particular, SMAR-P2 could reflect the RZSM trend in the point-scale environment, yet its simulation accuracy and trend response were lower than those of SMAR-P1.

Results of SMAR Parameters Regionalization
The complete validation of the SMAR for the point to the regional scale requires regionalization of the model parameters.Figure 5 presents the p-v results for the 20 optimization parameter sets used for SMAR-P1 and SMAR-P2, as well as a set of mean parameters in the multivariate model construction process.SMAR-P1 exhibits superior p-v results across all three parameter sets, with the exception of Sw2.In particular, SMAR-P1 p-v values exceed 0.05 for just one set of optimization parameters, while the SMAR-P2 p-v values are greater than 0.05 for all parameter sets.This indicates that the water loss coefficient under SMAR-P2 is not significantly correlated with soil and climatic factors; therefore, it cannot be used to construct a regression equation.

Results of SMAR Parameters Regionalization
The complete validation of the SMAR for the point to the regional scale requires regionalization of the model parameters.Figure 5 presents the p-v results for the 20 optimization parameter sets used for SMAR-P1 and SMAR-P2, as well as a set of mean parameters in the multivariate model construction process.SMAR-P1 exhibits superior p-v results across all three parameter sets, with the exception of Sw 2 .In particular, SMAR-P1 p-v values exceed 0.05 for just one set of optimization parameters, while the SMAR-P2 p-v values are greater than 0.05 for all parameter sets.This indicates that the water loss coefficient under SMAR-P2 is not significantly correlated with soil and climatic factors; therefore, it cannot be used to construct a regression equation.Black dots are outliers.

Results of SMAR Parameters Regionalization
The complete validation of the SMAR for the point to the regional scale requires regionalization of the model parameters.Figure 5 presents the p-v results for the 20 optimization parameter sets used for SMAR-P1 and SMAR-P2, as well as a set of mean parameters in the multivariate model construction process.SMAR-P1 exhibits superior p-v results across all three parameter sets, with the exception of Sw2.In particular, SMAR-P1 p-v values exceed 0.05 for just one set of optimization parameters, while the SMAR-P2 p-v values are greater than 0.05 for all parameter sets.This indicates that the water loss coefficient under SMAR-P2 is not significantly correlated with soil and climatic factors; therefore, it cannot be used to construct a regression equation.The multiple regression exhibited a satisfactory overall interpretability, with a minimum R 2 of 0.4.The p-v of the four parameters were close to the highly significant level, indicating that soil and climate factors offer a high degree of explanation for these parameters (Table 1).The DtB was employed in the construction of the regression model for all four SMAR parameters, demonstrating that this factor is strongly associated with all SMAR parameters.PET, as the only climate factor, participates in the regression of b, indicating that evapotranspiration is correlated with b, which is consistent with existing studies [23,46].A significant positive correlation was observed between BD and Sw2, with higher BD increasing soil water retention in the root zone.Silt influenced porosity and played a significant role in the Sc1 regression, revealing a close relationship between porosity and soil moisture diffusion.In summary, the set of multiple linear regression equations identifies the relationship between SMAR parameters and soil and climate factors, allowing for the regionalization of SMAR model parameters.7 presents the regionalization parameters.High Sw2 values are observed at medium elevations and low values at both high and low elevations.This difference can be attributed to the presence of river plains and floodplains with high water tables at middle elevations, while high and low elevations are typically shallow bedrock with thin soil layers and poor water-holding capacity.The topsoil in the midstream area has a high sand content, a high capacity for water infiltration and low Sc1 values.The spatial distribution of b is generally consistent with potential evapotranspiration; the greater the evapotranspiration, the faster the soil moisture spreads.Moreover, there is a correlation between the diffusion coefficient and DtB.A decreasing trend loss coefficient(a) is observed from the southeast to the northwest of the study area.This is attributed to the combined effect of soil texture and bedrock depth.The multiple regression exhibited a satisfactory overall interpretability, with a minimum R 2 of 0.4.The p-v of the four parameters were close to the highly significant level, indicating that soil and climate factors offer a high degree of explanation for these parameters (Table 1).The DtB was employed in the construction of the regression model for all four SMAR parameters, demonstrating that this factor is strongly associated with all SMAR parameters.PET, as the only climate factor, participates in the regression of b, indicating that evapotranspiration is correlated with b, which is consistent with existing studies [23,46].A significant positive correlation was observed between BD and Sw 2 , with higher BD increasing soil water retention in the root zone.Silt influenced porosity and played a significant role in the Sc 1 regression, revealing a close relationship between porosity and soil moisture diffusion.In summary, the set of multiple linear regression equations identifies the relationship between SMAR parameters and soil and climate factors, allowing for the regionalization of SMAR model parameters.Figure 7 presents the regionalization parameters.High Sw 2 values are observed at medium elevations and low values at both high and low elevations.This difference can be attributed to the presence of river plains and floodplains with high water tables at middle elevations, while high and low elevations are typically shallow bedrock with thin soil layers and poor water-holding capacity.The topsoil in the midstream area has a high sand content, a high capacity for water infiltration and low Sc 1 values.The spatial distribution of b is generally consistent with potential evapotranspiration; the greater the evapotranspiration, the faster the soil moisture spreads.Moreover, there is a correlation between the diffusion coefficient and DtB.A decreasing trend loss coefficient(a) is observed from the southeast to the northwest of the study area.This is attributed to the combined effect of soil texture and bedrock depth.

Regional Estimation of RZSM
The 2020 RZSM results for the Shandian River Basin at 1 km resolution were obtained based on the GD-SSM (1-day temporal resolution) and SMAR model domain parameters.Figure 8b depicts the RMSE and R 2 adj of the regional SMAR-estimated RZSM results versus the corresponding measured RZSM.The median RMSE and R 2 adj were determined as 0.1 and 0.29, respectively.The regional RZSM simulation exhibits a better accuracy than the site-specific simulations, yet the overall performance is weaker for the former.The reason for this is two-fold: (i) the GD-SSM still presents errors despite the high site accuracy, and (ii) despite the successful SMAR parameter regionalization, the selected variables did not fully explain the parameters.Figure 8a presents the average relative RZSM for 2020.The results demonstrate substantial spatial variation in RZSM in the Shandian River Basin, with high values occurring in the northern region and at lower elevations, and low values occurring at higher elevations.

Regional Estimation of RZSM
The 2020 RZSM results for the Shandian River Basin at 1 km resolution were obtained based on the GD-SSM (1-day temporal resolution) and SMAR model domain parameters.Figure 8b depicts the RMSE and R 2 adj of the regional SMAR-estimated RZSM results versus the corresponding measured RZSM.The median RMSE and R 2 adj were determined as 0.1 and 0.29, respectively.The regional RZSM simulation exhibits a better accuracy than the site-specific simulations, yet the overall performance is weaker for the former.The reason for this is two-fold: (i) the GD-SSM still presents errors despite the high site accuracy, and (ii) despite the successful SMAR parameter regionalization, the selected variables did not fully explain the parameters.Figure 8a presents the average relative RZSM for 2020.The results demonstrate substantial spatial variation in RZSM in the Shandian River Basin, with high values occurring in the northern region and at lower elevations, and low values occurring at higher elevations.

Regional Estimation of RZSM
The 2020 RZSM results for the Shandian River Basin at 1 km resolution were obtained based on the GD-SSM (1-day temporal resolution) and SMAR model domain parameters.Figure 8b depicts the RMSE and R 2 adj of the regional SMAR-estimated RZSM results versus the corresponding measured RZSM.The median RMSE and R 2 adj were determined as 0.1 and 0.29, respectively.The regional RZSM simulation exhibits a better accuracy than the site-specific simulations, yet the overall performance is weaker for the former.The reason for this is two-fold: (i) the GD-SSM still presents errors despite the high site accuracy, and (ii) despite the successful SMAR parameter regionalization, the selected variables did not fully explain the parameters.Figure 8a presents the average relative RZSM for 2020.The results demonstrate substantial spatial variation in RZSM in the Shandian River Basin, with high values occurring in the northern region and at lower elevations, and low values occurring at higher elevations.

Applicability of the SMAR Model at Different Root Depths
Soil moisture correlates significantly with precipitation, evapotranspiration, soil depth, topography, and the type of vegetation present [47][48][49].The depth of a plant's root system reflects the soil moisture and groundwater changes.Plant species with deep roots are frequently present in well-drained regions, such as mountains.In contrast, species with shallow roots are mostly found in lowland areas with shallow water tables, such as river plains [50].The Shandian River Watershed exhibits high elevation, rolling topography, diverse land cover, and significant variations in the rooting depth of vegetation.The applicability of the SMAR model varies across different depths of rooting layers.In this study, two groups of root layers were identified based on their depth: 10-20 cm and 10-50 cm.The delineation was performed in conjunction with site moisture stratification and root depth to investigate the suitability of the SMAR model at varying root layer depths.The results indicate that the SMAR model is effective at a root zone depth of 10-50 cm, yet this is not the case for the 10-20 cm depth.This is due to abnormalities in the calibration parameters (a and b) within the 10-20 cm range, which do not reflect the true rate of soil moisture diffusion and loss.The root depth at 10-20 cm equals the surface depth, with a depth ratio of one between the two layers.The effective water content of the surface layer is 1, whereas the effective water content of the root zone is 1-Sw 2 .Moreover, the residual water in the saturated surface soil surpasses the water capacity of the root zone, which contradicts the assumptions made in the SMAR model (Section 2.2.1).This may explain the lack of research on soil root-zone moisture at varying depths, with soil moisture in the root zone generally estimated at a single depth.

Multivariate Modeling Analysis
SMAR-P1 and SMAR-P2 were employed to estimate and parameterize the regional distribution of soil moisture in the root zone at depths ranging from 10 to 50 cm.The spatial analysis method effectively captures the spatial variation characteristics of environmental factors [51].This paper discusses the parameter regionalization results of SMAR-P1 and SMAR-P2 in relation to this methodology.Figure 5 reveals that parameter a in SMAR-P2 could not complete the parameter regionalization due to the inability to construct a regression equation with a significant correlation.This may be attributed to the complex spatial distribution of parameter a, including both the soil properties and evapotranspiration [21].In contrast, SMAR-P2 solely considers the soil property parameters, while evapotranspiration is ignored.
There was a notable disparity in the spatial arrangement of the two sets of parameter values (Figure 9a-d).SMAR-P1 was obtained based on physicochemical data and soil functions, and the spatial distribution of the data is relatively dispersed.This is consistent with the geospatial complexity of the Shandian River Basin.SMAR-P2 is an empirical value derived from soil texture.It generally corresponds to the spatial characteristics of the Shandian River, but yields unsatisfactory results at the local scale.Moreover, the data used for both SMAR-P1 and SMAR-P2 were collected following similar methods, resulting in the same parameter distributions.The graphs of the distributions in the upper left corner of Figure 9 are based on the overall spatial statistics of the corresponding parameters.The SMAR-P2 root zone field capacity (0.17-0.78) and porosity (0.4-0.48) were highly concentrated and exhibited more homogeneity compared to the SMAR-P1 root zone field capacity (0.13-0.39) and porosity (0.43-0.6), which were relatively dispersed and more heterogeneous.At the watershed scale, the spatial distribution of soil moisture can be influenced by multiple factors, including elevation, topography, vegetation, soil type, and climate.Thus, SMAR-P2 may have a weaker response to soil moisture based on soil texture, potentially hindering its ability to create accurate regression equations in the multivariate model compared to SMAR-P1.Thus, regionalizing the parameters was not possible.Furthermore, the insufficient number of measured stations may prevent the full characterization of the SMAR-P2 parameter set, thereby rendering the construction of the model unfeasible.

Regional RZSM Error Analysis
In this study, the applicability of the SMAR model is discussed based on the data for root zones with different depths and covariates with different sources, and the optimal combination is selected to estimate the regional RZSM (Figure 8).The high values are mainly distributed near the river channel and in the northern floodplain, while the low values are concentrated in the upland area.This trend in the simulation results is consistent with the spatial variation of the watershed.The soil moisture within the root zone is evaluated using the collaborative computation of the surface soil moisture, SMAR parameters, and the model.The errors in the estimation results originate from three sources.The downscaled soil moisture utilized in this study has been validated in the Lightning River Basin for 2018-2019, but not for the 2020 study period selected for this study [44].Consequently, unknown errors in surface soil moisture are superimposed on the RZSM estimates with the model.Second, although the SMAR model parameters are calibrated, they still exhibit errors (Figure 3).In addition, the calibrated model parameters and covariates can be used to construct a multiple regression model, yet it is difficult to obtain a high-precision parameter regionalization map from the multiple model.The error in the parameter regionalization process accumulates in the final RZSM result.Third, the SMAR model is a simplified physical model that does not consider lateral soil moisture transport and capillary movement, which may result in systematic errors in the model [20].Future work will compare and select different methods and data to reduce the uncertainty of RZSM estimations.

Regional RZSM Error Analysis
In this study, the applicability of the SMAR model is discussed based on the data for root zones with different depths and covariates with different sources, and the optimal combination is selected to estimate the regional RZSM (Figure 8).The high values are mainly distributed near the river channel and in the northern floodplain, while the low values are concentrated in the upland area.This trend in the simulation results is consistent with the spatial variation of the watershed.The soil moisture within the root zone is evaluated using the collaborative computation of the surface soil moisture, SMAR parameters, and the model.The errors in the estimation results originate from three sources.The downscaled soil moisture utilized in this study has been validated in the Lightning River Basin for 2018-2019, but not for the 2020 study period selected for this study [44].Consequently, unknown errors in surface soil moisture are superimposed on the RZSM estimates with the model.Second, although the SMAR model parameters are calibrated, they still exhibit errors (Figure 3).In addition, the calibrated model parameters and covariates can be used to construct a multiple regression model, yet it is difficult to obtain a high-precision parameter regionalization map from the multiple model.The error in the parameter regionalization process accumulates in the final RZSM result.Third, the SMAR model is a simplified physical model that does not consider lateral soil moisture transport and capillary movement, which may result in systematic errors in the model [20].Future work will compare and select different methods and data to reduce the uncertainty of RZSM estimations.

Conclusions
In this study, we utilized site data to estimate and assess soil moisture within the root zone at various depths and for different parameter groups.The applicability of the SMAR model at various root zone depths was discussed.Second, based on the root-zone soil moisture estimations and obtained calibration parameters, the results of different parameter groups in parameter regionalization were compared.Following this, we were able to determine the 2020 RZSM for the Shandian River at a resolution of 1 km and 10-50 cm depth.According to the results, the following research conclusions were obtained: At the site scale, diffusion coefficient b at the 10-20 cm root zone, coinciding with the surface layer depth, exceeded the upper limit of the SMAR by one.This is not consistent with the site context, and the SMAR is not applicable at this particular depth.However, this is not the case for the 10-50 cm root zone, where the opposite was observed.In addition, SMAR-P1 (RMSE = 0.02) outperformed SMAR-P2 (RMSE = 0.04) in the 10-50 cm RZSM estimations.
SMAR-P1 combines soil and climate factors to create multiple regression models that are statistically significant, thereby regionalizing parameters.In contrast, SMAR-P2, which also combines soil and climate factors, did not pass the significance test.
The RZSM results obtained from the regional SMAR model in the Shandian River watershed are consistent with spatial trends, and the spatial distribution of RZSM was significantly influenced by elevation and river discharge.
In summary, the SMAR model is able to effectively estimate soil moisture in the root zone for root zone depths that surpass the surface soil depth.Second, the process of parameter regionalization should determine whether the parameters exhibit similar heterogeneity to soil moisture.The generated high-resolution RZSM can provide data support for agricultural production and drought assessments.Note that the regional RZSM in this study was estimated using downscaled surface soil moisture data and regionalized parameters; thus, the RZSM inherits uncertainties in surface soil moisture and regionalized parameters.The estimation results are consistent with the overall spatial characteristics of the watershed, but the quality of the estimates varies when considering site-specific factors.Therefore, our future work will involve comparing the discrepancies among distinct parameter regionalization techniques and surface soil moisture data.We will also select suitable entities for the estimation of regional root-zone soil moisture to reduce uncertainties.

Figure 1 .
Figure 1.Study area overview.(a) Measuring sites in the study area and climate zoning.(b) Land use categories in the study area.(c) Elevation of the study area.

Figure 1 .
Figure 1.Study area overview.(a) Measuring sites in the study area and climate zoning.(b) Land use categories in the study area.(c) Elevation of the study area.

Figure 2 .
Figure 2. Flow chart of the methodology used in this study.

Figure 2 .
Figure 2. Flow chart of the methodology used in this study.

Figure 3 .
Figure 3. Calibration results for SMAR parameters across various parameter groups.The upper boundary condition of the SMAR parameter is indicated by the red dashed line, while the median is represented by the white solid line and labeled values.The white dots represent the mean.

Figure 3 .
Figure 3. Calibration results for SMAR parameters across various parameter groups.The upper boundary condition of the SMAR parameter is indicated by the red dashed line, while the median is represented by the white solid line and labeled values.The white dots represent the mean.

Figure
Figure4a-c presents the simulation and evaluation of RZSM based on site-measured data.The soil moisture results from the validation period of SMAR-P1 and SMAR-P2 (R 2 adj = 0.74 and 0.67, respectively) were lower than those from the calibration period (R 2 adj = 0.89 and 0.89, respectively) (Figure4b,c).However, the overall accuracy was similar, indicating that the model calibration parameters were able to accurately estimate future root-zone soil moisture.The SMAR-P1 (R 2 adj = 0.82) estimates were more accurate than those of SMAR-P2 (R 2 adj = 0.79).This indicates that the soil information acquired using SMAR-P1 can model root-zone soil moisture better than that of SMAR-P2 (Figure4a).Figure4d-f displays the NSE, R 2 adj , and RMSE results for each individual site, respectively.The results indicate that the estimation accuracy of SMAR-P1 is higher than that of SMAR-P2 (Figure4a-c).The median NSE values for SMAR-P1 and SMAR-P2 were determined as 0.53 and −0.65, while the median RMSE values were 0.02 and 0.04, and the median R 2 adj values were 0.58 and 0.63, respectively.The spatially averaged results were consistent with the single site results.In particular, SMAR-P2 could reflect the RZSM trend in the point-scale environment, yet its simulation accuracy and trend response were lower than those of SMAR-P1.

Figure 4 .
Figure 4. (a-c) Show the entire, calibration, and validation periods for soil moisture in the 10-50 cm root zone, respectively.(d-f) Show the NSE, R 2 adj, and RMSE for individual sites, respectively.Black dots are outliers.

Figure 5 .
Figure 5. p-v of the correlations between SMAR parameters with soil and climate factors.Dark blue, light blue, and red represent highly significant, significant, and non-significant correlations, respectively.

Figure 6
Figure 6 illustrates the difference between the estimated and calibrated parameters of the multivariate model.Greater accuracies were determined for parameters a (RMSE = 0.06) and b (RMSE = 0.07) compared to Sw2 (RMSE = 0.13) and Sc1 (RMSE = 0.18).Both a

Figure 4 .
Figure 4. (a-c) Show the entire, calibration, and validation periods for soil moisture in the 10-50 cm root zone, respectively.(d-f) Show the NSE, R 2 adj, and RMSE for individual sites, respectively.Black dots are outliers.

Water 2023 ,
15, x FOR PEER REVIEW 9 of 16

Figure 4 .
Figure 4. (a-c) Show the entire, calibration, and validation periods for soil moisture in the 10-50 cm root zone, respectively.(d-f) Show the NSE, R 2 adj, and RMSE for individual sites, respectively.Black dots are outliers.

Figure 5 .
Figure 5. p-v of the correlations between SMAR parameters with soil and climate factors.Dark blue, light blue, and red represent highly significant, significant, and non-significant correlations, respectively.

Figure 6
Figure 6 illustrates the difference between the estimated and calibrated parameters of the multivariate model.Greater accuracies were determined for parameters a (RMSE = 0.06) and b (RMSE = 0.07) compared to Sw2 (RMSE = 0.13) and Sc1 (RMSE = 0.18).Both a

Figure 5 .
Figure 5. p-v of the correlations between SMAR parameters with soil and climate factors.Dark blue, light blue, and red represent highly significant, significant, and non-significant correlations, respectively.

Figure 6
Figure 6 illustrates the difference between the estimated and calibrated parameters of the multivariate model.Greater accuracies were determined for parameters a (RMSE = 0.06) and b (RMSE = 0.07) compared to Sw 2 (RMSE = 0.13) and Sc 1 (RMSE = 0.18).Both a and b are directly impacted by porosity, and there is a correlation between the variables chosen

Figure 8 .
Figure 8.Estimated annual average of RZSM in (a) 2020 and (b) the assessment results.

Figure 8 .
Figure 8.Estimated annual average of RZSM in (a) 2020 and (b) the assessment results.Figure 8.Estimated annual average of RZSM in (a) 2020 and (b) the assessment results.

Figure 8 .
Figure 8.Estimated annual average of RZSM in (a) 2020 and (b) the assessment results.Figure 8.Estimated annual average of RZSM in (a) 2020 and (b) the assessment results.
Water 2023, 15, x FOR PEER REVIEW 13 of 16ble.Furthermore, the insufficient number of measured stations may prevent the full characterization of the SMAR-P2 parameter set, thereby rendering the construction of the model unfeasible.

Table 1 .
Multiple regression models used in the analysis.

Table 1 .
Multiple regression models used in the analysis.