Validation of Downscaled SoilMERGE with NDVI and Storm-Event Analysis in Oklahoma and Kansas

Tobin, Kenneth; Sanchez, Aaron; Alaniz, Alejandro X.; Hernandez, Stephanie; Perez, Adriana; Ganta, Deepak; Bennett, Marvin

doi:10.3390/rs17244058

Open AccessArticle

Validation of Downscaled SoilMERGE with NDVI and Storm-Event Analysis in Oklahoma and Kansas

by

Kenneth Tobin

^1,*,

Aaron Sanchez

¹,

Alejandro X. Alaniz

¹,

Stephanie Hernandez

¹,

Adriana Perez

¹,

Deepak Ganta

²

and

Marvin Bennett

²

¹

Center for Earth and Environmental Studies, Texas A&M International University, Laredo, TX 78041, USA

²

School of Engineering, Texas A&M International University, Laredo, TX 78041, USA

^*

Author to whom correspondence should be addressed.

Remote Sens. 2025, 17(24), 4058; https://doi.org/10.3390/rs17244058

Submission received: 29 October 2025 / Revised: 11 December 2025 / Accepted: 15 December 2025 / Published: 18 December 2025

(This article belongs to the Special Issue Satellite Soil Moisture Estimation, Assessment, and Applications (Second Edition))

Download

Browse Figures

Versions Notes

Highlights

What are the main findings?

SoilMERGE (SMERGE) can be successfully downscaled to 500 m via machine learning.
Extreme Gradient Boosting generally outperforms Gradient Boosting and Random Forest.

What are the implications of the main findings?

SMERGE can support historical analysis for diverse applications at a field scale.
This study provides a proof of concept for state-based SMERGE products to be developed across the US Great Plains

Abstract

SoilMERGE (SMERGE) is a 0.125-degree root zone soil moisture (RZSM) product (0 to 40 cm depth) covering the contiguous United States. The study area included most of Oklahoma and Kansas, a region where SMERGE exhibited superior performance. The time frame examined was the warm season from 2008 to 2019. In this study, evaluation of a prototype downscaled (500 m) version of SMERGE was made using (1) Ranked correlation (R²) benchmarking against Normalized Difference Vegetation Index (NDVI) datasets and (2) Ranked correlation (R²) analysis of antecedent RZSM with storm-event streamflow across a range of precipitation intensities (5 to >35 mm/day) at a watershed scale. In the NDVI benchmarking, all three downscaled products outperformed (0.52 to 0.59) default SMERGE (0.44). EXtreme Gradient Boosting (XGB) and Gradient Boost recorded a higher ranked correlation (0.59) than Random Forest (0.52). Within the study area, ranked correlation analysis of antecedent RZSM with storm-event United States Geological Survey streamflow was examined in five watersheds. For the most intense storm events (>35 mm), antecedent XGB downscaled SMERGE (0.64) outperformed antecedent streamflow (0.43) and all other versions of SMERGE (0.52 to 0.56) as a predictor of storm event response. The results of this study demonstrated broad-scale benefits of Machine Learning-assisted downscaling, providing proof of concept for the development of state-based SMERGE products across the US Great Plains.

Keywords:

SoilMERGE (SMERGE); root zone soil moisture; random forest; gradient boost; EXtreme Gradient Boosting

Graphical Abstract

1. Introduction

For historical studies of soil moisture over a large spatial domain (i.e., state-to-continental scale) viable options include land surface models and satellite soil moisture retrievals. Land surface models offer the advantage of providing continuous, multi-layer soil moisture values that extend into the root zones, from the surface down to a one-meter depth or greater. These models incorporate decades of data, making them suitable for retrospective historical analyses. However, land surface models are prone to forcing errors caused by inaccuracies in input data such as precipitation, temperature, and solar radiation. Errors in land cover characterization can result in biases and inaccuracies that impact model outputs. Commonly used land surface models include the Community Land Model [1], Interaction Soil–Biosphere–Atmosphere [2], and the model suite associated with the North American Land Data Assimilation System [3], which supports several modeling platforms including Mosaic, Noah, and Variable Infiltration Capacity. Despite their benefits, all land surface model data are simulated and untethered from direct observations.

Satellite soil moisture retrievals provide direct observational data, unlike land surface models. Recent satellite platforms, launched since 2011, include the Soil Moisture Ocean Salinity (SMOS) [4] and Soil Moisture Active Passive (SMAP) [5] missions. SMOS and SMAP are notable because they utilize highly penetrating L-band microwave retrievals, which are more accurate than earlier soil moisture sensors based on the C-, X-, and Ka-bands. Another significant effort is the European Space Agency (ESA) Climate Change Initiative (CCI) merged product [6], which combines all satellite soil moisture products into a single harmonized estimate. This global product provides coverage from 1979 onward with a 0.25 degree spatial resolution. Despite this effort, the ESA CCI has discontinuities in time and space, with more gaps present during earlier decades (pre-2003). Additionally, all satellite microwave retrievals in areas with high biomass have limited penetrative power, resulting in inadequate surface (0 to 5 cm) soil moisture retrievals. Techniques exist that can extend surface soil moisture retrievals into the root zone, such as the Ensemble Kalman Filter [7] and Exponential Filter [8,9], resulting in a depth coverage that is comparable to land surface models.

Another fundamental issue in land surface modeling and satellite-based soil moisture retrievals is their inherently coarse spatial resolution. These products have resolutions typically ranging from 0.125° to 0.25°, which limits the accurate representation of sub-grid heterogeneity and local-scale hydrological processes. There is a pressing need from diverse stakeholders—agricultural, ecological, and hydrological—for higher resolution datasets at a sub-kilometer resolution, i.e., field scale. Machine learning (ML) algorithms such as Random Forest (RF), Gradient Boosting (GB), and EXtreme Gradient Boosting (XGB) are rapidly becoming common approaches to downscale these coarse spatial resolution datasets. This study uses ML to downscale SoilMERGE (SMERGE), which is a product that combines a land surface model (NLDAS/Noah) and satellite soil moisture retrievals from ESA CCI. Tobin et al. [10] examined a similar spatial extent but used sparse in situ data and NASA’s Airborne Microwave Observatory of Subcanopy and Surface (AirMOSS) for evaluation. A criticism of [10] was the use of in situ that may not be representative for comparisons at larger spatial resolutions that are indicative of satellite-based soil moisture products [11]. This study addresses this concern by providing an evaluation over the entire study using continuous NDVI data. In addition, examination of antecedent RZSM in association with storm-event streamflow at a watershed scale provided additional insights. As such, this paper fills a vital research gap as being the first study in the central U.S. to simultaneously use continuous NDVI fields and basin storm-runoff responses as independent validation for a downscaled soil moisture product. Specifically, downscaled versions of SMERGE at a 500 m spatial resolution were generated. SMERGE is a historical root zone soil moisture product generated with an exponential filter that is gaining acceptance in the community [12,13]. SMERGE validation in this study was two-fold. (1) Benchmarking SMERGE against NDVI over the entire study area and (2) examination of how antecedent SMERGE acted as a predictor of storm event streamflow response at a watershed scale compared against antecedent streamflow.

2. Materials and Methods

2.1. Data Sets

All datasets used in this study were summarized in Table 1, which included several sources. Root zone soil moisture (RZSM) downscaling of SMERGE version 2.0 was based on both static and dynamic variables. Static variables included soil texture (sand, silt, clay), elevation, aspect, and slope that are constant over the evaluation period. Dynamic variables included SMERGE, Normalized Vegetation Index (NDVI) based on the Moderate Resolution Imaging Spectroradiometer (MODIS), albedo, Leaf Area Index (LAI), and surface temperature.

For pre-processing and results evaluation, three other datasets were required. Broad area NDVI evaluation was facilitated by using a second, independent NDVI product (Advanced Very-High-Resolution Radiometer, AVHRR NDVI Version 5). For watershed-based analyses, Parameter-elevation Regressions on Independent Slopes Model (PRISM) precipitation data was instrumental in selecting storm events. Subsequent analysis was aided by using land cover type from the National Land Cover Database (NLCD). Since landcover is not static, different datasets were used to cover the study period, which include: 2008 to 2009, 2008 NLCD; 2010 to 2012, 2011 NLCD; 2013 to 2014, 2013 NLCD; 2015 to 2016, 2016 NLCD; and 2017 to 2019, 2019 NLCD.

Preprocessing of all datasets was facilitated by the Zonal Statistics as Table function in Arcpy at a 500 m resolution. The data tables generated by this process were stitched together to create a spatially uniform dataset. Preprocessing removed all rows with negative numbers which are generally associated with flags and edge artifacts from the LAI, Albedo, NDVI, Clay, Sand, Silt, Slope, Elevation, and Aspect columns. Rows from LAI and Albedo with values greater than 249 and equal to 32,767, respectively, were also removed. Lastly, dates were used to generate separate Year (YYYY) and Month (MM) variables. This approach enabled ML algorithms to recognize and utilize the temporal proximity of satellite-derived data points to enhance the temporal coherence of the downscaled product.

2.2. Study Area

This study focuses on north-central Oklahoma and south-central Kansas (Figure 1). It is in this area where SMERGE exhibits optimal performance in CONUS [14].

For the overall study area, the average soil texture is best classified as a loam with near-equal proportions of sand, silt, and clay (Table 2). Like in [10], land cover was aggregated into three groups to facilitate ease of comparison that include: A—Herbaceous, Cropland, Developed; B—Scrubland, Barren; C—Forest, Wetlands (Table 3). Land cover group A dominates the study area, comprising 92% of the surface. Land cover group C is secondary in abundance and is more abundant to the east. This matches the trend in precipitation, in which there is an increasing trend going from west to east (Figure 1, Table 4).

Five small-to-moderate-sized watersheds within the study area were examined and are described in Table 4. Soils within these basins varied significantly. The Bird Creek, Chikaskia River, and Little Arkansas River watersheds have a loam texture matching the study area’s average composition. The North Folk Ninnescah basin has higher sand content indicative of a sandy loam. Conversely, the Walnut River watershed has a higher content of silt, reflective of a silty clay loam. Four of the five watersheds are dominated by landcover group A, with over 90% for this land use type. The Bird Creek basin has significant forests and wetlands (33%, landcover group C). Bird Creek and Walnut watersheds to the east (Figure 1) also have the highest annual precipitation and average storm event discharge ratio (Table 4). The Chikaskia River and North Fork Ninnescah, situated in the northwestern portion of the study area, recorded lower annual precipitation, and reduced average storm-event discharge ratios.

2.3. Methodology

The methodologies in this study include: (1) date and precipitation event selection; (2) executing ML downscaling; (3) ranked correlation calculation of RZSM against NDVI for the entire study area; and (4) watershed streamflow comparison against antecedent SMERGE soil moisture. The workflow for the procedures used in this study are indicated in Figure 2.

2.3.1. Date and Precipitation Event Selection

Selected dates fell within the warm season (March to November) spanning 2008 to 2019. Missing streamflow and input variable data prevent analysis before 2008. SMERGE is a historical product that ended in 2019. During this period, 200 days were selected for analysis (Table 5). Date selection focused on storm events, within any of the targeted watersheds, defined as a six-day period, following a significant (>5 mm) daily precipitation accumulation within the five target watersheds. Most storm events were interrupted by rainfall after the event, and post-event rainfall during the following six-day period that was greater than 50% of the event precipitation was excluded. In addition, storm events when the preceding two days that had greater than 50% of the event precipitation were also omitted. Additionally, high localized storms, events where precipitation occurs in less than 25% of the watershed, were not examined. Finally, if a daily minimum temperature of less than 4 °C was recorded during the storm event or any of the following six days, then this event was omitted.

2.3.2. Machine Learning Downscaling

Three ML algorithms—Random Forest (RF), a widely used technique for downscaling soil moisture estimates; EXtreme Gradient Boosting (XGB); and Gradient Boost (GB)—were used to create three downscaled versions of SMERGE. The spatial resolution of downscaled SMERGE was 500 m. The dataset was randomly divided by dates across the entire study area to validate NDVI benchmarking, allocating 70% for training and 30% for testing. A second watershed-scale approach was employed, where the entire dataset was split between testing (20%) and training (80%). After splitting, the watershed data was moved from the training dataset and incorporated into the testing dataset. This resulted in an overall data split of 63% training and 37% testing.

Yggdrasil Decision Forests’ RF implementation, Distributed (Deep) Machine Learning Community (DMLC)’s XGB, and Sklearing’s Gradient Boost (GB) were used to downscale SMERGE. The three models ran as regressors on nodes at the Texas Advanced Computing Center (TACC). Each model ran with 40 nodes, each with 50 CPU cores (Xeon 8280) and 192 GB DRR4. Without the support of TACC, hyper-tuning would have taken months instead of hours. Table 6 specifies the hyper-tuning parameters/settings used in this study. An iteration approach to hypertuning was implemented, where tuning values were constrained within a physical realistic range, and optimal parameter values were converged on after hundreds of model runs.

Sensitivity of independent variables was determined using a unified metric for all machine learning models, the Shapley Additive exPlanations (SHAP) framework. SHAP links the individual variable contributions to model predictions with higher SHAP values indicative of a greater importance for a given variable. While the inverse is also true, lower SHAP values determine the variable having a smaller impact on the prediction performance of the machine learning model. Values range from as high as 0.01 to as low as 0.0004. A direct numerical comparison for RF, XGB, and GB is shown in Table 7.

2.3.3. Ranked Correlation Comparisons Between NDVI and SMERGE

Bolten and Crow [15] developed an approach that used NDVI to benchmark the performance of remotely sensed surface soil moisture. In this study, a Python script, version 3.7.0, was developed to selectively extract data from a specific time frame, ensuring that only information from the years 1999 to 2019 was included in the analysis. This filtering allowed for the calculation of SMERGE 2.0 monthly soil moisture anomalies over a two-decade period. Statistical analysis was performed by assessing the strength and significance of associations, by implementing Spearman’s Rank Correlation methodology to evaluate the relationships between specified variables. Correlation used ranked data to measure the monotonic relationship between the variables. Spearman’s Correlation was symbolized as

ρ

measuring the strength and direction of the association between two ranked variables. Spearman’s Correlation coefficient ranged from −1 to +1. If

ρ

= 0, there was no association between ranks. The closer

ρ

was to zero, the weaker the association between ranks, and the closer to absolute one, the stronger the association. An R² value was used to evaluate the strength of the relationship to avoid consideration of the sign. Exceptionally low AVHRR NDVI values associated with rocky or urban areas had AVHRR NDVI values (<0.1) and were removed from the analysis described below.

The rank correlation compared the average soil moisture anomaly in the summer months (June to August) between years. The monthly averages were calculated for AVHRR NDVI and default SMERGE from historical data from 1999 to 2019. Historical monthly averages were calculated by grouping and averaging all values that belonged to the same month. The downscaled SMERGE was calculated similarly, only differing in historical data, due to the input data availability before 2008, particularly LAI and Albedo, resulting in a downscaled period from 2008 to 2019. This was because of the limited input data availability before 2008, most notably LAI and Albedo, thus the downscaling time scale was constrained to 2008 to 2019. Soil moisture anomaly calculation was based on the difference in the raw AVHRR NDVI and SMERGE with their respective historical monthly averages. The anomaly values are grouped and averaged by month and year. Spearman’s rank correlation coefficient was calculated using the average anomaly values of AVHRR NDVI and SMERGE, for both default and downscaled versions. The anomaly was calculated by first grouping and averaging all values of the same month, to create a historical average of the month for every grid square, then obtaining the difference in the raw values for the given day and the historical mean. For the ranked correlation analysis, we used SciPy_stats Python (version 3.7.0) package’s spearmanr function. To provide addition metrics to validate the described approach, root mean square error (RMSE) and significance testing were included. For our NDVI benchmarking and streamflow analyses, p-values < 0.05 indicate that the observed correlations between SMERGE (default or downscaled) and the AVHRR represent statistically meaningful relationships instead of spurious associations from random sampling variability. Lower p-values strengthen confidence that improvements in R² from default to downscaled SMERGE reflect genuine enhancements in product performance. RMSE quantifies the average magnitude of differences between predicted and observed values (or between two products), expressed in the same units as the variable of interest (m³/m³ for soil moisture). Unlike correlation, which measures the strength of linear association, RMSE captures the absolute magnitude of deviations. Lower (0.02–0.04) RMSE shows that the predicted SMERGE output still maintains the same general patterns to default SMERGE, while potentially adding spatial detail. When both metrics show improvement (higher R² with significant p-values and lower RMSE), this provides robust evidence that downscaled SMERGE better captures the temporal/spatial patterns in the AVHRR data.

2.3.4. Watershed Streamflow Comparison Against Antecedent Soil Moisture

A second, independent metric used to evaluate downscaled SMERGE involved assessing its effectiveness as a predictor of storm-event streamflow across five watersheds. Storm events were split into four groups: 5 to 15, 15 to 25, 25 to 35, and >35 mm/day (Table 8). Daily PRISM precipitation was defined as beginning at 00:00 Coordinated Universal Time (UTC) and United States Geological Survey (USGS) daily streamflow started at −06:00 UTC. Travel-time lags between precipitation and streamflow roughly correspond to this time discrepancy, reflecting the streamflow response at a basin’s gauging station after a rainfall event. To derive a storm runoff ratio, conversion of USGS streamflow measurement into normalized water depths spread across the basin was completed. This allowed for the derivation of a storm runoff ratio that represents the proportion of streamflow against total rainfall for each storm event. For each storm event intensity group, Spearman’s rank coefficient of determination (R²) was calculated for storm runoff ratio versus antecedent soil moisture (the average soil moisture two days before the event) and the average of streamflow two days before the rainfall event. This approach has been applied to evaluate soil moisture performance previously [14,16]. Higher R² values reflect a more robust predictive measure for both antecedent soil moisture and streamflow in terms of the conversion of storm-event rainfall into streamflow.

3. Results

Downscaling model sensitivity results are presented in Table 7. All three models share some similarities. Because different tools were used to evaluate sensitivity for the three models, we opted to evaluate sensitivity on a relative basis (low, moderate, high). Sand was consistently identified as a highly sensitive variable across all models, whereas NDVI, LAI, albedo, aspect and slope consistently exhibited low sensitivity. Likewise, silt had a moderate sensitivity for all models. The remaining variables exhibited divergent behavior in the RF model compared to those using boosting methodologies (XGB and GB). For instance, clay showed higher sensitivity in the RF model than in the XGB and GB models. Year, Month, silt, elevation, and temperature had a lower sensitivity for RF versus XGB and GB models.

NDVI benchmarking across the entire study area was used to evaluate the performance of both default and downscaled SMERGE and Table 9 and Figure 3 present these results. Downscaling depicts landscape features with greater fidelity than default SMERGE (Figure 4 and Figure 5). All three downscaling models outperformed default SMERGE in this comparison. The boosting-based models (XGB, GB) had higher-ranked correlations than RF. XGB models had the lowest RMSE, and all downscaled models had highly significant p-values. These results were also examined based on landcover type. For landcover type A (Herbaceous, Cropland, Developed), which defines 92% of the study area, the boosting-based models (XGB, GB) outperformed RF, mimicking the overall results. For landcover type C (Forest, Wetlands), which occupies 8% of the study area, all three models had similar ranked correlations, with GB slightly outperforming RF and XGB.

Watershed results were organized based on precipitation intensity grouping (5 to 15, 15 to 25, 25 to 35 mm, and >35 mm/day; Table 10 and Figure 6) and the number of storm events in each watershed are indicated in Table 8. For 5 to 35 mm storm events, USGS streamflow had a high ranked correlation (R² = 0.7), which is higher than both antecedent default and downscaled SMERGE. For the most intense storm events (>35 mm/day), average streamflow for the five examined watersheds dropped to R² = 0.431. For these most intense storm events, both antecedent default and downscaled SMERGE exhibited a higher ranked correlation than streamflow. But only XGB downscaled SMERGE outperformed the default SMERGE. Interesting results were revealed with examination of watersheds on an individual basis focusing on the most intense storm events (>35 mm/day; Table 11). In all watersheds, XGB downscaled SMERGE outperformed other downscaling models. However, in the North Folk Ninnescah River basin, all downscaling models underperformed compared with the default version of SMERGE. The North Folk Ninnescah River basin had a much higher proportion of sand than the four other watersheds (Table 2). Albergel et al. [8] noted that the Exponential Filter approach yielded poor performance with highly sandy soils, perhaps accounting for the underperformance of ML in this watershed. A major assumption of the Exponential Filter approach is that the entire soil profile is in hydrologic equilibrium. Albergel et al. [8] noted for sandy soils that decoupling between the surface and root zone was more likely.

4. Discussion

4.1. NDVI Benchmarking Performance Based on Geography

An east–west performance comparison was made between two grids located on the eastern and western edges of the study area, along the same latitude defining the northmost extent of Oklahoma (Figure 1). The western grid consisted of a more open landcover (Table 12) and all versions of SMERGE outperformed counterparts in the eastern grid. All downscaled versions greatly outperform default SMERGE in both eastern and western grids (Table 12). In the eastern grid, the downscaled GB model achieved the highest-ranked correlation, while in the western grid, the downscaled XGB model performed slightly better. However, across both grids, all downscaled versions of SMERGE showed similar performance, with ranked correlation values differing by no more than 0.02.

4.2. NDVI Benchmarking Performance Compared Against Previous Studies

In the last decade, there has been a rapidly growing literature focused on downscaling coarse spatial resolution soil moisture products. Typically, comparisons are made using in situ soil moisture data from sparse sensor networks, which can present challenges when evaluated against larger-scale, grid-based soil moisture measurements [14]. Nevertheless, numerous examples exist in the literature where downscaling has conferred added value compared to coarser resolution satellite products. For example, Zhao et al. [17] focused on the Iberian Peninsula and, using RF, generated a downscaled product at 1 km that recorded an improvement in correlation of 0.31 compared with SMAP. In China, Abowarda et al. [18] used RF and data fusion on ESI CCI and SMAP, and the downscaled product has an improved correlation of 0.14. Xu et al. [19] downscaled SMAP using XGB and GB to a 1 km resolution, which exhibited superior performance. Downscaling has also been demonstrated to improve the performance of root zone soil moisture (RZSM). Tobin et al. [10] examined the same study area as this paper, and reported a 0.15 increase in correlation for downscaled (400 to 700 m) SMERGE compared against SMERGE at its default 12.5 km resolution, which unlike this study was evaluated using in situ data. Other downscaling studies include Mahmood et al. [20] who downscaled RZSM in northeast Germany to 100 m resolution using RF. Sahaar & Niemann [21] downscaled SMAP. Francis and Bryson [22] focused on in situ sites in Kentucky comparing a 1 km downscaled product generated by RF and Soil Evaporative Efficiency.

Despite these notable efforts, concerns lingered about the validity of using in situ data to validate satellite soil moisture products. For example, Tobin et al. [14] used NDVI benchmarking to validate the SMERGE product across CONUS. In the study area, default SMERGE had a ranked correlation R² = 0.44 compared with the ML downscaled versions of SMERGE (R² = 0.52 to 0.59; Table 9). Another broad area comparison by Tobin et al. [10] is also relevant to establishing the utility of the downscaling process, focusing on the results from NASA’s Airborne Microwave Observatory of Subcanopy and Surface (AirMOSS) campaign at the Marena Oklahoma Soil Moisture Active Passive In Situ Testbed (MOISST) site in central Oklahoma. At a spatial resolution of 400 m, all three ML algorithms examined (RF, XGB, GB) yielded an increase in correlation (r) of around 0.4, compared to SMERGE at its native resolution. These results bolster the conclusion that the downscaling process confers added value over a broad area. In addition, for comparisons made at the watershed scale, antecedent soil moisture from SMERGE had a greater predictive value than USGS streamflow only for the most intense storm events (>35 mm/day). Tobin et al. [14] also noted this behavior. Significantly, the XGB downscaled version of SMERGE demonstrated superior performance compared to both the default and other downscaled versions. Crow et al. [16] found that for storm events exceeding 35 mm/day, SMAP L4 exhibited a ranked correlation more than 0.1 higher than USGS streamflow data.

4.3. Temporal Analysis of SMERGE NDVI Performance over Time

Figure 7 illustrates the temporal progression of correlation performance between AVHRR and SMERGE products using an aggregate time series approach, where each year incorporates all data from 2007 through that year. All downscaled products show good improvement after 2012, with correlations stabilizing around 0.5–0.7 compared to the notably poor performance in 2009 (R ≈ −0.7 to −0.5). This marked improvement after 2012 coincides with the advent of L-band satellite missions (SMOS and SMAP), which provided more accurate soil moisture retrievals that were incorporated into the SMERGE product. The machine learning downscaled versions (XGB, GB, RF) generally achieved higher correlations than default SMERGE from 2012 onwards, with XGB and GBR demonstrating particularly robust performance through 2017, maintaining correlations near 0.6–0.7.

4.4. Model Comparison and Sensitivity Discussion

The following discussion focuses on comparing variable sensitivity from previous downscaling efforts within this study. Kovačević et al. [23] developed a data engineering tool coupled with RF to downscale CCI in California. This study found NDVI and Day of Year had moderate importance and climate, precipitation, and LST variables had low importance. Similarly, Zappa et al. [24] utilized RF to downscale satellite-based soil moisture to 30 m in Austria. In this study, topography had a higher predictive power than soil texture. Mahmood et al. [20] also utilized RF to downscale soil water index values from northeast Germany derived from the Copernicus Global Land services. NDWI (normalized difference water index) was the most significant variable, compared to topographic variables, which were not as significant.

Xu et al. [19] downscaled SMAP using gradient boosting (XGBoost), light gradient boosting machine, and categorical boosting to generate 1 km resolution product. In the study, the effect of elevation on soil moisture was complex, since it was influenced not only by elevation itself but also by how elevation interacted with other variables. LAI played a minor role in the downscaling models, while clay, elevation, and land surface temperature (LST) were significant variables. Vegetation had a complex relationship with downscaled soil moisture. Liu et al. [25] applied regression tree (CART) RF, gradient boost decision tree, and XGB downscaling approaches to generate SMAP with a 1 km spatial resolution in southwest France. This study noted that vegetation inputs (NDVI, LAI) were less important for RF than in the boosting-based models. Conversely, albedo was more important for RF than boosting models. LST and elevation were about the same importance for all models. Sahaar and Niemann [21] downscaled SMAP using five ML models: artificial neural network, RF, XGBoost, Categorical Boosting, and Light Gradient Boosting Machine. This study noted that vegetation was less significant than soil and elevation variables.

The results from this study are comparable to those of the previous studies. Vegetation inputs (NDVI, LAI) had a lower sensitivity and albedo had a higher sensitivity (Table 7) matching the results of Liu et al. [25]. Furthermore, temperature and elevation had similar sensitivities as observed by [25]. Xu et al. [19] documented that LAI played a minor role in the downscaling models. Elevation was identified as a major variable in the boosting models examined by [19], which aligns with the findings of this study (Table 7). Focusing on only RF results, this study matches the findings of Kovačević et al. [23], who noted that NDVI had a moderate influence in affecting downscaled results. The high predictive power of elevation noted by Zappa et al. [24] matched the results of this study but were contradicted by the findings of Mahmood et al. [20], who indicated that topographic variables were not significant. Finally, in this study, for boosting models, vegetation variables were less significant than soil and elevation variables as noted by Sahaar and Niemann [21].

4.5. Implications for the Development of State-Wide Versions of SMERGE

While downscaling SMERGE, spatial artifacts which aligned with state boundaries were identified. These artifacts manifested as sharp discontinuities in the soil moisture field, where features that would be expected to continue smoothly across state lines instead terminate abruptly. These artifacts were detected in the soil texture data, specifically the proportions of sand (Figure 8), silt, and clay, which have a moderate-to-high sensitivity in the downscaling models (Table 7).

Mann–Whitney U tests were used to compare soil texture between Oklahoma and Kansas samples (Figure 8) of the study area at the pixel level (500 m resolution). All three soil texture components exhibited statistically significant differences across the state boundary (Table 13). The Mann–Whitney U statistic represents the number of times a value from one group exceeds a value from the other group when all possible pairs are compared. Larger U values indicate greater separation between the two distributions. The rank-biserial correlation (r) quantifies effect size, ranging from −1 to +1, where values closer to ±1 indicate greater differences between groups. Effect sizes are interpreted as small (|r| < 0.3), medium (0.3 ≤ |r| < 0.5), or large (|r| ≥ 0.5).

These U-statistic and R-values have a high confidence since all p-values are well below 0.0001. Sand content was significantly higher in Oklahoma (Median = 19.8%) than Kansas (Median = 7.8%, U = 1.04 × 10¹³, with a large effect size (r = −0.808). Conversely, clay content differed by a larger margin in Kansas (Median = 39.3%) and Oklahoma (Median = 29.5%), U = 1.84 × 10¹², r = 0.679 (large effect). Silt content also differed significantly between Kansas (Median = 53.2%) and Oklahoma (Median = 47.9%), U = 3.20 × 10¹², r = 0.442 (medium effect).

These abrupt discontinuities in soil texture at the state boundary cannot be attributed to natural soil texture gradients, which typically exhibit gradual spatial transitions. Rather, these artifacts seem to come from differences in state-level soil survey methodologies, classification schemes, and mapping standards within the gNATSGO database. Given that soil texture variables exhibit moderate-to-high sensitivity across all three machine learning models (Table 7), with sand showing high sensitivity for all machine learning models. RF in particular is highly sensitive to soil texture, given that SHAP values show that sand, silt, and clay are within the top four with the highest sensitivity. These discontinuities propagate through the downscaling process and contribute to the observed state-line artifacts in the downscaled SMERGE product. This finding supports our recommendation for state-by-state downscaling approaches to maintain internal consistency of soil texture characterization.

A grid comparison examines state-line artifacts in soil texture present between Oklahoma and Kansas (Figure 8). All versions of SMERGE have higher ranked correlations in Kansas compared to the adjacent grid in Oklahoma (Table 12). Kansas has much lower sand and higher silt and clay compared to Oklahoma (Figure 8, Table 14). The landcover between these two grids is roughly similar (Table 14), albeit landcovers A and C are somewhat higher and lower, respectively, in Kansas. In both grids, downscaled SMERGE had markedly higher ranked correlations than default SMERGE and downscaled XGB yielded superior results in both grids.

When evaluating SMERGE across the entire study area, comparing performance between Kansas and Oklahoma, the results also differed somewhat. Downscaled SMERGE outperformed default SMERGE (Table 12); however, the best downscaled performer differed between states. For Kansas, downscaled XGB yielded the best performance, but for Oklahoma downscaled RF edged out the other two downscaled versions of SMERGE. While land cover for the two state-wide comparisons was similar, there were some differences in soil texture (Table 14). The proportion of sand was lower, and silt and clay were higher in Kansas compared to Oklahoma.

Soil texture had a moderate-to-high sensitivity in all three models and had a strong influence on the downscaled soil moisture value. As such, inconsistencies in survey methods, classification schemes, or spatial resolution across state lines may have introduced artificial discontinuities at administrative boundaries. These discontinuities could propagate through the downscaling process, leading to model artifacts and degraded performance in regions near state borders. To address this, future work will focus on downscaled versions of SMERGE on a state-by-state basis. This approach removes the impact of discontinuities at state lines and maintains the internal consistency of soil texture data, enhancing the utility of these products to potential stakeholders.

5. Conclusions

This paper demonstrates that the downscaling of SMERGE confers an impact across the entire study area. The main conclusions are as follows:

(1): NDVI benchmarking facilitated performance evaluation over the entire study and was not confined to just in situ sites. Overall, all downscaled versions of SMERGE outperformed the default version of SMERGE. XGB and GB generally, and had higher-ranked correlations than RF.
(2): Improvements in downscaled performance based on NDVI benchmarking are comparable to those observed in previous studies based on in situ comparisons.
(3): XGB downscaled SMERGE was a superior predictor of storm event response at a watershed scale for the most intense storm events examined, >35 mm/day. Antecedent XGB downscaled SMERGE outperformed other downscaled versions of SMERGE, default SMERGE, and USGS streamflow for the most intense storm events.
(4): State line discrepancies in soil texture characterization between Kansas and Oklahoma were propagated into the downscaled version of SMERGE. Based on this finding, the SMERGE team will focus on the development of state versions of SMERGE where soil properties were uniformly characterized.

Author Contributions

Conceptualization, K.T.; methodology, K.T. and A.S.; software, A.S.; validation, K.T., A.S. and A.P.; formal analysis, A.S.; investigation, K.T. and A.S.; resources, K.T.; data curation, A.S., A.X.A., A.P. and S.H.; writing—original draft preparation, K.T. and A.S.; writing—review and editing, D.G. and M.B.; visualization, A.X.A. and A.S.; supervision, K.T. and A.S.; project administration, K.T. and A.S.; funding acquisition, K.T. All authors have read and agreed to the published version of the manuscript.

Funding

The authors acknowledge support from the United States Department of Energy Research Development and Partnership Pilot (RDPP, award number DE-SC0023067). Support from NASA Climate Indicator and Data Products for Future National Climate Assessments program through award # NNX16AH30G and NSF Geoscience Equipment (Award Number 1636769) is also gratefully acknowledged.

Data Availability Statement

The data presented in this study are available upon request from the corresponding author.

Acknowledgments

The Texas Advanced Computing Center at The University of Texas at Austin (TACC; http://www.tacc.utexas.edu) provided computational resources that have contributed to the research results reported within this paper. The assistance of Pablo Rangel (TAMIU ARC Writing Consultant) is also greatly appreciated.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

SMERGE	SoilMERGE
RZSM	Root zone soil moisture
NDVI	Normalized Difference Vegetation Index
XGB	Extreme Gradient Boosting
SMOS	Soil Moisture Ocean Salinity
SMAP	Soil Moisture Active Passive
ESA	European Space Agency
CCI	Climate Change Initiative
RF	Random Forest
GB	Gradient Boosting
LAI	Leaf Area Index
MODIS	Moderate Resolution Spectroradiometer Imagining
AVHRR	Advanced Very-High-Resolution Radiometer
PRISM	Parameter-elevation Regressions on Independent Slopes Model
NLCD	National Land Cover Database
AIRMOSS	Airborne Microwave Observatory of Subcanopy and Surface
DMLC	Distributed (Deep) Machine Learning Community
TACC	Texas Advanced Computing Center
IMMD	Inverse Mean Minimum Depth
SHAP	SHapley Additive exPlanations
UTC	Coordinated Universal Time
USGS	United States Geological Survey
MOISST	Marena Oklahoma Soil Moisture Active Passive In Situ Testbed

References

Lawrence, D.M.; Fisher, R.A.; Koven, C.D.; Oleson, K.W.; Swenson, S.C.; Bonan, G.; Collier, N.; Ghimire, B.; van Kampenhout, L.; Kennedy, D.; et al. The Community Land Model version Description of new features, benchmarking, and impact of forcing uncertainty. J. Adv. Model. Earth Syst. 2019, 11, 4245–4287. [Google Scholar] [CrossRef]
Mahfouf, J.-F.; Noilhan, J. Inclusion of gravitational drainage in a land surface scheme based on the force-restore method. J. Appl. Meteorol. Clim. 1996, 35, 987–992. [Google Scholar] [CrossRef][Green Version]
Mitchell, K.E.; Lohmann, D.; Houser, P.R.; Wood, E.F.; Schaake, J.C.; Robock, A.; Cosgrove, B.A.; Sheffield, J.; Duan, Q.; Luo, L.; et al. The multi-institution North American Land Data Assimilation System (NLDAS): Utilizing multiple GCIP products and partners in a continental distributed hydrological modeling system. J. Geophys. Res. Atmos. 2004, 109, D07S90. [Google Scholar] [CrossRef]
Kerr, Y.; Al-Yaari, A.; Rodriguez-Fernandez, N.; Parrens, M.; Molero, B.; Leroux, D.; Bircher, S.; Mahmoodi, A.; Mialon, A.; Richaume, P.; et al. Overview of SMOS performance in terms of global soil moisture monitoring after six years in operation. Remote Sens. Environ. 2016, 180, 40–63. [Google Scholar] [CrossRef]
Reichle, R.H.; De Lannoy, G.J.M.; Liu, Q.; Ardizzone, J.V.; Colliander, A.; Conaty, A.; Crow, W.; Jackson, T.J.; Jones, L.A.; Kimball, J.S.; et al. Assessment of the SMAP Level-4 surface and root-zone soil moisture product using in situ measurements. J. Hydrometeorol. 2017, 18, 2621–2645. [Google Scholar] [CrossRef]
Dorigo, W.; Wagner, W.; Albergel, C.; Albrecht, F.; Balsamo, G.; Brocca, L.; Chung, D.; Ertl, M.; Forkel, M.; Gruber, A.; et al. ESA CCI Soil Moisture for improved Earth system understanding: State-of-the art and future directions. Remote Sens. Environ. 2017, 203, 185–215. [Google Scholar] [CrossRef]
Reichle, R.H.; Crow, W.T.; Koster, R.D.; Sharif, H.O.; Mahanama, S.P.P. Contribution of soil moisture retrievals to land data assimilation products. Geophys. Res. Lett. 2008, 35, L01404. [Google Scholar] [CrossRef]
Albergel, C.; Ruediger, C.; Pellarin, T.; Calvet, J.-C.; Fritz, N.; Froissard, F.; Suquia, D.; Petitpa, A.; Piguet, B.; Martin, E. From near-surface to root-zone soil moisture using an exponential filter: An assessment of the method based on in-situ observations and model simulations. Hydrol. Earth Syst. Sci. 2008, 12, 1323–1337. [Google Scholar] [CrossRef]
Wagner, W.; Lemoine, G.; Rott, H. A method for estimating soil moisture from ERS scatterometer and soil data. Remote Sens. Environ. 1999, 70, 191–207. [Google Scholar] [CrossRef]
Tobin, K.; Sanchez, A.; Esparza, D.; Garcia, M.; Ganta, D.; Bennett, M. Machine Learning Downscaling of SoilMERGE in the United States Southern Great Plains. Remote Sens. 2023, 15, 5120. [Google Scholar] [CrossRef]
Crow, W.T.; Berg, A.A.; Cosh, M.H.; Loew, A.; Mohanty, B.P.; Panciera, R.; de Rosnay, P.; Ryu, D.; Walker, J.P. Upscaling sparse ground-based soil moisture observations for the validation of coarse-resolution satellite soil moisture products. Rev. Geophys. 2012, 50, RG2002. [Google Scholar] [CrossRef]
Dangol, S.; Zhang, X.; Liang, X.; Blanc-Betes, E. Advancing the SWAT model to simulate perennial bioenergy crops: A case study on switchgrass growth. Environ. Model. Softw. 2023, 170, 105834. [Google Scholar] [CrossRef]
Wang, M.; Wyatt, B.M.; Ochsner, T.E. Accurate statistical seasonal streamflow forecasts developed by incorporating remote sensing soil moisture and terrestrial water storage anomaly information. J. Hydrol. 2023, 626, 130154. [Google Scholar] [CrossRef]
Tobin, K.J.; Crow, W.T.; Dong, J.; Bennett, M.E. Validation of a new soil moisture product Soil MERGE. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2019, 12, 3351–3365. [Google Scholar] [CrossRef]
Bolten, J.D.; Crow, W.T. Improved prediction of quasi-global vegetation conditions using remotely-sensed surface soil mois ture. Geophys. Res. Lett. 2012, 39, L19406. [Google Scholar] [CrossRef]
Crow, W.T.; Chen, F.; Reichle, R.H.; Liu, Q. L band microwave remote sensing and land data assimilation improve the representation of prestorm soil moisture conditions for hydrologic forecasting. Geophys. Res. Lett. 2017, 44, 5495–5503. [Google Scholar] [CrossRef]
Zhao, W.; Sánchez, N.; Lu, H.; Li, A. A spatial downscaling approach for the SMAP passive surface soil moisture product using random forest regression. J. Hydrol. 2018, 563, 1009–1024. [Google Scholar] [CrossRef]
Abowarda, A.S.; Bai, L.; Zhang, C.; Long, D.; Li, X.; Huang, Q.; Sun, Z. Generating surface soil moisture at 30 m spatial resolution using both data fusion and machine learning toward better water resources management at the field scale. Remote Sens. Environ. 2021, 255, 112301. [Google Scholar] [CrossRef]
Xu, J.; Su, Q.; Li, X.; Ma, J.; Song, W.; Zhang, L.; Su, X. A Spatial Downscaling Framework for SMAP Soil Moisture Based on Stacking Strategy. Remote Sens. 2024, 16, 200. [Google Scholar] [CrossRef]
Mahmood, T.; Löw, J.; Pöhlitz, J.; Wenzel, J.L.; Conrad, C. Estimation of 100 m root zone soil moisture by downscaling 1 km soil water index with machine learning and multiple geodata. Environ. Monit. Assess. 2024, 196, 823. [Google Scholar] [CrossRef] [PubMed] [PubMed Central]
Sahaar, S.A.; Niemann, J.D. Estimating Rootzone Soil Moisture by Fusing Multiple Remote Sensing Products with Machine Learning. Remote Sens. 2024, 16, 3699. [Google Scholar] [CrossRef]
Francis, D.M.; Bryson, L.S. Proposed methodology for site-specific soil moisture obtainment utilizing coarse satellite-based data. Environ. Earth Sci. 2023, 82, 377. [Google Scholar] [CrossRef]
Kovačević, J.; Cvijetinović, Z.; Stančić, N.; Brodić, N.; Mihajlović, D. New Downscaling Approach Using ESA CCI SM Products for Obtaining High Resolution Surface Soil Moisture. Remote Sens. 2020, 12, 1119. [Google Scholar] [CrossRef]
Zappa, L.; Forkel, M.; Xaver, A.; Dorigo, W. Deriving field scale soil moisture from satellite observations and ground measurements in a hilly agricultural region. Remote Sens. 2019, 11, 2596. [Google Scholar] [CrossRef]
Liu, Y.; Xia, X.; Yao, L.; Jing, W.; Zhou, C.; Huang, W.; Li, Y.; Yang, J. Downscaling satellite retrieved soil moisture using Regression tree-based machine learning algorithms over Southwest France. Earth Space Sci. 2020, 7, e2020EA001267. [Google Scholar] [CrossRef]

Figure 1. Study area map (large dashed box) including location of examined watersheds (dashed lines) and grids (small solid boxes) used for analysis.

Figure 2. Flowchart displaying data iteration process and validation.

Figure 3. Comparison of overall correlation of SMERGE versions against AVHRR NDVI.

Figure 4. Rasterized example of default SMERGE. Sample date is 30 July 2014.

Figure 5. Rasterized example of eXtreme Gradient Boosting (XGB). Sample date is 30 July 2014.

Figure 6. Comparison of streamflow and SMERGE ranked correlation against storm runoff ratio for different storm-event intensities.

Figure 7. Comparison SMERGE correlation on a year-by-year basis.

Figure 8. Proportion of sand focusing on state line artifact between Kansas and Oklahoma.

Table 1. Data sets used to support machine learning downscaling, pre-processing, and results evaluation. All links were “accessed on 10 June 2024”.

Data Sets	Description	Download URL	Spatial Resolution
ML Static
Elevation	USGS Elevation Products (3DEP), 1/3 arc-sec DEM	TNM Download v2 (https://www.nationalmap.gov)	10 m
Soil Texture	Gridded National Soil Survey Geographic Database (gNATOSGO) from which sand, silt, and clay values were derived	https://www.nrcs.usda.gov/resources/data-and-reports/gridded-national-soil-survey-geographic-database-gnatsgo	30 m
ML Dynamic
RZSM	SMERGE-Noah-CCI root zone soil moisture 0–40 cm L4 daily V2.0 (SMERGE_ RZSM0_40CM):	https://www.tamiu.edu/cees/smerge/data.shtml	12.5 km
Albedo	MCD15A3H v061 MODIS/Terra + Aqua MCD43A3 v061 MODIS/Terra + Aqua BRDF/Albedo Albedo Daily L3 Global 500 m	https://www.earthdata.nasa.gov/data/catalog/lpcloud-mcd43a3-061	500 m
LAI	MCD15A3H v061 MODIS/Terra + Aqua Leaf Area Index/FPAR 4-Day L4 Global 500 m SIN	https://www.earthdata.nasa.gov/data/catalog/lpcloud-mcd15a3h-061	500 m
NDVI-1	Temporally Smoothed Weekly AQUA Collect 6 (C6) Moderate Resolution Imaging Spectroradiometer (MODIS) Normalized Difference Vegetation Index (NDVI)	Remote Sensing Phenology CONUS 250 m Smoothed NDVI (https://phenology.cr.usgs.gov/get_data_smNDVI.php)	250 m
Daily mean temperature	PRISM daily mean temperature, calculated as (tmax + tmin)/2	https://data.prism.oregonstate.edu/time_series/us/an/4km/tmean/daily/	4 km
Other Data
Daily mean precipitation	PRISM daily mean precipitation	https://data.prism.oregonstate.edu/time_series/us/an/4km/ppt/daily/	4 km
Land Cover	National Land Cover Database	https://www.usgs.gov/centers/eros/science/national-land-cover-database	30 m
NDVI-2	NOAA Climate Data Record (CDR) of AVHRR Normalized Difference Vegetation Index (NDVI) Version 5	https://www.ncei.noaa.gov/products/climate-data-records/normalized-difference-vegetation-index	5 km

Table 2. Soil texture in the entire study area and the five watersheds.

	USGS Outlet Number	Sand	Silt	Clay
Watershed—Bird Creek	7,177,500	38.8%	39.1%	22.2%
Watershed—Chikaskia River	7,152,000	35.5%	42.9%	21.7%
Watershed—Little Arkansas River	7,144,200	21.7%	48.8%	29.5%
Watershed—North Folk Ninnescah	7,144,750	65.9%	20.5%	13.6%
Watershed—Walnut River	7,147,800	7.8%	55.2%	37.0%
Overall Study Area	-	33.9%	42.2%	23.9%

Table 3. Land cover for the entire study area and the five watersheds.

	USGS Outlet Number	A	B	C
Watershed—Bird Creek	7,177,500	65.9%	1.5%	32.6%
Watershed—Chikaskia River	7,152,000	97.1%	0.2%	2.9%
Watershed—Little Arkansas River	7,144,200	96.2%	0.6%	3.8%
Watershed—North Folk Ninnescah	7,144,750	96.4%	0.4%	3.5%
Watershed—Walnut River	7,147,800	93.6%	0.6%	6.3%
Overall Study Area	-	91.8%	0.2%	8.0%

Table 4. Characteristics of the five examined watersheds.

Basin Name	USGS Outlet Number	Basin Size (km²)	Annual Avg. P (mm)	Avg. Storm Event Q Ratio
Bird Creek near Sperry, OK, USA	7,177,500	2349	953	0.0789
Chikaskia River near Blackwell, OK, USA	7,152,000	4851	847	0.0105
Little Arkansas River at Valley Center	7,144,200	3237	827	0.0167
North Folk Ninnescah River AB Cheney RE, USA	7,144,750	1424	752	0.0112
Walnut River at Winfield, KS, USA	7,147,800	4869	976	0.0309

Table 5. Storm-event dates selected for machine learning downscaling.

Year	Dates
2008	24 April, 8 May, 27 May, 9 June, 27 June, 13 July, 18 July, 19 July, 29 July, 30 July, 10 August, 25 August, 12 September, 7 October, 15 October, 23 October
2009	19 April, 27 April, 8 May, 16 May, 27 May, 3 June, 13 June, 21 June, 9 July, 21 July, 1 August, 18 August, 9 September, 26 September, 9 October, 22 October, 29 October, 30 October
2010	23 April, 30 April, 13 May, 20 May, 25 May, 31 May, 13 June, 14 June, 5 July, 15 July, 16 July, 25 July, 17 August, 18 August, 24 August, 1 September, 16 September, 24 September
2011	25 April, 12 May, 31 May, 17 June, 13 July, 4 August, 10 August, 13 August, 18 September, 22 September, 10 October
2012	20 March, 14 April, 15 April, 1 May, 31 May, 3 June, 4 June, 15 June, 21 June, 15 August, 25 August, 26 August, 14 September, 27 September, 13 October, 14 October
2013	2 May, 8 May, 9 May, 20 May, 30 May, 17 June, 28 June, 14 July, 30 July, 16 August, 30 August, 20 September, 28 September, 5 October, 29 October
2014	12 May, 24 May, 26 May, 10 June, 15 June, 29 June, 1 July, 10 July, 11 July, 17 July, 18 July, 31 July, 10 August, 29 August, 1 September, 2 September, 6 September, 24 September, 25 September, 13 October, 4 November
2015	26 March, 17 April, 27 April, 24 May, 27 May, 29 May, 12 June, 15 June, 18 June, 7 July, 10 July, 21 July, 31 July, 5 August, 10 August, 18 August, 23 August, 29 August, 9 September, 11 September, 26 September, 9 October, 31 October
2016	11 April, 27 April, 30 April, 17 May, 31 May, 1 June, 24 June, 3 July, 6 July, 15 July, 29 July, 12 August, 26 August, 1 September, 9 September, 10 September, 17 September, 25 September, 7 October, 26 October
2017	29 March, 5 April, 20 April, 22 April, 3 May, 11 May, 20 May, 18 June, 30 June, 9 July, 14 July, 23 July, 6 August, 10 August, 17 August, 18 September, 26 September, 27 September, 5 October, 7 October, 15 October
2018	22 April, 26 April, 3 May, 10 May, 19 May, 20 May, 30 May, 31 May, 12 June, 25 June, 1 July, 14 July, 18 July, 29 July, 7 August, 3 September, 21 September, 22 September
2019	14 April, 18 April, 24 April

Table 6. Hyper-tuning parameters/settings selected for machine learning downscaling models.

Random Forest (Version 0.13.0)

Num_trees = 500
Max_depth = 20
Min_examples = 5
Split_axis = “SPARSE_OBLIQUE”
Sparse_oblique_projection_density_factor = 1.0
Sparse_oblique_normalization = “MIN_MAX“
Sparse_oblique_weights = “CONTINUOUS”
Categorical_algorithm = “RANDOM”
Winner_take_all = True
Num_threads = 50

XGBoost (Version 1.6.2)

Colsample_bytree = 1.0
Device = “gpu”
Learning_rate = 0.005
Max_depth = 13
N_estimators = 1500
Subsample = 0.8
Tree_method = “hist”

Gradient Boost (Version 1.0.2)

N_estimators = 1000
Subsample = 0.817
Min_samples_split = 9
Min_samples_leaf = 4
Max_features = 4
Max_depth = 7
Learning_rate = 0.092

Table 7. Model sensitivity results.

Variable	RF Downscaled SMERGE	XGB Downscaled SMERGE	GB Downscaled SMERGE
Year	0.002703	0.01158	0.010724
Month	0.002316	0.010745	0.010632
Sand	0.013603	0.015659	0.011035
Silt	0.004526	0.008927	0.009021
Clay	0.00807	0.006518	0.004605
Elevation	0.005095	0.011947	0.009767
Aspect	0.000472	0.002215	0.001136
Slope	0.001254	0.003808	0.002615
LAI	0.001639	0.001195	0.001408
Temperature	0.003359	0.00959	0.009289
Albedo	0.004283	0.004545	0.005548
NDVI	0.001283	0.001067	0.00128

Table 8. Number of storm events in each watershed for each precipitation threshold (2008 to 2019).

Watershed	5–15 mm/Day	15–25 mm/Day	25–35 mm/Day	<35 mm/Day
Bird Creek	31	33	10	22
Chikaskia River	38	17	10	17
Little Arkansas River	39	24	9	15
North Folk Ninnescah	35	26	11	20
Walnut River	25	35	12	15

Table 9. Overall and by landcover ranked correlation, root mean square error (RMSE), and p-values.

Product	Metric	Overall	Landcover A	Landcover B	Landcover C
Default SMERGE	Correlation	0.444	0.433	0.620	0.318
Default SMERGE	p-value	0.021	0.024	0.001	0.106
Downscaled RF SMERGE	Correlation	0.524	0.541	0.683	0.490
Downscaled RF SMERGE	p-value	0.005	0.004	<0.001	0.009
Downscaled RF SMERGE	RMSE	-	0.035	0.036	0.035
Downscaled XGB SMERGE	Correlation	0.588	0.596	0.701	0.486
Downscaled XGB SMERGE	p-value	0.001	0.001	<0.001	0.011
Downscaled XGB SMERGE	RMSE	-	0.032	0.031	0.032
Downscaled GB SMERGE	Correlation	0.591	0.587	0.698	0.504
Downscaled GB SMERGE	p-value	0.001	0.001	<0.001	0.007
Downscaled GB SMERGE	RMSE	-	0.033	0.035	0.034

Table 10. Watershed average ranked correlation based on precipitation intensity.

Precipitation Intensity (mm/Day)	Streamflow	Default SMERGE	RF Downscaled	XGB Downscaled	GB Downscaled
5 to 15	0.716	0.504	0.457	0.479	0.467
15 to 25	0.686	0.546	0.535	0.573	0.499
25 to 35	0.718	0.716	0.604	0.639	0.614
>35	0.431	0.562	0.521	0.616	0.519

Table 11. Ranked correlation for most intense storm events (>35 mm/day) by basin.

Basin	Streamflow	Default SMERGE	RF Downscaled	XGB Downscaled	GB Downscaled
Bird Creek	0.442	0.844	0.840	0.880	0.863
Chikaskia River	0.422	0.471	0.581	0.699	0.530
Little Arkansas River	0.226	0.209	0.190	0.232	0.190
North Folk Ninnescah	0.333	0.545	0.412	0.476	0.317
Walnut River	0.735	0.741	0.584	0.791	0.698

Table 12. Specific area NDVI comparisons.

Comparison	Default SMERGE	RF Downscaled	XGB Downscaled	GB Downscaled
East–West Performance
Eastern Grid	0.2418	0.4768	0.4664	0.4860
Western Grid	0.5092	0.6624	0.6782	0.6716
State-Line Performance
Kansas Grid	0.3199	0.5183	0.6313	0.5403
Oklahoma Grid	0.2259	0.4951	0.5463	0.5305
Overall State Performance
Kansas Overall	0.4637	0.5042	0.6136	0.5598
Oklahoma Overall	0.4463	0.5995	0.5952	0.5958

Table 13. Mann–Whitney U test results.

Variable	Oklahoma Median	Kansas Median	U-Statistic	p-Value	Effect Size (r)	Interpretation
Sand (%)	19.8	7.8	1.04 × 10¹³	<0.0001	−0.808	Large
Clay (%)	29.5	39.3	1.84 × 10¹²	<0.0001	0.679	Large
Silt (%)	47.9	53.2	3.20 × 10¹²	<0.0001	0.442	Medium

Table 14. Soil texture and landcover of specific areas.

Comparison	Sand	Silt	Clay	Landcover A	Landcover B	Landcover C
East–West Performance
Eastern Grid	23.4%	50.0%	26.6%	75.3%	1.5%	23.2%
Western Grid	59.3%	26.0%	14.7%	93.5%	2.9%	3.7%
State-Line Performance
Kansas Grid	10.3%	53.2%	36.5%	89.5%	0.4%	10.1%
Oklahoma Grid	29.2%	45.3%	25.6%	82.3%	1.0%	16.4%
Overall State Performance
Kansas Overall	27.5%	45.4%	27.0%	80.7%	2.2%	17.1%
Oklahoma Overall	39.6%	39.2%	21.2%	83.5%	0.7%	15.8%

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Tobin, K.; Sanchez, A.; Alaniz, A.X.; Hernandez, S.; Perez, A.; Ganta, D.; Bennett, M. Validation of Downscaled SoilMERGE with NDVI and Storm-Event Analysis in Oklahoma and Kansas. Remote Sens. 2025, 17, 4058. https://doi.org/10.3390/rs17244058

AMA Style

Tobin K, Sanchez A, Alaniz AX, Hernandez S, Perez A, Ganta D, Bennett M. Validation of Downscaled SoilMERGE with NDVI and Storm-Event Analysis in Oklahoma and Kansas. Remote Sensing. 2025; 17(24):4058. https://doi.org/10.3390/rs17244058

Chicago/Turabian Style

Tobin, Kenneth, Aaron Sanchez, Alejandro X. Alaniz, Stephanie Hernandez, Adriana Perez, Deepak Ganta, and Marvin Bennett. 2025. "Validation of Downscaled SoilMERGE with NDVI and Storm-Event Analysis in Oklahoma and Kansas" Remote Sensing 17, no. 24: 4058. https://doi.org/10.3390/rs17244058

APA Style

Tobin, K., Sanchez, A., Alaniz, A. X., Hernandez, S., Perez, A., Ganta, D., & Bennett, M. (2025). Validation of Downscaled SoilMERGE with NDVI and Storm-Event Analysis in Oklahoma and Kansas. Remote Sensing, 17(24), 4058. https://doi.org/10.3390/rs17244058

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Validation of Downscaled SoilMERGE with NDVI and Storm-Event Analysis in Oklahoma and Kansas

Highlights

Abstract

1. Introduction

2. Materials and Methods

2.1. Data Sets

2.2. Study Area

2.3. Methodology

2.3.1. Date and Precipitation Event Selection

2.3.2. Machine Learning Downscaling

2.3.3. Ranked Correlation Comparisons Between NDVI and SMERGE

2.3.4. Watershed Streamflow Comparison Against Antecedent Soil Moisture

3. Results

4. Discussion

4.1. NDVI Benchmarking Performance Based on Geography

4.2. NDVI Benchmarking Performance Compared Against Previous Studies

4.3. Temporal Analysis of SMERGE NDVI Performance over Time

4.4. Model Comparison and Sensitivity Discussion

4.5. Implications for the Development of State-Wide Versions of SMERGE

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI