3.1. Effect of Input Predictors on SM Nowcasting/Forecasting Accuracy
Table 5 summarizes the ConvLSTM model performance for short-term (1-day lead time) forecasts at surface and subsurface depths in terms of median and interquartile ranges of R and ubRMSE values for each of the three input scenarios. Among these, scenario 2 (S2), which incorporated SMAP ancillary data and soil texture for layer and included upper-layer SM estimates for deeper layers (10–40 cm and 40–100 cm), outperformed the other S1 and S3 scenarios, confirming it as the best predictor combination. S2 incorporated SMAP ancillary data and soil texture for the surface layer and included upper-layer SM estimates to improve forecasts for the deeper layers (10–40 cm and 40–100 cm). These results indicate that the inclusion of soil texture information, as a major soil physical property, can improve the ability of the model to capture nonlinear spatiotemporal patterns of SM dynamics. This is in accordance with previous studies [
39,
48], which pointed out the importance of soil texture in near-surface SM modeling. For example, at the surface layer (0–10 cm), the median value of R over SCAN sites improved from 0.58 (S1 and S3 scenarios) to 0.61 under S2 scenario, while the median ubRMSE decreased from 0.07 (S3) to 0.045 cm
3 cm
−3 under S1 and S2.
A similar trend was observed for the two subsurface layers, with lower ubRMSE values (0.03 cm
3 cm
−3) and R values than that of surface layer (
Table 5). Supporting this, Tahmouresi et al. [
49] also reported a strong positive correlation between SMAP surface SM and the clay-to-sand content ratio, which indicates the value of soil texture in SM prediction. In subsurface layer (10–40 cm), the inclusion of upper layer SM estimates (i.e., 0–10 cm) in S2 notably improved the accuracy of SM nowcasts (i.e., 1-day lead time), where the median R value increased from 0.39 (S1) and 0.50 (S3) to 0.59 (S2), the median ubRMSE reduced from 0.042 (S1) to 0.04 cm
3 cm
−3 (S2 and S3), and the median bias improved from −0.008 (S3) to −0.003 cm
3 cm
−3 (S1 and S2). These results indicate the advantage of including upper-layer SM data to improve SM forecasts in deeper layers. Similar improvements were noted in the 40–100 cm layer. Comparable results were also obtained using USCRN data across all three soil layers scenarios (
Table 5). The availability of SOLUS soil products, which provide high-resolution (100 m) maps of basic soil properties at multiple depths is advantageous for improving SM forecasts and downscaling SM products [
50]. In this study, although the SOLUS-based soil texture maps (sand, silt and clay) were resampled from 100 m to 9 km grid, they still played an important role in improving SM forecasting at various depths.
Comparing the results from S2 and S3 indicates that including additional predictors does not necessarily improve model performance. This aligns with findings from prior studies, where reducing the number of input features through selection techniques like Lasso regularization improved model accuracy and generalization [
51,
52]. Interestingly, the inclusion of precipitation data in S3 did not improve the model performance. Although precipitation is the primary source of water input to the soil, its short-term signal does not always translate directly into measurable changes in soil moisture, particularly in deeper layers. The lag introduced by infiltration and percolation process often delays the response of SM to precipitation events [
53]. As a potential improvement, future work could explore incorporating precipitation lag effects more explicitly, for example, by replacing daily precipitation inputs with aggregated 7-day or 14-day averages, as suggested by Heuvelink et al. [
54]. Moreover, other dominant controls such as evapotranspiration, soil hydraulic properties, vegetation cover, and antecedent moisture conditions may exert stronger influences on SM dynamics that same-day precipitation. Another contributing factor may be the scale mismatch, as precipitation datasets typically have coarser spatial resolution and higher uncertainty compared to SM and other predictors. Consistent with this observation, Karthikeyan and Mishra [
55] reported that precipitation contributed least to SM prediction across all depths in feature importance analysis. Given that Scenario 2 produced more accurate short-term forecasts for both surface and subsurface layers, we relied on Scenario 2 to present the ConvLSTM model results in the following sections.
3.2. Short- and Mid-Term Surface and Subsurface SM Forecasts
Building upon the results that the S2 scenario provided the highest accuracy for 1-day forecasts at the three soil layers for SCAN and USCRN, we further evaluated its performance for short- (1 and 7 days) and mid-term (14 and 30 days) forecast lead times.
Figure 3 depicts the evaluation of S2 scenario forecasts using in situ observations from SCAN sites for the year 2022. Because the model performance using USCRN data yielded error metrics comparable to those from SCAN sites, the detailed results from USCRN are provided separately in
Figure A1 (
Appendix A). Results indicated that the accuracy of SM forecast decreased with increasing lead time across all depths. This decline was most pronounced in the surface (0–10 cm) and subsurface (10–40 cm) layers, where SM is highly dynamic due to rapid responses to rainfall inputs and evapotranspiration losses. These processes introduce large short-time fluctuations, making predictions at longer lead times more uncertain. The scatterplots show that the ConvLSTM model maintains relatively high accuracy across all three soil layers with the best accuracy at 1-day lead time. For example, at the surface layer (0–10 cm), the model achieves an ubRMSE of 0.086 cm
3 cm
−3 and R of 0.73 for 1-day lead time. The accuracy declined slightly in 7 days with ubRMSE of 0.090 cm
3 cm
−3 and R of 0.71. Similar trends were observed for the subsurface layers. At 10–40 cm, the ubRMSE increased from 0.089 (1-day) to 0.093 cm
3 cm
−3 (30-day) and R decreased from 0.70 (1-day) to 0.66 (30-day).
A systematic bias was observed across depths and lead times, where the model tended to over-forecast SM under dry conditions (~less than 0.10 cm
3 cm
−3) and under-forecast under wet conditions (~higher than 0.35 cm
3 cm
−3). This pattern can be attributed to several factors. First, spatial mismatches arise when comparing gridded forecasts (9 km pixel size) to point-scale in situ values [
1]. Second, vertical mismatches occur because the model provides average SM over thick layers (e.g., 0–10 cm), whereas in situ sensors provide data at more discrete depths (e.g., 5, 20, and 50 cm), which may not fully capture the entire modeled layer. Third, the ConvLSTM architecture uses a hyperbolic tangent (tanh) activation function for the memory cells and a sigmoid function for gate operations. While this setup ensures physically consistent outputs (e.g., preventing negative values), it may also compress extreme values toward the mean, which can lead to under-forecasting of SM in high moisture conditions [
56]. Moreover, considering the 9 km spatial resolution of the forecasts, extremely dry or wet conditions are unlikely to be uniformly present across the entire grid cell. Therefore, the model’s relatively smooth responses reflect not only a limitation of the architecture but also the inherent physical averaging effects at coarser resolutions.
The performance of the ConvLSTM model (test set) based on the test set for short- (1-, and 7-day) and mid-term (30-day) SM forecasts across the three layers based on the S2 scenario are depicted in
Figure 4,
Figure 5 and
Figure 6 for all SCAN and USCRN sites. As shown in
Figure 4, for the 0–10 cm layer, both short- and mid-term forecasts showed median ubRMSE and bias values of 0.05 cm
3 cm
−3 and 0.01 cm
3 cm
−3, respectively. Unsurprisingly, the R values declined with increasing the forecast lead time, ranging from 0.27 (30-day) to 0.61 (1-day). The largest ubRMSE and bias values were observed in the Mideast and Southeast regions, although the corresponding R values remained relatively high. This finding is consistent with Karthikeyan and Mishra ([
55], Figure 5) who reported that at 5 cm and 10 cm depths, ubRMSE was high (~0.06 cm
3 cm
−3) while correlation was also strong (~0.8). Such behavior can be attributed to pronounced soil moisture fluctuations in these regions (see Figure 6 in [
57]). As a result, the model was able to capture the overall temporal pattern of soil moisture dynamics (e.g., wet and dry seasons) but less accurately estimated the exact magnitudes. In contrast, the Northeast showed lower R values and higher ubRMSE values, likely due to the effect of dense vegetation which interferes with SMAP brightness temperature observations, and the narrower range of SM variations which weaken the R values. In addition, frequent freeze–thaw cycles and seasonal snowpack in the northeast reduce temporal data quality and increase ubRMSE.
Similar spatial patterns in SM accuracy have been reported by Fang et al. ([
58], Figure 1) who used an LSTM model to assimilate SMAP level 3 data for SM mapping. Another example is the work by Karthikeyan and Mishra ([
55], Figure 5) in which they generated 1 km SM maps across the U.S. In terms of ubRMSE and R values, the best SM forecasts were obtained in the Southwest and parts of the West, where sparse vegetation and lower vegetation water content reduce signal interference. Notably, SM forecast errors were particularly low at stations located in Utah state. Unlike SMAP SM products, which often show low accuracy in this region due to the influence of snowpack and complex mountainous terrain [
59], the ConvLSTM model demonstrated high accuracy in both R and ubRMSE values. This improved performance may be attributed to integrating the strengths of the two different SM products from the physics-based SMAP L3 (DCA), which uses surface observations, and a process-based model (NLDAS-2), which considers hydrological processes. In addition, in the northeastern CONUS, the forecast performance at the surface layer (0–10 cm) was relatively poor, with correlation values (R) ranging from −0.3 to 0.2 and ubRMSE values ranging from 0.07 to 0.11 cm
3 cm
−3 (
Figure 4). This reduced accuracy can be attributed to the prevailing cold and humid continental climate (Dfa/Dfb classes in the Köppen classification,
Figure A2), where soil freezing is common during winter months. Frozen soils alter the dielectric properties sensed by satellites and strongly affect the dynamics of soil moisture, leading to discrepancies between modeled forecasts and in situ measurements. These freeze–thaw processes introduce nonlinear and abrupt changes that are challenging for data-driven models to capture, thereby reducing forecast skill in this region. For the surface layer (0–10 cm), nearly 50% of the evaluated SM stations (90 sites) achieved ubRMSE values below the SMAP accuracy threshold of 0.04 cm
3 cm
−3 [
47] for both short- and mid-term SM forecasts.
Results for the subsurface layers (10–40 cm and 40–100 cm) followed similar trends to the surface layer. Overall, R values decreased with increasing forecasting lead time, while ubRMSE values slightly increased. For the 10–40 cm layer, mean R values ranged from 0.22 (30-day lead time) to 0.59 (1-day lead time), while the ubRMSE values ranged from 0.04 (1-day lead time) to 0.05 cm3 cm−3 (30-day lead time). The bias values were similar to those obtained for 0–10 cm layer. For the 40–100 cm layer, R values ranged from 0.19 (30-day lead time) to 0.42 (1-day lead time), while ubRMSE values achieved 0.04 cm3 cm−3 across all three lead times. The lower R values at greater layers may be due to the limited sensitivity of the input predictors (except soil texture data), especially satellite-derived features, to subsurface conditions (depth greater than 5 cm). Still, the model achieved the SMAP accuracy target of 0.04 cm3 cm−3 at over 50% of the sites for both 10–40 cm and 40–100 cm layers.
3.3. SM Forecasts Under Different Land Cover Types
Figure 7 depicts the model performance (test set) for the S2 scenario at three soil layers at SCAN sites under six dominant land cover types. Overall, the accuracy of SM forecasts, shown by R, ubRMSE and bias, decreases with increasing forecast lead time, which is in accordance with previous studies [
16]. When examined across land cover types, the ConvLSTM model generally performed best in grasslands and savannas, where SM variability is more directly linked to precipitation and evapotranspiration cycles, resulting in higher correlation coefficients (R), but sometimes greater bias and ubRMSE. In contrast, forests and shrublands exhibited lower accuracy, likely due to the buffering effect of dense vegetation and complex canopy-soil interactions, which obscure direct precipitation-soil moisture relationships. Croplands and permanent water bodies showed intermediate errors, with higher variability across SCAN sites. For the 0–10 cm layer, the model achieved median ubRMSE values below 0.06 cm
3 cm
−3 and median R values ranging approximately from 0.05 to 0.65 for all lead times and land cover types. Generally, in terms of R, the accuracy of SM forecasts decreased with increasing soil depth, whereas ubRMSE values improved with depth. To better explore in all lead times and land cover types, the median R values ranged from 0.10 to 0.55, while the median ubRMSE values were below 0.05 and 0.04 cm
3 cm
−3 for the 10–40 cm and 40–100 cm layers, respectively. These results indicate a significant improvement compared to previous studies. For example, Tavakol et al. [
12] reported daily ubRMSE values greater than 0.12 cm
3 cm
−3 for SMAP L3 across the U.S. between April 2015 and November 2017, with values of 0.11, 0.12, and 0.09 cm
3 cm
−3 for grasslands, croplands, and shrublands, respectively (see Table 2 in [
12]). Similarly, Xing et al. [
60] found Spearman correlation values of approximately 0.50 over grassland, 0.55 over croplands, and 0.45 over forested. Yi et al. [
61] reported Pearson correlation values of 0.41, 0.60, and 0.45 for the same land cover types, respectively. Comparable performance trends were observed using USCRN SM observations, as shown in
Figure A3 (
Appendix A).
3.4. SM Forecasts Under Different Soil Textures
Figure 8 displays the performance of the S2 scenario for SM forecasts across three broad soil textural classes: coarse-textured (sand, loamy sand, and sandy loam), medium-textured (loam, clay loam, silt loam, sandy clay loam, silty clay loam, silt), and fine-textured (sandy clay, silty clay, and clay) soils for three soil layers (0–10 cm, 10–40 cm, and 40–100 cm) and five lead times (1-, 3-, 7-, 14-, and 30-day) across SCAN sites (test set). Results showed that forecast accuracy was highest in coarse-textured soils, which allow rapid infiltration and clearer SM responses to precipitation, leading to relatively higher R values and lower errors. Medium-textured soils showed intermediate performance, while fine-textured soils exhibited the greatest challenges, particularly at deeper layers (40–100 cm), where slower infiltration and stronger retention effects delay the SM responses, reducing forecast skill over longer lead times. In terms of ubRMSE values at the surface layer (0–10 cm), coarse-textured soils showed the lowest errors for all lead times, followed by medium- and fine-textured soils. However, the ranking was different for R, where medium-textured soils outperformed fine- and coarse-textured soils. These differences may be partly explained by soil hydraulic properties, i.e., sandy soils have high infiltration capacity and rapid percolation rates, leading to rapid SM responses to rainfall but also faster drying, which can reduce temporal correlation. Conversely, fine-textured soils with low infiltration capacity can generate surface runoff during high precipitation rates, delaying water movement into deeper layers and dampening SM variability, thereby affecting R and ubRMSE. Additionally, the rapid SM fluctuations in sandy soils may increase the effective sensing depth of SMAP [
62,
63], affecting forecast performance patterns in various textures. The median bias was generally near zero across all soil textures and soil layers; however, slightly larger deviations were noted in the deeper layers for medium and fine-textured soils. These biases likely reflect challenges in capturing infiltration and redistribution processes in these two major soil textural classes where water moves slowly within the soil profile which slows down wetting-drying cycles. Similar performance patterns were observed using USCRN SM observations, as shown in
Figure A4 (
Appendix A).