Next Article in Journal
Precision Weeding in Agriculture: A Comprehensive Review of Intelligent Laser Robots Leveraging Deep Learning Techniques
Previous Article in Journal
Response of Four Shrubs to Drought Stress and Comprehensive Evaluation of Their Drought Resistance
Previous Article in Special Issue
Research on Precise Segmentation and Center Localization of Weeds in Tea Gardens Based on an Improved U-Net Model and Skeleton Refinement Algorithm
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Validating Data Interpolation Empirical Orthogonal Functions Interpolated Soil Moisture Data in the Contiguous United States

1
School of Earth and Atmospheric Sciences, Georgia Institute of Technology, Atlanta, GA 30332, USA
2
Hydrology and Remote Sensing Laboratory, Agricultural Research Service, U.S. Department of Agriculture, Beltsville, MD 20705, USA
3
Center for Spatial Information Science and Systems, George Mason University, Fairfax, VA 22030, USA
*
Authors to whom correspondence should be addressed.
Agriculture 2025, 15(11), 1212; https://doi.org/10.3390/agriculture15111212
Submission received: 28 April 2025 / Revised: 26 May 2025 / Accepted: 30 May 2025 / Published: 1 June 2025
(This article belongs to the Special Issue Applications of Remote Sensing in Agricultural Soil and Crop Mapping)

Abstract

:
Accurate and spatially detailed soil moisture (SM) data are essential for hydrological research, precision agriculture, and ecosystem monitoring. The NASA’s Soil Moisture Active Passive (SMAP) product offers unprecedented information on global soil moisture. To provide more detailed information about the cropland SM data for the Contiguous United States (CONUS), a 1-km SMAP product has been produced using the THySM model in support of USDA NASS operations. However, the current 1-km product contains substantial data gaps, which poses challenges for applications that require continuous daily data. Data Interpolation Empirical Orthogonal Functions (DINEOF+) is an interpolation technique that uses singular value decomposition (SVD) to address missing data problems. Previous studies have applied DINEOF+ to reconstruct the 1-km daily SM dataset but without further analysis of the reconstruction errors. In this study, we perform a comprehensive validation of DINEOF+ reconstructed SM by using both the original SMAP data and in situ measurements across the CONUS. Our results show that the reconstructed SM closely aligns with the original SM with R2 > 0.65 and bias ranging from 0.01 to 0.02 m3/m3. When compared to in situ SM, the mean absolute error (MAE) ranges between 0.01 and 0.04 m3/m3 and the time series correlation coefficient ranges from 0.6 to 0.8. Our findings suggest that DINEOF+ effectively recovers missing data and improves the temporal resolution of SM time series. However, we also note that the accuracy of the reconstructed SM is dependent on the quality of the original SMAP data, emphasizing the need for continued improvements in SM retrievals by satellite.

1. Introduction

Soil moisture (SM), or soil water content, is a key variable in hydrological models and plays a critical role in irrigation scheduling, crop health assessment, water resource management, weather prediction, and climate research [1,2,3,4,5,6]. Several satellite missions have contributed to global SM monitoring over the past decades, including Soil Moisture and Ocean Salinity (SMOS launched in 2009) [7], Advanced Microwave Scanning Radiometer 2 (AMSR2, 2012) [8], the Cyclone Global Navigation Satellite System (CYGNSS, 2016) [9], and Soil Moisture Active Passive (SMAP, 2015) [10]. However, passive microwave SM products typically have coarse spatial resolutions ranging from 10 to 40 km, which is insufficient to meet the requirements of many regional applications [11,12]. To address this issue, Fang et al. [13] and Liu et al. [14] applied downscaling techniques based on thermal inertia principles to derive 1 km SM estimates from SMAP’s 9-km Enhanced L2 radiometer product. The first version of this high-resolution dataset has been publicly released by the National Snow and Ice Data Center (NSIDC) [15], and it is also distributed and visually presented in the Crop Condition and Soil Moisture Analytic (Crop-CASMA) system [16].
The 1-km SM product provides unprecedented details on the spatial information of the SM condition. However, its application in time-series studies is hindered by substantial data gaps. These gaps arise from cloud cover and swath gaps between the satellite instruments used in the downscaling process [17]. For instance, in 2020, the proportion of missing daily SM data across the Contiguous United States (CONUS) ranges from 41.3% to as high as 98.9% (Figure 1), with a highly uneven spatial distribution of data availability. Notably, regions such as the South and Midwest exhibited particularly high percentages of missing data, averaging 71.0% and 69.1%, respectively. Such extensive and spatially variable missing data pose a significant limitation for applications that require SM data of both high spatial and temporal resolution. For instance, daily SM observations are essential for tasks such as irrigation scheduling, drought monitoring, and crop health assessment. While temporal aggregation (e.g., to 8-day or monthly averages) can mitigate the impacts of missing data, it also reduces temporal resolution, limiting the ability to capture short-term dynamics and rapid changes in soil moisture.
Interpolation methods represent an effective approach to deal with the data gap issue by reconstructing missing values. By filling data gaps, such techniques enhance the temporal resolution of the SM dataset to meet the requirement of temporal resolution. Many studies choose to employ simple statistical methods such as nearest-neighbor, linear regression, and inverse distance weighting (IDW) because these methods are computationally efficient and demonstrate acceptable interpolation accuracy in previous evaluations [18,19,20]. More advanced techniques, such as classic Kriging and its variants, are commonly applied when a large sample size is available. Kriging methods leverage covariance (variogram) analysis to incorporate spatial relationships between sample points, thus outperforming simple interpolation methods in many cases [21]. However, their performance tends to degrade in areas with sparse samplings or highly heterogeneous terrain [22] and depends on the accuracy of the variogram model, which often requires auxiliary data or priori knowledge [23,24,25,26].
Data Interpolating Empirical Orthogonal Functions (DINEOF) is a variant of a learning algorithm that uses matrix factorization to deal with missing data problems [27]. Compared to Kriging methods, it has the advantage of leveraging both spatial and temporal correlation between data points and requires no parameter or priori information. It has proven to be an effective tool for reconstructing missing values in geophysical datasets [28,29,30,31,32]. In a recent study, Zhao et al. [33] applied an enhanced version of DINEOF (DINEOF+) to recover missing data in the SM product. Their results demonstrate that DINEOF+ can effectively fill gaps in the daily SM data. However, they also found significant biases between interpolated and in situ measurements, without providing further analysis. Evaluating and validating soil moisture products is essential before utilizing them in supporting agricultural decision making and climate research [34]. In this study, we aim to provide a more comprehensive validation of DINEOF+ interpolated SM using both the original and in situ measurements. Our analysis is based on the 1-km THySM-based SMAP dataset over the CONUS, which includes the 48 adjoining U.S. states and the District of Columbia, excluding Alaska and Hawaii.
The paper is organized as follows: Section 2 describes the main steps of DINEOF+ and the datasets used in this study. Section 3 presents the reconstruction results and validation using in situ measurements. In Section 4, we discuss the performance of using DINEOF+ for the SM data interpolation.

2. Materials

2.1. Study Area

This study focuses on the Contiguous United States (24° N–50° N, 125° W–66.5° W) (Figure 1a). The area spans arid deserts in the Southwest, humid subtropical regions in the Southeast, temperate forests in the Northeast, and agricultural plains in the Midwest. Elevation ranges from sea level to over 4000 m in the Rocky Mountains, contributing to pronounced gradients in precipitation, temperature, and soil properties. Soil types vary from fertile Mollisols in the Midwest to Aridisols in the Southwest, influencing water retention and hydrological response [35]. Land cover includes croplands, grasslands, shrublands, forests, and urban areas, each affecting soil moisture through differing evapotranspiration and infiltration rates [36,37]. This environmental heterogeneity makes the CONUS an ideal setting for evaluating soil moisture reconstruction methods under varying physical-geographical conditions.

2.2. Dataset

SMAP-derived 1 km daily surface SM in 2020 is obtained from the National Snow and Ice Data Center (NSIDC) [15] (https://nsidc.org/data/nsidc-0779/versions/1, accessed on 10 April 2025). The dataset includes GeoTIFF files for each day with two soil moisture layers. Here, we use the dataset measured on the ascending overpass (6 PM).
In situ SM at 5 cm surface is obtained from the Internation Soil Moisture Network (ISMN) [38] (https://ismn.earth/en/dataviewer/, accessed on 10 April 2025). The networks that account for 99% of in situ data used for validation here include the Soil Climate Analysis Network (SCAN), the network of Snow Telemetry (SNOTEL), the U.S. Climate Reference Network (USCRN), Atmospheric Radiation Measurement (ARM), the Texas Soil Observation Network (TxSON), and the interactive Roaring Fork Observation Network (iRON) (Table 1). SCAN and USCRN consist of stations located across the country, while SNOTEL stations are mainly distributed in the West. Stations from the other networks are spatially clustered within local areas (Figure 2).

3. Methodology

The main steps of DINEOF+ are described as follows [32,39]. First, we transform a time series of images into a 2-D matrix A whose entry i ,   j [ m ] × [ n ] corresponds to the observation of the variable f ( r i , t j ) at location r i and moment t j :
A i j = f r i , t j
We subtract the average of all observed values from each element in A and assign 0 to entries without observations. Then, we apply SVD to decompose the matrix X 0 and select the r-truncated singular matrices and vector such that:
F ( X o ) = i = 1 r σ i u i v i
where u and v are referred to spatial and temporal empirical orthogonal functions (EOFs). σ are singular values.
As a result, we can obtain a complete matrix that approximates A and use these elements to fill in missed values in A . The accuracy of the approximation can be further improved by iteratively applying SVD to the reconstructed matrix. In practice, we start with k = 1 and increase the number of k by 1 during each iteration. The optimal number of iterations is determined by using a cross-validation technique [28]. In the final step, we remove those interpolated values that are flagged as invalid due to insufficient observations along both spatial and temporal dimensions [39].
It is noted that the classic DINEOF algorithm requires no user-defined parameter. To mitigate the impact of missing data, DINEOF+ introduces a parameter for setting the maximum PMD. Zhao et al. [39] suggest using 75% as a safe threshold of PMD, as they observed that interpolation errors become nearly double when PMD reaches 76%. Here, we follow the suggestion and choose 75% as the threshold.
Satellite images were read and processed using MATLAB R2023b, with each day’s data stored as a single-precision matrix (~59.5 MB). Each daily raster contains 2637 × 5640 grid cells (~14.9 million values per day), resulting in over 5.4 billion data points for the full year. The full annual dataset occupies approximately 21.8 GB. All computations were performed on a workstation equipped with an Intel(R) Core(TM) i7-14700F CPU @ 2.10 GHz, 64 GB of RAM, and a 64-bit Windows operating system. To reduce the requirement of computer memory, we divide the study area into 4 regions (Midwest, Northeast, South, and West [33]) and apply DINEOF+ to SM data in each region.
In this study, selected error metrics for validation include bias, median absolute error (MAE) (Equations (3) and (4)), root mean square error (RMSE), regression slope, and R-squared (R2).
b i a s = m e d i a n M i O i
M A E = m e d i a n ( M i O i )
where Oi is the observation as the reference solution and M i is the estimated value, i = 1 ,   2 , n .

4. Results

4.1. Reconstructed SMAP Daily SM Product

The SMAP-derived SM in 2020 consists of 366 images at 1 km resolution over the CONUS. In this study, we divide the test area into four regions (i.e., Northeast, South, Midwest, and West) based on the cartographic boundary established by the Census Bureau. There is 30.90%, 29.00%, 35.02%, and 36.83% of available data in the Northeast, South, Midwest, and West, respectively. To demonstrate the behavior of DINEOF+ to reconstruct the dataset, we randomly selected one day for each month from April to July. As seen from the four sequenced images (Figure 3a,c,e,g), extensive data gaps are found intermittently in the middle, western, and eastern regions of the continent. Consequently, these missing data hinder the monitoring of SM changes from the daily to weekly timescales. By applying DINEOF+ to the dataset, the reconstructed images show clear spatial patterns of SM distribution across the country (Figure 3b,d,f,h). SM peaks (~0.6 m3/m3) are found in the South and Northwest during April and May, followed by a decline in June and July. In contrast, the SM is lower in the Midwest and West (~0.2 m3/m3), while moderate SM peaks (~0.4) are found near the Northwest coast. The availability of daily data in 2020 increases to 87.20%, 89.75%, 92.99%, and 93.60% in the Northeast, South, Midwest, and West, respectively.

4.2. Validation: Comparison with Original 1-km SMAP SM

In this section, we perform the validation of DINEOF+ interpolated SM vis comparisons to the original 1 km SMAP-derived SM. To achieve this objective, we first mask all SMAP-ISMN matchups from the original SM dataset and then apply DINEOF+ to recover those masked values. The total number of SMAP-derived observations used for validation here is 108, 5523, 13,945, 53, 3180, and 237 from networks of ARM, SCAN, SNOTEL, TxSON, USCRN, and iRON, respectively (Figure 4). Measured by bias, MAE, R2, and slope, the errors of DINEOF+ interpolated SM are quantitatively close to the original SMAP-derived SM from the networks of SCAN, SNOTEL, and USCRN (Figure 4b,c,e). However, the interpolated SM shows poor agreement with original SM in the ARM, TxSON, and iRON, as indicated by low R2 values (i.e., 0.07, 0.07, and 0.01) and slope (i.e., 0.24, 0.39, and 0.17) respectively (Figure 4a,d,f). It is noticed that DINEOF+ tends to overestimate SM, particularly in the ARM and TxSON. In addition, large bias between interpolated and original SM are found in the ARM and TxSON of differences of 0.05 and 0.09 m3/m3, respectively.

4.3. Validation: Comparison with In Situ SM

In this section, we use in situ SM as a reference. Our first objective is validating the SMAP-derived SM and DINEOF+ interpolated SM by comparing each with in situ measurements from six ISMN networks. The original and interpolated SM correspond to the matchups used in the previous section. The original SM shows strong agreement with in situ data in the ARM and SCAN networks (R2 = 0.53 and 0.56, respectively), but poor agreement in the other networks (R2 = 0.11, 0.28, 0.25, and 0.00). The accuracy of DINEOF+ interpolated SM is comparable to the original SM based on bias and MAE in all six networks. It is noted that DINEOF+ improves the bias from −0.11 to −0.06 and reduces the MAE from 0.11 to 0.07 in the ARM (Table 2). In addition, DINEOF+ enhances the agreement between SMAP and in situ SM in the USCRN, increasing R2 from 0.25 to 0.31 and the slope from 0.51 to 0.59.
We next use all matchups between the DINEOF+ interpolated SM and in situ SM. It is known that the recovering rate of missing data by DINEOF+ depends on the distribution of observations. To increase the number of recovered missing data, we select 15% of the ISMN SM from each station and then assign these ground values to SMAP images where observations are missing. The final number of matchups used for validation in each network is as follows: ARM (2060), FLUXNET-AMERIFLUX (70), SCAN (18,247), SNOTEL (21,472), TxSON (5109), USCRN (12,277), and iRON (945). The results show that the interpolated SM has good agreement with in situ observations across all networks, with R2 values ranging from 0.47 to 0.96 and a slope between 0.74 and 1.05 (Figure 5a–g). Additionally, MAE and bias range from 0.02 to 0.04 m3/m3 and −0.01 to 0.00 m3/m3, respectively, indicating that both the average deviation and overall error magnitude are low. However, there is clear underestimation of some points in the iRON network (Figure 5g).
Finally, we compare the time series of DINEOF+ reconstructed SM with in situ SM from ISMN. In order to demonstrate the performance of DINEOF+ under scenarios with different data availability, we select stations with more than 300 observations per year, as well as those with fewer than 100, from four networks: ARM, SCAN, SNOTEL, and TxSON. At SCAN, the stations Beasley_Lake and Wedowee have 92 and 58 observations, respectively. At SNOTEL, the stations BUTTE and Sawtooth have 83 and 59 observations. For ARM and TxSON, because all stations have more than 100 observations, we selected those with the fewest data points—Pawnee with 151 observations and LCRA_5 with 300 observations, respectively (Figure 6a–d). Among stations with more than 300 observations, the correlation coefficients (r) between reconstructed and in situ SM range from 0.80 to 0.84 at ARM, 0.75 to 0.97 at SCAN, 0.88 to 0.94 at SNOTEL, and 0.73 to 0.82 at TxSON. In contrast, a significant decline in r is observed at stations with fewer than 100 observations: Beasley_Lake (0.60), Wedowee (0.16), and Sawtooth (0.61). While the reconstructed data exhibit a higher level of noise compared to in situ SM, they generally capture the temporal patterns observed in ground measurements. It is noted that the in situ SM from ISMN often displays abrupt jumps or discontinuities rather than smooth transitions over time. In such cases, interpolated SM tends to remain continuous and fails to match the sudden shifts present in the in situ data.

5. Discussion

Compared to the original 9 km SM product, the 1 km SM maps provide more detailed information on the spatial distribution of soil moisture. However, extensive gaps in the 1-km product hinder the daily monitoring of local soil conditions and human activities such as irrigation. These gaps are mainly caused by swath gaps between different instruments and missing data in the original products used for downscaling SM [40]. In this study, we employ the DINEOF+ algorithm to reconstruct the SMAP-derived SM product and perform a comprehensive validation of the reconstructed SM by comparison to both original SMAP-derived SM and in situ measurements.

5.1. Existing Flaws in Results

Our results first demonstrate that DINEOF+ can recover most missing data in the original daily 1 km SM product across the country, reducing PMD from 62.2% to 8.1%. However, on certain days, PMD still exceeds 50%, such as on day 26 (60.0%), day 181 (56.8%), and day 215 (64.8%). This occurs because DINEOF+ requires a maximum PMD threshold of 75% as a minimum condition for reconstruction. On the days mentioned, large regions such as the Midwest and the South lacked data and were excluded from reconstruction.
By comparison with the original SM, DINEOF+ demonstrates good performance in recovering missing data, particularly at SCAN, SNOTEL, and USCRN, where the number of matchups used for validation ranges from approximately 3000 to 10,000 and their stations are geographically distributed across the country. In contrast, poor agreement is observed in the ARM and TxSON, where DINEOF+ tends to overestimate SM. Both networks have far fewer matchups available for validation (~50–100), and their stations are clustered within local areas (Figure 2). Due to the small sample size, we cannot identify the errors as systematic bias. A likely attributable factor is the lack of sufficient observations surrounding the locations of missing data. Although DINEOF+ employs a mask to mitigate this issue, the accuracy of reconstruction tends to decrease in areas with sparse observations [39].
The application of DINEOF+ in recovering SM is further validated by using in situ data from ISMN. Consistent with the results obtained from the comparison with the original SM, relatively lower agreement with ISMN SM is also observed at ARM, TxSON, and iRON networks, where R2 values range from ~0.5–0.6. Notably, a greater number of matchups become available after incorporating additional observations from ISMN. The increased spatial and temporal coverage of input data contributes to the improved accuracy of the reconstruction. However, outliers in the reconstructed results remain present across all networks, with particularly pronounced examples observed at iRON (Figure 5).
The agreement between reconstructed and ISMN SM is demonstrated through time series comparisons (Figure 6). In addition to the four representative stations shown for each network, we analyzed stations with more than 340 valid measurements. This includes 14 stations in the ARM network (correlation ~0.7–0.8), 32 stations at SCAN (~0.6–0.9), and 34 stations at TxSON. SNOTEL contains fewer long-duration records, with 13 stations having 220–300 observations. The median correlation for SNOTEL is 0.90, with one notable outlier: SIERRA_BLANCA, which shows an unusually low correlation (0.17) due to a flat time series (SM = 0.058 m3/m3) sustained over an entire year. A similar case is observed at TxSON at station CR200_1 (correlation < 0.1), where SM remains constant at 0.182 m3/m3. These anomalies are likely due to sensor errors or uncorrected biases and are excluded from further analysis in our study. Although some point-level discrepancies remain, particularly at sharp discontinuities, the reconstructed time series effectively captures the overall trends observed at ISMN stations. This provides evidence that the application of DINEOF+ enables improved monitoring of soil moisture dynamics at high temporal resolution.

5.2. Denoising Effect of DINEOF+

Apart from filling in missing data, we find that DINEOF+ can contribute to reducing noise in the original SMAP-derived SM dataset, particularly in regions where large discrepancies exist between SMAP and in situ observations (Table 2). This denoising effect may be attributed to the fact that DINEOF+ only uses the first of several EOF modes, which filter out high-frequency noise and reduce variance. However, the strong agreement between the original and reconstructed values also indicates that DINEOF+ is sensitive to the accuracy of the input data. Errors present in the original SM can propagate into the reconstructed results. This explains the large bias in the reconstructed SM found in the study [33,41]. Therefore, the continuous improvement of the downscaled product is essential to ensure the accuracy of the interpolated results produced by DINEOF+.

5.3. Limitation of DINEOF+ in Interpolating Soil Moisture

There are several limitations to applying DINEOF+ for interpolating the SM products. First, the reconstructed dataset is not completely gap-free. In particular, reconstruction becomes challenging on days or in regions where observations are nearly missing. DINEOF+ leaves those areas uninterpolated, considering the fact that the lack of observations leads to significant bias in the estimated values [39]. In addition, we notice that the data gaps in the 1 km SMAP SM product are characterized by frequent strip-shaped distribution. As a result, the accuracy of the interpolated results strongly depends on temporal information from other days. This dependence may lead to discontinuities or artifacts near the boundaries of these gap areas, as reported in a previous study [33]. Lastly, because the core process of DINEOF+ involves iterative SVD of the data matrix, it is computationally intensive and demands substantial memory. Dividing the study area into smaller subregions is a practical solution. However, optimized strategies for spatial partitioning, along with the integration of auxiliary data sources, may help mitigate boundary artifacts and further enhance reconstruction accuracy.

6. Conclusions

This study aims to validate the performance of the DINEOF+ algorithm in reconstructing the daily 1 km SMAP soil moisture product across the Contiguous United States (CONUS). We first identify the spatial and temporal characteristics of missing data in the SMAP 1-km product and apply DINEOF+ to fill these gaps. The reconstructed results successfully reduce the average proportion of missing data (PMD) from 62.2% to 8.1%, with improvements in data availability exceeding 85%.
Next, we validate the DINEOF+ interpolated soil moisture (SM) using both the original SMAP-derived SM and in situ observations from the ISMN. The interpolated SM showed strong agreement with the original SM (R2 > 0.65, bias 0.01–0.02 m3/m3) and comparable accuracy to in situ measurements (MAE 0.01–0.04 m3/m3, correlation coefficient 0.6–0.8). These results demonstrate DINEOF+’s capability to effectively recover soil moisture data along both spatial and temporal dimensions.
Our analysis also reveals limitations in the performance of DINEOF+ under conditions of sparse observations or extremely high PMD, where reconstruction becomes unreliable. Additionally, artifacts may emerge near the edges of large gaps, and the algorithm’s computational demand poses challenges for large-scale implementation.
Overall, our validation demonstrates that DINEOF+ is an effective technique for reconstructing missing values in the 1 km SMAP soil moisture product. Temporal resolution is an important factor in identifying hydrological dynamics, such as the timing of soil drying or wetting events, which are essential for applications in agriculture and drought monitoring. By improving the temporal resolution of SM time series over the annual period, DINEOF+ helps mitigate uncertainties caused by data gaps in the original product, thereby improving its utility for time-sensitive analyses. Future work might be focused on the automation of the generation of this interpolated 1-km soil moisture product, making it publicly available via the online datahub or web GIS system.

Author Contributions

H.Z. (Haipeng Zhao) designed the study, conducted the analysis, and wrote the paper. H.Z. (Haoteng Zhao) and C.Z. took part in discussions about and revisions of the paper. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

The SMAP derived 1 km surface soil moisture dataset is available at https://nsidc.org/data/nsidc-0779/versions/1 (accessed on 10 April 2025). In situ SM from the International Soil Moisture Network is available at https://ismn.earth/en/dataviewer/ (accessed on 10 April 2025). The boundaries of the US regions are available at the US Census Bureau (https://www.census.gov/geographies/mapping-files/time-series/geo/cartographic-boundary.html, accessed on 10 April 2025). The DINEOF+ code can be found at the GitHub repository https://github.com/zhprm1992/DINEOF-plus.git (accessed on 10 April 2025).

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Li, G.; Yang, H.; Zhang, Y.; Huang, C.; Pan, X.; Ma, M.; Song, M.; Zhao, H. More Extreme Precipitation in Chinese Deserts from 1960 to 2018. Earth Space Sci. 2019, 6, 1196–1204. [Google Scholar] [CrossRef]
  2. Berg, A.; Sheffield, J. Climate Change and Drought: The Soil Moisture Perspective. Curr. Clim. Change Rep. 2018, 4, 180–191. [Google Scholar] [CrossRef]
  3. Abolafia-Rosenzweig, R.; Livneh, B.; Small, E.; Kumar, S. Soil Moisture Data Assimilation to Estimate Irrigation Water Use. J. Adv. Model. Earth Syst. 2019, 11, 3670–3690. [Google Scholar] [CrossRef] [PubMed]
  4. Zhao, H.; Gao, F.; Anderson, M.; Cirone, R.; Chang, G.J. Improving crop condition monitoring using phenologically aligned vegetation index anomalies–A case study in central Iowa. Int. J. Appl. Earth Obs. Geoinf. 2025, 139, 104526. [Google Scholar] [CrossRef]
  5. Zhao, H.; Di, L.; Guo, L.; Zhang, C.; Lin, L. An Automated Data-Driven Irrigation Scheduling Approach Using Model Simulated Soil Moisture and Evapotranspiration. Sustainability 2023, 15, 12908. [Google Scholar] [CrossRef]
  6. Dobriyal, P.; Qureshi, A.; Badola, R.; Hussain, S.A. A Review of the Methods Available for Estimating Soil Moisture and Its Implications for Water Resource Management. J. Hydrol. 2012, 458–459, 110–117. [Google Scholar] [CrossRef]
  7. Kerr, Y.H.; Waldteufel, P.; Wigneron, J.P.; Martinuzzi, J.; Font, J.; Berger, M. Soil Moisture Retrieval from Space: The Soil Moisture and Ocean Salinity (Smos) Mission. IEEE Trans. Geosci. Remote Sens. 2001, 39, 1729–1735. [Google Scholar] [CrossRef]
  8. Jackson, T.J.; Cosh, M.H.; Bindlish, R.; Starks, P.J.; Bosch, D.D.; Seyfried, M.; Goodrich, D.C.; Moran, M.S.; Du, J.Y. Validation of Advanced Microwave Scanning Radiometer Soil Moisture Products. IEEE Trans. Geosci. Remote Sens. 2010, 48, 4256–4272. [Google Scholar] [CrossRef]
  9. Kim, H.; Lakshmi, V. Use of Cyclone Global Navigation Satellite System (Cygnss) Observations for Estimation of Soil Moisture. Geophys. Res. Lett. 2018, 45, 8272–8282. [Google Scholar] [CrossRef]
  10. Entekhabi, D.; Njoku, E.G.; O’Neill, P.E.; Kellogg, K.H.; Crow, W.T.; Edelstein, W.N.; Entin, J.K.; Goodman, S.D.; Jackson, T.J.; Johnson, J.; et al. The Soil Moisture Active Passive (Smap) Mission. Proc. IEEE 2010, 98, 704–716. [Google Scholar] [CrossRef]
  11. Zaussinger, F.; Dorigo, W.; Gruber, A.; Tarpanelli, A.; Filippucci, P.; Brocca, L. Estimating Irrigation Water Use over the Contiguous United States by Combining Satellite and Reanalysis Soil Moisture Data. Hydrol. Earth Syst. Sci. 2019, 23, 897–923. [Google Scholar] [CrossRef]
  12. Walker, J.P.; Houser, P.R. Requirements of a Global near-Surface Soil Moisture Satellite Mission: Accuracy, Repeat Time, and Spatial Resolution. Adv. Water Resour. 2004, 27, 785–801. [Google Scholar] [CrossRef]
  13. Fang, B.; Lakshmi, V.; Cosh, M.; Liu, P.; Bindlish, R.; Jackson, T.J. A Global 1-Km Downscaled Smap Soil Moisture Product Based on Thermal Inertia Theory. Vadose Zone J. 2022, 21, e20182. [Google Scholar] [CrossRef]
  14. Liu, P.-W.; Bindlish, R.; O’Neill, P.; Fang, B.; Lakshmi, V.; Yang, Z.; Cosh, M.H.; Bongiovanni, T.; Collins, C.H.; Starks, P.J.; et al. Thermal Hydraulic Disaggregation of Smap Soil Moisture over the Continental United States. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2022, 15, 4072–4092. [Google Scholar] [CrossRef]
  15. Lakshmi, V.; Fang, B. Smap-Derived 1-Km Downscaled Surface Soil Moisture Product, Version 1; NASA National Snow and Ice Data Center Distributed Active Archive Center: Boulder, CO, USA, 2023.
  16. Zhang, C.; Yang, Z.; Zhao, H.; Sun, Z.; Di, L.; Bindlish, R.; Liu, P.-W.; Colliander, A.; Mueller, R.; Crow, W.; et al. Crop-Casma: A Web Geoprocessing and Map Service Based Architecture and Implementation for Serving Soil Moisture and Crop Vegetation Condition Data over U.S. Cropland. Int. J. Appl. Earth Obs. Geoinf. 2022, 112, 102902. [Google Scholar] [CrossRef]
  17. Fang, B.; Lakshmi, V.; Bindlish, R.; Jackson, T.J.; Liu, P.-W. Evaluation and Validation of a High Spatial Resolution Satellite Soil Moisture Product over the Continental United States. J. Hydrol. 2020, 588, 125043. [Google Scholar] [CrossRef]
  18. Eva, E.A.; Leasor, Z.; Dobreva, I.; Quiring, S.M. Identifying Important Features for Downscaling Soil Moisture to 1-Km in the Contiguous United States. EGUsphere 2025, 2025, 1–38. [Google Scholar]
  19. Gemitzi, A.; Kofidou, M.; Falalakis, G.; Fang, B.; Lakshmi, V. Estimating High-Resolution Soil Moisture by Combining Data from a Sparse Network of Soil Moisture Sensors and Remotely Sensed Modis Lst Information. Hydrol. Res. 2024, 55, 905–920. [Google Scholar] [CrossRef]
  20. Senyurek, V.; Gurbuz, A.; Kurum, M.; Lei, F.; Boyd, D.; Moorhead, R. Spatial and Temporal Interpolation of Cygnss Soil Moisture Estimations. In Proceedings of the 2021 IEEE International Geoscience and Remote Sensing Symposium IGARSS, Brussels, Belgium, 11–16 July 2021. [Google Scholar]
  21. Lu, G.Y.; Wong, D.W. An Adaptive Inverse-Distance Weighting Spatial Interpolation Technique. Comput. Geosci. 2008, 34, 1044–1055. [Google Scholar] [CrossRef]
  22. Yao, X.; Fu, B.; Lü, Y.; Sun, F.; Wang, S.; Liu, M. Comparison of Four Spatial Interpolation Methods for Estimating Soil Moisture in a Complex Terrain Catchment. PLoS ONE 2013, 8, e54660. [Google Scholar] [CrossRef]
  23. Karamouz, M.; Alipour, R.S.; Roohinia, M.; Fereshtehpour, M. A Remote Sensing Driven Soil Moisture Estimator: Uncertain Downscaling with Geostatistically Based Use of Ancillary Data. Water Resour. Res. 2022, 58, e2022WR031946. [Google Scholar] [CrossRef]
  24. Chen, H.; Fan, L.; Wu, W.; Liu, H.-B. Comparison of Spatial Interpolation Methods for Soil Moisture and Its Application for Monitoring Drought. Environ. Monit. Assess. 2017, 189, 525. [Google Scholar] [CrossRef]
  25. Zhang, J.; Li, X.; Yang, R.; Liu, Q.; Zhao, L.; Dou, B. An Extended Kriging Method to Interpolate near-Surface Soil Moisture Data Measured by Wireless Sensor Networks. Sensors 2017, 17, 1390. [Google Scholar] [CrossRef]
  26. Srivastava, P.K.; Pandey, P.C.; Petropoulos, G.P.; Kourgialas, N.N.; Pandey, V.; Singh, U. GIS and Remote Sensing Aided Information for Soil Moisture Estimation: A Comparative Study of Interpolation Techniques. Resources 2019, 8, 70. [Google Scholar] [CrossRef]
  27. Zoubin, G.; Jordan, M. Supervised Learning from Incomplete Data Via an Em Approach. Adv. Neural Inf. Process. Syst. 1993, 6, 120–127. [Google Scholar]
  28. Beckers, J.M.; Rixen, M. EOF Calculations and Data Filling from Incomplete Oceanographic Datasets. J. Atmos. Ocean. Technol. 2003, 20, 1839–1856. [Google Scholar] [CrossRef]
  29. Zhao, H.; Matsuoka, A.; Manizza, M.; Winter, A. Recent Changes of Phytoplankton Bloom Phenology in the Northern High-Latitude Oceans (2003–2020). J. Geophys. Res. Ocean. 2022, 127, e2021JC018346. [Google Scholar] [CrossRef]
  30. Marchese, C.; Colella, S.; Brando, V.E.; Zoffoli, M.L.; Volpe, G. Towards Accurate L4 Ocean Colour Products: Interpolating Remote Sensing Reflectance Via Dineof. Int. J. Appl. Earth Obs. Geoinf. 2024, 135, 104270. [Google Scholar] [CrossRef]
  31. Alvera-Azcárate, A.; Barth, A.; Parard, G.; Beckers, J.-M. Analysis of Smos Sea Surface Salinity Data Using Dineof. Remote Sens. Environ. 2016, 180, 137–145. [Google Scholar] [CrossRef]
  32. Aida, A.-A.; Barth, A.; Beckers, J.-M.; Weisberg, R.H. Multivariate Reconstruction of Missing Data in Sea Surface Temperature, Chlorophyll, and Wind Satellite Fields. J. Geophys. Res. Ocean. 2007, 112, C3. [Google Scholar]
  33. Zhao, H.; Zhao, H.; Zhang, C. Reconstructing Smap-Derived 1-Km Soil Moisture Dataset Using DINEOF+ Algorithm. In Proceedings of the 2024 12th International Conference on Agro-Geoinformatics (Agro-Geoinformatics), Novi Sad, Serbia, 15–18 July 2024. [Google Scholar]
  34. Zhao, H.; Di, L.; Sun, Z.; Yu, E.; Zhang, C.; Lin, L. Validation and Calibration of Hrldas Soil Moisture Products in Nebraska. In Proceedings of the 2022 10th International Conference on Agro-geoinformatics (Agro-Geoinformatics), Boulder, CO, USA, 11–14 July 2022. [Google Scholar]
  35. Chaney, N.W.; Minasny, B.; Herman, J.D.; Nauman, T.W.; Brungard, C.W.; Morgan, C.L.S.; McBratney, A.B.; Wood, E.F.; Yimam, Y.T. Polaris Soil Properties: 30-M Probabilistic Maps of Soil Properties over the Contiguous United States. Water Resour. Res. 2019, 55, 2916–2938. [Google Scholar] [CrossRef]
  36. Kern, J.S. Geographic Patterns of Soil Water-Holding Capacity in the Contiguous United States. Soil Sci. Soc. Am. J. 1995, 59, 1126–1133. [Google Scholar] [CrossRef]
  37. Kern, J.S. Spatial Patterns of Soil Organic Carbon in the Contiguous United States. Soil Sci. Soc. Am. J. 1994, 58, 439–455. [Google Scholar] [CrossRef]
  38. Dorigo, W.; Himmelbauer, I.; Aberer, D.; Schremmer, L.; Petrakovic, I.; Zappa, L.; Preimesberger, W.; Xaver, A.; Annor, F.; Ardö, J.; et al. The International Soil Moisture Network: Serving Earth System Science for over a Decade. Hydrol. Earth Syst. Sci. 2021, 25, 5749–5804. [Google Scholar] [CrossRef]
  39. Zhao, H.; Matsuoka, A.; Manizza, M.; Winter, A. DINEOF Interpolation of Global Ocean Color Data: Error Analysis and Masking. J. Atmos. Ocean. Technol. 2024, 41, 953–968. [Google Scholar] [CrossRef]
  40. Fang, B.; Lakshmi, V.; Bindlish, R.; Jackson, T.J. Downscaling of Smap Soil Moisture Using Land Surface Temperature and Vegetation Data. Vadose Zone J. 2018, 17, 170198. [Google Scholar] [CrossRef]
  41. Boyd, J.D.; Kennelly, E.P.; Pistek, P. Estimation of Eof Expansion Coefficients from Incomplete Data. Deep Sea Res. Part I Oceanogr. Res. Pap. 1994, 41, 1479–1488. [Google Scholar] [CrossRef]
Figure 1. Percentage of missing data (PMD) of 1 km SMAP daily SM product across the CONUS in 2020. (a) Spatial distribution of PMD. Makers indicate the locations of the stations selected for time series comparison between in situ and reconstructed SM in Section 4.3. (b) Daily variation of PMD throughout the year.
Figure 1. Percentage of missing data (PMD) of 1 km SMAP daily SM product across the CONUS in 2020. (a) Spatial distribution of PMD. Makers indicate the locations of the stations selected for time series comparison between in situ and reconstructed SM in Section 4.3. (b) Daily variation of PMD throughout the year.
Agriculture 15 01212 g001
Figure 2. Geographic distribution of ISMN networks and stations of in situ SM used for validation in this study. Dots indicate station locations, with colors representing different networks. Labels are station names.
Figure 2. Geographic distribution of ISMN networks and stations of in situ SM used for validation in this study. Dots indicate station locations, with colors representing different networks. Labels are station names.
Agriculture 15 01212 g002
Figure 3. SMAP-derived daily 6 pm SM at 1 km resolution (left panels) and DINEOF+ reconstructed results (right panels) in the contiguous US in 2020 on (a,b) April 20, (c,d) May 2, (e,f) June 26, and (g,h) July 29. Black indicates no data.
Figure 3. SMAP-derived daily 6 pm SM at 1 km resolution (left panels) and DINEOF+ reconstructed results (right panels) in the contiguous US in 2020 on (a,b) April 20, (c,d) May 2, (e,f) June 26, and (g,h) July 29. Black indicates no data.
Agriculture 15 01212 g003
Figure 4. Scatterplot of DINEOF+ reconstructed SM vs. original SMAP-derived SM at networks (a) ARM, (b) SCAN, (c) SNOTEL, (d) TxSON, (e) USCRN, and (f) iRON. The color represents data density. Red line = 1:1 ratio.
Figure 4. Scatterplot of DINEOF+ reconstructed SM vs. original SMAP-derived SM at networks (a) ARM, (b) SCAN, (c) SNOTEL, (d) TxSON, (e) USCRN, and (f) iRON. The color represents data density. Red line = 1:1 ratio.
Agriculture 15 01212 g004
Figure 5. Scatterplot of DINEOF+ reconstructed SM vs. in situ SM from ISMN at (a) ARM, (b) FLUXNET-AMERIFLUX, (c) SCAN, (d) SNOTEL, (e) TxSON, (f) USCRN, and (g) iRON. The color represents data density. Red line = 1:1 ratio.
Figure 5. Scatterplot of DINEOF+ reconstructed SM vs. in situ SM from ISMN at (a) ARM, (b) FLUXNET-AMERIFLUX, (c) SCAN, (d) SNOTEL, (e) TxSON, (f) USCRN, and (g) iRON. The color represents data density. Red line = 1:1 ratio.
Agriculture 15 01212 g005
Figure 6. Time-series plots of DINEOF+ reconstructed 1 km SMAP-derived SM estimates and corresponding ISMN in situ SM measurements of descending overpass of 16 stations from 4 SM networks, including (a) ARM, (b) SCAN, (c) STONEL, and (d) TxSON. Blue dots represent ISMN SM. Red dots are DINEOF+ reconstructed SM.
Figure 6. Time-series plots of DINEOF+ reconstructed 1 km SMAP-derived SM estimates and corresponding ISMN in situ SM measurements of descending overpass of 16 stations from 4 SM networks, including (a) ARM, (b) SCAN, (c) STONEL, and (d) TxSON. Blue dots represent ISMN SM. Red dots are DINEOF+ reconstructed SM.
Agriculture 15 01212 g006aAgriculture 15 01212 g006bAgriculture 15 01212 g006c
Table 1. Information on the ISMN networks.
Table 1. Information on the ISMN networks.
NetworkNumber of StationsNumber of ObservationsRange of SM (m3/m3)
SCAN16742,5070.001–0.521
SNOTEL35441,8180.001–0.460
USCRN10727,2160.001–0.515
ARM1514,7240.002–0.456
TxSON3713,0530.051–0.493
iRON1019540.010–0.430
FLUXNET-AMERIFLUX11910.227–0.411
Table 2. Error statistics of original SMAP-derived SM and DINEOF+ reconstructed SM compared to ISMN SM in 2020. N is the number of matchups. Bold indicates the lower error for each metric.
Table 2. Error statistics of original SMAP-derived SM and DINEOF+ reconstructed SM compared to ISMN SM in 2020. N is the number of matchups. Bold indicates the lower error for each metric.
Original SMAP-Derived SMDINEOF+ Reconstructed SM
NbiasMAERMSR2slopebiasMAERMSR2slope
ARM313−0.110.110.060.531.33−0.060.070.060.401.04
SCAN58840.020.050.080.560.800.030.060.080.560.81
SNOTEL13,9450.010.070.100.110.520.020.080.10.100.50
TxSON53−0.030.040.080.281.210.080.100.090.060.83
USCRN31800.040.070.100.250.510.050.080.100.310.59
iRON237−0.020.050.080.00−0.12−0.010.050.070.020.39
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Zhao, H.; Zhao, H.; Zhang, C. Validating Data Interpolation Empirical Orthogonal Functions Interpolated Soil Moisture Data in the Contiguous United States. Agriculture 2025, 15, 1212. https://doi.org/10.3390/agriculture15111212

AMA Style

Zhao H, Zhao H, Zhang C. Validating Data Interpolation Empirical Orthogonal Functions Interpolated Soil Moisture Data in the Contiguous United States. Agriculture. 2025; 15(11):1212. https://doi.org/10.3390/agriculture15111212

Chicago/Turabian Style

Zhao, Haipeng, Haoteng Zhao, and Chen Zhang. 2025. "Validating Data Interpolation Empirical Orthogonal Functions Interpolated Soil Moisture Data in the Contiguous United States" Agriculture 15, no. 11: 1212. https://doi.org/10.3390/agriculture15111212

APA Style

Zhao, H., Zhao, H., & Zhang, C. (2025). Validating Data Interpolation Empirical Orthogonal Functions Interpolated Soil Moisture Data in the Contiguous United States. Agriculture, 15(11), 1212. https://doi.org/10.3390/agriculture15111212

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop