Overview

This document describes the overall process to model Soil Moisture in Oklahoma and surrounding areas performing geospatial interpolation by means of Ordinary Kriging, Regression Kriging and Generalized Linear Model regarding the relationship between monthly mean soil moisture and, monthly mean minimum air temperature and total precipitation. Soil Moisture product was acquired from the European Space Agency Climate Change Initiative (CCI version 4.5).

Objectives

This workflow performs three approaches to model soil moisture over those pixels where soil moisture was not remotely retrieved and processed by the ESA CCI product (version 4.5). The region of interest is an area of 465,777 km2, covering the state of Oklahoma and some portions of the surrounding states.

According to the availability of ground-truth data aimed to validate the modeled soil moisture values, as well as define the base correlation between satellite soil moisture estimates and ground-truth data, a period from January 2000 to September 2012 was established.

This workflow shows three approaches to model soil moisture for one single month, which can be any of the months of the established study period. Regarding the availability of spatial gaps to perform the analysis, different percentages of valid data (25% and 50%) were artificially removed to test the modeling methods under different scenarios of gap presence; with 100% of available data, 75 % and 50% respectively.

First approach aims to soil moisture modeling by means of Ordinary Kriging interpolation (Hengl, Heuvelink, and Stein 2004; Stein 1999; Cressie 1990), regarding the pixel values nearby No Data (invalid) pixels, as well as the spatial distribution of both valid and invalid pixels.

The second one performs Regression Kriging, which combines the principles of kriging interpolation and linear regression with covariates (Hengl, Heuvelink, and Stein 2004; Kang, Jin, and Li 2015) that are used to solve kriging weights (Hengl, Heuvelink, and Rossiter 2007). In this work, RK relies on the relation between soil moisture (response variable) with precipitation and minimum air temperature (explanatory variables).

The third approach is based on Generalized Linear Model regression, regarding the relationship between soil moisture as response variable and two or more predictors (Gareth et al. 2013). As result of previous analysis, independent linear correlation between soil moisture and meteorological parameters such as precipitation and minimum air temperature was found, this way, the conceptual assumption for using linear models is fulfilled.

All proposed techniques for soil moisture modeling are evaluated and validated using cross validation approach, as well as Pearson correlation with soil moisture field data derived from the North America Soil Moisture Database Network (Quiring et al. 2016) over the region of interest.

Workflow for soil moisture modeling and the gap-filling over the region of interest, regarding 100%, 75% and 50% of available valid pixels in each monthly layer. Cross-validation as well as ground-truth validation is also described.

Workflow for soil moisture modeling and the gap-filling over the region of interest, regarding 100%, 75% and 50% of available valid pixels in each monthly layer. Cross-validation as well as ground-truth validation is also described.

Data sources

The CCI product is a soil moisture database developed by the European Space Agency in Collaboration with the Vienna University of Technology (TU Wien) (Dorigo et al. 2015). This gathers historical records from a set of sensors, providing global soil moisture measurements daily since November 1978, at 0.25 degrees of spatial resolution.

Meteorological data used in this workflow was acquired from Daymet (Daily Surface Weather and Climatological Data), an initiative supported by NASA through the Earth Science Data and Information System and the Terrestrial Ecosystem program (Thornton et al. 2018). Daymet provides gridded estimates of seven surface parameters at 1-km spatial resolution over North America. Original data was cropped to the region of interest, then projected to WGS 84 Lat-Long coordinate system and resampled to 0.25 degrees by means of nearest neighbor method (ngb) (J. a Parker, Kenyon, and Troxel 1983).

Validation data was acquired from the North American Soil Moisture Database (Quiring et al. 2016). The NASMD is a collection of field soil moisture measurements from 33 observation networks, comprising over 1,800 sites in the United States and Canada. For the sake of comparison between satellite soil moisture and modeled soil moisture outputs, mean monthly soil moisture values from the NASMD were calculated for each station.

Data preparation

Import of one soil moisture monthly layer (February 2011), as well as predefined monthly covariates (precipitation and min air temperature) for the same month as soil moisture.

Transformation of soil moisture layer into Spatial Points Data Frame, the format needed to run kriging interpolation method using 100% of available valid pixels in the selected month.

Data needed to reproduce the process along the study period (153 months)

153 Soil Moisture layers (January 2000 - September 2012) 153 Total monthly precipitation layers (January 2000 - September 2012) 153 Average monthly minimum temperature layers (January 2000 - September 2012) 153 Monthly field soil moisture records over the region of interest, from NASMD (January 2000 - September 2012)

#Set working directory where the described files below must be located
setwd("E:/Dropbox/UDEL/Gap_Filling_paper_definitive/Soil_moisture_modeling_v45")

#This seed needs to be set to generate the same sampling outputs in some parts of the process. 
#This parameter must not be modified
set.seed(3456)

#This is the projection that will assign metric coordinates to the data to be interpolated based on Ordinary Kriging and Regression Kriging
albers_proj <- "+proj=aea +lat_1=29.5 +lat_2=45.5 +lat_0=23 +lon_0=-96 +x_0=0+y_0=0 +ellps=GRS80 +datum=NAD83 +units=m +no_defs"

#This is the original projection in which the original soil moisture data is provided.
wgs84 <- "+proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0"

### Reference raster layer of the region of interest (wgs84)
raster_reference <- raster('E:/Dropbox/UDEL/Gap_Filling_paper_definitive/Boundaries/Region_Interest_Frame_.tif')

#These are the lists of files including all input data needed to reproduce the entire workflow.
sm_files <- list.files(pattern = "metrics_version_45.tif")
prcp_files <- list.files(pattern = "daymet_prcp")
tmin_files <- list.files(pattern = "daymet_tmin")
nasmd_files <- list.files(pattern = "validation_nasmd")

raster_reference_pixels <- as(raster_reference, 'SpatialPolygonsDataFrame')
raster_reference_pixels <- spTransform(raster_reference_pixels, CRSobj=CRS(albers_proj))

#This is the number of layer that defines the month to be processed (e.g. 134 = Febraruay 2011)

i=134

sm_monthly_ref <- raster(sm_files[i], band =1)
sm_monthly_ref <- as.data.frame(sm_monthly_ref, xy=TRUE)
names(sm_monthly_ref)[3] <- 'original_soil_moist'

sm_raster_base <- raster(sm_files[i], band =1)
sm_raster_krig <- as(sm_raster_base, 'SpatialPointsDataFrame')
sm_raster_krig <- spTransform(sm_raster_krig, albers_proj)

sm_raster <- na.omit(as.data.frame(sm_raster_base[[1]], xy=TRUE))

names(sm_raster)[3] <- 'soil_moist'
names(sm_raster_base)[1] <- 'soil_moist'
names(sm_raster_krig)[1] <- 'soil_moist'

prcp_raster <- raster(prcp_files[i], band = 1)
prcp_montlhy_ref <- as.data.frame(prcp_raster, xy=TRUE)
names(prcp_montlhy_ref)[3] <- 'prcp'

tmin_raster <- raster(tmin_files[i], band = 1)
tmin_montlhy_ref <- as.data.frame(tmin_raster, xy=TRUE)
names(tmin_montlhy_ref)[3] <- 'tmin'

covariates <- stack(prcp_raster, tmin_raster)
names(covariates)[1] <- 'prcp'
names(covariates)[2] <- 'tmin'
#Generation of the table to store correlation and rmse outputs from interpoaltion methods
df <- data.frame(YYMM=character(), correlation_OK=numeric(), rmse_OK=numeric(), 
                 correlation_RK=numeric(), rmse_RK=numeric(), 
                 correlation_GLM=numeric(), rmse_GLM=numeric(), 
                 nfold_OK=numeric(), nfold_RK=numeric(), nfold_GLM=numeric())

df$YYMM <- as.character()
layer_name_sm <- substr(sm_files[i], 20,26)
layer_name_prcp <- substr(prcp_files[i], 29,35)
layer_name_tmin <- substr(tmin_files[i], 29,35)
layer_name_nasmd <- substr(sm_files[i], 20,26)
df[1,1] <- layer_name_sm

print(layer_name_nasmd)

accuracy_report_100 <- df
accuracy_report_75 <- df
accuracy_report_50 <- df

Soil Moisture Modeling

Ordinary Kriging spatial interpolation of available valid pixels

This section runs Ordinary Kriging Spatial Interpolation method using 100%, 75% and 50% of available valid pixels in the soil moisture monthly layer. The number of valid pixels varies according to every monthly layer.

Ordinary Kriging 100% of available valid pixels

#generation of a model to fit variogram (using automap library)
ordkrig_model_100 <- autofitVariogram(soil_moist~1, input_data = sm_raster_krig, verbose= FALSE, 
                              GLS.model=TRUE)

#application of ordinary kriging spatial interpolation using the previous fittef model
sm_ordkriging_100 <- autoKrige(soil_moist~1, input_data = sm_raster_krig, GLS.model = TRUE, new_data = raster_reference_pixels)

#save the R file, which contains the ordinary kriging variogram model
saveRDS(ordkrig_model_100, file = paste0('ordkrig_variogram_model_100_', layer_name_sm))
write.csv(ordkrig_model_100$exp_var, file = paste0('ordkrig_variogram_model_100_exp_var_', layer_name_sm, '.csv'))
write.csv(ordkrig_model_100$var_model, file = paste0('ordkrig_variogram_model_100_var_model_', layer_name_sm, '.csv'))

#definition of predicted values and associated standard error
sm_ordkriging_100_output <- sm_ordkriging_100$krige_output
sm_ordkriging_100_output <- spTransform(sm_ordkriging_100_output, wgs84)

ordkrig_pred_100 <- rasterize(sm_ordkriging_100_output, raster_reference, 'var1.pred')
ordkrig_std_err_100 <- rasterize(sm_ordkriging_100_output, raster_reference, 'var1.stdev', na.rm = FALSE)

#Ordinary Kriging Model plot
plot(sm_ordkriging_100, col=colorRampPalette(c("tan4", "tan2", "yellow1", "yellowgreen", 
    "green", "springgreen2", "steelblue1", "steelblue4"))(20))

#export of outputs to raster files
writeRaster(stack(ordkrig_pred_100, ordkrig_std_err_100), 
            file=paste0('ordKriging_', layer_name_sm, '_pred_error_sm_100.tif'), overwrite = TRUE)

Ordinary Kriging 75% of available valid pixels

#selection of training data, 75% of valid available pixels
trainIndex75 <- createDataPartition(sm_raster[,3], p = .75, list = FALSE, times = 1)
Train75 <- sm_raster[ trainIndex75,] 
Test75 <- sm_raster[-trainIndex75,]

coordinates(Train75) =~ x+y
projection(Train75) <- wgs84

Train75 <- spTransform(Train75, CRSobj = albers_proj)

#generation of  a model to fit variogram
ordkrig_model_75 <- autofitVariogram(soil_moist~1, input_data = Train75, verbose= FALSE, 
                                      GLS.model=TRUE)

#application of ordinary kriging spatial interpolation using the previous fittef model
sm_ordkriging_75 <- autoKrige(soil_moist~1, input_data = Train75, GLS.model = TRUE, new_data = raster_reference_pixels)

#save the R file, which contains the ordinary kriging variogram model
saveRDS(ordkrig_model_75, file = paste0('ordkrig_variogram_model_75_', layer_name_sm))
write.csv(ordkrig_model_75$exp_var, file = paste0('ordkrig_variogram_model_75_exp_var_', layer_name_sm, '.csv'))
write.csv(ordkrig_model_75$var_model, file = paste0('ordkrig_variogram_model_75_var_model_', layer_name_sm, '.csv'))

#definition of predicted values and associated standard error
sm_ordkriging_75_output <- sm_ordkriging_75$krige_output
sm_ordkriging_75_output <- spTransform(sm_ordkriging_75_output, wgs84)

ordkrig_pred_75 <- rasterize(sm_ordkriging_75_output, raster_reference, 'var1.pred')
ordkrig_std_err_75 <- rasterize(sm_ordkriging_75_output, raster_reference, 'var1.stdev', na.rm = FALSE)

#Ordinary Kriging Model plot
plot(sm_ordkriging_75, col=colorRampPalette(c("tan4", "tan2", "yellow1", "yellowgreen", 
    "green", "springgreen2", "steelblue1", "steelblue4"))(20))

#export of outputs to raster files
writeRaster(stack(ordkrig_pred_75, ordkrig_std_err_75), 
            file=paste0('ordKriging_', layer_name_sm, '_pred_error_sm_75.tif'), overwrite = TRUE)

Ordinary Kriging 50% of available valid pixels

#selection of training data, 50% of valid available pixels
trainIndex50 <- createDataPartition(sm_raster[,3], p = .5, list = FALSE, times = 1)
Train50 <- sm_raster[ trainIndex50,]
Test50 <- sm_raster[-trainIndex50,]

coordinates(Train50) =~ x+y
projection(Train50) <- wgs84

Train50 <- spTransform(Train50, CRSobj = albers_proj)

#generation of  a model to fit variogram
ordkrig_model_50 <- autofitVariogram(soil_moist~1, input_data = Train50, verbose= FALSE, 
                                     GLS.model=TRUE)

#application of ordinary kriging spatial interpolation using the previous fittef model
sm_ordkriging_50 <- autoKrige(soil_moist~1, input_data = Train50, GLS.model = TRUE, new_data = raster_reference_pixels)

#save the R file, which contains the ordinary kriging variogram model
saveRDS(ordkrig_model_50, file = paste0('ordkrig_variogram_model_50_', layer_name_sm))
write.csv(ordkrig_model_50$exp_var, file = paste0('ordkrig_variogram_model_50_exp_var_', layer_name_sm, '.csv'))
write.csv(ordkrig_model_50$var_model, file = paste0('ordkrig_variogram_model_50_var_model_', layer_name_sm, '.csv'))

#definition of predicted values and associated standard error
sm_ordkriging_50_output <- sm_ordkriging_50$krige_output
sm_ordkriging_50_output <- spTransform(sm_ordkriging_50_output, wgs84)

ordkrig_pred_50 <- rasterize(sm_ordkriging_50_output, raster_reference, 'var1.pred')
ordkrig_std_err_50 <- rasterize(sm_ordkriging_50_output, raster_reference, 'var1.stdev', na.rm = FALSE)

#Ordinary Kriging Model plot
plot(sm_ordkriging_50, col=colorRampPalette(c("tan4", "tan2", "yellow1", "yellowgreen", 
    "green", "springgreen2", "steelblue1", "steelblue4"))(20))

#export of outputs to raster files
writeRaster(stack(ordkrig_pred_50, ordkrig_std_err_50), 
            file=paste0('ordKriging_', layer_name_sm, '_pred_error_sm_50.tif'), overwrite = TRUE)

Report of semivariograms from kriging models

#Semivariogram model using 100% of available data
kable(ordkrig_model_100$exp_var, caption = 'Explained Variance', digits = 5)

kable(ordkrig_model_100$var_model, caption = 'Variance Model', digits = 5)
#Semivariogram model using 100% of available data
kable(ordkrig_model_75$exp_var, caption = 'Explained Variance', digits = 5)

kable(ordkrig_model_75$var_model, caption = 'Variance Model', digits = 5)
#Semivariogram model using 100% of available data
kable(ordkrig_model_50$exp_var, caption = 'Explained Variance', digits = 5)

kable(ordkrig_model_50$var_model, caption = 'Variance Model', digits = 5)

Cross validation for ordinary kriging spatial interpolation models

This function takes the same model used in the autoKrige function to predict over different subsets based on an iterative process.

Cross validation is applied to the outputs of ordinary kriging interpolation using 100%, 75% and 50% of available valid pixels. autoKrige.cv removes a predefined percent of the available valid data and then predicts new values over them, this process is performed 10 times in this section, ensuring every valid pixel is removed once and then predicted.

Only the model is validated in this process, as no predicted values over areas with original invalid pixels can be validated.

100% of valid pixels

sm_ordkriging_cv_100_10fold <- autoKrige.cv(soil_moist~1, 
                                            input_data = sm_raster_krig, nfold = 10)

acc_report_100_ordkrig <- accuracy_report_100[, -c(4,5,6,7,9,10)]

acc_report_100_ordkrig[1,4] <- 10
acc_report_100_ordkrig[1,2] <- round(cor(sm_ordkriging_cv_100_10fold$krige.cv_output$observed, 
                                      sm_ordkriging_cv_100_10fold$krige.cv_output$var1.pred) ^2, 3)
acc_report_100_ordkrig[1,3] <- round(rmse(sm_ordkriging_cv_100_10fold$krige.cv_output$observed, 
                                       sm_ordkriging_cv_100_10fold$krige.cv_output$var1.pred) ,3)

write.csv(acc_report_100_ordkrig, file = paste0('cv_ordkriging_100_', layer_name_sm, '.csv'))

sm_ordkriging_cv_100_10fold_output <- as.data.frame(sm_ordkriging_cv_100_10fold$krige.cv_output)
sm_ordkriging_cv_100_10fold_output <- transform(sm_ordkriging_cv_100_10fold_output, 
                                                YY = substr(layer_name_sm, 1, 4), 
                                                MM = substr(layer_name_sm, 6, 7))

write.csv(sm_ordkriging_cv_100_10fold_output, file = paste0('cross_validation_ordkriging_100_', 
                                                            layer_name_nasmd, '.csv'))

75% of valid pixels

sm_ordkriging_cv_75_10fold <- autoKrige.cv(soil_moist~1, 
                                            input_data = sm_raster_krig, nfold = 10)

acc_report_75_ordkrig <- accuracy_report_75[, -c(4,5,6,7,9,10)]

acc_report_75_ordkrig[1,4] <- 10
acc_report_75_ordkrig[1,2] <- round(cor(sm_ordkriging_cv_75_10fold$krige.cv_output$observed, 
                                         sm_ordkriging_cv_75_10fold$krige.cv_output$var1.pred) ^2, 3)
acc_report_75_ordkrig[1,3] <- round(rmse(sm_ordkriging_cv_75_10fold$krige.cv_output$observed, 
                                          sm_ordkriging_cv_75_10fold$krige.cv_output$var1.pred) ,3)

write.csv(acc_report_75_ordkrig, file = paste0('cv_ordkriging_75_', layer_name_sm, '.csv'))

sm_ordkriging_cv_75_10fold_output <- as.data.frame(sm_ordkriging_cv_75_10fold$krige.cv_output)
sm_ordkriging_cv_75_10fold_output <- transform(sm_ordkriging_cv_75_10fold_output, 
                                                YY = substr(layer_name_sm, 1, 4), 
                                                MM = substr(layer_name_sm, 6, 7))

write.csv(sm_ordkriging_cv_75_10fold_output, file = paste0('cross_validation_ordkriging_75_', 
                                                            layer_name_nasmd, '.csv'))

50% of valid pixels

sm_ordkriging_cv_50_10fold <- autoKrige.cv(soil_moist~1, 
                                           input_data = sm_raster_krig, nfold = 10)

acc_report_50_ordkrig <- accuracy_report_50[, -c(4,5,6,7,9,10)]

acc_report_50_ordkrig[1,4] <- 10
acc_report_50_ordkrig[1,2] <- round(cor(sm_ordkriging_cv_50_10fold$krige.cv_output$observed, 
                                        sm_ordkriging_cv_50_10fold$krige.cv_output$var1.pred) ^2, 3)
acc_report_50_ordkrig[1,3] <- round(rmse(sm_ordkriging_cv_50_10fold$krige.cv_output$observed, 
                                         sm_ordkriging_cv_50_10fold$krige.cv_output$var1.pred) ,3)

write.csv(acc_report_50_ordkrig, file = paste0('cv_ordkriging_50_', layer_name_sm, '.csv'))

sm_ordkriging_cv_50_10fold_output <- as.data.frame(sm_ordkriging_cv_50_10fold$krige.cv_output)
sm_ordkriging_cv_50_10fold_output <- transform(sm_ordkriging_cv_50_10fold_output, 
                                               YY = substr(layer_name_sm, 1, 4), 
                                               MM = substr(layer_name_sm, 6, 7))

write.csv(sm_ordkriging_cv_50_10fold_output, file = paste0('cross_validation_ordkriging_50_', 
                                                           layer_name_nasmd, '.csv'))

Regression Kriging spatial interpolation of available valid pixels

This section runs Regression Kriging Spatial Interpolation method using 100%, 75% and 50% of available valid pixels in the soil moisture monthly layer. The number of valid pixels varies according to every monthly layer.

Regression Kriging 100% of available valid pixels

#matrix preparation for regression kirging, soil moisture and covariates
regkrig_stack_100 <- stack(sm_raster_base, prcp_raster, tmin_raster)
regkrig_matrix_100 <- as(regkrig_stack_100, 'SpatialPointsDataFrame')
regkrig_matrix_100 <- spTransform(regkrig_matrix_100, albers_proj)  
names(regkrig_matrix_100) <- c("soil_moist", "prcp", "tmin")

#generation of matrix data frame as input for regression kriging model  
regkrig_matrix_100 <- as.data.frame(regkrig_matrix_100)
regkrig_matrix_100 <- data.frame(soil_moist=regkrig_matrix_100$soil_moist, x=regkrig_matrix_100$x, 
                                 y=regkrig_matrix_100$y, prcp=regkrig_matrix_100$prcp, 
                                 tmin=regkrig_matrix_100$tmin)

regkrig_matrix_100 <- na.omit(regkrig_matrix_100)

#spatial pixels data frame layer with the covariates and the target locations to predict soil moisture values
regkrig_covariates_grid <- stack(prcp_raster, tmin_raster)  
regkrig_covariates_grid <- projectRaster(regkrig_covariates_grid, crs = albers_proj)
regkrig_covariates_grid[is.na(regkrig_covariates_grid)==TRUE] <- -9999
regkrig_covariates_grid <- as(regkrig_covariates_grid, 'SpatialPixelsDataFrame')  
names(regkrig_covariates_grid) <- c("prcp", "tmin")

#generation of a model to fit variogram (using GSIF library)
regkrig_model_100 <- fit.regModel(soil_moist~prcp+tmin+prcp*tmin, rmatrix = regkrig_matrix_100,
                                  regkrig_covariates_grid, method="GLM")

#application of regression kriging spatial interpolation using the previous fitted model
regkrig_100_predict <- predict(regkrig_model_100, regkrig_covariates_grid, nfold = 10)

regkrig_100_predict_map <- stack(regkrig_100_predict@predicted)
regkrig_100_predict_map[regkrig_100_predict_map < 0] <- NA
regkrig_100_predict_map[regkrig_100_predict_map > 1] <- NA

plot(regkrig_100_predict_map)

#save the R file, which contains the regression kriging variogram model
saveRDS(regkrig_model_100, file = paste0('regkrig_variogram_model_100_', layer_name_sm))

regkrig_svgmModel_100 <- as.data.frame(regkrig_model_100@svgmModel)
write.csv(regkrig_svgmModel_100, file = paste0('regkrig_variogram_model_100_exp_var_', layer_name_sm, '.csv'))

regkrig_vgmModel_100 <- as.data.frame(regkrig_model_100@vgmModel)
write.csv(regkrig_vgmModel_100, file = paste0('regkrig_variogram_model_100_var_model_', layer_name_sm, '.csv'))

#definition of predicted values and associated standard error
regkrig_pred_100 <- stack(regkrig_100_predict_map[[4]])
regkrig_pred_100 <- projectRaster(regkrig_pred_100, raster_reference)

regkrig_std_err_100 <- stack(regkrig_100_predict_map[[3]])
regkrig_std_err_100 <- projectRaster(regkrig_std_err_100, raster_reference)

#export of outputs to raster files
writeRaster(stack(regkrig_pred_100, regkrig_std_err_100), 
            file=paste0('regKriging_', layer_name_sm, '_pred_error_sm_100.tif'), overwrite = TRUE)

Regression Kriging 75% of available valid pixels

Train75_regkrig <- regkrig_matrix_100[ trainIndex75,] 
Test75_regkrig <- regkrig_matrix_100[-trainIndex75,]

coordinates(Train75_regkrig) =~ x+y
projection(Train75_regkrig) <- albers_proj

#generation of matrix data frame as input for regression kriging model  
regkrig_matrix_75 <- as.data.frame(Train75_regkrig)
regkrig_matrix_75 <- data.frame(regkrig_matrix_75$soil_moist, soil_moist=regkrig_matrix_75$x, 
                                regkrig_matrix_75$y, regkrig_matrix_75$prcp, regkrig_matrix_75$tmin)
names(regkrig_matrix_75) <- c("soil_moist", "x", "y", "prcp", "tmin")

#generation of a model to fit variogram (using GSIF library)
regkrig_model_75 <- fit.regModel(soil_moist~prcp+tmin+prcp*tmin, rmatrix = regkrig_matrix_75,
                                  regkrig_covariates_grid, method="GLM")

#application of regression kriging spatial interpolation using the previous fitted model
regkrig_75_predict <- predict(regkrig_model_75, regkrig_covariates_grid, nfold = 10)

regkrig_75_predict_map <- stack(regkrig_100_predict@predicted)
regkrig_75_predict_map[regkrig_75_predict_map < 0] <- NA
regkrig_75_predict_map[regkrig_75_predict_map > 1] <- NA

plot(regkrig_75_predict_map)

#save the R file, which contains the regression kriging variogram model
saveRDS(regkrig_model_75, file = paste0('regkrig_variogram_model_75_', layer_name_sm))

regkrig_svgmModel_75 <- as.data.frame(regkrig_model_75@svgmModel)
write.csv(regkrig_svgmModel_75, file = paste0('regkrig_variogram_model_75_exp_var_', layer_name_sm, '.csv'))

regkrig_vgmModel_75 <- as.data.frame(regkrig_model_75@vgmModel)
write.csv(regkrig_vgmModel_75, file = paste0('regkrig_variogram_model_75_var_model_', layer_name_sm, '.csv'))

#definition of predicted values and associated standard error
regkrig_pred_75 <- stack(regkrig_75_predict_map[[4]])
regkrig_pred_75 <- projectRaster(regkrig_pred_75, raster_reference)

regkrig_std_err_75 <- stack(regkrig_75_predict_map[[3]])
regkrig_std_err_75 <- projectRaster(regkrig_std_err_75, raster_reference)

#export of outputs to raster files
writeRaster(stack(regkrig_pred_75, regkrig_std_err_75), 
            file=paste0('regKriging_', layer_name_sm, '_pred_error_sm_75.tif'), overwrite = TRUE)

Regression Kriging 50% of available valid pixels

Train50_regkrig <- regkrig_matrix_100[ trainIndex50,] 
Test50_regkrig <- regkrig_matrix_100[-trainIndex50,]

coordinates(Train50_regkrig) =~ x+y
projection(Train50_regkrig) <- albers_proj

#generation of matrix data frame as input for regression kriging model  
regkrig_matrix_50 <- as.data.frame(Train50_regkrig)
regkrig_matrix_50 <- data.frame(regkrig_matrix_50$soil_moist, soil_moist=regkrig_matrix_50$x, 
                                regkrig_matrix_50$y, regkrig_matrix_50$prcp, regkrig_matrix_50$tmin)
names(regkrig_matrix_50) <- c("soil_moist", "x", "y", "prcp", "tmin")

#generation of a model to fit variogram (using GSIF library)
regkrig_model_50 <- fit.regModel(soil_moist~prcp+tmin+prcp*tmin, rmatrix = regkrig_matrix_50,
                                 regkrig_covariates_grid, method="GLM")

#application of regression kriging spatial interpolation using the previous fitted model
  if(regkrig_model_50@vgmModel$range[[2]] > 0){
    
      regkrig_50_predict <- predict(regkrig_model_50, regkrig_covariates_grid, nfold = 10)

    }

regkrig_50_predict_map <- stack(regkrig_50_predict@predicted)
regkrig_50_predict_map[regkrig_50_predict_map < 0] <- NA
regkrig_50_predict_map[regkrig_50_predict_map > 1] <- NA

plot(regkrig_50_predict_map)

#save the R file, which contains the regression kriging variogram model
saveRDS(regkrig_model_50, file = paste0('regkrig_variogram_model_50_', layer_name_sm))

regkrig_svgmModel_50 <- as.data.frame(regkrig_model_50@svgmModel)
write.csv(regkrig_svgmModel_50, file = paste0('regkrig_variogram_model_50_exp_var_', layer_name_sm, '.csv'))

regkrig_vgmModel_50 <- as.data.frame(regkrig_model_50@vgmModel)
write.csv(regkrig_vgmModel_50, file = paste0('regkrig_variogram_model_50_var_model_', layer_name_sm, '.csv'))

#definition of predicted values and associated standard error
regkrig_pred_50 <- stack(regkrig_50_predict_map[[4]])
regkrig_pred_50 <- projectRaster(regkrig_pred_50, raster_reference)

regkrig_std_err_50 <- stack(regkrig_50_predict_map[[3]])
regkrig_std_err_50 <- projectRaster(regkrig_std_err_50, raster_reference)

#export of outputs to raster files
writeRaster(stack(regkrig_pred_50, regkrig_std_err_50), 
            file=paste0('regKriging_', layer_name_sm, '_pred_error_sm_50.tif'), overwrite = TRUE)

Cross validation for regression kriging spatial interpolation model

100% of valid pixels

acc_report_100_regkrig <- accuracy_report_100[c(1,4,5,9)]

acc_report_100_regkrig[1,4] <- 10
acc_report_100_regkrig[1,2] <- round(cor(regkrig_100_predict@validation$observed, 
                                         regkrig_100_predict@validation$var1.pred) ^2, 3)
acc_report_100_regkrig[1,3] <- round(rmse(regkrig_100_predict@validation$observed, 
                                          regkrig_100_predict@validation$var1.pred) ,3)

write.csv(acc_report_100_regkrig, file = paste0('cv_regkriging_100_', layer_name_sm, '.csv'))

sm_regkriging_cv_100_10fold_output <- as.data.frame(regkrig_100_predict@validation)
sm_regkriging_cv_100_10fold_output <- transform(sm_regkriging_cv_100_10fold_output, 
                                                YY = substr(layer_name_sm, 1, 4), 
                                                MM = substr(layer_name_sm, 6, 7))

write.csv(sm_regkriging_cv_100_10fold_output, file = paste0('cross_validation_regkriging_100_', 
                                                            layer_name_nasmd, '.csv'))

75% of valid pixels

acc_report_75_regkrig <- accuracy_report_75[c(1,4,5,9)]

acc_report_75_regkrig[1,4] <- 10
acc_report_75_regkrig[1,2] <- round(cor(regkrig_75_predict@validation$observed, 
                                         regkrig_75_predict@validation$var1.pred) ^2, 3)
acc_report_75_regkrig[1,3] <- round(rmse(regkrig_75_predict@validation$observed, 
                                          regkrig_75_predict@validation$var1.pred) ,3)

write.csv(acc_report_75_regkrig, file = paste0('cv_regkriging_75_', layer_name_sm, '.csv'))

sm_regkriging_cv_75_10fold_output <- as.data.frame(regkrig_75_predict@validation)
sm_regkriging_cv_75_10fold_output <- transform(sm_regkriging_cv_75_10fold_output, 
                                                YY = substr(layer_name_sm, 1, 4), 
                                                MM = substr(layer_name_sm, 6, 7))

write.csv(sm_regkriging_cv_75_10fold_output, file = paste0('cross_validation_regkriging_75_', 
                                                            layer_name_nasmd, '.csv'))

50% of valid pixels

acc_report_50_regkrig <- accuracy_report_50[c(1,4,5,9)]

acc_report_50_regkrig[1,4] <- 10
acc_report_50_regkrig[1,2] <- round(cor(regkrig_50_predict@validation$observed, 
                                         regkrig_50_predict@validation$var1.pred) ^2, 3)
acc_report_50_regkrig[1,3] <- round(rmse(regkrig_50_predict@validation$observed, 
                                          regkrig_50_predict@validation$var1.pred) ,3)

write.csv(acc_report_50_regkrig, file = paste0('cv_regkriging_50_', layer_name_sm, '.csv'))

sm_regkriging_cv_50_10fold_output <- as.data.frame(regkrig_50_predict@validation)
sm_regkriging_cv_50_10fold_output <- transform(sm_regkriging_cv_50_10fold_output, 
                                                YY = substr(layer_name_sm, 1, 4), 
                                                MM = substr(layer_name_sm, 6, 7))

write.csv(sm_regkriging_cv_50_10fold_output, file = paste0('cross_validation_regkriging_50_', 
                                                            layer_name_nasmd, '.csv'))

Generilized LInear Model (GLM)

This section generates soil moisture predictions over the region of interest, based on the correlation between soil moisture and meteorological parameters such as total monthly precipitation and monthly average of daily minimum temperature records.

Generation of training subsets

Training subsets with different percentage of valid data. These are the inputs for the multiple regression model.

####100% of available data
#This section project the data used for kriging interpolation to a lat-lon reference system that matches with the covariates (precipitation, minimum temperature) references system. 100% of available valid pixels in the region of interest.
sm_raster_krig <- spTransform(sm_raster_krig, CRSobj = wgs84)
covariates_values_100 <- extract(covariates,sm_raster_krig) 

lm_train_100 <- cbind(as.data.frame(sm_raster_krig, xy=TRUE), 
                      data.frame(covariates_values_100))  
lm_train_100 <- data.frame(sm =lm_train_100[1], prcp= lm_train_100[5], 
                           tmin=lm_train_100[6])

####75% of available pixels
#This section project the data used for kriging interpolation to a lat-lon reference system that matches with the covariates (precipitation, minimum temperature) references system. 75% of available valid pixels in the region of interest.
Train75 <- spTransform(Train75, CRSobj = wgs84)
covariates_values_75 <- extract(covariates,Train75) 

lm_train_75 <- cbind(as.data.frame(Train75, xy=TRUE), 
                     data.frame(covariates_values_75))  
lm_train_75 <- data.frame(sm =lm_train_75[1], prcp= lm_train_75[5], 
                          tmin=lm_train_75[6])

####50% of available pixels
#This section project the data used for kriging interpolation to a lat-lon reference system that matches with the covariates (precipitation, minimum temperature) references system. 50% of available valid pixels in the region of interest.
Train50 <- spTransform(Train50, CRSobj = wgs84)
covariates_values_50 <- extract(covariates,Train50) 

lm_train_50 <- cbind(as.data.frame(Train50, xy=TRUE), 
                     data.frame(covariates_values_50))  
lm_train_50 <- data.frame(sm =lm_train_50[1], prcp= lm_train_50[5], 
                          tmin=lm_train_50[6])

###Input and output matrices, Creation of input matrices for model generation, as well as matrices for outputs writing.
sm_and_predictors <- data.frame(sm_monthly_ref[3], prcp_montlhy_ref[3], 
                                tmin_montlhy_ref[3])

sm_monthly_outputs <- sm_monthly_ref
sm_monthly_outputs[4] <- NA
names(sm_monthly_outputs)[4] <- 'predicted_soil_moist'

100% of available pixels

ctrl1 <- trainControl(method = "repeatedcv", repeats = 1, number=10, savePredictions = TRUE)

#Generation of GLM for predicting soil moisture using precipitation and minimum temperature as covariates. 100% of available valid pixels in the region of interest.
glm_100_10fold <- train(soil_moist~., data=lm_train_100, method = "glm", 
                        trControl = ctrl1)

sm_prediction_100_10fold <- predict(covariates, glm_100_10fold)
names(sm_prediction_100_10fold) <- c('Prediction_100_valid_data_10_fold')

plot(sm_prediction_100_10fold$Prediction_100_valid_data_10_fold, main = '100% valid data 10 fold', 
     col=colorRampPalette(c("tan4", "tan2", "yellow1", "yellowgreen", "green", "springgreen2", "steelblue1", "steelblue4"))(20))

###Export of predicted layers with linear regression, to raster files
writeRaster(sm_prediction_100_10fold, file=paste0('glm_100_', layer_name_sm, 
                                                  '.tif'), overwrite = TRUE)

75% of available pixels

#Generation of GLM for predicting soil moisture using precipitation and minimum temperature as covariates. 75% of available valid pixels in the region of interest.
glm_75_10fold <- train(soil_moist~., data=lm_train_75, method = "glm", 
                       trControl = ctrl1)

sm_prediction_75_10fold <- predict(covariates, glm_75_10fold)
names(sm_prediction_75_10fold) <- c('Prediction_75_valid_data_10_fold')

plot(sm_prediction_75_10fold$Prediction_75_valid_data_10_fold, main = '75% valid data 10 fold', 
     col=colorRampPalette(c("tan4", "tan2", "yellow1", "yellowgreen", "green", "springgreen2", "steelblue1", "steelblue4"))(20))

###Export of predicted layers with linear regression, to raster files
writeRaster(sm_prediction_75_10fold, file=paste0('glm_75_', layer_name_sm, 
                                                  '.tif'), overwrite = TRUE)

50% of available pixels

#Generation of GLM for predicting soil moisture using precipitation and minimum temperature as covariates. 50% of available valid pixels in the region of interest.
glm_50_10fold <- train(soil_moist~., data=lm_train_50, method = "glm", trControl = ctrl1)

sm_prediction_50_10fold <- predict(covariates, glm_50_10fold)
names(sm_prediction_50_10fold) <- c('Prediction_50_valid_data_10_fold')

plot(sm_prediction_50_10fold$Prediction_50_valid_data_10_fold, main = '50% valid data 10 fold', 
     col=colorRampPalette(c("tan4", "tan2", "yellow1", "yellowgreen", "green", "springgreen2", "steelblue1", "steelblue4"))(20))

###Export of predicted layers with linear regression, to raster files
writeRaster(sm_prediction_50_10fold, file=paste0('glm_50_', layer_name_sm, 
                                                  '.tif'), overwrite = TRUE)

Cross validation for GLM spatial interpolation model

100% of valid pixels

acc_report_100_glm <- accuracy_report_100[c(1,6,7,10)]

acc_report_100_glm[1,4] <- 10
acc_report_100_glm[1,2] <- round((glm_100_10fold$results['Rsquared']), 3)
acc_report_100_glm[1,3] <- round((glm_100_10fold$results['RMSE']), 3)

write.csv(acc_report_100_glm, file = paste0('cv_glm_100_', layer_name_sm, '.csv'))

sm_regkriging_cv_100_10fold_output <- as.data.frame(regkrig_100_predict@validation)
sm_regkriging_cv_100_10fold_output <- transform(sm_regkriging_cv_100_10fold_output, 
                                               YY = substr(layer_name_sm, 1, 4), 
                                               MM = substr(layer_name_sm, 6, 7))

accuracy_glm_100_10fold <- glm_100_10fold$pred
accuracy_glm_100_10fold <- transform(accuracy_glm_100_10fold, 
                                     YY = substr(layer_name_sm, 1, 4), 
                                     MM = substr(layer_name_sm, 6, 7))

write.csv(accuracy_glm_100_10fold, file = paste0('cross_validation_glm_100_', layer_name_nasmd, '.csv'))

75% of valid pixels

acc_report_75_glm <- accuracy_report_75[c(1,6,7,10)]

acc_report_75_glm[1,4] <- 10
acc_report_75_glm[1,2] <- round((glm_75_10fold$results['Rsquared']), 3)
acc_report_75_glm[1,3] <- round((glm_75_10fold$results['RMSE']), 3)

write.csv(acc_report_75_glm, file = paste0('cv_glm_75_', layer_name_sm, '.csv'))

sm_regkriging_cv_75_10fold_output <- as.data.frame(regkrig_75_predict@validation)
sm_regkriging_cv_75_10fold_output <- transform(sm_regkriging_cv_75_10fold_output, 
                                                YY = substr(layer_name_sm, 1, 4), 
                                                MM = substr(layer_name_sm, 6, 7))

accuracy_glm_75_10fold <- glm_75_10fold$pred
accuracy_glm_75_10fold <- transform(accuracy_glm_75_10fold, 
                                     YY = substr(layer_name_sm, 1, 4), 
                                     MM = substr(layer_name_sm, 6, 7))

write.csv(accuracy_glm_75_10fold, file = paste0('cross_validation_glm_75_', layer_name_nasmd, '.csv'))

50% of valid pixels

acc_report_50_glm <- accuracy_report_50[c(1,6,7,10)]

acc_report_50_glm[1,4] <- 10
acc_report_50_glm[1,2] <- round((glm_50_10fold$results['Rsquared']), 3)
acc_report_50_glm[1,3] <- round((glm_50_10fold$results['RMSE']), 3)

write.csv(acc_report_50_glm, file = paste0('cv_glm_50_', layer_name_sm, '.csv'))

sm_regkriging_cv_50_10fold_output <- as.data.frame(regkrig_50_predict@validation)
sm_regkriging_cv_50_10fold_output <- transform(sm_regkriging_cv_50_10fold_output, 
                                               YY = substr(layer_name_sm, 1, 4), 
                                               MM = substr(layer_name_sm, 6, 7))

accuracy_glm_50_10fold <- glm_50_10fold$pred
accuracy_glm_50_10fold <- transform(accuracy_glm_50_10fold, 
                                    YY = substr(layer_name_sm, 1, 4), 
                                    MM = substr(layer_name_sm, 6, 7))

write.csv(accuracy_glm_50_10fold, file = paste0('cross_validation_glm_50_', layer_name_nasmd, '.csv'))

Field data validation

Correlation between original satellite data (100%, 75%, 50%) and field data from the NASMDB

Correlation between values from pixels in the original satellite data and field data, is intended to establish the reference correlation value expected between modeled data, both kriging and linear models, and field data.

In the first step, all valid data for the specified month are extracted from the North America Soil Moisture Data Base, this data will be correlated with satellite values over pixels with original satellite data.

#Preparation of references layers from satellite data, subsetting different percentages of valid data (100%, 75%, 50%)
Train100_raster <- sm_raster_krig
gridded(Train100_raster) <- TRUE
Train100_raster <- raster(Train100_raster)

Train75_raster <- Train75
gridded(Train75_raster) <- TRUE
Train75_raster <- raster(Train75_raster)

Train50_raster <- Train50
gridded(Train50_raster) <- TRUE
Train50_raster <- raster(Train50_raster)

#Generation of tables to record correlation outputs between satellite data with different percentage of 
#valid pixels (100%, 75%, 50%) and the data from the nasmd over the same valid pixels

satellite_field_correlation <- matrix(data = NA, nrow =1 , ncol =6) 
satellite_field_correlation <- as.data.frame(satellite_field_correlation)
headers <- c('ID', 'Layer', 'Correlation', 'RMSE', 'Number_Stations', 'Number_Points')
names(satellite_field_correlation) <- headers
satellite_field_correlation$ID <- i
satellite_field_correlation$Layer <- layer_name_sm

satellite_field_correlation_100 <- satellite_field_correlation
satellite_field_correlation_75 <- satellite_field_correlation
satellite_field_correlation_50 <- satellite_field_correlation

100% of valid pixels

validation_csv_values_100 <- na.omit(read.csv(nasmd_files[[i]]))

if(length(validation_csv_values_100$Station_ID) != 0){
  
  coordinates(validation_csv_values_100) <- ~Longitude+Latitude
  proj4string(validation_csv_values_100) <- CRS("+proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0") 
  validation_csv_values_100 <- spTransform(validation_csv_values_100, 
                                           CRS(projection(raster_reference)))
  
}

validation_csv_values_100$extract_values <- extract(Train100_raster, 
                                                    validation_csv_values_100)
validation_table_satellite_100 <- as.data.frame(validation_csv_values_100)

no_stations <- length(validation_table_satellite_100$Station_ID)
no_points <- na.omit(validation_table_satellite_100$extract_values)
no_points <- length(no_points)

correlation <- cor(validation_table_satellite_100$mean_sm_depth_5cm,
                   validation_table_satellite_100$extract_values, 
                   use = 'pairwise.complete.obs')
RMSE <- rmse(na.omit(validation_table_satellite_100$mean_sm_depth_5cm),
             na.omit(validation_table_satellite_100$extract_values))

satellite_field_correlation_100[1, 2] <- layer_name_nasmd
satellite_field_correlation_100[1, 3] <- correlation
satellite_field_correlation_100[1, 4] <- RMSE
satellite_field_correlation_100[1, 5] <- no_stations
satellite_field_correlation_100[1, 6] <- no_points

#Correlation output
write.csv(satellite_field_correlation_100, file = paste0('satellite_nasmd_100_validation_', layer_name_nasmd, '.csv'))

75% of valid pixels

validation_csv_values_75 <- na.omit(read.csv(nasmd_files[[i]]))

if(length(validation_csv_values_75$Station_ID) != 0){
  
  coordinates(validation_csv_values_75) <- ~Longitude+Latitude
  proj4string(validation_csv_values_75) <- CRS("+proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0") 
  validation_csv_values_75 <- spTransform(validation_csv_values_75, 
                                          CRS(projection(raster_reference)))
  
}

validation_csv_values_75$extract_values <- extract(Train75_raster, 
                                                   validation_csv_values_75)
validation_table_satellite_75 <- as.data.frame(validation_csv_values_75)

no_stations <- length(validation_table_satellite_75$Station_ID)
no_points <- na.omit(validation_table_satellite_75$extract_values)
no_points <- length(no_points)

correlation <- cor(validation_table_satellite_75$mean_sm_depth_5cm, 
                   validation_table_satellite_75$extract_values, 
                   use = 'pairwise.complete.obs')
RMSE <- rmse(na.omit(validation_table_satellite_75$mean_sm_depth_5cm),
             na.omit(validation_table_satellite_75$extract_values))

satellite_field_correlation_75[1, 2] <- layer_name_nasmd
satellite_field_correlation_75[1, 3] <- correlation
satellite_field_correlation_75[1, 4] <- RMSE
satellite_field_correlation_75[1, 5] <- no_stations
satellite_field_correlation_75[1, 6] <- no_points

#Correlation output
write.csv(satellite_field_correlation_75, file = paste0('satellite_nasmd_75_validation_', layer_name_nasmd, '.csv'))

50% of valid pixels

validation_csv_values_50 <- na.omit(read.csv(nasmd_files[[i]]))

if(length(validation_csv_values_50$Station_ID) != 0){
  
  coordinates(validation_csv_values_50) <- ~Longitude+Latitude
  proj4string(validation_csv_values_50) <- CRS("+proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0") 
  validation_csv_values_50 <- spTransform(validation_csv_values_50, 
                                          CRS(projection(raster_reference)))
  
}

validation_csv_values_50$extract_values <- extract(Train50_raster, 
                                                   validation_csv_values_50)
validation_table_satellite_50 <- as.data.frame(validation_csv_values_50)

no_stations <- length(validation_table_satellite_50$Station_ID)
no_points <- na.omit(validation_table_satellite_50$extract_values)
no_points <- length(no_points)

correlation <- cor(validation_table_satellite_50$mean_sm_depth_5cm, 
                   validation_table_satellite_50$extract_values, 
                   use = 'pairwise.complete.obs')
RMSE <- rmse(na.omit(validation_table_satellite_50$mean_sm_depth_5cm),
             na.omit(validation_table_satellite_50$extract_values))

satellite_field_correlation_50[1, 2] <- layer_name_nasmd
satellite_field_correlation_50[1, 3] <- correlation
satellite_field_correlation_50[1, 4] <- RMSE
satellite_field_correlation_50[1, 5] <- no_stations
satellite_field_correlation_50[1, 6] <- no_points

write.csv(satellite_field_correlation_50, file = paste0('satellite_nasmd_50_validation_', layer_name_nasmd, '.csv'))
##Export of points used for validation between satellite data and field data, to shapefile. Also validation tables are exported to CSV files, as well as correlation results
#100%
writeOGR(obj = validation_csv_values_100, dsn = paste0('validation_shapes_100_', layer_name_nasmd), 
         layer = paste0('validation_reference_100_', layer_name_nasmd), 
         driver = 'ESRI Shapefile', overwrite_layer = TRUE)

write.csv(validation_table_satellite_100, 
          file = paste0('validation_reference_satellite_100_', layer_name_nasmd, '.csv'))

#75%
writeOGR(obj = validation_csv_values_75, dsn = paste0('validation_shapes_75_', layer_name_nasmd), 
         layer = paste0('validation_reference_75_', layer_name_nasmd), 
         driver = 'ESRI Shapefile', overwrite_layer = TRUE)

write.csv(validation_table_satellite_75, 
          file = paste0('validation_reference_satellite_75_', layer_name_nasmd, '.csv'))

#50%
writeOGR(obj = validation_csv_values_50, dsn = paste0('validation_shapes_50_', layer_name_nasmd), 
         layer = paste0('validation_reference_50_', layer_name_nasmd), 
         driver = 'ESRI Shapefile', overwrite_layer = TRUE)

write.csv(validation_table_satellite_50, 
          file = paste0('validation_reference_satellite_50_', layer_name_nasmd, '.csv'))

Validation of Ordinary kriging model outputs

Correlation between Ordinary kriging model outputs (100%, 75%, 50%) and field data from the North American Soil Moisture Database.

Ordinary Kriging 100% of valid data

validation_csv_values_100 <- na.omit(read.csv(nasmd_files[[i]]))

if(length(validation_csv_values_100$Station_ID) != 0){
  
  coordinates(validation_csv_values_100) <- ~Longitude+Latitude
  proj4string(validation_csv_values_100) <- CRS("+proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0") 
  validation_csv_values_100 <- spTransform(validation_csv_values_100, 
                                           CRS(projection(raster_reference)))
  
}

validation_csv_values_100$extract_values <- extract(ordkrig_pred_100, validation_csv_values_100)
validation_table_ordkriging_100 <- as.data.frame(validation_csv_values_100)

no_stations <- length(validation_table_ordkriging_100$Station_ID)
no_points <- na.omit(validation_table_ordkriging_100$extract_values)
no_points <- length(no_points)

correlation <- cor(validation_table_ordkriging_100$mean_sm_depth_5cm, 
                   validation_table_ordkriging_100$extract_values, 
                   use = 'pairwise.complete.obs')
RMSE <- rmse(na.omit(validation_table_ordkriging_100$mean_sm_depth_5cm),
             na.omit(validation_table_ordkriging_100$extract_values))

ordkriging_field_correlation_100 <- satellite_field_correlation
ordkriging_field_correlation_100[1, 2] <- layer_name_nasmd
ordkriging_field_correlation_100[1, 3] <- correlation
ordkriging_field_correlation_100[1, 4] <- RMSE
ordkriging_field_correlation_100[1, 5] <- no_stations
ordkriging_field_correlation_100[1, 6] <- no_points

write.csv(ordkriging_field_correlation_100, file = paste0('ordkriging_nasmd_100_validation_', layer_name_nasmd, '.csv'))

#Export of validation tables from kriging models to CSV files, as well as correlation results
write.csv(validation_table_ordkriging_100, file = paste0('validation_reference_ordkriging_100', layer_name_nasmd, '.csv'))

Ordinary Kriging 75% of valid data

validation_csv_values_75 <- na.omit(read.csv(nasmd_files[[i]]))

if(length(validation_csv_values_75$Station_ID) != 0){
  
  coordinates(validation_csv_values_75) <- ~Longitude+Latitude
  proj4string(validation_csv_values_75) <- CRS("+proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0") 
  validation_csv_values_75 <- spTransform(validation_csv_values_75, 
                                           CRS(projection(raster_reference)))
  
}

validation_csv_values_75$extract_values <- extract(ordkrig_pred_75, validation_csv_values_75)
validation_table_ordkriging_75 <- as.data.frame(validation_csv_values_75)

no_stations <- length(validation_table_ordkriging_75$Station_ID)
no_points <- na.omit(validation_table_ordkriging_75$extract_values)
no_points <- length(no_points)

correlation <- cor(validation_table_ordkriging_75$mean_sm_depth_5cm, 
                   validation_table_ordkriging_75$extract_values, 
                   use = 'pairwise.complete.obs')
RMSE <- rmse(na.omit(validation_table_ordkriging_75$mean_sm_depth_5cm),
             na.omit(validation_table_ordkriging_75$extract_values))

ordkriging_field_correlation_75 <- satellite_field_correlation
ordkriging_field_correlation_75[1, 2] <- layer_name_nasmd
ordkriging_field_correlation_75[1, 3] <- correlation
ordkriging_field_correlation_75[1, 4] <- RMSE
ordkriging_field_correlation_75[1, 5] <- no_stations
ordkriging_field_correlation_75[1, 6] <- no_points

write.csv(ordkriging_field_correlation_75, file = paste0('ordkriging_nasmd_75_validation_', layer_name_nasmd, '.csv'))

#Export of validation tables from kriging models to CSV files, as well as correlation results
write.csv(validation_table_ordkriging_75, file = paste0('validation_reference_ordkriging_75', layer_name_nasmd, '.csv'))

Ordinary Kriging 50% of valid data

validation_csv_values_50 <- na.omit(read.csv(nasmd_files[[i]]))

if(length(validation_csv_values_50$Station_ID) != 0){
  
  coordinates(validation_csv_values_50) <- ~Longitude+Latitude
  proj4string(validation_csv_values_50) <- CRS("+proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0") 
  validation_csv_values_50 <- spTransform(validation_csv_values_50, 
                                          CRS(projection(raster_reference)))
  
}

validation_csv_values_50$extract_values <- extract(ordkrig_pred_50, validation_csv_values_50)
validation_table_ordkriging_50 <- as.data.frame(validation_csv_values_50)

no_stations <- length(validation_table_ordkriging_50$Station_ID)
no_points <- na.omit(validation_table_ordkriging_50$extract_values)
no_points <- length(no_points)

correlation <- cor(validation_table_ordkriging_50$mean_sm_depth_5cm, 
                   validation_table_ordkriging_50$extract_values, 
                   use = 'pairwise.complete.obs')
RMSE <- rmse(na.omit(validation_table_ordkriging_50$mean_sm_depth_5cm),
             na.omit(validation_table_ordkriging_50$extract_values))

ordkriging_field_correlation_50 <- satellite_field_correlation
ordkriging_field_correlation_50[1, 2] <- layer_name_nasmd
ordkriging_field_correlation_50[1, 3] <- correlation
ordkriging_field_correlation_50[1, 4] <- RMSE
ordkriging_field_correlation_50[1, 5] <- no_stations
ordkriging_field_correlation_50[1, 6] <- no_points

write.csv(ordkriging_field_correlation_50, file = paste0('ordkriging_nasmd_50_validation_', layer_name_nasmd, '.csv'))

#Export of validation tables from kriging models to CSV files, as well as correlation results
write.csv(validation_table_ordkriging_50, file = paste0('validation_reference_ordkriging_50', layer_name_nasmd, '.csv'))

Validation of Regression kriging model outputs

Correlation between Regression kriging model outputs (100%, 75%, 50%) and field data from the North American Soil Moisture Database.

Regression Kriging 100% of valid data

validation_csv_values_100 <- na.omit(read.csv(nasmd_files[[i]]))

if(length(validation_csv_values_100$Station_ID) != 0){
  
  coordinates(validation_csv_values_100) <- ~Longitude+Latitude
  proj4string(validation_csv_values_100) <- CRS("+proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0") 
  validation_csv_values_100 <- spTransform(validation_csv_values_100, 
                                           CRS(projection(raster_reference)))
  
}

validation_csv_values_100$extract_values <- extract(regkrig_pred_100, validation_csv_values_100)
validation_table_regkriging_100 <- as.data.frame(validation_csv_values_100)

no_stations <- length(validation_table_regkriging_100$Station_ID)
no_points <- na.omit(validation_table_regkriging_100$extract_values)
no_points <- length(no_points)

correlation <- cor(validation_table_regkriging_100$mean_sm_depth_5cm, 
                   validation_table_regkriging_100$extract_values, 
                   use = 'pairwise.complete.obs')
RMSE <- rmse(na.omit(validation_table_regkriging_100$mean_sm_depth_5cm),
             na.omit(validation_table_regkriging_100$extract_values))

regkriging_field_correlation_100 <- satellite_field_correlation
regkriging_field_correlation_100[1, 2] <- layer_name_nasmd
regkriging_field_correlation_100[1, 3] <- correlation
regkriging_field_correlation_100[1, 4] <- RMSE
regkriging_field_correlation_100[1, 5] <- no_stations
regkriging_field_correlation_100[1, 6] <- no_points

write.csv(regkriging_field_correlation_100, file = paste0('regkriging_nasmd_100_validation_', layer_name_nasmd, '.csv'))

#Export of validation tables from kriging models to CSV files, as well as correlation results
write.csv(validation_table_regkriging_100, file = paste0('validation_reference_regkriging_100', layer_name_nasmd, '.csv'))

Regression Kriging 75% of valid data

validation_csv_values_75 <- na.omit(read.csv(nasmd_files[[i]]))

if(length(validation_csv_values_75$Station_ID) != 0){
  
  coordinates(validation_csv_values_75) <- ~Longitude+Latitude
  proj4string(validation_csv_values_75) <- CRS("+proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0") 
  validation_csv_values_75 <- spTransform(validation_csv_values_75, 
                                          CRS(projection(raster_reference)))
  
}

validation_csv_values_75$extract_values <- extract(regkrig_pred_75, validation_csv_values_75)
validation_table_regkriging_75 <- as.data.frame(validation_csv_values_75)

no_stations <- length(validation_table_regkriging_75$Station_ID)
no_points <- na.omit(validation_table_regkriging_75$extract_values)
no_points <- length(no_points)

correlation <- cor(validation_table_regkriging_75$mean_sm_depth_5cm, 
                   validation_table_regkriging_75$extract_values, 
                   use = 'pairwise.complete.obs')
RMSE <- rmse(na.omit(validation_table_regkriging_75$mean_sm_depth_5cm),
             na.omit(validation_table_regkriging_75$extract_values))

regkriging_field_correlation_75 <- satellite_field_correlation
regkriging_field_correlation_75[1, 2] <- layer_name_nasmd
regkriging_field_correlation_75[1, 3] <- correlation
regkriging_field_correlation_75[1, 4] <- RMSE
regkriging_field_correlation_75[1, 5] <- no_stations
regkriging_field_correlation_75[1, 6] <- no_points

write.csv(regkriging_field_correlation_75, file = paste0('regkriging_nasmd_75_validation_', layer_name_nasmd, '.csv'))

#Export of validation tables from kriging models to CSV files, as well as correlation results
write.csv(validation_table_regkriging_75, file = paste0('validation_reference_regkriging_75', layer_name_nasmd, '.csv'))

Regression Kriging 50% of valid data

validation_csv_values_50 <- na.omit(read.csv(nasmd_files[[i]]))

if(length(validation_csv_values_50$Station_ID) != 0){
  
  coordinates(validation_csv_values_50) <- ~Longitude+Latitude
  proj4string(validation_csv_values_50) <- CRS("+proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0") 
  validation_csv_values_50 <- spTransform(validation_csv_values_50, 
                                          CRS(projection(raster_reference)))
  
}

validation_csv_values_50$extract_values <- extract(regkrig_pred_50, validation_csv_values_50)
validation_table_regkriging_50 <- as.data.frame(validation_csv_values_50)

no_stations <- length(validation_table_regkriging_50$Station_ID)
no_points <- na.omit(validation_table_regkriging_50$extract_values)
no_points <- length(no_points)

correlation <- cor(validation_table_regkriging_50$mean_sm_depth_5cm, 
                   validation_table_regkriging_50$extract_values, 
                   use = 'pairwise.complete.obs')
RMSE <- rmse(na.omit(validation_table_regkriging_50$mean_sm_depth_5cm),
             na.omit(validation_table_regkriging_50$extract_values))

regkriging_field_correlation_50 <- satellite_field_correlation
regkriging_field_correlation_50[1, 2] <- layer_name_nasmd
regkriging_field_correlation_50[1, 3] <- correlation
regkriging_field_correlation_50[1, 4] <- RMSE
regkriging_field_correlation_50[1, 5] <- no_stations
regkriging_field_correlation_50[1, 6] <- no_points

write.csv(regkriging_field_correlation_50, file = paste0('regkriging_nasmd_50_validation_', layer_name_nasmd, '.csv'))

#Export of validation tables from kriging models to CSV files, as well as correlation results
write.csv(validation_table_regkriging_50, file = paste0('validation_reference_regkriging_50', layer_name_nasmd, '.csv'))

Validation of GLM outputs

Correlation between GLM outputs (100%, 75%, 50%) and field data from the North American Soil Moisture Database

GLM 100% of valid data

validation_csv_values_100 <- na.omit(read.csv(nasmd_files[[i]]))

if(length(validation_csv_values_100$Station_ID) != 0){
  
  coordinates(validation_csv_values_100) <- ~Longitude+Latitude
  proj4string(validation_csv_values_100) <- CRS("+proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0") 
  validation_csv_values_100 <- spTransform(validation_csv_values_100, 
                                           CRS(projection(raster_reference)))
  
}

validation_csv_values_100$extract_values <- extract(sm_prediction_100_10fold, 
                                                    validation_csv_values_100)
validation_table_glm_100_10fold <- as.data.frame(validation_csv_values_100)

write.csv(validation_table_glm_100_10fold, file = paste0('validation_reference_glm_100_', 
                                                                 layer_name_nasmd, '.csv'))

no_stations <- length(validation_table_glm_100_10fold$Station_ID)
no_points <- na.omit(validation_table_glm_100_10fold$extract_values)
no_points <- length(no_points)

correlation <- cor(validation_table_glm_100_10fold$mean_sm_depth_5cm, 
                   validation_table_glm_100_10fold$extract_values, 
                   use = 'pairwise.complete.obs')
RMSE <- rmse(na.omit(validation_table_glm_100_10fold$mean_sm_depth_5cm),
             na.omit(validation_table_glm_100_10fold$extract_values))

glm_field_correlation_100_10fold <- satellite_field_correlation
glm_field_correlation_100_10fold[1, 2] <- layer_name_nasmd
glm_field_correlation_100_10fold[1, 3] <- correlation
glm_field_correlation_100_10fold[1, 4] <- RMSE
glm_field_correlation_100_10fold[1, 5] <- no_stations
glm_field_correlation_100_10fold[1, 6] <- no_points

write.csv(glm_field_correlation_100_10fold, file = paste0('glm_10fold_nasmd_100_validation_', layer_name_nasmd, '.csv'))

GLM 75% of valid data

validation_csv_values_75 <- na.omit(read.csv(nasmd_files[[i]]))

if(length(validation_csv_values_75$Station_ID) != 0){
  
  coordinates(validation_csv_values_75) <- ~Longitude+Latitude
  proj4string(validation_csv_values_75) <- CRS("+proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0") 
  validation_csv_values_75 <- spTransform(validation_csv_values_75, 
                                          CRS(projection(raster_reference)))
  
}

validation_csv_values_75$extract_values <- extract(sm_prediction_75_10fold, 
                                                   validation_csv_values_75)
validation_table_glm_75_10fold <- as.data.frame(validation_csv_values_75)

write.csv(validation_table_glm_75_10fold, file = paste0('validation_reference_glm_75_', 
                                                                layer_name_nasmd, '.csv'))

no_stations <- length(validation_table_glm_75_10fold$Station_ID)
no_points <- na.omit(validation_table_glm_75_10fold$extract_values)
no_points <- length(no_points)

correlation <- cor(validation_table_glm_75_10fold$mean_sm_depth_5cm, 
                   validation_table_glm_75_10fold$extract_values, 
                   use = 'pairwise.complete.obs')
RMSE <- rmse(na.omit(validation_table_glm_75_10fold$mean_sm_depth_5cm),
             na.omit(validation_table_glm_75_10fold$extract_values))

glm_field_correlation_75_10fold <- satellite_field_correlation
glm_field_correlation_75_10fold[1, 2] <- layer_name_nasmd
glm_field_correlation_75_10fold[1, 3] <- correlation
glm_field_correlation_75_10fold[1, 4] <- RMSE
glm_field_correlation_75_10fold[1, 5] <- no_stations
glm_field_correlation_75_10fold[1, 6] <- no_points

write.csv(glm_field_correlation_75_10fold, file = paste0('glm_10fold_nasmd_75_validation_', layer_name_nasmd, '.csv'))

GLM 50% of valid data

validation_csv_values_50 <- na.omit(read.csv(nasmd_files[[i]]))

if(length(validation_csv_values_50$Station_ID) != 0){
  
  coordinates(validation_csv_values_50) <- ~Longitude+Latitude
  proj4string(validation_csv_values_50) <- CRS("+proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0") 
  validation_csv_values_50 <- spTransform(validation_csv_values_50, 
                                          CRS(projection(raster_reference)))
  
}

validation_csv_values_50$extract_values <- extract(sm_prediction_50_10fold, 
                                                   validation_csv_values_50)
validation_table_glm_50_10fold <- as.data.frame(validation_csv_values_50)

write.csv(validation_table_glm_50_10fold, file = paste0('validation_reference_glm_50_', 
                                                                layer_name_nasmd, '.csv'))

no_stations <- length(validation_table_glm_50_10fold$Station_ID)
no_points <- na.omit(validation_table_glm_50_10fold$extract_values)
no_points <- length(no_points)

correlation <- cor(validation_table_glm_50_10fold$mean_sm_depth_5cm, 
                   validation_table_glm_50_10fold$extract_values, 
                   use = 'pairwise.complete.obs')
RMSE <- rmse(na.omit(validation_table_glm_50_10fold$mean_sm_depth_5cm),
             na.omit(validation_table_glm_50_10fold$extract_values))

glm_field_correlation_50_10fold <- satellite_field_correlation
glm_field_correlation_50_10fold[1, 2] <- layer_name_nasmd
glm_field_correlation_50_10fold[1, 3] <- correlation
glm_field_correlation_50_10fold[1, 4] <- RMSE
glm_field_correlation_50_10fold[1, 5] <- no_stations
glm_field_correlation_50_10fold[1, 6] <- no_points

write.csv(glm_field_correlation_50_10fold, file = paste0('glm_10fold_nasmd_50_validation_', layer_name_nasmd, '.csv'))

Final correlation report for all models and field data

#Final table comparing all cross-validation outputs as well as validation 
#using ground-truth data
Final_report_cross_validation <- read.csv('Final_report_cross_validation.csv')
Final_report_cross_validation <- Final_report_cross_validation[-1]

Final_report_cross_validation[1,4] <- acc_report_100_ordkrig[1,2]
Final_report_cross_validation[2,4] <- acc_report_75_ordkrig[1,2]
Final_report_cross_validation[3,4] <- acc_report_50_ordkrig[1,2]
Final_report_cross_validation[4,4] <- acc_report_100_regkrig[1,2]
Final_report_cross_validation[5,4] <- acc_report_75_regkrig[1,2]
Final_report_cross_validation[6,4] <- acc_report_50_regkrig[1,2]
Final_report_cross_validation[7,4] <- acc_report_100_glm[1,2]
Final_report_cross_validation[8,4] <- acc_report_75_glm[1,2]
Final_report_cross_validation[9,4] <- acc_report_50_glm[1,2]

Final_report_cross_validation[1,5] <- acc_report_100_ordkrig[1,3]
Final_report_cross_validation[2,5] <- acc_report_75_ordkrig[1,3]
Final_report_cross_validation[3,5] <- acc_report_50_ordkrig[1,3]
Final_report_cross_validation[4,5] <- acc_report_100_regkrig[1,3]
Final_report_cross_validation[5,5] <- acc_report_75_regkrig[1,3]
Final_report_cross_validation[6,5] <- acc_report_50_regkrig[1,3]
Final_report_cross_validation[7,5] <- acc_report_100_glm[1,3]
Final_report_cross_validation[8,5] <- acc_report_75_glm[1,3]
Final_report_cross_validation[9,5] <- acc_report_50_glm[1,3]

write.csv(Final_report_cross_validation, file = paste0('Final_report_cross_validation_', 
                                                       layer_name_nasmd, '.csv'))

Final_report_satellite_field_data <- read.csv('Final_report_satellite_field_data.csv')

Final_report_satellite_field_data[1,3] <- satellite_field_correlation_100[1,3]
Final_report_satellite_field_data[1,4] <- satellite_field_correlation_100[1,4]
Final_report_satellite_field_data[1,5] <- satellite_field_correlation_100[1,6]
Final_report_satellite_field_data[2,3] <- satellite_field_correlation_75[1,3]
Final_report_satellite_field_data[2,4] <- satellite_field_correlation_75[1,4]
Final_report_satellite_field_data[2,5] <- satellite_field_correlation_75[1,6]
Final_report_satellite_field_data[3,3] <- satellite_field_correlation_50[1,3]
Final_report_satellite_field_data[3,4] <- satellite_field_correlation_50[1,4]
Final_report_satellite_field_data[3,5] <- satellite_field_correlation_50[1,6]

write.csv(Final_report_satellite_field_data, file = paste0('Final_report_satellite_field_data_', 
                                                           layer_name_nasmd, '.csv'))

Final_report_model_outputs_field_data <- read.csv('Final_report_model_outputs_field_data.csv')

Final_report_model_outputs_field_data[1,4] <- ordkriging_field_correlation_100[1,3]
Final_report_model_outputs_field_data[1,5] <- ordkriging_field_correlation_100[1,4]  
Final_report_model_outputs_field_data[1,6] <- ordkriging_field_correlation_100[1,6]  
Final_report_model_outputs_field_data[2,4] <- ordkriging_field_correlation_75[1,3]
Final_report_model_outputs_field_data[2,5] <- ordkriging_field_correlation_75[1,4]  
Final_report_model_outputs_field_data[2,6] <- ordkriging_field_correlation_75[1,6]
Final_report_model_outputs_field_data[3,4] <- ordkriging_field_correlation_50[1,3]
Final_report_model_outputs_field_data[3,5] <- ordkriging_field_correlation_50[1,4]  
Final_report_model_outputs_field_data[3,6] <- ordkriging_field_correlation_50[1,6]

Final_report_model_outputs_field_data[4,4] <- regkriging_field_correlation_100[1,3]
Final_report_model_outputs_field_data[4,5] <- regkriging_field_correlation_100[1,4]  
Final_report_model_outputs_field_data[4,6] <- regkriging_field_correlation_100[1,6]  
Final_report_model_outputs_field_data[5,4] <- regkriging_field_correlation_75[1,3]
Final_report_model_outputs_field_data[5,5] <- regkriging_field_correlation_75[1,4]  
Final_report_model_outputs_field_data[5,6] <- regkriging_field_correlation_75[1,6]
Final_report_model_outputs_field_data[6,4] <- regkriging_field_correlation_50[1,3]
Final_report_model_outputs_field_data[6,5] <- regkriging_field_correlation_50[1,4]  
Final_report_model_outputs_field_data[6,6] <- regkriging_field_correlation_50[1,6]

Final_report_model_outputs_field_data[7,4] <- glm_field_correlation_100_10fold[1,3]
Final_report_model_outputs_field_data[7,5] <- glm_field_correlation_100_10fold[1,4]  
Final_report_model_outputs_field_data[7,6] <- glm_field_correlation_100_10fold[1,6]  
Final_report_model_outputs_field_data[8,4] <- glm_field_correlation_75_10fold[1,3]
Final_report_model_outputs_field_data[8,5] <- glm_field_correlation_75_10fold[1,4]  
Final_report_model_outputs_field_data[8,6] <- glm_field_correlation_75_10fold[1,6]
Final_report_model_outputs_field_data[9,4] <- glm_field_correlation_50_10fold[1,3]
Final_report_model_outputs_field_data[9,5] <- glm_field_correlation_50_10fold[1,4]  
Final_report_model_outputs_field_data[9,6] <- glm_field_correlation_50_10fold[1,6]

write.csv(Final_report_model_outputs_field_data, file = paste0('Final_report_model_outputs_field_data_', 
                                                               layer_name_nasmd, '.csv'))

kable(Final_report_cross_validation, digits = 3, caption = "Final Cross Validation Report")
Final Cross Validation Report
Method Data_Perc Folds Correlation RMSE
Ordinary Kriging 100 10 0.794 0.028
Ordinary Kriging 75 10 0.798 0.028
Ordinary Kriging 50 10 0.798 0.028
Regression Kriging 100 10 0.804 0.027
Regression Kriging 75 10 0.797 0.028
Regression Kriging 50 10 0.763 0.030
General Linear Model 100 10 0.573 0.041
General Linear Model 75 10 0.572 0.040
General Linear Model 50 10 0.572 0.042
kable(Final_report_satellite_field_data, digits = 3, caption = "Reference Validation Report, ESA CCI v45 vs NASMD")
Reference Validation Report, ESA CCI v45 vs NASMD
Reference Data_Perc Correlation RMSE Points_pairs
Satellite_(CCI) 100 0.506 0.077 107
Satellite_(CCI) 75 0.544 0.105 88
Satellite_(CCI) 50 0.438 0.097 53
kable(Final_report_model_outputs_field_data, digits = 3, caption = "Field Validation Report, Models Outputs vs NASMD")
Field Validation Report, Models Outputs vs NASMD
Method Data_Perc Folds Correlation RMSE Points_pairs
Ordinary Kriging 100 10 0.528 0.075 107
Ordinary Kriging 75 10 0.523 0.075 107
Ordinary Kriging 50 10 0.511 0.075 107
Regression Kriging 100 10 0.530 0.075 107
Regression Kriging 75 10 0.530 0.075 107
Regression Kriging 50 10 0.484 0.076 107
General Linear Model 100 10 0.417 0.080 107
General Linear Model 75 10 0.416 0.080 107
General Linear Model 50 10 0.414 0.080 107

References

Cressie, Noel. 1990. “The Origins of Kriging.” Mathematical Geology 22 (3): 239-52. https://doi.org/10.1007/BF00889887.

Dorigo, W. A., A. Gruber, R. A M De Jeu, W. Wagner, T. Stacke, A. Loew, C. Albergel, et al. 2015. “Evaluation of the ESA CCI Soil Moisture Product Using Ground-Based Observations.” Remote Sensing of Environment 162 (June): 380-95. https://doi.org/10.1016/j.rse.2014.07.023.

Gareth, James, Daniela Witten, Trevor Hastie, and Robert Tibshirani. 2013. An Introduction to Statistical Learning. Springer Texts in Statistics. Vol. 103. https://doi.org/10.1007/978-1-4614-7138-7.

Hengl, Tomislav, Gerard B M Heuvelink, and Alfred Stein. 2004. “A Generic Framework for Spatial Prediction of Soil Variables Based on Regression-Kriging.” Geoderma 120 (1-2): 75-93. https://doi.org/10.1016/j.geoderma.2003.08.018.

Hengl, Tomislav, Gerard B M Heuvelink, and David G Rossiter. 2007. “About Regression-Kriging : From Equations to Case Studies” Computers and Geosciences 33 (10): 1301-15. https://doi.org/10.1016/j.cageo.2007.05.001.

Kang, Jian, Rui Jin, and Xin Li. 2015. “Regression Kriging-Based Upscaling of Soil Moisture Measurements from a Wireless Sensor Network and Multiresource Remote Sensing Information over Heterogeneous Cropland.” IEEE Geoscience and Remote Sensing Letters 12 (1): 92-96. https://doi.org/10.1109/LGRS.2014.2326775.

Parker, J a, R V Kenyon, and D E Troxel. 1983. “Comparison of Interpolation Methods for Image Resampling.” IEEE Transactions on Medical Imaging 2 (1): 31-39. https://doi.org/10.1109/42.7784.

Quiring, Steven M., Trent W. Ford, Jessica K. Wang, Angela Khong, Elizabeth Harris, Terra Lindgren, Daniel W. Goldberg, and Zhongxia Li. 2016. “The North American Soil Moisture Database: Development and Applications.” Bulletin of the American Meteorological Society 97 (8): 1441-59. https://doi.org/10.1175/BAMS-D-13-00263.1.

Stein, Michael L. 1999. Interpolation of Spatial Data: Some Theory for Kriging. Springer.

Thornton, M.M., P.E. Thornton, Y. Wei, B.W. Mayer, R.B. Cook, and R.S. Vose. 2018. “Daymet: Monthly Climate Summaries on a 1-Km Grid for North America, Version 3.” Tennessee, USA: ORNL DAAC, Oak Ridge. https://doi.org/10.3334/ORNLDAAC/1345.