Linking Remotely Sensed Carbon and Water Use Efﬁciencies with In Situ Soil Properties

: The capacity of terrestrial ecosystems to sequester carbon dioxide (CO 2 ) from the atmosphere is expected to be altered by climate change and CO 2 fertilization, but this projection is limited by our understanding of how the soil system interacts with plants. Understanding the soil–vegetation interactions is essential to assess the magnitude and response of terrestrial ecosystems to the changing climate. Here, we used soil proﬁle and satellite data to explore the role that soil properties play in regulating water and carbon use by plants. Data obtained for 19 terrestrial ecosystem sites in a warm temperate and humid climate were used to investigate the relationship between remotely sensed data and soil physical and chemical properties. Classiﬁcation and regression tree results showed that in situ soil carbon isotope ( δ 13 C), and soil order were signiﬁcant predictors (r 2 = 0.39, mean absolute error (MAE) = 0 of 0.175 gC/KgH 2 O) of remotely sensed water use efﬁciency (WUE) based on the Moderate Resolution Imaging Spectroradiometer (MODIS). Soil extractable calcium (Ca), and land cover type were signiﬁcant predictors of remotely sensed carbon use efﬁciency (CUE) based on MODIS and Landsat data-(r 2 = 0.64–0.78, MAE = 0.04–0.06). We used gross primary productivity (GPP) derived from solar-induced ﬂuorescence (SIF) data, based on the Orbiting Carbon Observatory-2 (OCO-2), to calculate WUE and CUE (referred to as WUE SIF and CUE SIF , respectively) for our study sites. The regression tree analysis revealed that soil organic matter and soil extractable magnesium (Mg), δ 13 C, and soil silt content were the important predictors of both WUE SIF (r 2 = 0.19, MAE = 0.64 gC/KgH 2 O) and CUE SIF (r 2 = 0.45, MAE = 0.1), respectively. Our results revealed the importance of soil extractable Ca, soil carbon (S 13 C is a facet of soil carbon content), and soil organic matter predicting CUE and WUE. Insights gained from this study highlighted the importance of biotic and abiotic factors regulating plant and soil interactions. These types of data are timely and critical for accurate predictions of how terrestrial ecosystems respond to climate change.


Introduction
Plant growth is tightly coupled with soil nutrients, particularly nitrogen (N) and phosphorus (P) [1][2][3][4], and limitation of these nutrients is assumed to be one of the drivers of observed spatial variability in ecosystem carbon (C) fluxes [5][6][7], particularly terrestrial Remote Sens. 2021, 13, 2593 3 of 22 with soil depth and with satellite sensors' spatial resolutions. We present a novel approach combining CUE and WUE estimates from satellite GPP, NPP, ET, and SIF with soil properties to understand the regulatory mechanisms and effects of the nutrient cycles. Previous studies have shown that WUE varies with land use type, with forest having the highest WUE and croplands having the lowest WUE [39,53,54]. The relationship between CUE and land use type is more complex and is driven by climate controls, such as temperature and precipitation; CUE is higher in a drier and cooler climate than in a wet and warm climate [55]. In general, forest has lower CUE than grasslands, and crops have medium CUE values [55,56]. Based on previous studies, we hypothesize that the CUE and WUE depend on land use and depth-related soil properties. Moreover, this work elucidates how ecosystems are structured and function based on connections between aboveground vegetation productivity and soil properties.

Site Selection
We collected soil samples from 19 sites spatially distributed across a warm temperate and fully humid climate [57] in the Commonwealth of Kentucky, USA. The region is characterized by a humid subtropical climate, with cool to moderately cold winters and warm to hot summers. The mean annual temperature ranges from 11.7 • C in the northeast to 15 • C in the southwest. The warmest month is August, while the coldest month is January. Mean annual precipitation is~1300 mm in the south and 965 mm in the northern areas, with August the driest month and May the wettest month.
The site selection criteria consisted of two stages. Stage 1 involved the selection of potential sites that represented the major land cover types of Kentucky (i.e., agricultural, grassland, and forest). Stage 2 assessed differences in lithologic and topographic effect (slope) to assure analyses of vegetation-soil interaction were not confounded by parent material and relief. Following our site selection criteria, we identified 19 sites for soil sampling (Table  1). We limited our site selection to two soil orders: Alfisol and Ultisol, which minimized differences due to soil physical, chemical, and mineralogical properties ( Figure 1). These two soil orders at the 19 sites consist of fine-grained parent material that is slight to strongly leached of nutrients in the topsoil. These soils also have clay-enriched B horizons. Unlike surrounding Inceptisols and Entisols, these two soil orders more likely reflect the surrounding long-term climate and ecological conditions. These two soil orders cover approximately 18% of the global glacial free land area according to the Soil Science Society of America (https://www.soils.org/about-soils/basics/types/, accessed on 2 June 2021).

Soil Physical and Chemical Analysis
Continuous soil cores were extracted using a hammer-percussion coring device and the soil horizons were described [58]. The cores were cut in half in the laboratory and an interval-sampling approach was used, where soil samples were extracted from the core every 8 ± 2 cm, to ensure that each horizon was sampled at least once. The soil physical and chemical properties were measured on samples collected to a depth of 60 cm. Particle size, oven-dry bulk density, gravimetric water content, pH, and loss on ignition (LOI) were measured for the 19 sites at Murray State University using previously established methods [59] with modifications [60]. LOI describes the variance in soil organic matter content (SOM) and is herein referred to as OM%. Soil samples were sent to the University of Kentucky Soils Characterization Lab and extractable P, C, Ca, Mg, Zn, and Fe were measured using the Mehlich-3 extraction method [61,62].
We performed soil stable isotopes analysis because they are indicators of soil C and

Soil Physical and Chemical Analysis
Continuous soil cores were extracted using a hammer-percussion coring device and the soil horizons were described [58]. The cores were cut in half in the laboratory and an interval-sampling approach was used, where soil samples were extracted from the core every 8 ± 2 cm, to ensure that each horizon was sampled at least once. The soil physical and chemical properties were measured on samples collected to a depth of 60 cm. Particle size, oven-dry bulk density, gravimetric water content, pH, and loss on ignition (LOI) were measured for the 19 sites at Murray State University using previously established methods [59] with modifications [60]. LOI describes the variance in soil organic matter content (SOM) and is herein referred to as OM%. Soil samples were sent to the University of Kentucky Soils Characterization Lab and extractable P, C, Ca, Mg, Zn, and Fe were measured using the Mehlich-3 extraction method [61,62].
We performed soil stable isotopes analysis because they are indicators of soil C and N dynamics in an ecosystem and can be used to determine the sources of SOM and soil C turnover rates [63,64]. Soil samples were sent to the Stable Isotope Ecosystem Lab of the University of California, Merced (SIELO) for isotope analysis and elemental content of organic C and N. Splits of the soil sample (<2 mm size fraction) were prepared for stable isotope and elemental composition analysis with an initial homogenization step where samples were powdered using mortar and pestle for three to five minutes. One portion of this powdered subsample was placed in centrifuge vials and treated with 1N HCl to remove any inorganic C (calcite), which confounds the organic C signal for isotopic and elemental compositional analysis. After acidification, these subsamples were triple-rinsed in deionized water, then dried overnight at 50 • C. The other split was weighed without acid treatment to preserve δ 15 N values. All powdered samples were weighed to 20-40 mg (based on organic C and N content) into 5 × 9 tin capsules for stable isotope ( 13 C/ 12 C and 15 N/ 14 N) and organic elemental (C and N) analysis. The SIELO analyses content and isotopes of C and N with an Elemental Analyzer (Costech 4010) coupled to a Thermo Scientific Delta V Plus isotope ratio mass spectrometer via conflo IV. Reference materials for elemental and isotopic composition were run with all samples and included at least n = 3 of each Costech acetanilide, Peach Leaf (NIST 1547), and USGS 40. Reproducibility for these reference materials was <0.3‰ for both δ 13 C and δ 15 N values as well as <3% for C and N content ( Table 2). Stable isotope compositions are reported with VPDB and AIR as references for δ 13 C and δ 15 N values, respectively, and in per mille (‰). All C:N ratios are reported as weight percent.
We also used the GPP product (GOSIF GPP) based on SIF derived from the Orbiting Carbon Observatory-2 (OCO-2). GOSIF is a global, OCO-2 based SIF product, while GOSIF GPP is based on GOSIF and linear relationships between GPP and SIF [65,66]. Both products are available at http://globalecology.unh.edu, accessed on 13 May 2021. GOSIF GPP data, refer to as GPP SIF hereafter were resampled from 0.05 • to 250 m spatial resolution using the bilinear interpolation method to match the resolution of GPP 250 .
CUE based on each of the data inputs was calculated as follows: where CUE 30 We used GOSIF GPP along with MODIS NPP to calculate CUE and WUE to explore how SIF-based GPP can be used to examine the relationships of WUE and CUE with soil properties.

Statistical Analysis
To identify and rank the soil chemical and physical properties that are significant in predicting CUE and WUE, we used the classification and regression trees (CART) method (the rpart package in R). The advantage of the regression tree is that it uses recursive partitioning to split the data into nodes or branches. This allows the training data to be split into groups that behave independently without the need for data stratification before analysis. For soil data analysis, whereas soil variables are the results of interdependent soil processes, the CART model is very helpful because it reveals variable interactions [67,68]. Thus, CART can help address the interdependence between soil data with depth and CUE and WUE. The predictors used to estimate WUE and CUE were land cover type, soil order, soil horizon, soil depth, layer thickness, bulk density, silt, sand, clay, OM, EC, pH-CaCl2, pH-H2O (1:1 H 2 O), soil organic carbon (SOC), total C, total N, C:N ratio, δ 13 C, δ 15 N, and extractable soil P, K, Ca, Mg, Zn, and Fe. Prediction errors were estimated by randomly dividing the data to 70% training (n = 80 for WUE and WUE SIF ; n = 72 for CUE 30 , n = 63 for CUE 250 , and n = 76 for CUE SIF ) and 30% testing samples (n = 35 for WUE and WUE SIF ; n = 31 for CUE 30 , n = 28 for CUE 250 , and n = 33 for CUE SIF ). The differences in the sampling sizes for the CUE data were related to the availability of satellite data for our Remote Sens. 2021, 13, 2593 9 of 22 study sites. We built the regression tree using the training sample and then pruned the tree to minimize overfitting by selecting the number of splits in the CART that is associated with a minimized cross-validated error estimate. The CART results were used to identify the soil chemical and physical variables that were important predictors for CUE and WUE. Then, the mean absolute error (MAE) was calculated by using the testing sample with the pruned CART model. We applied the CART for the 30 m and 250 m CUE and WUE to investigate the impact of spatial resolution on the results of CART. Additionally, CART was developed for CUE SIF and WUE SIF . Finally, we analyzed the CART results based on GPP derived from MODIS and SIF data to detect any difference in soil chemical and physical variables identified as important predictors for CUE and WUE. The reason for such comparison is that GOSIF GPP is directly estimated from SIF that is a direct measure of energy emitted by plants during photosynthesis, while MODIS GPP and NPP are based on the light use efficiency model and process-based model, respectively. Thus, they represent two different approaches to estimate CUE and WUE. To test for the effects of land cover type, soil order, soil horizon, and soil depth, we performed four additional CART analyses.: (1) excluding only land cover from the CART model (CART nLC ); (2) excluding only soil order from the CART model (CART nSO ); (3) excluding only soil horizon from the CART model (CART nSH ); and (4) excluding only soil depth from the CART model (CART nSD ). This was performed for each of the WUE and CUE data and the model with the highest r 2 and lowest MAE was selected as the best model. Soil physical and chemical properties were compared to satellite CUE and WUE. To achieve this, we summed the 8-day MODIS GPP and ET and 16-day Landsat GPP to produce annual GPP for the study sites. Both MODIS and Landsat-based NPP are directly available as annual products. Then, we used the annual data to calculate WUE and CUE at the annual timescale.

Soil Data
The study sites sampled for this study (7 grasslands, 6 forestlands, and 6 agricultural sites) are dominated by silty clay and silty clay loam soils ( Figure 2). Forest sites showed the highest soil SOC and the lowest δ15N values in the top 10 cm of the soil, while agricultural soils had the highest P concentrations, which could be due to fertilization (Figure 3). The low forest δ15N in the top 10 cm of the soil can be related to litter inputs and low microbial processing [69] as evident in higher C:N in the soil A horizon (Table 2). In general, soil physical and chemical properties were not significantly different based on Tukey's HSD test for land use types (i.e., forest vs. grasslands vs. agriculture) (see Supplementary Data). We note that one of our sites was formerly agricultural land (currently converted to grassland) with soil extractable P values higher than other grassland sites that were not cultivated ( Figure 3C).
Tukey's HSD test for land use types (i.e., forest vs. grasslands vs. agriculture) (see Supplementary Data). We note that one of our sites was formerly agricultural land (currently converted to grassland) with soil extractable P values higher than other grassland sites that were not cultivated ( Figure 3C).

Soil Variables Versus CUE and WUE
The pruned CART model for MODIS WUE showed that soil δ 13 C was the primary splitting node indicating the role of SOC in regulating WUE since soil δ 13 C is a facet of soil carbon content. MODIS WUE decreased with greater soil δ 13 C values (less negative) and increased with lower soil δ 13 C values ( Figure 4A). The second split was soil order Alfisol that decreased MODIS WUE ( Figure 4A), indicating lower WUE in Alfisol soils. Soil OM was the only split for WUE SIF , showing a negative relationship between soil OM and WUE SIF ( Figure 5A). Table 3 shows the r 2 and MAE values for each of the CART models developed to address the impacts of land cover, soil order, soil horizon, and soil depth. The model with the best model was the one with land cover excluded (CART nLC ) from the analysis for MODIS WUE with r 2 of 0.39 and MAE of 0.17 gC/KgH 2 O ( Table 3). The WUE SIF models showed the highest MAE when all the soil variables were included (Table 3), and thus we selected the model with land cover excluded (CART nLC ) (r 2 = 0.19, MAE = 0.64 gC/KgH 2 O) ( Table 3). We found that MODIS WUE outperformed WUE SIF likely because MODIS had finer spatial resolution (250 m) than the GOSIF GPP (0.05 • ).

Soil Variables Versus CUE and WUE
The pruned CART model for MODIS WUE showed that soil δ 13 C was the primary splitting node indicating the role of SOC in regulating WUE since soil δ 13 C is a facet of soil carbon content. MODIS WUE decreased with greater soil δ 13 C values (less negative) and increased with lower soil δ 13 C values ( Figure 4A). The second split was soil order Alfisol that decreased MODIS WUE ( Figure 4A), indicating lower WUE in Alfisol soils. Soil OM was the only split for WUESIF, showing a negative relationship between soil OM and WUESIF ( Figure 5A). Table 3 shows the r 2 and MAE values for each of the CART models developed to address the impacts of land cover, soil order, soil horizon, and soil depth. The model with the best model was the one with land cover excluded (CARTnLC) from the analysis for MODIS WUE with r 2 of 0.39 and MAE of 0.17 gC/KgH2O (Table 3). The WUESIF models showed the highest MAE when all the soil variables were included (Table 3), and thus we selected the model with land cover excluded (CARTnLC) (r 2 = 0.19, MAE = 0.64 gC/KgH2O) ( Table 3). We found that MODIS WUE outperformed WUESIF likely because MODIS had finer spatial resolution (250 m) than the GOSIF GPP (0.05°).   gC/KgH2O) ( Table 3). We found that MODIS WUE outperformed WUESIF likely because MODIS had finer spatial resolution (250 m) than the GOSIF GPP (0.05°).    The pruned CART model for CUE 30 showed that land cover was the first split, grassland being the main land cover type in the first split ( Figure 6A). The second split was soil extractable Ca that showed a negative relationship with CUE 30 , with CUE 30 increasing at lower Ca values ( Figure 6A), and the CART model had r 2 of 0.64 and MAE of 0.04 (Table 3). Land cover was the first split and the most important predictor for CUE 250 , but with the forest as the land cover predictor rather than a grassland as in CUE 30 ( Figure 7A). CUE 250 decreased with forest with higher soil extractable Ca ( Figure 7A), and the model had r 2 of 0.78 and MAE of 0.06 (Table 3). Since the difference between the original CART and CART nSH for CUE 250 was almost identical, we decided to use the model that included all the soil variables ( Table 3). The pruned CART model for CUE SIF showed that soil δ 13 C was the first split showing a positive relationship with CUE SIF , that the second split was soil silt content with lower CUE SIF at lower soil silt content, and that the third split was soil extractable Mg that a positive relation with CUE SIF ( Figure 8A); the model had r 2 of 0.45 and MAE of 0.1 ( Table 3). The variable importance plots for the CART models are shown in the Supplementary Figures S1-S5. As expected, soil order and soil δ 13 C were the most important predictors of WUE ( Figure S1), while OM was the most powerful predictor of WUE SIF ( Figure S2). Land cover was the most important predictor of CUE 30 and CUE 250 ( Figures S3 and S4). Soil δ 13 C was the most important predictor of CUE SIF ( Figure S5). It is important to note that all other soil variables had some amount of predictive capacity for both WUE and CUE (Figures S1-S5). 0.45 and MAE of 0.1 ( Table 3). The variable importance plots for the CART models are shown in the Supplementary Figures S1-S5. As expected, soil order and soil δ 13 C were the most important predictors of WUE ( Figure S1), while OM was the most powerful predictor of WUESIF ( Figure S2). Land cover was the most important predictor of CUE30 and CUE250 (Figures S3 and S4). Soil δ 13 C was the most important predictor of CUESIF ( Figure  S5). It is important to note that all other soil variables had some amount of predictive capacity for both WUE and CUE (Figures S1-S5).

Impacts of Different Productivity Measures and Spatial Resolution on the Relationship Satellite-Derived CUE and, WUE Withsoil Properties
We applied the CART for MODIS WUE and WUE SIF to investigate the impacts of different productivity measures (based on the MODIS light use efficiency model vs. GPP based on OCO-2 SIF) on the CART results. There were no common variables detected between CART-MODIS and CART-SIF (Figures 4 and 5). We compared GEOSIF GPP and MODIS GPP and found a low correlation (r 2 = 0.19) between these two products ( Figure S7). In general, GOSIF GPP values were much higher than MODIS GPP which was reported to underestimate site measured GPP, particularly for cropland [40,70]. Besides, we applied CART for Landsat CUE, MODIS CUE, and SIF-based CUE to investigate the impact of spatial resolution on the results of CART. The land cover type and soil extractable Ca were the common variables between MODIS CUE and Landsat CUE ( Figures 7A and 8A) highlighting the importance of these soil variables for predicting CUE. No common soil variables were detected among CUE SIF , CUE 30, and CUE 250 (Figures 6-8). This could be due to different algorithms used to estimate MODIS and Landsat GPP compared to the GPP derived from SIF and the different spatial resolutions of these data products (Table 4). Table 4. Summary of the GPP algorithm for the remote sensing data set used in this study.

MODIS 250 m
Light use efficiency model using MODIS 250 m derived leaf area index and the fraction of photosynthetically active radiation [40] Landsat 30 m Light use efficiency model using Landsat 30 m derived leaf area index and the fraction of photosynthetically active radiation [40] GOSIF GPP OCO-2 SIF vs. GPP universal relationships [66,67]

Discussion
Overall, our approach is robust and applicable to our study sites given the significant relationships of soil chemical and physical properties with CUE and WUE. The recursive partitioning that CART is built upon can model WUE and CUE from soil physical and chemical properties by providing a simplified, but yet powerful perspective on the drivers of WUE and CUE. The CART models showed higher predictability for CUE than for WUE likely due to the uncertainties in MODIS ET or other environmental controls on WUE (e.g., vapor pressure deficit). Our results did not reject our hypothesis that CUE depends on the land cover type with grasslands and forest being the most important predictors of CUE 30 and CUE 250 , respectively. Conversely, our results rejected the hypothesis that WUE is dependent on land cover type.
The CART WUE and CUE results revealed indirectly the correlations of soil δ 13 C and OM with soil depth. In general, OM was higher and soil δ 13 C values were lower in the topsoil layers and OM becomes lower and δ 13 C values increase with an increase in soil depth [71], which was also the case for our study sites ( Figure S6). Although CART WUE and CART CUE SIF models did not show that soil depth was an important predictor of WUE and CUE SIF , they did reveal an indirect indication of the soil depth-related properties in regulating WUE and CUE as evident in Figures 4, 5 and 8. Therefore, our results failed to reject our hypothesis that WUE and CUE are independent of soil depth-related properties.
Our analysis revealed the importance of soil extractable Ca and Mg in regulating CUE and WUE and calls for more data collections and synthesis to better understand the potential role that these soil properties play in moderating the effects of climate change on CUE and WUE. Our methodology demonstrated the ability of remotely sensed data to capture the observed relationships of CUE, WUE, NDVI, and EVI with soil and provides a platform for future exploration of the soil-vegetation interactions using satellite vegetation data and in situ soil data.
The regression tree trained using MODIS data relied on soil δ 13 C and soil order to predict WUE (Figure 4). Soil δ 13 C values showed a positive correlation with WUE ( Figure S4A) similar to the observed positive correlations between Leaf δ 13 C and intrinsic WUE [72,73]. This relationship can be attributed to the close relationship between SOC and δ 13 C values [71,74] and the role that SOC plays in controlling soil water availability. Regression tree analysis showed that soil order Alfisol was important for predicting WUE. This could be driven by our three sites with soil order Ultisol that on average had higher WUE than the other sites. In general, the three forest sites with soil order Ultisol had lower soil extractable Ca and K than the other sites. An increase in the uptake of soil extractable Ca and K can enhance WUE by decreasing ET [37]. We hypothesized that higher WUE in these three sites is due to an increase in the uptake of soil extractable Ca and K. We plan to measure leaf nutrient concentration data from our study sites to test this hypothesis in future studies. Soil OM was the only statistically significant predictor of WUE SIF ( Figure  5). Additionally, OM showed a negative correlation with WUE and can indicate structural water in clay minerals and/or SOC and can affect soil water holding capacity. This is because an increase in soil OM is critical for forming soil aggregate that can increase the soil-plant available water and the latter controls the ET rates [75]. These findings indicate that an increase in OM could lead to a decrease in WUE.
The regression trees trained using Landsat and MODIS data were very similar, with both relying on land cover and soil extractable Ca to predict CUE. Land cover was the most important predictor for CUE highlighting the difference between forest and grassland ecosystems, which was similar to the observed pattern in MODIS-derived CUE [33,55]. The importance of soil extractable Ca for predicting CUE highlighted the close interactions between plant N uptake, soil N pools, and soil extractable Ca [76,77], and an increase in soil extractable Ca increases GPP and eventually decreased CUE. The CUE increased with a decrease in soil extractable Mg because of the role that Ca plays in regulating physiological responses related to plant stress and growth [37,78]. For instance, the deficiency in soil Ca was related to an increase in root biomass and a decrease in respiration [78], which can lead to an increase in NPP and potentially CUE. The importance of the soil δ 13 C as a predictor of the CUE SIF has not been reported before mainly because of limited site CUE data. As more soil δ 13 C data becomes available, more in-depth studies can potentially further probe into the linkages between soil δ 13 C values and vegetation WUE and CUE. The CART CUE SIF model showed a positive relationship between CUE SIF and soil extractable Mg. Soil Mg is involved in various processes that affect photosynthesis and plant growth, such as chlorophyll a and b production and root and shoot biomass [37,79,80]. A recent meta-analysis study found that an increase in soil extractable Mg leads to an increase in NPP and biomass [81], which can eventually increase CUE as predicted by our CART results.
The regression tree analysis revealed relationships of soil chemical and physical properties with WUE and CUE. The CART analysis allowed us to: unravel the interactions among various soil properties and their and combined effects on WUE and CUE; and to identify soil variables that are most important for predicting WUE or CUE. Our approach provided an understanding of the interrelationships within the datasets (e.g., soil depth) more easily than using, for example, simple or multiple regression analysis. It should be noted that the results shown in Figure 4, Figure 5, Figure 6, Figure 7, Figure 8 did not mean that these variables were the only soil variables important for predicting WUE and CUE, but rather that the CART method found these variables to be able to minimize the variability in WUE and CUE between all the tree nodes. Figures S1-S5 show the ranking of the importance of soil variables in predicting WUE and CUE. Thus, all the other soil variables (Figures S1-S5) had certain predictive power for WUE and CUE suggesting the need for more comprehensive soil-vegetation analysis to better untangle the complex soil-vegetation processes that underlie WUE and CUE.
All soil chemical and physical variables included in this study have direct or indirect effects on vegetation processes, including plant growth, WUE, GPP, and ET [37,79,80,82]. Soil extractable Zn and C:N ratio were the only soil variables that did not show any predictive power for either WUE or CUE. (Figures S1-S5). Soil δ 15 N values appeared once as a variable that had predictive power for WUE ( Figure S1). Thus, for our study sites, these soil variables were not important in minimizing the errors in CUE or WUE between the CART tree nodes.
We collected soil samples from two soil orders to minimize the effects of soil order on soil chemical properties. We expect that our soil chemical and physical data to have little variability as they were collected from the center of Landsat 30 m pixels with a homogenous land cover type. This is less likely to be the case for the coarse-resolution MODIS (250 m) and GOSIF GPP (0.05 • ) data, as individual grid cells could contain multiple land cover types. With time, vegetation type and land use could alter soil properties and result in different soil chemical concentrations, leading to heterogeneous soil properties in a pixel with multiple land cover types. Moreover, soil chemical properties vary significantly as a function of topographic position. Although we controlled for land cover type (for Landsat data), we did not control for topographic variability. In addition, the scale mismatch between Landsat and MODIS can result in differences in the land cover type assigned for calculating GPP and NPP. Thus, MODIS GPP and NPP values might not represent the same land cover type as Landsat derived GPP and NPP. The scale mismatch among site, MODIS, and GOSIF GPP could influence the results since no common soil variables were detected between MODIS and GOSIF GPP CART results for CUE and WUE. Moreover, the difference in algorithms MODIS and GOSIF GPP products (Table 4) could have contributed to the differences in our results as evident in the low correlation between MODIS GPP and GOSIF GPP.
Finally, in implementing our approach, we limited the soil sampling to two soil orders to control for the effect of the variability in soil type on the results. Limiting our sampling to two soil orders could affect the broad application of our results. We anticipated that the observed relationships of CUE and WUE with soil chemical properties could apply to these two soil orders in different climatic regions, but more research is needed to verify the correlations that we observed. The robustness of our results is dependent on the accuracy of satellite data, used to estimate CUE and WUE. Newer, higher spatial resolution data on WUE from ECOSTRESS have the potential to significantly refine these results further. ECOSTRESS produces an operational WUE product at 70 × 70 m spatial resolution every 1-5 days from the International Space Station [42,83]. ECOSTRESS measures thermal infrared radiance, which is converted into a land surface temperature and emissivity (L2_LSTE) product; L2_LSTE is then converted into ET (L3_ET_PT-JPL) using the Priestley-Taylor Jet Propulsion Laboratory (PT-JPL) method [84]. ET is then combined with GPP from MOD17 to create the WUE product (L4_WUE); newer versions of ECOSTRESS replace MOD17 with a native 70 m GPP based on the Breathing Earth System Simulator [85]. ECOSTRESS can detect spatial variability in WUE never seen before ( Figure S8). These differences are evident throughout the study region landscape, especially in relation to topography, hydrology, and land cover, and land use. Nevertheless, to our knowledge, our effort is one of the few that relate satellite data to soil properties for investigating the soil-vegetation interactions. CUE and WUE terms are important indicators of plants' adaptability to changing environmental conditions, such as precipitation and temperature. Understanding changes in CUE and WUE is thus critical to quantify the response of the terrestrial ecosystem to climate change.

Conclusions
CUE and WUE are key physiological parameters linking carbon and water cycles. Understanding changes in these efficiencies is critical to quantify future terrestrial ecosystem responses to climate change. Here, we presented a different approach to understand the stand-level soil-vegetation interactions by using remotely sensed data. Data from the 19 sites representative of the major land cover types occurring in a warm-temperate and humid climate were used to investigate the capability of satellite data to derive meaningful information about the soil-vegetation interactions. The results revealed significant relationships between satellite-derived CUE and WUE and measured soil chemical properties. The CART model identified soil variables that were important to predict CUE and WUE, implying that satellite data have the potential for elucidating soil-vegetation interactions.
We have learned from this study that satellite spatial resolution was not important for more accurate detection of relationships between satellite vegetation products and soil chemical properties. Interestingly, coarser-resolution MODIS data produce the best estimates for CUE than high-resolution remote sensing products (e.g., Landsat CUE). Our study is limited by the number of sites used and the reported relationships may differ with more data or additional biomes or soil types. Nevertheless, our study represents a new approach that can be used to better understand the observed spatial variability in satellite-derived CUE and WUE data and can be used to inform land surface models about missing processes that are essential for more accurate predictions of vegetation CUE and WUE [86].
Supplementary Materials: The following are available online at https://www.mdpi.com/article/10 .3390/rs13132593/s1. Figures S1-S5: Variable importance ranking based on the total reduction in mean square error (MSE) for calibrated CART CUE and WUE models, Figure S6: variability of soil carbon isotope and organic matter with soil depth, Figure S7: regression analysis for the relationship between MODIS GPP and GOSIF GPP for our study sites. Figure S8: Spatial image of WUE from ECOSTRESS. and supplementary data showing the Tukey's HSD test for our study sites.

Data Availability Statement:
The data presented in this study is available upon request from the corresponding author.