Ad Hoc Modeling of Root Zone Soil Water with Landsat Imagery and Terrain and Soils Data.

Agricultural producers require knowledge of soil water at plant rooting depths,while many remote sensing studies have focused on surface soil water or mechanisticmodels that are not easily parameterized. We developed site-specific empirical models topredict spring soil water content for two Montana ranches. Calibration data sample sizeswere based on the estimated variability of soil water and the desired level of precision forthe soil water estimates. Models used Landsat imagery, a digital elevation model, and asoil survey as predictor variables. Our objectives were to see whether soil water could bepredicted accurately with easily obtainable calibration data and predictor variables and toconsider the relative influence of the three sources of predictor variables. Independentvalidation showed that multiple regression models predicted soil water with average error(RMSD) within 0.04 mass water content. This was similar to the accuracy expected basedon a statistical power test based on our sample size (n = 41 and n = 50). Improvedprediction precision could be achieved with additional calibration samples, and rangemanagers can readily balance the desired level of precision with the amount of effort tocollect calibration data. Spring soil water prediction effectively utilized a combination ofland surface imagery, terrain data, and subsurface soil characterization data. Rancherscould use accurate spring soil water content predictions to set stocking rates. Suchmanagement can help ensure that water, soil, and vegetation resources are usedconservatively in irrigated and non-irrigated rangeland systems.

Landsat multispectral satellite imagery might be used to account for the empirical relationship between evapotranspiration and the spatial distribution of soil water. Landsat imagery has been used to estimate accurately leaf area [6], which in turn should be highly correlated to evapotranspiration [7]. Empirical relationships between evapotranspiration and soil water content are site-and-date-specific, but are considerably easier to develop at a ranch scale than mechanistic modeling approaches. Such empirical models avoid the radiometric correction and universal calibration issues that mechanistic remote sensing-based models must confront. Satellite imagery also has been used to directly estimate soil water [e.g., 8], but the characteristics of the imagery have resulted in a focus on surface soil water, which is highly important for certain applications, but not particularly valuable for estimating plant growth, which is a function of water at plant rooting depths.
DEMs can be used to derive hydrologically important topographic variables such as slope and aspect. Topographic variables can be used to account for relative amounts of evapotranspiration across a landscape [9]. Terrain has been shown to be a better predictor of soil water content in wet versus dry conditions [9,10]. Soil water content in semi-arid Montana environments, however, has been found to have limited correlation with terrain subdivisions and topographic indices [10,11].
Soil water distribution might be more closely related to hydrologically important soil characteristics, such as texture, than to topographic variables in semi-arid Montana rangelands [10]. NCSS soil surveys provide one source of spatially explicit soil attribute data that might be appropriate for modeling soil water at a ranch scale. Soil surveys, however, have limited accuracy [12]. Attribute data is often interpolated and/or extrapolated from a handful of lab characterized pedons for an entire survey area [13]. The addition of site-specific soils data to soil water content models based on soil survey and terrain data has been recommended for future research in semi-arid Montana agricultural systems [10].
The overall goal of this study was to assess the ability of Landsat and ancillary soil and terrain data to model accurately spring soil water content in semi-arid rangelands of the NGP. This was carried out in the context of relatively easily developed, site-and date-specific ad hoc empirical models developed for two Montana ranches. The first objective of the study was to test whether soil water could be predicted accurately with the models. The second objective was to consider the relative influence of Landsat and ancillary predictor variables. Two questions were considered in regard to the second objective: (1) Is Landsat imagery a useful predictor of soil water content at the plant rooting zone? (2) Do the terrain and soil ancillary data sources provide predictive ability in addition to the Landsat imagery?

Site and Methods
Fieldwork and modeling were carried out for two study sites: the Decker/Bales ranch and the BBar ranch ( Figure 1). The Decker/Bales ranch, located in southwestern Powder River County in southeastern Montana, is approximately 100 km 2 . The landscape is part of Montana's non-glaciated plains and is characterized by dissected sedimentary layers that form a low relief, fluvially incised landscape. Range vegetation consists of grassland communities of western wheatgrass (Agropyron smithii Rydb.), needle and thread (Stipa comata Trin. & Rupr.), blue grama (Bouteloua gracilis Willd. ex Kunth), and big sagebrush (Artemisia tridentata Nutt.) [14]. Soils include loamy, calcareous Ustorthents formed in siltstones, clayey, calcareous Ustorthents formed in shales, fine to coarse-loamy Haplustalfs formed in slope alluvium, loamy-skeletal Haplustalfs formed in scoria beds, and fine Natrustalfs that are often associated with prairie dog communities [14,15]. The area receives approximately 30 cm of mean annual precipitation, the soil temperature regime is on the boundary between Mesic and Frigid, and the soil moisture regime is on the boundary between Ustic and Aridic [16]. The BBar ranch is approximately 30 km 2 and is located in northern Sweet Grass County in southcentral Montana (Figure 1). It lies in a valley near the Rocky Mountain front in the westernmost extent of Montana's non-glaciated plains. The landscape consists of rolling, sedimentary bedrock-controlled hills vegetated with grassland communities of western wheatgrass, little bluestem (Andropogon scoparius Michx.), needle and thread, and blue grama [14]. Soils at this site range from fine Argiustolls on backslopes, footslopes, and toeslopes, to loamy-skeletal Ustorthents on summit and shoulder positions, as well as fine Natrustalfs on toeslopes and valley floor positions, and fine and fine-loamy Torrifluvents in drainageways [14,15]. The area receives approximately 35 cm of mean annual precipitation, the soil temperature regime is Frigid, and the moisture regime is Ustic [17].
A statistical power test [18] was performed on a small preliminary data set of depth of moist soil measurements from the Decker/Bales ranch (n=11). Depth of moist soil for this data set was measured at several routinely monitored pastures by one of the ranch's owners with a Paul Brown push probe in April, 2003. The data set had a mean of 74 cm of moist soil and a standard deviation of 9.6 cm of moist soil. The power test showed that 41 points were necessary to be able to detect a significant -Montana Counties BBar Ranch Decker/Bales Ranch difference of 7.6 cm of moist soil. At least 41 points for model development and 41 points for validation were targeted for each study area during sampling.
Soil survey maps were used to choose a set of sample locations that was representative of the variability in soil type as well as the variability in slope, aspect, landform, and landform position at each ranch. Random points were selected within each soil survey map unit, with at least one location for each named map unit. Navigation to the points was accomplished with a GPS receiver with an accuracy of < 1 m. A hand auger was used to collect soil samples in 10 cm increments to 100 cm depth at each sample location. Samples were not collected from 60-70 cm and 80-90 cm for efficiency.
A total of 100 locations were sampled at the Decker/Bales ranch and 82 locations were sampled at the BBar ranch. Sampling was completed during the first week of May 2004 at the Decker/Bales ranch and during the second week of May 2004 at the BBar ranch. Mass water content (Ө) was measured for each sample and used as the response variable for water content modeling at both study locations.
The satellite imagery predictor variables were bands 1-7 from a Landsat 5 TM scene selected from the previous growing season for each study site. Scenes were selected by proximity to the expected peak of growing season biomass production and by cloud free quality. A scene from 1 August 2003 was selected for the BBar site and a scene from 3 August 2003 was selected for the Decker/Bales ranch. Satellite image band values were extracted for the individual pixels corresponding to each sample site.
Terrain predictor variables were derived from a seamless, 30-m USGS DEM. Percent slope and aspect layers were created in ARCGIS using the spatial analyst surface function. Aspect was transformed to the cosine of aspect from degrees from north.
Soil survey predictor variables were created from digitized soil maps and associated attribute data. These data have been found to be appropriate in scale for use with Landsat-based imagery [4]. Maps of individual soil characteristics were developed from the soil survey attribute data in ARCGIS. Percent clay content and percent soil organic matter (SOM) were calculated as depth-weighted average values for the entire profile. Maps of depth to root restrictive layer (cm), plant available water holding capacity (AWC) by volume (cm3/cm3), and equivalent depth (cm) of plant available water (PAW) were also created based on soil survey data.
Sample data for each ranch were split randomly into two equally sized data sets (n = 50 for Decker/Bales and n = 41 for BBar) prior to modeling soil water content. One data set for each ranch was used for model calibration and the other for independent validation. Multiple regression models were constructed stepwise by predicting Ө with the possible predictor variables of Landsat bands, DEM-derived slope and aspect layers, and the set of soil survey variables.
Two models were developed, one for each ranch. The best performing models for the calibration data were selected for each ranch. Independent model validation consisted of predicting water content for the reserved data set. Least squares regressions of the validation water content as a function of predicted values were constructed, and scatter plots of the relationship containing points, regression lines, and 1:1 lines (slope = 1, intercept = 0) were examined. Mean squared deviation (MSD) and root mean square deviation (RMSD) were calculated for the predicted versus observed values and the MSD was broken into components of standard bias (SB), non-unity (NU), and lack of correlation (LC) [19].
SB quantifies the proportion of the MSD related to the deviance of the least squares fit from a 1:1 relationship in the y direction (intercept). NU quantifies the proportion of the MSD related to the deviance of the least squares fit from a 1:1 relationship in the slope of the fitted line. LC quantifies the proportion of the MSD related to the scatter of the points in relation to the 1:1 line.
Hypothesis tests were used to test for significant differences between predicted and observed validation soil water content. The Levene's test was used to test whether predicted and observed sample populations had significantly different variances [20]. The Mann-Whitney test of the paired predicted and validation samples was used to test whether the mean of the differences between the samples was statistically significantly different than zero [20].

Results and Discussion
Regression models constructed for both study sites (Table 1) contained at least one variable from the Landsat, DEM, and soil data sources. The model developed for the BBar site explained 21% more variability in soil water content than the model developed for the Decker/Bales site. Table 1. Average soil profile (100 cm) Өm (profile) models using soil survey data that were selected for validation for the BBar and Decker/Bales (D/B) ranches. Band numbers represent Landsat TM 5 bands. Slope is percent topographic slope. Aspect is cosine transformation of topographic aspect in degrees from north. Clay is entire profile weighted average percent clay of soil survey map unit major component. All models are significant at p-value < 0.05 and all variables are significant at p-value < 0.05 unless noted with #. R 2 values, adjusted for degrees of freedom, are presented. The model for each site predicted Ө with an average prediction error (RMSD) within 0.04 gravimetric water content (Table 2). This was similar for many of the soils at both study sites to the expected margin of error of 7.6 cm of moist soil predicted using the statistical power test at the outset of the study. No statistically significant difference between predicted and observed water content was found for the BBar model ( Table 2). Predicted and observed water contents were statistically significantly different for the Decker/Bales model ( Table 2). Table 3. Correlation coefficients for soil water content (water) and selected predictor variables at the BBar ranch. Bands refer to Landsat 5 Thematic Mapper image bands, awc refers to plant available water holding capacity, clay refers to percent clay content, depth refers to depth to root restrictive layer, som refers to percent soil organic matter, and paw refers to plant available water. Correlation coefficients for predictor variables and soil water content (Tables 3 and 4) provided insight into the relative influence of the four predictor data sources. The strongest correlations with soil water content at the BBar ranch were for Landsat bands and plant available water holding capacity. At the Decker/Bales ranch, Landsat bands were again the most strongly correlated variables, although band 4 (near infrared) was less so, while slope and soil organic matter were relatively strongly correlated. There was also substantial correlation between Landsat bands and certain other predictor variables, most notably for both site slope and soil organic matter. Table 4. Correlation coefficients for soil water content (water) and selected predictor variables at the Decker/Bales ranch. Bands refer to Landsat 5 Thematic Mapper image bands, awc refers to plant available water holding capacity, clay refers to percent clay content, depth refers to depth to root restrictive layer, som refers to soil organic matter, and paw refers to plant available water. The variables included in the final models were at multiple scales. The thermal band from Landsat 5, for example, was at a coarser scale than the remaining bands and terrain variables (120 m as opposed to 30 m), while soil map units encompass large and varying numbers of 30-m pixels. This meant that, to the extent the thermal band and soils data were included in our final models, the variability that was accounted for by those data was modeled at a coarser scale than sources of variability attributable to other Landsat data and topographic variables. This will affect the precision of any soil water maps produced from these models, but was also inherent in our validation analysis, so does not negate the results presented.

Can soil water be predicted accurately with site-and-date-specific empirical models?
Water content was predicted within the statistically expected average prediction error. The statistical power built into the study resulted in errors that were a substantial percentage of measured soil water, however, primarily because sampling occurred during the sixth year of drought conditions, which resulted in very low average soil water. Greater prediction precision might be necessary for such conditions or for other purposes. This approach, however, enables a manager to determine the practical advantages of precision of predictions versus number of samples needed. The samples required for this study required one day of sampling per ranch.

Is Landsat imagery a useful predictor of soil water content?
Landsat imagery from the peak of the previous growing season explained the most variability in soil water content of all data sources. The Landsat variables appeared to detect vegetation and ground surface patterns that were related to soil water distribution at both ranches.
Landsat bands 3 and 4 were the most common predictor variables as individual predictors and in interaction terms. The positive near infrared (band 4) coefficient and negative red (band 3) coefficient in the model for the Decker/Bales ranch indicated that locations with more growing season green biomass had higher spring soil water contents, since vegetation reflects highly in near infrared and absorb heavily red wavelengths [21]. This might suggest that water-collecting landscape positions, or positions that held more water due to soil characteristics such as texture and depth, might have supported both higher plant productivity in the previous growing season and also higher soil water content in the spring. This is the opposite of an evapotranspiration driven interpretation where areas of lower leaf area with resulting limited evapotranspiration might have been expected to conserve soil water for later seasons, such as in a fallow agricultural field [22].
The negative red (band 3) and near infrared (band 4) coefficients for the model developed at the BBar site suggest that spring soil water content decreased with an increase in both red and near infrared reflectance during the previous growing season. Band 3 and 4 reflectances have been shown to decrease with increased surface soil water content on fallow agricultural surfaces [23]. Reflectance for both bands has been suggested to be high for both increased cover of senescent litter [23] and exposed bare soil [24,25]. The negative coefficients for bands 3 and 4 in the BBar models might suggest that spring soil water content was more related to surface water content, senescent vegetation cover, and/or bare soil than abundance of healthy green biomass during the previous growing season. The semi-arid, droughty conditions during our study suggest, however, that there was minimal surface water content, so we believe this explanation is remote in this instance. The different band 3 and 4 responses between the two sites indicates that, while Landsat imagery can be an important predictor for soil water, a single model might not be appropriate and our approach to site-specific, ad hoc models might be more appropriate.
Landsat thermal response (band 6) was a useful predictor of spring soil water content at Decker/Bales when bands 3 and 4 were in the model. Emittance measured by the thermal band might be influenced by ground surface temperature and water content [21]. Remote sensing in the thermal range has been used to link ground surface temperature with evapotranspiration rates [26]. Plants that have adequate water are cooler than if under water deficit. High band 6 emittance at the peak of the previous growing season might have suggested higher surface temperatures. This might have indicated water-limited areas or areas with high exposed bare soil and low vegetation cover. Higher evaporation rates can lead to greater soil water depletion and lower soil water content in later seasons [22].

Do the terrain ancillary data sources provide predictive ability in addition to the Landsat imagery?
The slope and aspect variables were more strongly correlated with Landsat and soil predictor variables than with soil water. Slope was substantially correlated with Landsat bands 3 and 4. It was used as a predictor in interaction terms with these variables for the BBar model. Aspect was used by the model developed for the Decker/Bales ranch and was as strongly correlated with Landsat band 7 (middle infrared) as it was with soil water content at this ranch. The positive coefficient for aspect in the Decker/Bales model suggests that southerly aspects had lower soil water contents and northerly aspects had relatively higher soil water contents, an expected relationship in the mid-latitudes of Northern Hemisphere as southern slopes receive more direct solar radiation resulting in higher evaporation rates [27].
The sign of the coefficient for the percent slope variable at the BBar ranch suggests that water content was lower on steeper slopes. Water content is generally expected to be lower on steeper slopes due to surface and sub-surface flow [9,28] and because steeper slopes tend to have shallower soils due to colluvial processes. The redistribution of soil water by sub-surface flow, however, probably is not substantial in semi-arid environments where soil water content might not be highly influenced by terrain [10,11]. The water content and slope relationship might have been mitigated by soil characteristics such as texture, SOM, and depth, as well as vegetation characteristics [29,30].

Summary and Conclusions
This study found that site-and date-specific, ad hoc empirical models based on Landsat imagery, USGS DEMs, and NCSS soil surveys could be used to predict spring soil water content in Montana semi-arid rangelands within a statistically expected level of accuracy. Drought conditions might require higher precision and, therefore, greater calibration sampling. Soil water in semi-arid environments is generally expected to be highly variable and the level of variability further affects the level of calibration sampling required. This approach, however, is one of general applicability and allows the land manager the ability to evaluate the relative value of increased precision versus sampling intensity prior to beginning the modeling process.
Our results in terms of the performance of the predictor variable data sources provided important insights, especially considering the expected difficulty of modeling soil water in a semi-arid system. Landsat imagery was found to be a strong predictor of soil water relative to the terrain and soil ancillary data sources. Soil survey variables explained additional variability in soil water that was unexplained by the Landsat variables.
Ranchers or their consultants seeking to apply these methods in the future will need to assemble several data sources. Moderate-resolution imagery is needed that includes at least the red and near infrared portions of the spectrum, while our results indicate that other portions of the spectrum, such as thermal infrared, might also be valuable. All other moderate-resolution satellite imagery sources, of which there are many (www.asprs.org/news/satellites/ASPRS_DATABASE_090706.pdf), include both the red and near infrared bands, however access to moderate-resolution thermal data could be problematic. Both topographic and soils data appear to be important in ad hoc soil water modeling, are publically available, and easily accessible (http://gisdata.usgs.net/website/Seamless and http://soildatamart.nrcs.usda.gov, respectively). A statistical sample size test is a necessary prerequisite. This test requires knowledge of the variability of soil water, which can be estimated with a small sample and verified with the full sample, as well as a decision as to the desired level of precision. Once the collected samples have been measured for soil water content, multiple regression models developed on a site and time specific basis can be calculated and used for mapping soil water across a given site.
Ranchers in the NGP could benefit from a modeling approach that produces accurate predictions of spring soil water content. Water content predictions could be used to estimate forage production, set stocking rates, and ensure the rangeland resource is neither over-nor under-utilized. Accurate predictions of spring soil water could also be used to conserve and budget water in both non-irrigated and irrigated systems.
Soil water is often modeled via subsurface processes. In the NGP, spring soil water might be more readily empirically relatable to land surface patterns than to subsurface soil characteristics. This study showed that soil water prediction in these environments was successful within the statistical power expected from our models with a combination of land surface imagery, terrain data, and subsurface soil characterization data.