Monitoring Climate Impacts on Annual Forage Production across U.S. Semi-Arid Grasslands

: The ecosystem performance approach, used in a previously published case study focusing on the Nebraska Sandhills, proved to minimize impacts of non-climatic factors (e.g., overgrazing, ﬁre, pests) on the remotely-sensed signal of seasonal vegetation greenness resulting in a better attribution of its changes to climate variability. The current study validates the applicability of this approach for assessment of seasonal and interannual climate impacts on forage production in the western United States semi-arid grasslands. Using a piecewise regression tree model, we developed the Expected Ecosystem Performance (EEP), a proxy for annual forage production that reﬂects climatic inﬂuences while minimizing impacts of management and disturbances. The EEP model establishes relations between seasonal climate, site-speciﬁc growth potential, and long-term growth variability to capture changes in the growing season greenness measured via a time-integrated Normalized Difference Vegetation Index (NDVI) observed using a Moderate Resolution Imaging Spectroradiometer (MODIS). The resulting 19 years of EEP were converted to expected biomass (EB, kg ha − 1 year − 1 ) using a newly-developed relation with the Soil Survey Geographic Database range production data (R 2 = 0.7). Results were compared to ground-observed biomass datasets collected by the U.S. Department of Agriculture and University of Nebraska-Lincoln (R 2 = 0.67). This study illustrated that this approach is transferable to other semi-arid and arid grasslands and can be used for creating timely, post-season forage production assessments. When combined with seasonal climate predictions, it can provide within-season estimates of annual forage production that can serve as a basis for more informed adaptive decision making by livestock producers and land managers. within the 90% conﬁdence interval represent the model prediction variability within the error term of the model. Residuals that fall beyond this limit represent signiﬁcant under-and overperformance anomalies and demonstrate the inﬂuence of land management and disturbances (vertical movement of points). B.K.W., D.D.; M.P.; M.P., B.K.W., M.J.H., B.D.W., D.J.B., Y.A.B., J.D.D., P.A.F., W.H.S., J.D.V., P.W.; visualization, M.P.; supervision, B.K.W., B.D.W., D.J.B.; project administration, funding M.J.H.,


Introduction
Rangelands in the United States cover around 312 million ha representing about 30% of the total land area [1]. Climatic changes, including more frequent and longer-duration droughts and long-term directional shifts in temperature and precipitation seasonality and amounts, can substantially affect forage production [2][3][4][5]. Reduced forage production can lead to lower livestock gains, soil erosion, water quality and quantity issues, and often results in rippling socio-economic impacts like physical and emotional stress [6] or financial and social instability [7]. For example, a widespread severe drought in 2012 led to poor or very poor conditions in 59% of the total U.S. pasture and range areas [8], costing taxpayers over $2.6 billion through record payouts via the U.S. Department of Agriculture's Livestock Forage Program [9]. Despite this drought assistance, many producers experienced losses in welfare that may have persisted for several years [10].
Semi-arid rangelands of the United States have inherently high interannual differences in forage production mostly connected to inter-and intra-annual variability in precipitation [11][12][13]. Matching animal demand to forage availability using adaptive management can improve animal weight gains [14] and increase sustainability of land management. Previous management emphasized fixed moderate to low stocking rates, based on decadalscale research, that optimized weight gains per animal and per unit land area [15] with slow but substantial changes to these stocking rates over long periods of time (e.g., multiple decades) [16]. These fixed stocking rates result in underutilization of forage during years with above-average forage production while also creating difficult and costly grazing management decisions to reduce numbers of livestock or shorten the grazing season in drought years. The use of flexible stocking rates, adjusted seasonally/annually based on forage availability, provides opportunities for more efficient utilization of available forage within and across years and can be advantageous for economic returns [17][18][19][20].
A fundamental need for producer decision-making using adaptive management and flexible stocking rates is the accurate simulation or prediction of forage production prior to and during the grazing season. Modeling efforts are available but the complexity of plant communities in rangelands remains challenging for site-level estimates [21] and cross-site efforts are more problematic [22]. Lack of large-scale, long-term, coordinated ground observations of forage production hinders development of a robust model covering large spatial extents that would rely solely on ground-observed data. Alternatively, advancements in remote sensing and processing of big data have produced emergent tools in the United States for estimating forage production in rangelands at large spatial scales with enhanced accuracy (Table 1). These tools can be generally categorized into two groups based on the timing of the provided information: (1) a post-growing season assessment and (2) a within-growing season prediction of annual forage yield. Characteristics and limitations of these tools are provided in Table 1. Long return intervals of Landsat (16-days) with frequent cloud contamination can introduce errors in these observations [23,24] https://rangelands.app (last accessed 18 November 2021) [25] Grass-Cast Within-growing season prediction production anomaly Process-based model connected with empirical biomass observations.

km Great Plains and Southwest
Coarse spatial resolution when compared to an average size of a ranch. Except for the SDDT, all the tools identified in Table 1 in some way use satelliteobserved Normalized Difference Vegetation Index (NDVI, [27]) data for estimating forage production. While effectively capturing spatial and temporal patterns of greenness, vegetation health, and annual biomass production [28][29][30], NDVI and tools using NDVI do not provide information about the underlying causes of changes in grassland productivity [31].
Many factors affect forage production across spatial and temporal scales. Long-term spatial patterns of forage production are primarily affected by soil characteristics, vegetation type, long-term climate, and topography [11,32,33]. Temporal scale variability, in contrast, is driven primarily by the intra-and interannual variability in climate conditions [12,13,34,35], fire [36], pest outbreaks [37], management [13], and growing conditions of the previous year [38,39]. To limit the focus to the effect of intra-and inter-annual climate variability on forage production, other non-climatic effects that cause variability in forage production during the growing season, such as disturbances caused by overgrazing, fire, flooding, and pests, would need to be minimized or eliminated completely. In a case study from the Nebraska Sandhills, the Expected Ecosystem Performance (EEP) approach [40] was used to separate the climate signal from other annual effects [3]. The amount of expected forage production was estimated solely based on seasonal climate and site-specific characteristics using a data-driven piecewise regression tree model. This model used a long-term historical NDVI dataset at 250-m spatial and weekly temporal resolutions as a proxy for annual forage production. While the approach was successful in estimating annual forage production based on observed climate, the model was developed over a specific and relatively small ecoregion.
Here, we test the extension of the EPP approach with the piecewise regression tree model on semi-arid grasslands of the western United States. The objectives are as follows: (1) develop an EEP model and historical maps of EEP derived from this model for the years 2000 until 2018 that retain the information from the climate signal but reduce other nonclimatic factors, (2) convert the EEP to a measure of forage biomass (e.g., kg ha −1 ), (3) assess the accuracy of EEP estimates across various sites within the study area using independent ground-based observations, and (4) test whether the model can be used beyond the model training period (2000)(2001)(2002)(2003)(2004)(2005)(2006)(2007)(2008)(2009)(2010)(2011)(2012)(2013)(2014)(2015)(2016)(2017)(2018). The outcome of our efforts was to ascertain the applicability of EEP for post-growing season assessments and, if used in conjunction with seasonal climate forecasts or scenarios, for timely within-growing season annual forage predictions.

Site Description
The spatial extent of the study and resulting biomass maps cover major grassland areas in western United States ( Figure 1). The eastern border of the study area was determined using the Level II Omernik ecoregion classification [41] that divides the Great Plains into west-central and south-central semi-arid prairies from temperate prairies. Most of the tallgrass temperate prairies, characterized by higher precipitation amounts and resultant greater soil moisture content, were excluded from the analysis because the majority of this grassland type has been converted to cultivated crop land [42]. The western border of the study area is delineated by the mountain ranges of the Sierra Nevada and Cascade Range and the coinciding Level III Omernik ecoregions [41]. We excluded desert ecoregions including the Mojave Basin and Range, Sonoran Basin and Range, Arizona-New Mexico Plateau, and Chihuahuan Desert, where growth dynamics of sparse vegetation cover are difficult for moderate resolution sensors to detect. We divided the study site into two regions that differ in the percentage of herbaceous land cover compared to other land cover classes in each Level III ecoregion ( Figure 1). More detailed description of the differences between these two regions is provided in Appendix A. The study site has a total area of over 3 million km 2 and is diverse in climate, elevation, soil properties, and land cover types. Annual precipitation ranges from~100 mm to 2500 mm, the minimum February temperature from −19 to 11 • C, and the maximum August temperature from 13 to 38 • C. As summarized in The Fourth National Climate Assessment [43], over the last few decades the western part of the United States has seen the largest increases in annual temperatures (over 0.8 • C) when compared to the entire United States. Since the beginning of the 20th century, the United States has experienced changes in precipitation patterns with increases in annual mean precipitation in the Midwest and Great Plains and decreases in the Southwest. These trends are expected to continue in the future [43]. Elevation in the study site ranges from~30 m to 4300 m. The soil orders include, but are not limited to, Mollisols, Entisols, Aridisols, and Alfisols. Most of the area is well drained, while some locations are somewhat excessively drained (e.g., Nebraska Sandhills). Although the area contains many different Land Cover (LC) classes, we focused purely on the grassland/herbaceous class as identified in the National Land Cover Database 2016 (NLCD 2016). Because of the large geographic extent of the study area, livestock management strategies vary based on location-specific characteristics.
Remote Sens. 2021, 13, x FOR PEER REVIEW 5 of 27 [43], over the last few decades the western part of the United States has seen the largest increases in annual temperatures (over 0.8 °C) when compared to the entire United States. Since the beginning of the 20th century, the United States has experienced changes in precipitation patterns with increases in annual mean precipitation in the Midwest and Great Plains and decreases in the Southwest. These trends are expected to continue in the future [43]. Elevation in the study site ranges from ~30 m to 4300 m. The soil orders include, but are not limited to, Mollisols, Entisols, Aridisols, and Alfisols. Most of the area is well drained, while some locations are somewhat excessively drained (e.g., Nebraska Sandhills). Although the area contains many different Land Cover (LC) classes, we focused purely on the grassland/herbaceous class as identified in the National Land Cover Database 2016 (NLCD 2016). Because of the large geographic extent of the study area, livestock management strategies vary based on location-specific characteristics. Figure 1. Location of the study area. Region A represents the Great Plains with higher percentage of herbaceous cover, while Region B represents the southern and western areas with much lower total herbaceous cover. The green areas display the distribution of areas that were identified as suitable for placement of model training points. Locations of ground-based validation are displayed as red dots.

Methods
This study builds upon the methods used in a proof-of-concept study by Poděbradská et al. [3]. We introduce the forage production model methodology but focus primarily on the adjustments made to the methodology of Poděbradská et al. [3] with emphasis on additional steps necessary for regional application at that larger scale that is more variable in climate, land cover, elevation, and land use. Location of the study area. Region A represents the Great Plains with higher percentage of herbaceous cover, while Region B represents the southern and western areas with much lower total herbaceous cover. The green areas display the distribution of areas that were identified as suitable for placement of model training points. Locations of ground-based validation are displayed as red dots.

Methods
This study builds upon the methods used in a proof-of-concept study by Poděbradská et al. [3]. We introduce the forage production model methodology but focus primarily on the adjustments made to the methodology of Poděbradská et al. [3] with emphasis on additional steps necessary for regional application at that larger scale that is more variable in climate, land cover, elevation, and land use.
A data-driven, rule-based Regression Tree (RT) model is used to estimate annual forage production based on a combination of remotely sensed site-specific characteristics and seasonal climate data [3]. RT models provide an accurate representation of complex systems and bring insight to non-linear relations and higher-order interactions that occur between the dependent and independent variables [40]. In RT models, the input data space is stratified into several small multi-dimensional zones. In each zone, a linear regression model is fit to predict the dependent variable using some or all of the independent variables. Each zone is then represented by the definition of the zone (similar to decisions trees) and a corresponding regression equation [44].
The output of our RT model is the EEP, which represents a proxy for the total annual biomass expected to be produced at a specific location under certain climate conditions, omitting other influencing factors like management, fire, or pests, which are not model input variables. The ecosystem performance approach was originally developed for boreal forest ecosystems [40] but has been frequently used in grasslands [3,[45][46][47][48].

Model Inputs
The RT model was designed and run using Cubist ® software [49]. The target (dependent) variable in the RT model was the Growing Season NDVI (hereon called GSN), a proxy of annual vegetation growth, e.g., [26,50,51]. The NDVI is one of the most widely used vegetation indices capturing changes in vegetation greenness. It uses information about the reflectance of the red and near infrared (NIR) portion of the electromagnetic spectrum (Equation (1)), which can be utilized for assessing vegetation health, drought stress, biomass production, and other vegetation-related information [28][29][30].
The GSN is often approximated using an integral of the NDVI collected multiple times between the start and end dates of the growing season [52,53]. In large geographic areas, the start and end dates of the growing season can vary considerably with location and from year to year. To address this, we developed a method to dynamically approximate the GSN without defining the specific start and end dates of the growing season using archived modified NDVI data obtained from the U.S. Geological Survey Earth Observation and Science Center (USGS EROS). The modified NDVI (mNDVI data) consists of Expedited Moderate Resolution Imaging Spectroradiometer (eMODIS) weekly NDVI composites with 250-m spatial resolution [54] from the MODIS Terra collection 5 (2000-2002) and MODIS Aqua collection 6 (2003-2018) that are quality masked, temporally smoothed [55], and scaled to 0-200. The GSN was calculated as the sum of weekly (n = 52) positive differences (negative differences are omitted) between the mNDVI and the scaled NDVI value 120 (hereafter called the base NDVI), which corresponds to a NDVI of 0.2 (0.2 NDVI * 100 + 100). The base NDVI represents a signal from soil background and dead vegetation [56]. The calculation of the GSN is expressed as: This approach ensures that the GSN represents forage growth during the growing season, which varies in start, end, and duration based on geographical location, site specific characteristics, and interannual differences in growing conditions.
The independent (explanatory) variables in the model include: seasonal climate data, site potential, and a long-term mean absolute error of the GSN (MAE GSN). Seasonal climate data were calculated from monthly climate variables (minimum, mean, and maximum temperature, and precipitation) obtained from the PRISM Climate Group [57]. Monthly data, available at a 4-km spatial resolution, were bilinearly interpolated to 250 m and converted to seasonal values (winter-December, January, February; spring-March, April, May; summer-June, July, August) depending on the specific variable-mean of 3 months for temperature variables and sum of 3 months for precipitation. Site poten-tial ( Figures 2 and A1), defined as the mean of annual GSN values that are higher than long-term GSN median, represents the spatial variability in forage production potential governed by local biogeophysical properties that stay relatively stable across time. This method of site potential approximation has been previously used in grassland areas [3,45]. The MAE GSN ( Figure 2) serves as the measure of mean interannual variability in forage growth and has been used in estimates of annual exotic herbaceous cover in the sagebrush ecosystem [58]. These independent variables were selected based on their relationship with plant growth, data availability, and appropriateness of their spatial and temporal resolution. Additionally, these variables correspond to those used in the proof-of-concept case study by Poděbradská et al. [3] and similar studies, e.g., [45][46][47][48]58].
Remote Sens. 2021, 13, x FOR PEER REVIEW 7 of 27 months for temperature variables and sum of 3 months for precipitation. Site potential (Figures 2 and A1), defined as the mean of annual GSN values that are higher than longterm GSN median, represents the spatial variability in forage production potential governed by local biogeophysical properties that stay relatively stable across time. This method of site potential approximation has been previously used in grassland areas [3,45]. The MAE GSN ( Figure 2) serves as the measure of mean interannual variability in forage growth and has been used in estimates of annual exotic herbaceous cover in the sagebrush ecosystem [58]. These independent variables were selected based on their relationship with plant growth, data availability, and appropriateness of their spatial and temporal resolution. Additionally, these variables correspond to those used in the proof-of-concept case study by Poděbradská et al. [3] and similar studies, e.g., [45][46][47][48]58].

EEP, EB, Annual Biomass Deviation, and Percent Normal Biomass
The GSN serves as the predicted (dependent) variable in the EEP model. Site potential, the long-term MAE of the GSN, and seasonal climate variables are used as the explanatory (independent) variables ( Figure 3). Thus, EEP is solely based on climate and longterm site-specific variations and lacks inputs that would explain EEP variability due to management or disturbances included in the model training database. This facilitates the isolation of climate driven dynamics for the EEP predictions. The number of rules and other model parameters were based on the recommendations of Gu et al. [59] that aim to minimize over-and under-fitting tendencies. Model output, specifically, a set of rules and multiple linear regressions, was further used to develop maps of EEP in MapCubist, an internal USGS EROS code for application of Cubist model output. MapCubist can be substituted with open-source python modules (e.g., GDAL, Pickle) as underlying mapping

EEP, EB, Annual Biomass Deviation, and Percent Normal Biomass
The GSN serves as the predicted (dependent) variable in the EEP model. Site potential, the long-term MAE of the GSN, and seasonal climate variables are used as the explanatory (independent) variables ( Figure 3). Thus, EEP is solely based on climate and long-term sitespecific variations and lacks inputs that would explain EEP variability due to management or disturbances included in the model training database. This facilitates the isolation of climate driven dynamics for the EEP predictions. The number of rules and other model parameters were based on the recommendations of Gu et al. [59] that aim to minimize overand under-fitting tendencies. Model output, specifically, a set of rules and multiple linear regressions, was further used to develop maps of EEP in MapCubist, an internal USGS EROS code for application of Cubist model output. MapCubist can be substituted with open-source python modules (e.g., GDAL, Pickle) as underlying mapping applications. The method for delineating the area that was used for placement of model training points and displaying of model results is described in Appendix A (Table A1). applications. The method for delineating the area that was used for placement of model training points and displaying of model results is described in Appendix A (Table A1). The rangeland productivity dataset within the Soil Survey Geographic Database (SSURGO, obtained from https://nrcs.app.box.com/v/soils, last accessed on 18 November 2021) was used to convert a unitless EEP to biomass units for producers ( Figure 3). Biomass values, typically measured in kg ha −1 or lbs acre −1 , are more relevant to rangeland managers and livestock producers than GSN index values. Accordingly, we established a relation between EEP and dry matter biomass values obtained from the SSURGO range production database. The SSURGO database contains three biomass categories that illustrate low, representative, and high production years. We created a dataset of the 25th, 50th, and 75th percentiles from the time-series of annual EEP values (2000-2018) to match these three categories. We extracted SSURGO and EEP values from randomly located points (n = 37,455) within the mapped areas across the three production categories to establish an empirical relation using weighted least squares (WLS) regression (R 2 = 0.7), which was used to convert EEP maps into values of Expected Biomass (EB, kg ha −1 yr −1 of dry biomass matter) for each pixel and year. The WLS regression model, which was chosen instead of an ordinary least squares regression due to an observed heteroscedasticity, was fit to the data. The pixel-based EEP was converted to annual biomass values (in kg ha −1 ) using the resulting regression equation (Equation (3)).
Forage production (kg ha −1 yr −1 ) = 2.18 * EEP + 340. 4 (3) To reduce the observation bias in EEP and SSURGO productivity values, the randomly located points were grouped into 20 clusters of EEP values equally spaced along the range of the values. The mean and standard deviation of the SSURGO productivity was calculated for these clusters. A regression analysis was performed on the cluster EEP and SSURGO data. This approach gives a near-equal importance to data along the entire range of the data values and avoids over-or underestimation biases in the lowest or highest range, which are relatively rare occurrences in the space for time dataset of the regression equation.
The annual biomass deviation was calculated as the EB in a given year minus the long-term biomass median on a pixel basis. The percent anomaly was calculated as the division of EB in a given year by a long-term EB median multiplied by 100. We used the long-term median (as opposed to long-term mean) to calculate the annual biomass deviation and the percent normal biomass. Compared to the mean, the use of the median does The rangeland productivity dataset within the Soil Survey Geographic Database (SSURGO, obtained from https://nrcs.app.box.com/v/soils, last accessed on 18 November 2021) was used to convert a unitless EEP to biomass units for producers (Figure 3). Biomass values, typically measured in kg ha −1 or lbs acre −1 , are more relevant to rangeland managers and livestock producers than GSN index values. Accordingly, we established a relation between EEP and dry matter biomass values obtained from the SSURGO range production database. The SSURGO database contains three biomass categories that illustrate low, representative, and high production years. We created a dataset of the 25th, 50th, and 75th percentiles from the time-series of annual EEP values (2000-2018) to match these three categories. We extracted SSURGO and EEP values from randomly located points (n = 37,455) within the mapped areas across the three production categories to establish an empirical relation using weighted least squares (WLS) regression (R 2 = 0.7), which was used to convert EEP maps into values of Expected Biomass (EB, kg ha −1 yr −1 of dry biomass matter) for each pixel and year. The WLS regression model, which was chosen instead of an ordinary least squares regression due to an observed heteroscedasticity, was fit to the data. The pixel-based EEP was converted to annual biomass values (in kg ha −1 ) using the resulting regression equation (Equation (3)).
Forage production (kg ha −1 yr −1 ) = 2.18 * EEP + 340. 4 (3) To reduce the observation bias in EEP and SSURGO productivity values, the randomly located points were grouped into 20 clusters of EEP values equally spaced along the range of the values. The mean and standard deviation of the SSURGO productivity was calculated for these clusters. A regression analysis was performed on the cluster EEP and SSURGO data. This approach gives a near-equal importance to data along the entire range of the data values and avoids over-or underestimation biases in the lowest or highest range, which are relatively rare occurrences in the space for time dataset of the regression equation.
The annual biomass deviation was calculated as the EB in a given year minus the long-term biomass median on a pixel basis. The percent anomaly was calculated as the division of EB in a given year by a long-term EB median multiplied by 100. We used the long-term median (as opposed to long-term mean) to calculate the annual biomass deviation and the percent normal biomass. Compared to the mean, the use of the median does not assume the data to be normally distributed and this statistical measure is also less affected by outliers.
We used the Standardized Precipitation Index (SPI, [60]) to qualitatively assess the spatial distribution of abnormally dry and wet conditions and compared it to areas of Remote Sens. 2022, 14, 4 9 of 27 abnormally high and low biomass production as estimated by the EEP model. Specifically, we used SPI for August 31 with a 6-month time-step that summarizes the moisture conditions over the period of the growing season (from the beginning of March through to the end of August). The SPI data were obtained from https://www.drought.gov/data-mapstools/us-gridded-standardized-precipitation-index-spi-nclimgrid-monthly (last accessed on 18 November 2021).

Ground Validation
We obtained long-term ground clipping biomass data from multiple sources for 11 sites in five states of the study region ( Figure 1) with details of each found in Table 2. The mean of biomass from multiple clipping locations established in each pasture was used to represent a pasture-wide biomass. The collection of data and the management of the specific data collection pastures differed based on the validation location and its primary research purpose. In some locations exclosure cages were used to completely eliminate grazing pressure on the collected biomass, and in some locations samples were collected from pastures with low grazing pressure, while in others the entire stands were harvested and weighed at the end of the season. These differences can introduce certain inconsistencies when validating the modeled biomass; however, we argue that on a moderate spatial scale these differences are expected to be fairly negligible. The methodology of data collection for each validation site is described in detail in Table 2. We spatially matched the pastures where clipping data were obtained and averaged the pixel biomass values matching the extent of these pastures. A regression and a time-series analysis were performed on the dataset. Data from Texas were excluded from the regression analysis due to a difference in collection methodology (collecting all standing biomass from the stands and weighing non-dried samples), and data from Oklahoma were excluded from the time-series analysis due to a short period of record. We also estimated historical biomass since 1982 (moderate and heavy grazed) and 1991 (light grazed) using our established EB model output by extrapolating the model output using historical climate data and validated these results using long-term ground observations from the USDA Agricultural Research Service research site located near Cheyenne, Wyoming. This site was chosen due to data availability prior to our model training period. We compared the validation results of the historical estimates to estimates created within the model training period (2001)(2002)(2003)(2004)(2005)(2006)(2007)(2008)(2009)(2010)(2011)(2012)(2013)(2014)(2015)(2016)(2017)(2018). Additionally, we also examined the annual sign of trend (increase/decrease) when compared to a previous year's production for all locations of modeled and observed data.

Site Potential and MAE GSN
The site potential captured a productivity gradient over the Great Plains from southeast to northwest (Figure 2). The long-term interannual variability captured by the MAE GSN did not follow the same spatial patterns. Lower, long-term interannual variability was observed in the Middle Rockies, Wyoming Basin, Nebraska Sandhills, and Flint Hills. Higher interannual production variability was observed generally in Texas and South Dakota with spots scattered around the entire study area (Figure 2).

EEP Model
The EEP model was trained on 80% of the total points (n = 277,931) with 20% of the points used for independent model validation. The model was limited to 33 rules and used three committee models (which use mean predictions from multiple model runs to fine-tune the model [63]). Table 3 provides a list of the independent variables used in the model ranked by overall importance and the model evaluation on training and testing datasets. Table 3. Summary of independent variables used in the Expected Ecosystem Performance (EEP) model and their overall importance expressed as the mean of the frequency with which they were used in the model stratification and prediction.

Independent Variable
Overall Importance (%) Plotting EEP model estimates against the GSN observations ( Figure 4) reveals a good fit. The distance between each point and the 1:1 line represents the model residual. Residuals located within the 90% confidence interval represent the model prediction variability within the error term of the model. Residuals that fall beyond this limit represent significant underand overperformance anomalies and demonstrate the influence of land management and disturbances (vertical movement of points).

Figure 4. Growing Season Normalized Difference Vegetation Index (GSN) regressed on Expected
Ecosystem Performance (EEP) predicted from the EEP model with 1:1 line, regression line, and 90% confidence limits (normal, gray color) displayed. Over-(purple color) and underperformance (green color) points located outside of the 90% confidence limits represent the influence of non-climatic and site factors (e.g., overgrazing, fire, pests).

Conversion to Biomass
Annual EEP values serve as a proxy for forage production. The scatter plot of the annual EEP and SSURGO production shows a strong relation (R 2 = 0.70) between these two variables ( Figure 5). We observed some non-linear tendencies in the higher values of the EEP (EEP > 1500). Despite these, the linear fit between EEP and SSURGO productivity was found to be stronger than other non-linear fits (2nd degree polynomial and logarithmic). The observation density indicates that most observations fall along the regression line, while lower densities are associated with higher biomass values and points that are farther away from the regression line. This finding together with the increasing error rate (higher standard deviation for higher values) for higher biomass values might imply a lower predictability of biomass from EEP for high biomass areas. We observed a very strong relation (R 2 = 0.99) between the cluster-averaged variables ( Figure 5).

Figure 4. Growing Season Normalized Difference Vegetation Index (GSN) regressed on Expected
Ecosystem Performance (EEP) predicted from the EEP model with 1:1 line, regression line, and 90% confidence limits (normal, gray color) displayed. Over-(purple color) and underperformance (green color) points located outside of the 90% confidence limits represent the influence of non-climatic and site factors (e.g., overgrazing, fire, pests).

Conversion to Biomass
Annual EEP values serve as a proxy for forage production. The scatter plot of the annual EEP and SSURGO production shows a strong relation (R 2 = 0.70) between these two variables ( Figure 5). We observed some non-linear tendencies in the higher values of the EEP (EEP > 1500). Despite these, the linear fit between EEP and SSURGO productivity was found to be stronger than other non-linear fits (2nd degree polynomial and logarithmic). The observation density indicates that most observations fall along the regression line, while lower densities are associated with higher biomass values and points that are farther away from the regression line. This finding together with the increasing error rate (higher standard deviation for higher values) for higher biomass values might imply a lower predictability of biomass from EEP for high biomass areas. We observed a very strong relation (R 2 = 0.99) between the cluster-averaged variables ( Figure 5). Figure 6. Annual EB maps for the entire study period (2000-2018) can be found in Appendix B ( Figure A2). These maps capture the spatial and temporal variability of biomass across the study area and years of analysis. The EB data are publicly available [64].

EEP was converted to biomass and is displayed for two years as examples in
Annual biomass deviation maps (Figures 6 and A3) provide information of biomass departure from the long-term median value. Two example years of biomass deviation are presented in Figure 6, and all years (2000-2018) can be found in Appendix B ( Figure A3). These maps show areas in specific years where biomass was lower or higher than normal. Remote Sens. 2021, 13, x FOR PEER REVIEW 13 of 27

EB, Annual Biomass Deviation, and Percent Normal Biomass
EEP was converted to biomass and is displayed for two years as examples in Figure  6. Annual EB maps for the entire study period (2000-2018) can be found in Appendix B ( Figure A2). These maps capture the spatial and temporal variability of biomass across the study area and years of analysis. The EB data are publicly available [64].
Annual biomass deviation maps (Figures 6 and A3) provide information of biomass departure from the long-term median value. Two example years of biomass deviation are presented in Figure 6, and all years (2000-2018) can be found in Appendix B ( Figure A3). These maps show areas in specific years where biomass was lower or higher than normal.
The percent anomaly maps (Figures 6 and A4), derived from the EB and long-term median EB, effectively show the spatial distribution of areas where the seasonal climate had a positive or a negative effect on biomass. Example percent anomaly maps from 2012 and 2017 are displayed in Figure 6. Maps for all years (2000-2018) can be found in Appendix B ( Figure A4). In 2012, the biomass deviation map and the percent anomaly map showed low biomass production in the majority of the Great Plains region. On the other hand, in 2017, there was higher than normal biomass in the southern Great Plains and the Great Basin area, while a reduction was observed in the northern Great Plains. The areas of abnormally low and high production correspond to areas of drought and abnormally high precipitation as summarized for the period of the growing season (from the The percent anomaly maps (Figures 6 and A4), derived from the EB and long-term median EB, effectively show the spatial distribution of areas where the seasonal climate had a positive or a negative effect on biomass. Example percent anomaly maps from 2012 and 2017 are displayed in Figure 6. Maps for all years (2000-2018) can be found in Appendix B ( Figure A4). In 2012, the biomass deviation map and the percent anomaly map showed low biomass production in the majority of the Great Plains region. On the other hand, in 2017, there was higher than normal biomass in the southern Great Plains and the Great Basin area, while a reduction was observed in the northern Great Plains. The areas of abnormally low and high production correspond to areas of drought and abnormally high precipitation as summarized for the period of the growing season (from the beginning of March through the end of August) using the SPI with a 6-month time-step ( Figure 6).

Validation of the EEP Model
Comparison of the model-derived and the ground-observed biomass showed a moderately strong relation for the overall relation across time and space (Figure 7). Based on the regression analysis, the model explained 67% of the overall ground-observed variability. beginning of March through the end of August) using the SPI with a 6-month time-step ( Figure 6).

Validation of the EEP Model
Comparison of the model-derived and the ground-observed biomass showed a moderately strong relation for the overall relation across time and space (Figure 7). Based on the regression analysis, the model explained 67% of the overall ground-observed variability.
The time-series of modeled and observed biomass revealed that the model captured the variability in ground-observed biomass (Figure 8) across spatial and temporal dimensions. Underprediction of higher observed ground biomass was more pronounced for more sites in Colorado and Wyoming. The annual trend analysis revealed that the modeled data follow the same trend (decrease/increase) when compared to a previous year of production as the observed data in 75% of cases. The time-series of modeled and observed biomass revealed that the model captured the variability in ground-observed biomass (Figure 8) across spatial and temporal dimensions. Underprediction of higher observed ground biomass was more pronounced for more sites in Colorado and Wyoming. The annual trend analysis revealed that the modeled data follow the same trend (decrease/increase) when compared to a previous year of production as the observed data in 75% of cases.
Time-series and regression analyses were performed separately for three validation sites near Cheyenne, Wyoming, to assess the model predictions beyond the time period of the model training database (Figure 9) using extrapolation of the model output using the historical climate data. The time-series plots ( Figure 9A) capture the similarities and differences in the peaks and troughs between the modeled and ground-observed biomass, while the regression plots ( Figure 9B) show the similarities of the model predictions, with respect to ground-observed data, in the training period and prior to it. The regression lines created for the historical and recent data (1982)(1983)(1984)(1985)(1986)(1987)(1988)(1989)(1990)(1991)(1992)(1993)(1994)(1995)(1996)(1997)(1998)(1999)       of the model training database (Figure 9) using extrapolation of the model output using the historical climate data. The time-series plots ( Figure 9A) capture the similarities and differences in the peaks and troughs between the modeled and ground-observed biomass, while the regression plots ( Figure 9B) show the similarities of the model predictions, with respect to ground-observed data, in the training period and prior to it. The regression lines created for the historical and recent data (1982-1999 and 2001-2018, respectively) fell within the 90 th confidence limits of the recent data regression (2001-2018).

Discussion and Conclusions
The remoteness of certain U.S. areas makes tracking changes and quantifying impacts of drought on vegetation difficult to monitor [51]. Our findings contribute to this effort by providing a landscape-scale understanding of forage dynamics across major U.S. semiarid grasslands. For this area we developed an EEP model that estimates the amount of annual forage in a specific location based on seasonal and interannual climate variability, reducing the effect on non-climatic influences. The EEP was converted to biomass (kg ha −1 yr −1 ) and compared to ground-based biomass observations resulting in moderately strong relations. The maps of percent anomaly from a long-term median derived from the EEP model capture the effect of interannual climate variability on forage production with respect to a long-term production at a specific location. Because the regression lines created for the historical and recent data ( Figure 9B) are sufficiently similar, we conclude that the model can be used for estimating forage production in years prior and after the model training period (2000-2018).
The ecosystem performance approach, used in this study, has been well established for the observation of ecosystem performance anomalies (difference between observed

Discussion and Conclusions
The remoteness of certain U.S. areas makes tracking changes and quantifying impacts of drought on vegetation difficult to monitor [51]. Our findings contribute to this effort by providing a landscape-scale understanding of forage dynamics across major U.S. semi-arid grasslands. For this area we developed an EEP model that estimates the amount of annual forage in a specific location based on seasonal and interannual climate variability, reducing the effect on non-climatic influences. The EEP was converted to biomass (kg ha −1 yr −1 ) and compared to ground-based biomass observations resulting in moderately strong relations. The maps of percent anomaly from a long-term median derived from the EEP model capture the effect of interannual climate variability on forage production with respect to a long-term production at a specific location. Because the regression lines created for the historical and recent data ( Figure 9B) are sufficiently similar, we conclude that the model can be used for estimating forage production in years prior and after the model training period (2000-2018).
The ecosystem performance approach, used in this study, has been well established for the observation of ecosystem performance anomalies (difference between observed GSN and EEP) caused by fire, management, and other non-climatic factors [40,45,47,48,50]. Our study focused on the opposite process, using the EEP to capture the interannual variability of specific locations caused explicitly by changes in seasonal and interannual climate, while reducing the effect of non-climatic factors, a concept used by Poděbradská et al. [3]. Based on the model evaluation using the testing dataset we conclude that the model prediction was robust with a low degree of over-and underfitting, as well as minimal bias (Figure 4). The model under-and overperformances can be attributed mostly to non-climatic factors such as, fire, pest or disease infestation, land use and land cover changes, and management (e.g., overgrazing). Based on the model attribute usage, we found that the site potential is the most important factor when estimating grass biomass, which is consistent with findings of Wylie et al. [50]. The second most important factor is the accumulated summer precipitation in June, July, and August. These two variables were also found to be the most important in the Nebraska Sandhills, only in the opposite order (summer precipitation the most important, site potential the second most important) [3]. This result is expected because our model represents much larger growth potential variability than represented in the Nebraska Sandhills and therefore can capture more differences in annual biomass production across the area than the summer precipitation.
The site potential captures areas of higher and lower grass growth potential (Figure 2). The spatial patterns of growth potential are consistent with previous findings. For example, the site potential reflects a production gradient across the Great Plains from the more productive south-east region to the less productive north-west region [11], a similar gradient across the Greater Platte River Basin [48,65] and Nebraska Sandhills [3], and depicts similar patterns of growth potential across the Upper Colorado River Basin as found by Gu and Wylie [47].
The MAE of long-term GSN provides insight for areas with high and low interannual variability in forage production. We found an agreement with Reeves et al. [5] in certain locations of low (e.g., Flint Hills, Nebraska Sandhills, and north-central Wyoming) and high (e.g., central and western South Dakota and southern Texas) variability, while there were also differences in our findings (e.g., eastern Texas and eastern Wyoming). Based on the combined information from the site potential and MAE of long-term GSN, locations across the western United States can be categorized into four groups: (1) low growth potential with high interannual variability, (2) low growth potential with low interannual variability, (3) high growth potential with high interannual variability, and (4) high growth potential with low interannual variability. This information can be related to adaptive range management and flexible stocking. For example, areas characterized by low growth potential with high interannual variability (e.g., some areas in Utah, southeast Montana, and western Texas) might benefit from highly flexible livestock operations. On the other hand, areas that are highly productive and with low interannual variability (e.g., Flint Hills) may be able to manage their operations in a more conservative way without implementing highly flexible management strategies.
The results of conversion between unitless EEP and SSURGO show good relations for both the entire dataset and the binned values. The observed noise in this relation can be mostly attributed to the characteristics of the SSURGO dataset. The values of biomass production in SSURGO are determined based on ground observations and further assigned to areas that have the same biophysical properties (e.g., soil associations). This sometimes leads to large areas having the exact same biomass production values, which is unrealistic. Similar results were found by Gu et al. [65] for the Greater Platte River Basin. Gu et al. [65] used a slightly different method for representing the GSN and used only the representative SSURGO values depicting a "normal year" for precipitation and forage production. We observed a non-linear relation between EEP and SSURGO for higher EEP values, due to an underestimation of ecosystem productivity in dense vegetation canopies as found by previous studies [66,67].
We used the time-integrated NDVI as a proxy for annual forage production because it has been widely used both for biomass estimation and in EEP models. However, some studies observed better relations between biomass and other remotely sensed biomass proxies, for example, the Absorbed Photosynthetically Active Radiation (APAR, [68]). Future research could investigate the use of these or other vegetation indices such as the leaf area index, enhanced vegetation index, and APAR in EEP models.
The EB maps captured the wet and dry years as well as the variability in growth potential across the study area (Figures 6 and A2). The maps of annual biomass deviation (Figures 6 and A3) depict the biomass departure from the long-term median, which reflects the effect of seasonal and interannual climate conditions on biomass production. Maps of percent anomaly (Figures 6 and A4) depict the percent departure from the long-term median biomass, which captures the effect of seasonal and interannual climate conditions on biomass production with respect to production at a specific location. These maps are the most intuitive for depicting the below and above normal biomass conditions in different years caused by intra-and interannual climate variability and can be connected to the four groups of site potential and long-term MAE. For example, during the drought year 2012, the biomass in the Flint Hills (eastern Kansas) was around 3500 kg ha −1 . The biomass deviation maps show a negative departure between 250 and 500 kg ha −1 , which is depicted as two of the most severe negative categories, but the percent anomaly maps revealed only a 10-20% departure. The Flint Hills are in a category of high site potential and low interannual variability. On the other hand, low-productive areas in northern Utah showed a slight negative biomass deviation (~−150 kg ha −1 ) while also having a high negative percent anomaly (less than 70% of normal). This is also observed for positive biomass deviations and percent anomalies (e.g., as observed in the same Utah area in 2017, Figure 6). If these maps were provided as predictions during the growing season, livestock producers and land managers would be able to see where the major positive and negative departures from normal are expected to occur, which they could use as a basis for their decision-making (e.g., where to relocate cattle or buy additional feed).
To validate our EEP model together with the conversion to biomass using SSURGO data, we compared the modeled biomass data with long-term grass clipping data from various locations in the study area. The R 2 value of the observed versus modeled biomass indicated a moderately strong relation and was comparable to values observed in other studies [3,26,69]. The ground-observed clipping data were in most cases obtained from multiple small rectangles (0.25-m 2 ) and scaled to the entire pasture, while the modeled data captured much larger areas (62,500-m 2 ) averaged over a few pixels covering the entire pasture. Both techniques can introduce errors that can result in lower R 2 values. In most cases the ground observations were obtained from small areas that were not grazed during the current season, while the modeled data that are based on satellite observations capture a signal from the entire pasture that was under a certain grazing pressure. This can partially explain the bias in the observed versus modeled data. Additional errors could also have been introduced during the conversion of the EEP values to biomass using the SSURGO database. This database is known to contain state and county discontinuities that are caused by differences in survey methods over decades of soil data collection; however, there are emerging techniques that address this issue (e.g., the POLARIS soil series map, [70]). The time-series validation plots reveal that the model reasonably predicts the trend in biomass amount. However, overestimation of very low and underestimation of very high groundobserved biomass can be observed. The time-series and regression analysis on the extended historical data from the Cheyenne, Wyoming, area indicate that the model can be used beyond its training period, creating opportunities for analysis of temporal trends of forage production in the past and future.
Using this modelling method, we established a relation between biomass production and seasonal climate for western U.S. semi-arid grasslands. This relation can be utilized for creating historical and future biomass estimates without the use of remotely sensed data. This is especially valuable for time periods when remote sensing data are not available, for example, before the Landsat era starting in 1972 or for future climate change scenarios. However, these estimates warrant interpretation with caution as some important local characteristics such as species composition and land use can change over longer time periods. Additionally, the older historical gridded climate data are created from a smaller number of weather stations leading to more extrapolation between the observed data and to potential errors in the climate data, an input of the forage model. The model can also be used for early predictions of annual biomass together with seasonal climate forecasts and scenarios to support within-season decision-making of livestock producers and land managers.
Future research could focus on comparison of the annual biomass estimates in this study with other similar approaches like the Rangeland Production Monitoring Service, Rangeland Analysis Platform, and Grass-Cast end-of-season estimates. Grassland biomass growth is affected by many complex processes that occur on the stand level. Further research could be helpful to understand the effect of hot dry summer conditions on the grass dormancy. A better understanding and effective large scale monitoring of the geographic distribution of warm and cool season grasses can help better explain the dynamics of forage production in the future. Because grasslands are not the only component of rangelands, which also include shrublands and savannas, exploring application of this methodology to other rangeland land covers is desirable. The method introduced in this manuscript can be used with global NDVI and climate datasets to develop estimates of forage biomass across global grassland areas. Finally, higher spatial resolution (e.g., 30-m) would lead to an improvement in capturing the within-pasture variability in forage production and would be more relevant for individual livestock operations and from the perspective of fire fuel load modeling. Newly emerging data fusion techniques (e.g., fusion of NDVI observations from Landsat and Sentinel-2) can leverage this type of spatial resolution with high observation frequency, which is important for characterization of growing season phenology [71] and precise estimates of the growing season NDVI [72]. Future efforts could explore the use of these or similar NDVI datasets in the forage production model. However, it is important to note that other datasets used in the forage production model would need to match the spatial resolution of the NDVI data. This is problematic especially for temperature and precipitation data, which are usually produced with a coarser spatial resolution (e.g., 1-km).
of the model input data) that contained a high percentage of herbaceous cover. The signal can therefore be attributed mostly to herbaceous LC. Specifically, we used the NLCD (available on www.mrlc.gov) in multiple epochs (2001, 2003, 2006, 2008, 2011, 2013, and 2016) and the National Gap Analysis Project Land Cover mapping (available on https://www.usgs.gov/core-science-systems/science-analytics-and-synthesis/gap, last accessed on 18 November 2021) for identification of 30-m LC categories. Irrigated land (from Moderate Resolution Imaging Spectroradiometer (MODIS) Irrigated Agriculture available on https://www.usgs.gov/special-topics/monitoring-vegetation-drought-stress/science/ modis-irrigated-agriculture (last accessed on 18 November 2021) was excluded from the suitable training areas, because the relation between vegetation condition and climate can be altered by application of irrigation water. Areas with a high percentage of annual cheatgrass cover (https://rangelands.app/cheatgrass/, last accessed on 18 November 2021) were excluded from training and mapped areas due to difference in seasonality compared to perennial vegetation [73]. Areas with a high amount of elevation variation (from The National Map available on https://nationalmap.gov/elevation.html, last accessed on 18 November 2021) can experience high variability in precipitation and temperature over small linear distances [74] and therefore were excluded due to uncertainties connected to the climate data downscaling process. Finally, alpine tundra (data from the National GAP Land Cover mapping) was excluded from the training and mapped areas due to frequent snow cover and lower responsiveness to the variability in precipitation [51]. The remaining pixels were identified as suitable for our model training database. Table A1 summarizes the criteria based on which pixels were considered suitable for placement of training points and criteria used for displaying the model results (mapped areas), which use slightly less restrictive criteria. The criteria were more restrictive for Region A (Figure 1), which has more continuous herbaceous cover, than Region B, which is characterized by high occurrence of shrubland LC with scattered herbaceous areas. The suitable training areas determined pixels where the model database training points could be placed. The number of points used for the model training database was based on the number of points from Poděbradská et al. [3] and scaled to the total mapped area (n = 347,414). The placement of the training points should be stratified across space, time, and forage production gradient. The number of training points placed in each Omernik Level III ecoregion (L3E, [75]) was determined based on a logarithmically weighted proportion of the mapped area within each ecoregion compared to a total mapped area. This ensured an inclusion of training points from areas with both high and low density of herbaceous cover. To represent a range of possible annual forage production, the GSN in every L3E was divided into 6 groups for every year of our analysis (2000-2018). One-sixth of the training points for each L3E was then randomly placed into the suitable training areas of each productivity category for each year. This ensured that both extremely low and high productivity values are represented equally in the model and theoretically lead to a better estimation of extreme values as recommended by Poděbradská et al. [3]. low density of herbaceous cover. To represent a range of possible annual forage production, the GSN in every L3E was divided into 6 groups for every year of our analysis (2000)(2001)(2002)(2003)(2004)(2005)(2006)(2007)(2008)(2009)(2010)(2011)(2012)(2013)(2014)(2015)(2016)(2017)(2018). One-sixth of the training points for each L3E was then randomly placed into the suitable training areas of each productivity category for each year. This ensured that both extremely low and high productivity values are represented equally in the model and theoretically lead to a better estimation of extreme values as recommended by Poděbradská et al. [3]. The total unmasked study area was over 3 million km 2 , while the area considered for displaying results was around 550 thousand km 2 or about 17 percent of the entire study area. The area used for placement of model database training points was considerably smaller, around 400 thousand km 2 or 12.5 percent of the entire study area. Larger differences in mapped and training areas were observed for the western part of the study region The total unmasked study area was over 3 million km 2 , while the area considered for displaying results was around 550 thousand km 2 or about 17 percent of the entire study area. The area used for placement of model database training points was considerably smaller, around 400 thousand km 2 or 12.5 percent of the entire study area. Larger differences in mapped and training areas were observed for the western part of the study region (Region B) due to the smaller size of grassland patches and intermixed shrubland components. An example of these differences is provided in Appendix A ( Figure A1