Application of Normalized Radar Backsca tt er and Hyperspectral Data to Augment Rangeland Vegetation Fractional Classi ﬁ cation

.


Introduction
Rangeland ecosystems in the western United States are vulnerable to climate change, fire, and anthropogenic disturbances, yet available geospatial data for assessing trends in condition or fire risk, for example, may not be sufficient to inform management practices.To address this need, scientists from the U.S. Geological Survey (USGS) and Bureau of Land Management (BLM) developed the Rangeland Condition Monitoring Assessment and Projection (RCMAP) project [1][2][3].RCMAP provides robust, long-term, and floristically detailed maps of vegetation cover across western North America using Landsat imagery and machine learning from 1985 to 2023 at yearly time-steps.In the most recent version [4], the RCMAP product suite consists of 10 fractional components, including annual herbaceous, bare ground, herbaceous, litter, non-sagebrush shrub, perennial herbaceous, sagebrush (Artemisia spp.), shrub, tree, and shrub height, in addition to the temporal trends of each component.
Classification of rangeland areas tends to be difficult due to frequently sparse vegetation canopies that increase the influence of soils and senesced vegetation, the overall abundance of senesced vegetation, heterogeneity of life forms (shrub versus herbaceous cover versus tree), and limited ground-based data [5][6][7].Moreover, biological soil crusts, high spatial heterogeneity, and irregular often short-duration peak phenology pose challenges [7][8][9].Differentiation among various rangeland plant communities is often hampered by similar spectral absorption profiles [10].Additional classification challenges exist in rangeland burns (e.g., [11].Although satellite imagery does detect burn influence, the post-fire signal reflects a faster recovery relative to field observations [11].Accordingly, error rates are likely to be higher than average in recent (<~10 years) burns and in areas with unique spectral properties due to soil mineralogy, albedo, and increased annual grasses and forbs.Rangelands are characterized by slow vegetation growth and substantial natural and anthropogenic disturbances, resulting in an often noisy temporal signal, with high inter-annual variation in weather and frequent disturbances from fire.Most pixels in the RCMAP products show at least some change over the time-series [12,13].Thus, the decoupling of ecosystem function and structure with optical reflectance remains one of the biggest challenges in rangeland mapping [7].Within the context of global change, accurate measurements of rangeland interannual dynamics at a large spatiotemporal scale are important to improve our understanding of the variability and change in rangelands [2].
RCMAP validation relative to high resolution imagery (World View imagery, [13], field observations, and long-term monitoring plots) shows moderate to strong relationships [1].Consistent patterns emerge from the analyses: (1) spatial patterns are better characterized than temporal ones; (2) bare ground, herbaceous and tree cover have lower error relative to shrub, sagebrush, and annual herbaceous cover; (3) modeling tends to generalize relative to field observations, resulting in prediction inflation and deflation on the low and high distribution of data, respectively; (4) accuracy tends to be lower in more arid locations with higher bare ground (similar to [8]).Of particular concern to the current study is (1) improved discrimination between herbaceous and shrub cover, and (2) to a lesser extent, between tree and shrub.
Additional sources of data including space-borne solar-induced chlorophyll fluorescence (SIF) and thermal infrared (TIR), light detection and ranging (lidar), and hyperspectral data present new information to augment multispectral data in classification [7,14].In the current study, we focus on Synthetical Aperture Radar (SAR) from Sentinel and Earth Surface Mineral Dust Source Investigation (EMIT) hyperspectral data as new sources of information.Hyperspectral data offer the promise of discriminating among vegetation communities that are indiscernible in multispectral data [10,15].Hyperspectral data can provide information on vegetation, including water, nitrogen, chlorophyll, carotenoid, and xanthophyll concentrations, leading to more robust classification [7].The focus of the EMIT mission is mapping surface mineralogy in desert dust source regions [16], which has proven to be successful for gypsum and other materials [17].However, EMIT data can also provide data on more general spectral properties [18].Indeed, the hyperspectral feature space of EMIT was found to be much more informative than that in multispectral sensors, with good separation of vegetation spectral end members and detection of subtle differences in vegetation pigmentation, cellulose/lignin absorption, and mesophyll reflectance [18].Finally, Dennison et al. [19] demonstrated the potential of EMIT to enhance classification of non-photosynthetic vegetation (NPV).
SAR data are sensitive to the physical properties of the land surface, such as moisture content, roughness, and vegetation structure, which makes SAR data useful for monitoring vegetation type, structure, and cover in rangelands [20][21][22][23][24].Moreover, SAR data are not limited by saturation issues found in spectral data [24].SAR data can be acquired in different polarizations, such as HH (horizontal transmit and horizontal receive), VH (horizontal transmit and vertical receive), and VV (vertical transmit and vertical receive) [25][26][27].Consensus on the best polarization for rangelands from SAR data is complicated and not entirely clear [24,25,28,29].The choice of polarization depends on the vegetation type and structure being observed and the specific application of the data.Generally, the horizontal transmit and horizontal receive (HH) polarization is commonly used in rangeland studies due to its sensitivity to rough surfaces and vegetation canopies.However, some studies have shown that using multiple polarizations, including horizontal transmit and vertical receive (VH) and vertical transmit and vertical receive (VV), can provide additional information and improve classification accuracy [29][30][31][32].
Here, we focus on C-band (centered at 5.405 GHz) VV and VH SAR (processed as normalized radar backscatter) data from Sentinel 1. SAR operates in different frequency bands, including L, C, X, Ku, and Ka.Each band has different characteristics, and the choice of band depends on the specific application and the desired spatial resolution [24,33,34].C-band SAR (between 4 and 8 GHz) is more sensitive to changes in surface roughness and can penetrate vegetation to a limited extent, making it useful for monitoring soil moisture and surface deformation.The C-band has a higher spatial resolution than the L-band, making it more suitable for detecting smaller features [35,36].C-band SAR data have been used to detect soil moisture content in arid regions, which could be useful for rangeland management [37].In some cases, the use of multiple bands can provide more comprehensive information about the rangeland environment [37][38][39].For example, multiple bands can help distinguish between different vegetation types and provide information on soil moisture, which is important for rangeland management [40].Ultimately, each band has its unique strengths and weaknesses.
The superspectral imaging capacity of Landsat NEXT (LNEXT), with a scheduled launch date of ~2030, will be greatly enhanced relative to legacy multispectral Landsat data.The 26 spectral bands proposed in LNEXT include all 11 bands from legacy Landsat data for continuity, 5 with a capacity similar to Sentinel 2 to support data fusion, and 10 new bands designed to support applications in water quality, crop mapping, snow research, and mineral mapping [41].Specific enhancements of LNEXT are in the shortwave infrared (SWIR) 2 band, which has improved specifications for mapping NPV due to the upgrade from one to three bands [19].Relative to vegetation mapping, the new yellow, red edge, liquid water, and SWIR 2 bands may be useful in the detection of leaf chlorosis and vegetation stress, leaf area index, chlorophyll content, vegetation water content, and cellulose [41].
Hyperspectral data have greater potential to retrieve information of surface condition; however, their availability has remained limited [42].Previous work [43] has demonstrated the challenges with hyperspectral data, notably their limited spatio-temporal footprint and the higher dimensionality driving the need for increased training samples (i.e., Hughes phenomena, [44]).However, mapping accuracy has generally been higher with hyperspectral compared to multispectral imagery.For example, Bostan et al. [43] observed a classification accuracy of 80% for crop type mapping using hyperspectral Hyperion imagery compared to 70% with Landsat.Similarly, Govender et al. [45] found generally improved classification accuracy of various tree species using Hyperion imagery relative to multispectral imagery.Sankey et al. [46] used hyperspectral imagery and lidar to map rangeland plant functional types, finding improved accuracy and discriminatory power relative to equivalent testing using multispectral data.Specifically, the authors found 36% overall accuracy for a nine-class map with multispectral imagery, 71% with hyperspectral imagery, and 87% with hyperspectral imagery with lidar fusion.The latter finding indicates the potential importance of height indicators to improve mapping accuracy, which is accomplished using SAR in the current study.
Many surface targets and vegetation species have unique absorption features that are better distinguished using hyperspectral compared to multispectral imagery.For example, Mutanga and Kumar [47] used hyperspectral imagery (HyMAP) to estimate grass phosphorus concentration, finding a unique absorption feature at 2015-2019 nanometers (nm).Sibanda et al. [48] underscore the importance of the red edge wavelengths in mapping rangeland management practices.Lu et al. [42] found the red edge and red spectra results were important for mapping vegetation chlorophyll content.
Though prior studies have shown the potential accuracy improvements with hyperspectral or SAR data compared to multispectral data, none have applied the combination of all three to the challenging domain of rangeland fractional vegetation cover mapping.The main objective of this case study is to apply several new data streams (Sentinel SAR and EMIT hyperspectral data) to the RCMAP model, with the goal of understanding the effects of additional modalities/spectral resolution on fractional component classification accuracy.Additionally, we use EMIT data to replicate 20 of the proposed LNEXT bands and test our models with these data.We strive to determine how the additional data can contribute to mapping of RCMAP components utilizing an Artificial Intelligence/Machine Learning (AI/ML) approach.Given the findings of previous researchers, we expect to obtain an accuracy improvement with both EMIT and SAR.Secondly, to elucidate the altered classification accuracy with the additional datasets, we carried out a spectral profile analysis to illustrate the efficacy of EMIT and Landsat in discriminating among rangeland vegetation types.

Study Area
Our study area includes 56,311 km 2 in southwest Montana and adjacent portions of northeast Idaho and northwestern Wyoming (Figure 1).This highly diverse region includes several forested (mostly coniferous) mountain ranges with subalpine to alpine environments at the highest elevations dominated by grasses and forbs, but additionally includes stands of various species of big sagebrush (Artemisia tridentata spp.).Valleys tend to be more arid and are dominated by various mixtures of bunch grasses and sagebrush, with some irrigated meadows and hayfields.Invasion of annual grasses (e.g., cheatgrass; Bromus tectorum L.) has occurred in some of the lower-elevation valleys, especially in the southwest portion of the study extent.In RCMAP development at both the high-resolution and Landsat scale, we found that the cool season perennial bunchgrasses prevalent in this region (e.g., Rough fescue [Festuca campestris Rydb.],Idaho fescue [Festuca idahoensis Elmer], and bluebunch wheatgrass [Pseudoroegneria spicata Á. Löve]) can be confused with shrub, and vice versa.We test the hypothesis that this confusion is related to the shadowing and overall texture created by these grasses being like that of shrub, which SAR data could distinguish based on height.

Training Data
RCMAP models are trained using both direct field observations collected in house and from the BLM Analysis Inventory and Monitoring (AIM) program and a unique dataset composed of millions of pixels worth of downscaled, high-resolution, component cover predictions [1].These high-resolution data form most training data used in RCMAP.Specifically, high-resolution images (primarily World View 2 and 3) are trained using ~95 field samples per image.Field samples are designed to capture homogenous patches of ground cover across the range of conditions at each site.Neural net classifiers are then used for prediction component cover at 2 m resolution; they are subsequently downscaled to 30 m resolution.In the current study area, a total of 483,921 pixels of high-resolution derived training data are available.
RCMAP also collected field observations of component cover at a 30 m scale [1], with 716 such plots available within the study area.Additionally, we used 499 observations collected by the BLM AIM program.A total of 485,136 training sites were available in the pooled training database including high-resolution predictions, AIM data, and RCMAP 30 m scale data.

Landsat Composites and Ancillary Data
We used the same Landsat composites generated in RCMAP production [4], including 6 bands (Supplemental Table S1).These composites are based on 30 m Collection 2 Landsat analysis ready data [49] consisting of level 1 data with Albers projection and level 2 surface reflectance data.Composites are generated using a percentile approach, where the 10th, 50th, and 90th percentile of all clear observations for each band each year are retrieved; in our tests, we include composites from 2016 only.However, we have found that Landsat imagery composites alone are not sufficient to produce maps of acceptable accuracy.Accordingly, we have included multiple ancillary datasets as predictor variables, elevation, topographic slope, aspect, position index, and various derived spectral indices (Table S1).Additional discriminatory power may be achieved with higher-fidelity representation of phenology through additional Landsat composites; however, doing so presents challenges of clear image availability.The Landsat composites and ancillary "base" data were included in all tests.

Sentinel SAR Data
Sentinel 1 SAR (Normalized Radar Backscatter) CEOS-compliant (Committee on Earth Observation Satellites) data were sourced from the Sinergise Sentinel Hub Card4L Tool (refer to [50].The data were Level 1 (ground range detected) with radiometric terrain correction and are given in gamma-0 backscatter, which resolves some variation related to observation geometry.All images were observed in ascending mode.Speckle filtering was not applied to the data.Data were local UTM projection in a WGS 84 datum at 10 m resolution that were resampled to 30 m resolution using bilinear interpolation and reprojected to Albers NAD 83 to match the resolution of training data and Landsat composites.We produced monthly mosaics of SAR data using a median value approach for each month from April to October 2016.Due to limited spatial coverage, we opted to use only the August 2016 data.For our SAR tests, we used both VV and VH data and added the square root of each dataset as input.

EMIT Data
We obtained EMIT Level 2A Estimated Surface Reflectance and Uncertainty and Masks (EMITL2ARFL data; LP DAAC-EMITL2ARFL; [51]).The EMIT instrument installed on the International Space Station uses imaging spectroscopy to take mineralogical measurements, targeting the Earth's arid dust source regions.The surface reflectance data used are non-orthocorrected, 60 m resolution with 285 bands spanning 381-2493 nm at a spectral resolution of ~7.5 nm.We used the associated reflectance mask, containing six binary flag bands, to exclude clouds, water, and spacecraft.Additional hand edits were applied as needed to remove non-target features.We obtained data from April to October of 2023, reprojected to Albers NAD 83, resampled using bilinear interpolation to 30 m resolution, and mosaiced using a median value approach across the study area to match other inputs.To obtain broad spatial coverage, data from April to October 2023 were included in the EMIT composite.Bands 128-142 and 188-213 were missing in our EMIT data.We used all available EMIT bands in our testing.We replicated 20 proposed LNEXT bands [41] by averaging the corresponding wavelengths in the EMIT bands.Band wavelength matching was not exact due to the 7.5 nm range of the EMIT data.LNEXT has five thermal infrared bands and a cirrus band in wavelengths not available in EMIT that were not replicated.

Model Architecture
From the pool of 485,136 training points available in the study area, we randomly selected 10% as test.Based on prior RCMAP testing, we used a Deep Neural Network (DNN) with a neuron width of 512, 4 layers deep, a learning rate of 10 −4 , a dropout between layers of 0.2, batch normalization, and clamping of 0-100%.We doubled the training data in the high end of distribution tails (>+1 standard deviation of the mean) for each component to better capture the range of each component, enhance spatial contrast, and reduce model propensity to regress toward the mean.However, if mean minus 1 standard deviation was less than 0, we did not double the low end of the distribution (e.g., in tree).We used a weighted mean squared logarithmic error (WMSLE) loss function, where zeros have half the weight of nonzero values.WMSLE penalizes higher relative errors.For example, a prediction of 5% cover with a training value of 0% is given a higher penalty than a prediction of 50% cover with a training value of 55%.

Tests
We ran five tests to evaluate the effect on accuracy of adding SAR and EMIT data as independent variables (Table 1).Tests are designed to isolate the effect of each dataset on accuracy.We evaluate the performance of each test on each component independently and with an average across all components per test.We use our independent validation data (n = 398) and cross-validation test to assess accuracy, with the coefficient of determination (R 2 ) and root mean square error (RMSE) as metrics.Independent validation consisted of data sampled by RCMAP and random 50% withholding of BLM AIM data.Both datasets were collected using a line point intercept method [52], collected to represent a single Landsat pixel.All tests represent a circa 2016 condition.A land cover mask targeted to omit water bodies, urban areas, cultivated croplands, and perennial snow/ice [4] was applied to our mapping results, as these areas are untrained.The land cover mask was identified using National Land Cover Database 2021 [53] land cover classes and 2021 Cropland Data Layer [54].Considering the land cover mask and masking applied to EMIT and SAR data, a total of 26,635 km 2 was mapped in our analysis.The study area was defined based on the common area available in the selected SAR and EMIT datasets.

Spectral Profile Analysis
We undertook a spectral profile analysis to illustrate the efficacy of EMIT and Landsat in discriminating among important rangeland targets, and specifically how this varies across wavelengths.First, we classified data from RCMAP high-resolution training sites in the region into several classes: dense herbaceous (≥50% herbaceous cover and 0% tree and shrub), mixture of herbaceous and shrub (≥20% cover of both shrub and herbaceous and 0% tree), non-sagebrush shrub (<20% cover of herbaceous, ≥20% non-sagebrush cover, <5% sagebrush cover, 0% tree cover), sagebrush (<20% cover of herbaceous, <5% non-sagebrush cover, ≥5% sagebrush cover, 0% tree cover), and tree (<20% cover of herbaceous, 0% shrub cover, ≥20% tree cover).These classes are not inclusive of all training data; rather, they reflect end members of vegetation types important to land management.Next, we evaluated the mean spectral response in the EMIT and Landsat data for each class.At each wavelength (band) of the EMIT and Landsat data, we calculated the mean spectral response of the five vegetation classes, and the standard deviation among the five classes was used as a proxy for separability.A higher standard deviation indicates higher separability at a given wavelength.Second, we analyzed the mean EMIT and Landsat spectral values by 5% cover increments of shrub cover.

Caveats
Training and validation data in the study area are primarily from 2014 to 2019 and align with the data availability of SAR and Landsat data.EMIT data are not available prior to 2022, so for the purposes of tests including EMIT data, we must assume that RCMAP high-resolution training data from 2014 to 2019 are comparable.We undertook analysis to scope the magnitude of error in this assumption by comparing 2016 and 2023 RCMAP production-scale predictions of component cover over the training footprint.Although mean differences in predictions between years was <0.5%, and correlations between years were ~95% averaged across components, differences up to 65% in herbaceous cover and 87% in tree cover existed.To resolve these discrepancies, we removed sites that had burned since 1984 as indicated by Monitoring Trends in Burn Severity from the training pool.Doing so removed most locations with large differences between years, but some degree of inconsistency will remain when using 2016 data to train 2023 imagery.

Accuracy Metrics
Cross-validation results were robust compared to production-scale RCMAP products, as we developed the models over a relatively small geographic area, which allows for improved fit (Table 2).On average, test accuracy varied little among runs, but adding SAR slightly improved results, and adding LNEXT and EMIT led to a larger improvement compared to the base run.Independent validation results show ranking of accuracy by component similar to prior RCMAP testing, where bare ground and tree are top performers.Bare ground and herbaceous component accuracy in this region is lower than across the western United States [2], one of the reasons we selected this area for testing (Table 3).The addition of SAR data improved the R 2 by an average of 7.5% and lowered the RMSE by 2.47%, with the greatest improvement observed in herbaceous and litter components (Table 3, Figure 2).Inclusion of EMIT data had even more profound effects on accuracy, with the R 2 improving by an average of 29% and the RMSE reduced by 10.34%.EMIT improved the independent R 2 of herbaceous cover by 44%, litter by 99%, and sagebrush by 22%.The combined test showed even greater accuracy, although the incremental improvements in SAR and EMIT were not additive (i.e., 5.5% improvement in bare ground with SAR and 21.7% with EMIT-did not equal a net 27.2% improvement with both, but rather 22.3%).This indicates some redundancy or conflicting information between SAR and EMIT.In summary, adding SAR, but especially EMIT data, greatly improved independent accuracy.The LNEXT test shows an intermediate level of accuracy improvement (18.2% higher R 2 , 5.5% lower RMSE).The scatterplots of independent validation sites (Figure 2) demonstrate the tightening of the relationship between field observation and prediction with the addition of EMIT data, but also the propensity of models to flatten the data distribution (positive and negative bias in predictions at the low and high ends of data distribution, respectively).

Predictions
Mean predicted component cover was relatively consistent among tests (Table 4).However, the SAR test predicted lower tree and bare ground cover and higher shrub cover compared to the other tests.On average, spatial similarity to the base test was 74% for the SAR test, 73% for EMIT, 68% for the All test, and 84% for the LNEXT test.Spatial similarly was highest in herbaceous (82%) and lowest in shrub cover (59%).Spatial patterns overall captured the expected ecological, climatic, and disturbance gradients (Figure 3).EMIT predictions contained some artifacts due to limited image availability, leading to phenological mismatch across seamlines.Additionally, SAR predictions contained some artifacts, especially in areas with boulder fields and standing dead trees, which are sometimes misclassified as tree and shrub cover.Depiction of burn history through effects to vegetation structure is a key application of rangeland mapping data.We evaluated shrub and tree cover predictions from the four predictions in fires from varying time periods.Models including SAR data tended to accurately depict tree cover as lower in burns compared to those without SAR.Case studies of prediction differences by model run are presented in Supplemental Figures S1 and S2.

SAR and Spectral Analysis
To better understand the relative improvements related to including SAR and EMIT data by component, we undertook several analyses.First, we evaluated individual component relationships with SAR data at RCMAP training sites (Figure 4).Taller components (tree and shrub) had stronger and more positive relationships with the SAR data.Bare ground, where the ground surface itself would be the return, yielded negative relationships.Of note, we found that SAR VV and VH data were only moderately related (R 2 = 0.33), revealing the unique information in each polarization.The EMIT data offered opportunities to explore the spectral profiles of important rangeland cover targets and the optimal spectral wavelengths for discrimination.We classified our high-resolution RCMAP training data into five classes (refer to Methods section for explanation) and plotted the average spectral response using both EMIT and Landsat data (Figure 5).The average spectral profile of the dense herbaceous and tree targets are quite distinct.The remaining three profiles are relatively inseparable except for ~1000-1800 nm (e.g., ~the Landsat SWIR 1 band, Figure 6).Next, we evaluated the separability of 5% increments of shrub cover (Figure 7).EMIT reflectance profiles tend to decrease with increasing shrub cover.Key wavelengths for discrimination among shrub percentage cover are similar to that of the vegetation classes in Figure 5, although the visible portion of the spectrum tends to be somewhat more important with discrimination among shrub cover values.

Discussion
Accurate mapping of rangeland is essential to ensure user confidence and effective decision making, especially when considering the temporal dimension that can amplify errors.As alluded to in Figure 7, to fully capture their potential benefit, products need to be accurate to potentially cover 1% increments, particularly when considering analyses thresholding data to the percentage (e.g., the sagebrush conservation design framework, [55]).Our testing results indicate a clear accuracy benefit of adding SAR and EMIT data to the RCMAP model.The ability of SAR data to observe vegetation height allows for more accurate classification of vegetation types and potential degradation [24].
EMIT's continuous characterization of the spectral curve boosts discriminatory power relative to multispectral data ( [56], Figures 5-7).Our spectral profile analysis reveals the enhanced classification power with EMIT is related to both the improved spectral resolution and representation of the entire domain as compared to Landsat.In general, the visible spectrum and Landsat SWIR 2 range offer modest separability among the herbaceous and shrub, non-sagebrush shrub, and sagebrush classes (Figure 6).Conversely, the tree and dense herbaceous targets are quite distinct spectrally in the visible to SWIR 1 wavelengths.Separability tends to be lowest in the red and SWIR 2 wavelengths, with enhanced separation in the red edge to SWIR 1 wavelengths, likely responsive to vegetation water content.Indeed, the standard deviation among classes (i.e., separability) is 2.67× greater on average in these wavelengths than in the visible spectra.For the three classes including shrubs, mean separability is lower than in the dense herbaceous and tree classes, with the exception of the SWIR 2 range.However, the red edge to SWIR 1 wavelengths still hold the greatest separability in these classes.Hanson et al. [57] evaluated the spectra of dead sagebrush, live sagebrush, grass, and bare ground in Idaho using a field spectroradiometer.Like our results, the authors show little separation among these targets in the visible spectra, with greater separation in the infrared and SWIR ranges.
Our study area was selected due to the difficulty of separating bunchgrasses and shrubs and the diversity of vegetation types in this region.The accuracy benefit of hyperspectral data may be diminished in regions with less floristic and structural diversity.However, if such regions contain complex mineralogy and high amounts of bare ground, the EMIT instrument may reduce classification uncertainty [17].Moreover, legacy Landsat data are suboptimal for classification of NPV [19], which is a major constituent of rangelands.
One key finding is that legacy Landsat bands largely miss portions of the electromagnetic spectrum where enhanced separation among important rangeland targets exists (Figure 6), namely in the range of 900-1250 nm and 1500-1780 nm.The former range is typically considered responsive to water liquid and vapor and to snow/ice applications, and the latter is in the SWIR spectra, which can be responsive to NPV [19].Our LNEXT test includes these ranges and demonstrates their effects on enhancing classification accuracy.However, the reduced spectral resolution of LNEXT bands (average of 33.8 nm) compared to 7.5 nm for EMIT ostensibly drives the reduced accuracy of the LNEXT test compared to the EMIT test.
Adding these datasets came with a substantial time investment and some necessary assumptions regarding temporal compatibility.Moreover, SAR data introduce errors in burned areas and caused shrub/tree confusion, whereas EMIT data introduce some seamlines.Resampling native 60 m EMIT data to 30 m resolution also likely introduced some error.Expanding the number of predictor variables can introduce the Hughes phenomena, where the increased dimensionality results in increased discriminatory power but increases the number of required training data to retain potential accuracy [44,56,58].RCMAP has >60 million training points available, which presumably is sufficient to remediate this concern, especially when combined with methods designed to reduce dimensionality [58].
Due to the highly variable precipitation regime driving varied phenology, sensors with infrequent return intervals such as Landsat often miss important events [7].Even daily temporal resolution data from a Moderate-Resolution Imaging Spectroradiometer often cannot fully capture the dynamics [59].One major weakness of EMIT, and our testing overall, is the incomplete exploitation of the temporal domain through within-season phenological variation.RCMAP has found that the temporal domain (i.e., phenology) offers key discriminatory power, unique from the spectral domain.For example, prior RCMAP research has indicated that the uniquely short and early growing season of invasive annual grasses relative to perennial grasses can be exploited with monthly imagery composites to enhance separation [12,60].The 10th, 50th, and 90th percentile Landsat composites used in this study were found to capture a large amount of the within-season variation, but geomedian composites (e.g., [61] or monthly composites may offer improved depiction of key phenological events.Future work combining SAR, EMIT, and enhanced temporal frequency with HLS (Harmonized Landsat Sentinel; [62,63]) or similar data is warranted.Wood et al. [8], for example, found a 5-10% accuracy in rangeland classification with an increase from one to eight drone images over the course of the growing season.Additional information may be derived from increased spatial resolution, especially from texture metrics that become more meaningful at a finer scale.For example, RCMAP has demonstrated the that three-by-three variance of 2 m World View imagery is strongly associated with shrub and tree cover.Moreover, higher-resolution imagery serves to decrease the diversity of features represented in a pixel, thus creating a purer training relationship.Additional avenues for exploration are vegetation optical depth, chlorophyll fluorescence, and thermal imaging [8].

Conclusions
We show the promise of enhanced classification accuracy using EMIT data and, to a lesser extent, SAR.Integrating these datasets in the current study was a laborious exercise, however, and the limited spatial and temporal extent of both EMIT and SAR datasets would render them incompatible with production-scale modeling as in RCMAP.The superspectral resolution (26 bands) proposed in LNEXT may bridge this gap, however, while offering higher temporal resolution compared to legacy Landsat observatories, with synoptic coverage.Importantly, many of the spectral gaps we found potentially important to rangeland fractional component classification will be included in LNEXT bands.Specifically, the inclusion of red edge wavelengths and additional SWIR bands in both EMIT and LNEXT enhances the capture of vegetation water content, stress, and NPV important to separability among cover types and fractional cover.
Given multispectral, hyperspectral, SAR, and topographic variables included in our best overall model, error rates improve but are still noteworthy.Exploitation of the temporal domain may further reduce error, but the relatively small separation among rangeland features in spectral space (Figures 5-7) indicates a limit.The high overall amount of bare ground and diverse minerology create complex spectral conditions.Moreover, given the seven RCMAP components, each ranging from 0-100%, the theoretical number permutations of cover is 1.6 X 10 16 .AI/ML has enhanced the ability to parameterize this complexity, but some moderate level of error is likely inevitable in rangeland fractional classification at a broad spatio-temporal scale.

Figure 1 .
Figure 1.Study area used to analyze impact of Synthetic Aperture Radar (SAR) and Earth Surface Mineral Dust Source Investigation (EMIT) hyperspectral data in rangeland vegetation classification.Base image is a 50th percentile composite of 2016 Landsat imagery.Inset map shows study location in the conterminous United States.

Figure 2 .
Figure 2. Scatterplots of independent validation (n = 399) for selected component cover predictions by model run.Line of best fit and 1-to-1 line indicated by dashed blue and solid red line,

Figure 3 .
Figure 3. Component predictions by model for shrub (top) and herbaceous cover (bottom).White indicates either land cover or Earth Surface Mineral Dust Source Investigation (EMIT) masking.SAR = Synthetic Aperture Radar.

Figure 4 .
Figure 4. Correlation (r) between Rangeland Condition Monitoring Assessment and Projection (RCMAP) high-resolution training data and Synthetic Aperture Radar (SAR) horizontal transmit and vertical receive (VH)/vertical transmit and vertical receive (VV) data at n = 60,000 pixels.

Figure 5 .
Figure 5. Average spectral profiles of important rangeland targets (see Methods Section for classification details) based on high-resolution Rangeland Condition Monitoring Assessment and Projection (RCMAP) data.Line data reflect Earth Surface Mineral Dust Source Investigation (EMIT) profiles, and points of the same color represent Landsat.

Figure 6 .
Figure 6.Separability of rangeland classes plotted in Figure 5 as measured by the standard deviation among spectral profiles across the classes at each Earth Surface Mineral Dust Source Investigation (EMIT) band.Width of Landsat band ranges (bands 2-7 used in Rangeland Condition Monitoring Assessment and Projection [RCMAP] analysis) are plotted in black.

Figure 7 .
Figure 7. Average spectral profile from Earth Surface Mineral Dust Source Investigation (EMIT) by shrub 5% cover bins (color) in Rangeland Condition Monitoring Assessment and Projection (RCMAP) high-resolution training sites.Some bins are omitted for clarity.We removed pixels with >0% tree cover and ≥20% herbaceous cover from this analysis.All species of shrub are included.

Table 1 .
Tests implemented in the current study to evaluate rangeland fractional component mapping accuracy.All tests include a set of independent variables listed in TableS1.LNEXT refers to Earth Surface Mineral Dust Source Investigation (EMIT) data synthetized into proposed Landsat NEXT bands.SAR = Synthetic Aperture Radar.

Table 3 .
Model independent validation at Rangeland Condition Monitoring Assessment and Projection (RCMAP) and Analysis Inventory and Monitoring (AIM) sites (n = 399).Average difference from base represents the mean relative improvement among components for each test relative to the base run, in percent.SAR = Synthetic Aperture Radar; EMIT = Earth Surface Mineral Dust Source Investigation; LNEXT = Landsat NEXT; N/A = not applicable.

Table 4 .
Mean predicted value by component per test based on a random sample across the study area (n = 10,000).SAR = Synthetic Aperture Radar; EMIT = Earth Surface Mineral Dust Source Investigation; LNEXT = Landsat NEXT.