Imaging Spectroscopic Analysis of Biochemical Traits for Shrub Species in Great Basin, USA

: The biochemical traits of plant canopies are important predictors of photosynthetic capacity and nutrient cycling. However, remote sensing of biochemical traits in shrub species in dryland ecosystems has been limited mainly due to the sparse vegetation cover, manifold shrub structures, and complex light interaction between the land surface and canopy. In order to examine the performance of airborne imaging spectroscopy for retrieving biochemical traits in shrub species, we collected Airborne Visible Infrared Imaging Spectrometer—Next Generation (AVIRIS-NG) images and surveyed four foliar biochemical traits (leaf mass per area, water content, nitrogen content and carbon) of sagebrush ( Artemesia tridentata ) and bitterbrush ( Purshia tridentata ) in the Great Basin semi-desert ecoregion, USA, in October 2014 and May 2015. We examined the correlations between biochemical traits and developed partial least square regression (PLSR) models to compare spectral correlations with biochemical traits at canopy and plot levels. PLSR models for sagebrush showed comparable performance between calibration ( R 2 : LMA = 0.66, water = 0.7, nitrogen = 0.42, carbon = 0.6) and validation ( R 2 : LMA = 0.52, water = 0.41, nitrogen = 0.23, carbon = 0.57), while prediction for bitterbrush remained a challenge. Our results demonstrate the potential for airborne imaging spectroscopy to measure shrub biochemical traits over large shrubland regions. We also highlight challenges when estimating biochemical traits with airborne imaging spectroscopy data.


Introduction
The biochemical traits of plant foliage provide important information to study photosynthetic capacity and biogeochemical cycling in ecosystems [1][2][3][4][5]. Among widely-studied biochemical traits are leaf mass per area, water content, nitrogen content, and carbon content. Variations in leaf mass per area (LMA, ratio of leaf dry mass to leaf area) correspond to the fundamental tradeoffs in leaf construction costs vs. light-intercepting surface area and are driven by a range of environmental controls [6,7]. While foliage gains carbon and builds structure, LMA corresponds to several biochemical and structural compounds in the plant including (positive correlation) cellulose and lignin, and (negative correlation) protein. These relate to physiological processes of photosynthesis, primary production and leaf decomposition [8,9]. Water is one of the most important factors regulating plant growth and development in ecosystems [10]. Leaf water content is an important parameter for assessing drought and predicting susceptibility of wildfire. Nitrogen content is a fundamental component of (iii) the modeling complexity at canopy and plot levels. To answer these questions we collected shrub samples for biochemical traits in representative sagebrush and bitterbrush communities concurrent with AVIRIS-NG flights. Then we examined the correlation between traits and developed partial least square regression (PLSR) models to evaluate spectral correlation at canopy and plot levels.

Stuty Sites
We selected two sites on the eastern slopes of the Sierra Nevada Mountains, in Owens Valley, CA, USA (Figure 1a). One site was located on large depositional fans (bajadas) out of the Sierra Nevada Mountains and represents the southern extent of sagebrush communities in the region. The other site was farther south and is near the southern end of a long ecotone that transitions into the Mojave Desert ecosystem. The two sites span a geographic region approximately 57 km 2 at elevations between 1215 m and 1892 m. The study areas are within the rain shadow of the Sierra Nevada Mountains and consist of a diverse mixture of semiarid vegetation communities. The continental climate has cold winters with hot summers, with a precipitation regime dominated by winter storms. The total annual precipitation was about 9.6 cm in 2014 and 6.9 cm in 2015. The vegetation community in our study area is complex due to heterogeneous soil types, rocks, and disturbances including grazing, wildfires, and the regional drought in years 2012-2016. Most of the Sierra Nevada fan vegetation is a mix of Great Basin sagebrush (Artemisia tridentata) semi-desert ecosystem type and bitterbrush (Purshia tridentata) and Blackbrush (Coleogyne ramosissima) transitional type, while the valley floor is dominated by alkali grasslands (Distichlis spicata, Sporobolus airoides) and saltbush (Atriplex sp., Sarcobatus vermiculatus) communities. The vegetation community in our study area is complex due to heterogeneous soil types, rocks, and disturbances including grazing, wildfires, and the regional drought in years 2012-2016. Most of the Sierra Nevada fan vegetation is a mix of Great Basin sagebrush (Artemisia tridentata) semi-desert ecosystem type and bitterbrush (Purshia tridentata) and Blackbrush (Coleogyne ramosissima) transitional Remote Sens. 2018, 10, 1621 4 of 12 type, while the valley floor is dominated by alkali grasslands (Distichlis spicata, Sporobolus airoides) and saltbush (Atriplex sp., Sarcobatus vermiculatus) communities.

Field Data Collection
We established 10 × 10 m plots (north-south orientation) in October 2014 and revisited them in May 2015 (Figure 1b). These plots represented relatively homogenous patches of sagebrush and bitterbrush. Within each plot we set up 10-m-long line transects at 1, 3, 5, 7, 9 m intervals (Figure 1c), then we recorded shrub species; shrub canopy coverage length along each transect; calculated the percentage of shrub cover on each transect by dividing the coverage length by 10 m; and then averaged shrub covers of five transects as the plot shrub cover (%). We randomly chose one shrub between each transect if a shrub was present. We recorded the GPS locations of chosen shrubs and the four corners of each plot with a Trimble GeoExplorer ® 7 series GPS. For every chosen shrub we measured the major (widest) canopy width, minor (narrowest) canopy width (in meter), and leaf area index (LAI) with a LI-COR plant canopy analyzer (LAI-2200 model); then we randomly collected four branch tips (3-5 cm) from the canopy. All samples were immediately stored in sealed plastic bags in a cooler with ice and transported to the laboratory to carry out biochemical analysis. In total, we sampled 17 sagebrush plots (99 shrubs) and 12 bitterbrush plots (50 shrubs) in 2014, 18 sagebrush plots (95 shrubs) and 12 bitterbrush plots (24 shrubs) due to shrub mortality in 2015.

Biochemical Measurements
Within 12 h after sample collection we removed the leaves from the sampled branch tips and weighed the fresh leaf mass. Four branch tips from each shrub were mixed to represent one sample for that shrub. Then we laid the leaves on a flat-bed scanner and recorded grayscale scanned images. The leaf area (cm 2 ) was calculated as the ratio of the number of black pixels to the total number of pixels. The leaves were dried in a convection oven for 24 h at 60 degree centigrade and re-weighed to determine dry mass. Water mass was calculated as the difference between fresh leaf mass and dry mass. Foliar LMA (mg/cm 2 ) was determined by dividing dry leaf mass by leaf area, and water content (mg/cm 2 ) was calculated as the ratio of water mass to leaf area. Then the dried leaves were ground to measure nitrogen and carbon percentage (% of dry mass) using combustion-gas chromatography (Costech ECS 4010).

Remote Sensing Data
On 9 October 2014, and 13 June 2015, the AVIRIS-NG collected images of our study area with a nominal pixel resolution of 2.6 m. The AVIRIS-NG instrument measures the spectral radiance at the wavelength range from 380 nm and 2510 nm with 5 nm sampling [39]. The raw images were orthocorrected, calibrated to radiance, and atmospherically corrected to surface reflectance based on Atmosphere Removal Algorithm (ATREM) program [40]. The individual shrub canopy-level spectra were extracted from pixels at the GPS locations on the imagery after removal of the following wavebands due to noise and water vapor absorptions: 346-391 nm, 1348-1428 nm, 1778-1949 nm, 2485-2510 nm. The plot-level spectra were averaged spectra of pixels within each 10 × 10 m plot area.

Data Analysis
We pooled foliar biochemical traits and shrub spectra from 2 years. Then we calculated correlation coefficients between pairs of biochemical traits in each species to analyze correlations between traits. Partial least squares regression (PLSR) is a widely used multivariate statistical method in chemometrics and near-infrared spectroscopy for analyzing quantitative relationships between multiple predictor and response variables [41]. PLSR has been a valuable method in spectroscopy for analyzing quantitative relationships between reflectance data and vegetation biochemical traits [15,17,33,42]. We implemented a bootstrap method to develop a single response PLSR model between each biochemical trait and spectra by species. In each species dataset, we first used the Kennard-Stone algorithm to divide the dataset into 80% training and 20% for validation [43]. The Kennard-Stone algorithm uses a uniform mapping method to ensure both training and validation datasets provide uniform coverage of the entire dataset [43]. We further implemented 500 iterations by randomly selecting 80% of the calibration dataset to build a PLSR model with leave-one-out validation. The optimal number of PLSR components was determined when the prediction residual sum of squares (PRESS) was minimized, and successive PLSR components did not improve the root mean square error in prediction (RMSEP) as assessed using a t-test [44]. Lastly, we applied 500 PLSR models from the calibration dataset with the determined number of components on the validation dataset. To enable comparison between models, we calculated the mean and standard deviation of R-square (R 2 ) and relative root mean square error (divide RMSE by the range of observation values) to measure modeling accuracy in the calibration and validation models. We also calculated the mean of variable importance in the projection (VIP) metric to identify regions of the reflectance spectra that were significant in the 500 calibration models. VIP is a weighted sum of squares of the PLSR weights with the weights calculated from the amount of variance in response variables of each PLSR component. Higher absolute VIP values indicate great importance for wavelengths in projecting traits. Generally, the wavelengths with VIP values larger than 1 are considered important [45]. In plot-level models, we calculated biochemical traits as the mean leaf traits of sampled shrubs within each plot, and then we implemented the same PLSR analytic workflow as the canopy-level models.

Variation in Biochemical Traits and Reflectance
The biochemical traits showed variation between years and species (Table 1). In general, foliar traits showed higher LMA and carbon content, and lower water and nitrogen content in October 2014 than in May 2015. LMA and water in sagebrush presented smaller means than bitterbrush. Two species had similar canopy measurements including LAI, major and minor widths. Correlation analyses showed correlation coefficients between pairs of canopy biochemical traits ( Table 2). While most correlations were not significant, LMA and water content showed positive correlation in the sagebrush, and LMA and carbon content showed positive correlation in the bitterbrush. The reflectance spectra varied in association with the biochemical traits at both shrub canopy and plot levels ( Figure 2). The extracted spectra were from pixels mixed with features like background soil and dry grass. In contrast with typical mesic green leaf spectra, shrubs did not show a strong pattern of pigment absorption in the visible wavelengths. High reflectance in the near infrared and shortwave infrared wavelengths indicated the water-stressed vegetation, abundant dry biomass in stems and litter and bright soils in the background.

PLSR Analysis
PLSR models of sagebrush produced comparable accuracy between calibration and validation in all biochemical traits at the canopy level ( Table 3). The R 2 values in the calibration models were above 0.4 for all traits. Validation models generally showed smaller R 2 and larger rRMSE in comparable ranges as calibration models (Figure 3). For bitterbrush, calibration models in LMA and nitrogen could not predict the validation dataset, and R 2 in the water validation model was small. PLSR models also identified high VIP values (larger than 1) at spectral regions associated with biochemical traits (Figure 4).

PLSR Analysis
PLSR models of sagebrush produced comparable accuracy between calibration and validation in all biochemical traits at the canopy level ( Table 3). The R 2 values in the calibration models were above 0.4 for all traits. Validation models generally showed smaller R 2 and larger rRMSE in comparable ranges as calibration models (Figure 3). For bitterbrush, calibration models in LMA and nitrogen could not predict the validation dataset, and R 2 in the water validation model was small. PLSR models also identified high VIP values (larger than 1) at spectral regions associated with biochemical traits (Figure 4). ----Note: R-square (R 2 ), relative root mean square error (rRMSE), and standard deviation (S.D.) are from the calibration and validation models respectively. Dash lines denote results that are not significant.

Discussion
Our study sites represent a typical sagebrush and bitterbrush dominant community in the Great Basin, albeit sites are at the extreme southwest boundary of this vegetation type, into the ecotone transition with the warmer and drier Mojave Desert ecosystem. Our sampling dates were long after the rainy season ended in both 2014 and 2015, and the severe drought in California had significantly The plot-level PLSR models showed that the reflectance spectra could model biochemical traits (Table 4). PLSR calibration data generally produced higher or comparable R 2 and smaller rRMSE values than validation data in sagebrush plots. The rRMSE of nitrogen and water models were larger than values of LMA and water models. In bitterbrush plots, only water content could be modeled in comparable accuracy between calibration and validation data.

Discussion
Our study sites represent a typical sagebrush and bitterbrush dominant community in the Great Basin, albeit sites are at the extreme southwest boundary of this vegetation type, into the ecotone transition with the warmer and drier Mojave Desert ecosystem. Our sampling dates were long after the rainy season ended in both 2014 and 2015, and the severe drought in California had significantly stressed forests and rangelands throughout southern Sierra Nevada, to produce extensive mortality [37]. Leaf water content changes seasonally between 75% and 45% of fresh weight in shrub species [42], but water in our samples averaged about 46% by fresh weight in sagebrush and 42% in bitterbrush. These low values of leaf water content are consistent with the drought conditions during both years of this project, indicating a high degree of water stress on the plants [46]. The variation in foliar traits across years represented a seasonal variation in the plant physiological process and environment. As shrubs grew from spring to fall, foliar LMA and carbon content increased, but water and nitrogen content decreased. Shrubs utilized nitrogen to construct the foliar structure, likely resulting in higher LMA, and water resource scarcity during the summer led to the lower water content.
Given the complexity of the environmental conditions, the spectroscopic analysis was still able to demonstrate the potential to predict biochemical traits. Our model accuracy of four traits in sagebrush was comparable to or higher than previous studies of nitrogen content [33]. At canopy level, only validation models of LMA and carbon content showed R-square values above 0.5 and comparable rRMSE values to calibration models. In comparison, canopy-level PLSR models of bitterbrush showed much lower prediction ability than sagebrush. Bitterbrush leaves are about 1cm long (Jepson eFlora, http://ucjeps.berkeley.edu/eflora/, last accessed in August 2018), but our samples showed significantly smaller leaf sizes and fragmented shape. The complex canopy structure and especially small leaves in bitterbrush makes it more difficult to spectrally estimate traits. Various methods have been used to minimize noise and resolve overlapping absorption features such as difference and logarithm transformation in reflectance [47,48]. We tested the difference and logarithm transformation and found no significant improvement for prediction (results not reported).
Our PLSR models identified high VIP values in spectral regions significantly correlated with biochemical trait variation. High VIP values in LMA models were generally located in near-infrared and short-wave infrared [13,[49][50][51], which was associated to leaf structure and dry matter such as cellulose and lignin. Significant spectral regions in short-wave infrared were also shown in the models of carbon which was the major component in dry matter. VIP values of water models showed some high values near 1450 and 2300 nm as identified in previous studies [52], and smaller values in spectral regions associated with dry matter. Nitrogen is a relatively small component of leaf dry matter content, with an average 1.5% in bitterbrush and 2.3% in sagebrush from our leaf samples. Kokaly [53] identified two absorption features at 2055 and 2172 nm on the shoulders of 2100 nm corresponding to absorption features of proteins. Our nitrogen models showed high VIP values in this spectral region. VIP values of four vegetation traits covered common dry matter-related wavelengths in short-wave infrared, which demonstrated that the main spectral signatures of water-stressed shrubs in this ecoregion were attributed to dry matter.
Plot-level LMA models in sagebrush showed higher R-square values and lower rRMSE values than canopy-level models. The nitrogen content model showed higher R-square value but rRMSE value increased as well. Results of water and carbon content were poorer than canopy-level models. Compared to study sites in a wetter section of the Great Basin [33], our sagebrush had a much lower nitrogen content, and shrub vegetation cover in our plots was also much lower. Many of our surveyed shrub canopies had canopy diameters that were smaller than the AVIRIS-NG data pixel size of 2.6 m. This study showed significant effects of canopy structure for remote sensing of traits. Knyazikhin et al. [54] argued the canopy structure's important role in canopy radiative transfer and spectral determination of nitrogen content in closed canopy environments. Several papers have investigated the practical limits on sparse vegetation cover discrimination in dryland environments [25], and several methods have been proposed to estimate vegetation structure and cover such as spectral unmixing and data fusion of passive and active remote sensing data [55][56][57][58][59][60]. We suggest future research should examine whether the combining airborne imaging spectrometer with a high-resolution multispectral camera or lidar sensor can better predict shrub biochemical traits over large landscapes.

Conclusions
The characterization of plant biochemical traits is important for understanding dryland ecosystems and their response to environmental change. To summarize our findings, this study demonstrated the performance of airborne imaging spectrometer AVIRIS-NG to estimate biochemical traits of sagebrush and bitterbrush in the Great Basin ecoregion, including leaf mass per area, water content, nitrogen content and carbon content. Sparse vegetation cover, complex canopy structure and small foliage size made the spectral estimation more challenging. The spectral regions identified by VIP values in PLSR models displayed significant distinctions in specific wavelengths corresponding to known biochemical absorption features, including those related to foliar lignin, cellulose, nitrogen, and water. A regional remote sensing estimation of vegetation canopy structure will facilitate a more robust prediction of vegetation traits. A future step will be to combine similar data sets from other shrub species to refine and standardize both data and methods as a basis to operationally estimating foliar traits in dryland ecosystems with the NEON airborne observation platform, NASA's proposed space-borne Hyperspectral Infrared Imager (HyspIRI), and German EnMAP mission.