Inferring Grassland Drought Stress with Unsupervised Learning from Airborne Hyperspectral VNIR Imagery

: The 2018–2019 Central European drought had a grave impact on natural and managed ecosystems, affecting their health and productivity. We examined patterns in hyperspectral VNIR imagery using an unsupervised learning approach to improve ecosystem monitoring and the understanding of grassland drought responses. The main objectives of this study were (1) to evaluate the application of simplex volume maximisation (SiVM), an unsupervised learning method, for the detection of grassland drought stress in high-dimensional remote sensing data at the ecosystem scale and (2) to analyse the contributions of different spectral plant and soil traits to the computed stress signal. The drought status of the research site was assessed with a non-parametric standardised precipitation–evapotranspiration index (SPEI) and soil moisture measurements. We used airborne HySpex VNIR-1800 data from spring 2018 and 2019 to compare vegetation condition at the onset of the drought with the state after one year. SiVM, an interpretable matrix factorisation technique, was used to derive typical extreme spectra (archetypes) from the hyperspectral data. The classiﬁcation of archetypes allowed for the inference of qualitative drought stress levels. The results were evaluated using a set of geophysical measurements and vegetation indices as proxy variables for drought-inhibited vegetation growth. The successful application of SiVM for grassland stress detection at the ecosystem canopy scale was veriﬁed in a correlation analysis. The predictor importance was assessed with boosted beta regression. In the resulting interannual stress model, carotenoid-related variables had among the highest coefﬁcient values. The signiﬁcance of the photochemical reﬂectance index that uses 512 nm as reference wavelength (PRI 512 ) demonstrates the value of combining imaging spectrometry and unsupervised learning for the monitoring of vegetation stress. It also shows the potential of archetypical reﬂectance spectra to be used for the remote estimation of photosynthetic efﬁciency. More conclusive results could be achieved by using vegetation measurements instead of proxy variables for evaluation. It must also be investigated how the method can be generalised across ecosystems.


Introduction
The changing background climate state is projected to lead to an increase in the frequency and severity of extreme weather and climate events in many parts of the world [1][2][3]. extreme spectra (archetypes), which can be used for estimating vegetation stress levels. However, as far as we know, the method has only been used in small-scale experimental settings. Therefore, this study aims at (1) evaluating the application of SiVM to ecosystem scale hyperspectral VNIR data of a grassland site for qualitative drought stress classification and (2) analysing the contributions of different evaluation variables to the inferred stress signal.
Image acquisition was conducted during two aircraft campaigns. The first in spring 2018 at the onset of the 2018-2019 European drought, and the second in spring 2019 after one year of prolonged drought conditions. The drought status of the grassland research site was assessed with a non-parametric implementation of the standardised precipitation-evapotranspiration index (SPEI) [41] and soil moisture measurements. Because no vegetation measurements were made, statistical evaluation was based on a set of plant and soil trait proxy variables, including a number of established vegetation indices that are sensitive to drought-inhibited vegetation growth. After a correlation analysis, different models were fit for explanatory variable analysis using boosted beta regression.

Study Site and Drought Status
The intensive research site "Am Grossen Bruch" (DE-GsB) is part of the Bode catchment hydrological observatory that is a component of the German TERrestrial ENvironmental Observatories (TERENO) [42,43]. TERENO sites are equipped with numerous sensors to monitor ecosystem behaviour and status, such as carbon, water, and energy fluxes with the eddy covariance method, distributed soil moisture, as well as biotic and abiotic drivers. DE-GsB is also an associated site of the Integrated Carbon Observation System Research Infrastructure (ICOS) that aims to provide consistent long-term measurements of sources and sinks of greenhouse gases [44]. Being located in the Central German Lowland in the state of Saxony-Anhalt (see Figure 1), the region surrounding DE-GsB is characterised by a drier subcontinental climate, which is partly due to the rain shadow of the Harz mountains. The area has a negative climatic water balance [45], which makes it susceptible to more severe impacts from drought events. DE-GsB is a mesophilic grassland that is regularly used for cattle grazing. It is located near a water channel (52.03 • N, 11.10 • E) and is affected by seasonal flooding. Testing SiVM at the ecosystem scale with data from a flat, mixed grassland has certain advantages, such as minimizing the influence of uncontrolled variables, like species composition, canopy structure, and leaf angle distribution.
We assessed the hydrological conditions of DE-GsB with the SPEI, which is used for analysing anomalies in the climatic water balance from a distributional perspective. Thus, it can be used for the standardised detection of droughts and comparisons of drought severity in a statistically robust way [46]. We used a non-parametric approach to determine an empirical distribution function using a kernel density estimator, following the method that was described by Vergni et al. [47]. Gridded (1 × 1 km) daily meteorological data from a follow-up product of the High-Resolution Dataset of Water Fluxes and States for Germany [48] were used as input data, covering a time span from 1947-2019. These forcing data itself are derived from weather station data from the German Meteorological Service (Deutscher Wetterdienst, DWD), which were interpolated with external drift kriging, using terrain elevation as the drift variable. We used a Gaussian kernel for SPEI calculation and optimised the bandwidth selection with cross validation [49]. In addition, we evaluated continuous soil water content (SWC) data that were derived from a profile with TDR measurements in three depths (CS616, Campbell Scientific, Shepshed, UK) to assess the severity of the local soil water deficiency resulting from the prolonged drought. This was done to determine whether the two images showed contrasting environmental conditions, justifying factorisation with SiVM.

Airborne Data
Flight campaigns for hyperspectral image acquisition on the DE-GsB site were conducted on 7 May 2018 and 23 April 2019. The images were collected using a HySpex VNIR sensor (NEO, Oslo, Norway) mounted on a Cessna 207 airplane from an altitude of 3 km a.g.l between 11:30 and 13:30 CEST under clear sky conditions. The spectra were recorded over the range 409-989 nm with a spectral resolution of 3.2 nm (see Table 1). Raw HySpex image blocks were radiometrically and geometrically corrected for surface reflectance with HySpexRAD and HySpexNAV (NEO). Georeferencing and calibration were performed with PARGE [50] using a digital elevation model (DEM) with an original resolution of 10 m obtained from the German Federal Agency for Cartography and Geodesy (Bundesamt für Kartographie und Geodäsie, BKG) and ground control points set in ENVI (Harris Geospatial Solutions, Boulder, CO, USA). Atmospheric correction was done with ATCOR4 [51] with visibility values from the nearby (10 km distance) Ummendorf DWD weather station [52]. The resulting image cubes had a spatial resolution of 0.5 m (2018) and 0.4 m (2019).
The atmosphere-corrected hyperspectral imagery went through the following preprocessing routines: The 2018 dataset was resampled to 0.4 m using cubic spline interpolation to align the spatial resolution and extent of both datasets. The 0.4 m target resolution was selected to increase the sample size for the statistical evaluation. The data were tested for noisy bands and bad pixels with a correlation-based approach [53]. The bands were deemed to be noisy when the Pearson correlation coefficient to adjacent bands/pixels fell below 0.8. Bad pixel candidates were selected as pixels with a value of at least 80% of the maximum band value. If the candidate value exceeded the maximum value of the same pixel in adjacent bands by 90% or more, the pixel was considered to be abnormal. Pixels with zero reflectance in all bands were also classified as bad pixels. All of the identified noisy bands were removed from all image cubes, resulting in the removal of the first seven of 182 bands from all data sets. Bad pixels were masked and excluded from factorisation. Except for the statistical evaluation of results, all of the data preprocessing and analyses were implemented in Python 3.7.3 (Python Software Foundation, https://www.python.org/, accessed on 23 February 2021).

Geophysical Data
The geophysical measurements originate from a campaign in March 2014 that was conducted following a 10 day period of heat and no precipitation. Measurements were conducted with electromagnetic induction (EMI) sensors, which establish magnetic fields at the soil surface and measure the induced magnetic field of subsurface materials [54]. The sensor models EM38DD and EM31MK2 from Geonics, Canada were used (EM38 and EM31 in the following). Additionally, gamma ray (GR) emissions of Thorium 232 Th were measured using a portable gamma ray spectrometer from GF instruments. All of the sensors were attached to sledges and pulled over the research site with GPS positioning to measure apparent electrical conductivity (ECa) and nuclide concentrations [55].
ECa is controlled by a number of soil properties and states, such as soil texture, layering, or porosity [56], which can influence vegetation ecophysiology indirectly. Concerning soil moisture, Martini et al. [57] concluded that there is a complex interplay between factors controlling ECa and soil moisture, but not in any case a direct link. EM31 and EM38 measurements were taken in vertical coil orientation, with an effective penetration depth up to 1.5 m (EM38) and 4.0 m (EM31), respectively [55]. The gamma ray measurements are related to properties that are influenced by the source rock, soil genesis, management (e.g., pH), and soil moisture. For spatial interpolation, the transects were subsampled and then examined in a variogram analysis to find the best linear unbiased estimator for ordinary kriging using the GSTools module (v1.2.1) [58]. Figure 2 provides an overview of the analysis procedure. SiVM allows the calculation of convexity constrained latent components [40], similar to precursor algorithms like Convex-NMF (C-NMF) [59] or Archetypal Analysis (AA) [60]. In the case of clustering of hyperspectral data, convex latent components are desirable as they are identical with actual measured data points, assigning them a physical meaning [61]. Furthermore, in contrast to common clustering methods, like k-means or DBSCAN, whose cluster centers represent average values of certain data regions, convexity constrained factorisation yields the characteristic extreme data points (archetypes) of a dataset [40]. These properties facilitate the interpretability of the resulting latent variable model. One major advantage of SiVM over similar factorisation algorithms is that it runs in linear time. This feature makes it a suitable algorithm for processing large amounts of data, e.g., hyperspectral image cubes at landscape scale. So far, SiVM has been used successfully for water stress detection and prediction on the individual plant scale [26,61,62]. In this study, its applicability to the ecosystem scale was evaluated.

Unsupervised Classification with Simplex Volume Maximisation
Before factorisation, all of the image cubes were transposed and concatenated along the band axis, which results in a data matrix V d×n with n samples of d-dimensional vectors. With SiVM, V d×n was decomposed into a basis (archetype) matrix W d×k and a coefficient matrix H k×n by iterative distance computations only. Thurau et al. [40] showed that fitting a (k − 1)-simplex of maximal volume to the d-dimensional input data minimizes the Frobenius norm V − HW , a common optimisation target in matrix factorisation. The simplex approximates the convex hull of the data cloud, and it's vertices are the archetypical data points in W. When compared to solving the quadratic optimisation problem as proposed by Cutler and Breiman [60] in their work on Archetypal Analysis, this approximation is computationally more efficient, and it minimizes the residual of H while preserving convexity: For h i = h 1 , ..., h k , h ij ≥ 0 and ∑ i h ij = 1. The number of archetypes should be low enough to enable generalisation, but also high enough to cover a range of different extreme reflectance signatures. In this study, the number of archetypes was set to 30. To develop a qualitative classification of grassland stress levels from the latent components, the reflectance signatures in W were classified into three categories. Some archetypes resembled the spectra of healthy plants, others showed signs of stress, and a third group comprised spectra of non-vegetated pixels (background). The classification of archetypes was based on the results of field and lab experiments [63][64][65][66][67][68], visual assessment of archetype maps, and expert knowledge. From a probabilistic viewpoint, the archetypes span a probability space in which each data point can be expressed as a convex combination of the archetypes. In other words, the coefficients h ij in H provide a measure of similarity between any input pixel and the archetypical reflectance spectra in W [26]. Therefore, all of the data points in X can be viewed as draws from a specific Dirichlet distribution. This distribution is often used for proportional data and it has the advantage of imposing the convexity constraint on the coefficients in H [61]. The Dirichlet has one parameter α, which is a k-dimensional vector and it can be estimated with a maximum-likelihood approach as described by Minka [69]. The Dirichlet has a useful aggregation property that allows to merge parts of the sample space. For example, if one tosses a six-sided dice, then the probability of all rollable numbers can be described by Dir(α 1 , ..., α 6 ). Now, if the goal is to obtain the probability of rolling odd and even numbers, the aggregated (two-event) sample space still follows a Dirichlet distribution with aggregated parameter Dir(α 1 + α 3 + α 5 , α 2 + α 4 + α 6 ) [70]. In the same manner, archetypes were lumped together according to their category, "healthy", "stressed", or "background". This allowed creating maps of drought stress by summarizing coefficient values of healthy, stressed, and background archetypes, which are abbreviated as ζ, ν and ξ below. Each resulting aggregated archetype is still beta-distributed. Factorisation was performed with the pymf module (v0.3) [71].

Evaluation with Spectral Indices and Geophysical Measurements
Unsupervised learning techniques cluster datasets by exploring structures and patterns in the data and do not rely on ground truth (labels) for model building. Since vegetation measurements were not available for the research site DE-GsB, an internal validation of model performance was not feasible. An external evaluation that was based on a set of reliable proxy variables was implemented instead. This set consists of vegetation indices, which capture spectral plant traits. These traits are proxies for drought-inhibited plant growth resulting from changes to different biochemical, physiological, and structural plant traits [72]. A number of geophysical measurements as proxies for different soil traits and a high-resolution DEM derived from UAV data were included in the evaluation data set. We selected VNIR vegetation indices that either have been used in stress detection before or are related to ecophysiological traits impacted by drought stress, although those relationships are not always linear. For most of them, successful application to grassland sites had been reported in the literature (see the following paragraphs). The indices range in complexity and use a variety of distinct spectral feature combinations. Table 2 provides details about the geophysical measurements and the DEM used in the evaluation dataset, while Table 3 gives an overview about the used vegetation indices. The broadband indices Chl REopt and Car REopt are used as proxies for vegetation chlorophyll (C ab ) and carotenoid content (C xc ). The indices were developed by Féret et al. [24] using statistical models and model inversion on a large number of experimental and synthetic datasets in order to integrate variability between species. The red edge normalised difference vegetation index (RENDVI) was used to include an established chlorophyllrelated vegetation index. The index is applicable for a broad range of species and it does not saturate for high-chlorophyll cases like dense canopies [73].
The PRI is correlated with short-term (minutes) changes of light use efficiency (LUE) through tracking variations in the de-epoxidation state of xantophylls, and water-stress indicators, like stomatal conductance [74][75][76]. It has proven to be applicable across a wide range of different types of leaf morphology, photosystems, and ecosystems. Beyond that, the PRI allows for the detection of long-term (weeks to months) changes of LUE caused by shifts in the C xc /C ab ratio [77][78][79] and is, therefore, correlated with seasonal variations in net CO 2 uptake [80]. Plant stress results in an increase of the C xc /C ab ratio and a decrease of photosynthetic LUE, which makes PRI well-suited for water stress detection. When compared to the original formulation, the PRI 512 developed by Hernández-Clemente et al. [75] is less sensitive to structural effects. We assume the former index to have very similar proxy capabilities because the PRI 512 has the same diagnostic band as the PRI.
The combination of a chlorophyll-sensitive red edge band and a NIR band makes the CTR2 an effective ratio index for stress detection [66,81]. The improved modified chlorophyll absorption ratio index (MCARI2) is a chlorophyll-corrected vegetation index for green LAI estimation [82]. In contrast to its predecessors, the index is less sensitive to variability in C ab concentrations, and it has performed well when applied to grasslands for LAI estimation [83]. The improved modified soil-adjusted vegetation index (MSAVI2) [84] was included in the analysis to account for potential soil effects. It has been applied for LAI estimation and drought detection on different grassland sites [85,86]. Because MSAVI2 is a broadband index, it was calculated from the hyperspectral data based on Sentinel-2 band specifications.
Changes in RWC can be tracked with the WBI under progressive water stress [87,88]. However, studies found that derivative spectra of the slope of the NIR water absorption feature outperformed the reflectance-based water indices in RWC estimation in different experimental settings, including grassland sites in the Netherlands and UK [89,90]. WBI and the first-order derivative of the reflectance at 950.6 nm ( f (ρ 950.6 )) were both added to the set of evaluation variables. Derivatives were calculated by applying Savitzky-Golay smoothing with a window size of 67 nm and second-order polynomial fitting [91]. Table 3. List of vegetation indices used for evaluation of the SiVM-based stress level estimates (ρ denotes reflectance at a specific wavelength or wavelength range).

Index Name
Abbr. Equation

Ref.
Opt. chlorophyll red edge index Modified chlorophyll absorption ratio index 2 MCARI2 Modified soil-adjusted vegetation index 2 MSAVI2

Statistical Inference
Spatial autocorrelation should be minimised for the statistical evaluation of the latent variable model. Therefore, a variogram analysis was conducted, followed by a systematic subsampling of data points. Background archetypes were used to exclude data points with ξ ≥ 0.4. After the removal of defective (all-zero) pixels, the evaluation data set comprised 832 data points from both 2018 and 2019. Statistical analyses were carried out with R (v3.6.3) [92]. In a first step, a Pearson correlation matrix was computed to assess the agreement between ζ, ν and the evaluation data set using the package corrplot (v0.84) [93]. In addition, the matrix gave a first impression about the occurrence of multicollinearity among evaluation variables.
A regression approach was adopted for statistical inference and multicollinearity testing. Because ζ, ν are limited to the standard unit interval (0, 1), beta regression, as introduced by Ferrari and Cribari-Neto [94], was used. Their parameterisation of the beta density uses µ as the mean of the response and φ as a precision parameter. The density of y is then defined as where 0 < µ < 1, φ > 0, and Γ(·) is the gamma function. For regressing the dependent variable on a set of independent variables, the logit link is used. The estimation of coefficients and distribution parameters is based on maximum likelihood. Beta regression was carried out using the betareg package (v3.1-3) [95]. Multicollinearity in the fitted model was assessed with the mctest package (v1.3.1) [96]. The condition number of the regressor matrix X and variance inflation factors (VIF) were used as criteria to determine collinearity among predictor variables. The objective of the statistical inference was to determine those variables (and, hence, plant traits) that most strongly influence the stress signal of the DE-GsB grassland site. Multicollinearity leads to unstable coefficient estimates and biased statistics [97]. The detection of multicollinearity in the fitted beta regression model required an alternative model fitting approach that is less sensitive to collinearity. Boosted beta regression is a statistical learning technique that is suitable for this purpose. The algorithm is based on the principles of gradient boosting [98]: In an iterative process, weak learners are fit to the data to maximise the negative gradient of a given loss function. The model is updated in every iteration, eventually yielding a strong learner. The implementation of boosted beta regression uses the gamboostLSS [99] framework, which is an extension of generalised additive models for location, scale, and shape (GAMLSS), as introduced by Stasinopoulos and Rigby [100]. In contrast to many classical methods, like generalised additive models (GAM), this flexible, semiparametric approach to regression does not restrict the response variable to follow an exponential family distribution. Moreover, it is not restricted to modeling the conditional mean of the response. Instead, every distribution parameter can be modeled as a function of different predictor variables. In the case of beta regression, this allows accounting for overdispersion by regressing φ on one or more predictor variables [101]. The predictor-response relationships are not limited to linear functions, but also encompass nonlinear and smooth functions, e.g., regression splines. The additive predictors are computed in the same way as in traditional GAMs [102].
Component-wise fitting of covariates constitutes the link between gradient boosting and GAMLSS. In the multivariate case, only one component of the regressor matrix X is fitted to the gradient vector per weak learner [103]. In each boosting iteration, the model is updated with the best-performing weak learner, which leads to a strictly additive structure [99]. For boosted beta regression, a weak learner will typically be a method, like univariate linear regression, penalised regression splines, or ridge regression. Because all of the distribution parameters are estimated, the algorithm constructs a distinct additive predictor for each distribution parameter by component-wise updates of separate prediction functions per iteration. Optimizing beta regression with gamboostLSS can be written as where η µ and η φ are the additive predictors, ρ is the negative log-likelihood, and (X, Y) are predictor matrix and response vector [99]. Although, in practice, instead of minimizing the negative log-likelihood, the loss function is minimised by gradient descent, as the expected value is unknown. When setting an early stopping iteration m stop , this approach has some beneficial properties: covariates that are never selected in the updating process are excluded from the final model [104]. Thus, data-driven variable selection to exclude less important predictors is performed, which alleviates collinearity [101]. Moreover, the coefficients of the selected predictors are shrunk towards zero. Similar to regularised regression approaches, early stopping in gamboostLSS introduces an estimation bias in exchange for decreased variance, which results in a more parsimonious model [105]. In low-dimensional settings with few predictors and many observations, boosting techniques are known to include too many noisy variables [106]. Therefore, m stop was set to 300 to achieve sufficient variable selection and shrinkage, and the learning rate was set to 0.05. Three different models were fit: besides the full (interannual) model for the combined 2018-2019 dataset, two subset models were fit using only data points from 2018 or 2019. The evaluation datasets that were used in fitting of the models were mean-centered and standardised independently, and a global intercept variable was added. Boosted beta regression was performed using the gamboostLSS package (v2.0-1.1) [107].

Drought Status
We investigated the drought and soil water status of the research site with threemonths SPEI values and relative soil water content measurements (see Figure 3). The SPEI time series shows that the 2018 hyperspectral flight took place at the onset of an exceptional meteorological drought period that lasted until early 2019. At the time of first image acquisition, soil moisture had started to decrease. However, after a rather cool and wet summer and winter in 2017, soil moisture at 30 cm depth was still around 50%. During spring 2018, soil moisture was decreasing rapidly and SPEI values of ≤−2 show persistent extreme drought conditions in summer 2018. The second image acquisition flight took place in spring 2019 after a winter with near-average precipitation and temperature conditions and subsequent re-emerging drought conditions, as indicated by the SPEI values. The soil moisture time series gives insight into the impact of the 2018-2019 drought on water availability at DE-GsB as the relative water content was quickly approaching the low levels of 2018 again. Because water reservoirs were insufficiently replenished by winter precipitation, the second image acquisition date is characterised by rather low soil water contents at around 25 or 35%. The analysis of these drought indicators confirms that the two hyperspectral scenes show contrasting environmental conditions that can be deemed to be sufficiently extreme for application of SiVM. It also illustrates the intensity and duration of the drought period that affected the research site between the two flight campaigns.

Archetypes and Stress Maps
Matrix factorisation with SiVM resulted in a set of archetypes that encompass a range of spectra with distinct features. Figure 4 shows the classified archetypes. The background class holds more than half of all calculated archetypes due to the wide range of observed non-vegetation land cover. They mostly lack the trough in red reflectance and the sharp increase in the red edge area, which are characteristic for vegetation reflectance spectra [108]. Differences between healthy and moderately stressed vegetation are subtle. Vegetation stress leads to an increase in red reflectance, which reduces the red edge slope until the red reflectance trough around 680 nm disappears [23]. Healthy archetypes show a decrease in NIR reflectance beyond 900 nm, which is not the case for most stressed archetypes.

Correlation Analysis
The results shown in Figure 6 show an overall good agreement of the computed qualitative stress metric and vegetation indices, implying a robust performance of SiVMbased stress classification at the canopy level. The correlation matrix of the aggregated archetypes ζ and ν, as well as the evaluation variables, demonstrates the correspondence between the latent variable-based stress index and established vegetation growth proxies as well as potentially relevant geophysical measurements. Correlation analysis of the spectral indices revealed considerable collinearity among the predictor variables. The absolute correlations of ζ are consistently stronger than for ν. We assume this to be a consequence of the stronger similarity between stressed and background spectra that makes an exact distinction of these categories more difficult. Therefore, we focused on ζ as a response variable in the statistical analyses. The RENDVI as a proxy for C ab content and plant vitality is positively correlated with ζ. Likewise, the two optimised pigment indices Chl REopt and Car REopt point to higher pigment concentrations in less stressed vegetation, with Car REopt exhibiting the strongest correlation of all evaluation variables (r = 0.89, p < 0.001). Of the two vegetation indices that are related to structural traits, the MCARI2 exhibited slightly higher correlation values than the MSAVI2. The results indicate a larger photosynthetically active leaf area in vegetation with higher ζ values, as expected. The RWC-related spectral indices WBI and f (ρ 950.6 ) lead to very similar results and suggest a good level of agreement between the latent variable-based stress index and plant water content.
The correspondence between the ECa and GR measurements, as well as stress level estimates ζ/ν, is low. For these soil proxy variables, the strongest correlations result from the EM38 measurements (r = −0.28, p < 0.001). The DEM exhibits similarly low correlation values. Geophysical variables and DEM seem to be rather unrelated to the computed stress signal. Nevertheless, the variables were included in the statistical model to verify this assertion by variable selection.

Boosted Beta Regression
Before interpreting the results of the statistical analysis, multicollinearity diagnostics were performed with a beta regression model that included ζ as the dependent and all evaluation variables as the independent variables. The beta distribution parameters µ and φ were both estimated by the full set of regressors. The regressor matrix has a condition number of 3388, which suggests severe multicollinearity. VIF also indicate multicollinearity in all vegetation indices, except for the PRI 512 , which had a VIF of 6.8 (see Table 4). While Car REopt had a VIF of 19, all other vegetation index VIF values were in the range of 73-216, which indicates severe multicollinearity. The rest of the predictor variables exhibited substantially lower VIF values. Table 4 shows the estimated coefficient values from the three different model fits for the expected value µ and the precision parameter φ. The interannual full model (FM) results show that substantial variable selection has taken place. Four out of nine spectral indices that were related to plant traits were not included in the estimation of µ. MCARI2, CTR2, and RENDVI were excluded or marginalised in all µ models. Likewise, all of the geophysical variables and DEM were either excluded or shrunk towards zero in all µ models. Car REopt is the most important predictor with the highest µ coefficient value in all models. Thus, increased C xc is associated with an increased probability of observing a healthy pixel. The same applies to higher PRI 512 values in FM µ. This dependency is much weaker in the subset models M18 µ and M19 µ. Increased WBI values are associated with less vegetation stress in all models, with weaker relationships in the subset models. WBI and f (ρ 950.6 ) had been almost perfectly collinear (r = −0.98) in the correlation analysis, so that the effect of the derivative reflectance did not contribute to FM µ. The situation is less clear in the subset models where both of the predictors that are related to relative water content contribute to the regression, although f (ρ 950.6 ) has relatively small coefficient values. In contrast to the results of the correlation analysis, the coefficients for Chl REopt and MSAVI2 are negative in M18 µ and M19 µ. e.g., MSAVI2 is related to green LAI, but a correspondence of lower LAI and healthier vegetation is implausible. The inclusion of collinear variables induces instability in the model and it can result in erratic coefficient estimation. We assume this to be the case for Chl REopt and MSAVI2, as both are highly correlated with Car REopt . In the subset models, MSAVI2 shows small, negative coefficient values, while the picture is not clear for Chl REopt . In conclusion, the strongest predictors in the full model are linked to carotenoids (Car REopt , PRI 512 ), LAI (MSAVI2), and water content (WBI). The subset µ models contain three main predictors: Car REopt , Chl REopt , and WBI. The difference is especially large for the PRI 512 which is only relevant in the FM µ. Coefficients for M19 µ are generally smaller, probably due to the prevalence of stressed vegetation and, hence, the smaller range of stress values in the 2019 dataset. Table 4. Variance inflation factors (VIF) from multicollinearity analysis of the evaluation variables. The estimated coefficient values from fitting the interannual full model (FM) and subset models (M18, M19) with boosted beta regression (early stopping after 300 iterations, learning rate: 0.05, coefficient values rounded to three digits). Car REopt is the most relevant predictor in two out of three φ models. φ is a precision parameter and, therefore, vegetation with higher Car REopt values is associated with a larger variance of y. In contrast to the µ models, CTR2 was an important predictor in every φ model. Likewise, the DEM and geophysical variables have small, mostly negative coefficients in the precision models, with the EM38 measurements being the most relevant. However, in this study, focus was on the location parameter µ.

Archetype Classification
The classification of archetypical spectra shown in Figure 4 is in line with results from previous experiments. Several studies reported decreases in NIR reflectance beyond 850 nm under increasing water stress [63,64,66]. Likewise, the water absorption feature around 970 nm becomes less pronounced [110]. The decrease in NIR reflectance over 900 nm exhibited by healthy archetypes is probably related to stronger water absorption in healthy vegetation. Nevertheless, NIR reflectance exhibits much more variability with varying biophysical variables, like leaf angle distribution or LAI, when compared to the visible spectrum [111]. Therefore, distinguishing healthy and stressed spectra is more reliable when based on visible spectrum wavelengths [108]. Further criteria like NIR reflectance should be consulted in the case of ambiguity because a classification of mixed pixels cannot be highly selective.
The maps presented in Figure 5 show strongly increased vegetation stress levels across the entire site in the 2019 image. This can be attributed to the prolonged drought in the region (see Figure 3). The substantial decrease in soil water availability has a marked impact on the pasture. The background patch that is shown in the upper left quarter of both images is related to grazing and management of the site, while the extended background patch in the lower left quarter of the 2018 image can be explained by a depression that is regularly flooded. In Figure 5b, vegetation in the depression exhibits among the highest stress levels owing to a combination of multiple stress agents. This shows that the computed stress index does not distinguish between the two stressors that occurred in the area of investigation before the second flight campaign.
The analysis of field data entails limited control of influencing factors. In the case of the DE-GsB grassland site, this is particularly the case for grazing. Depending on its intensity, grazing can be an additional stressor, but it can also have beneficial effects, such as a reduction of competition between plants [112]. Because detailed information about grazing intensity and timing was not available, we did not include the influence of grazing in the analysis. Most grass species are shallow-rooted, so that their water status is more dependent on surface soil water and, hence, precipitation [113]. Therefore, long-term stress has different consequences for grassland species, which are often adapted to cope with periods of water shortage when compared to deep-rooted plants. For such a dynamic ecosystem, the comparison of the 2018 and 2019 stress classification should, thus, not be interpreted as the aggregated effect of a prolonged drought period. The datasets rather represent two extreme cases that can be viewed as reference points to observe anomalies.

Importance of Variables
We validated the latent variable-based stress classification with a set of evaluation variables. As a general remark, these results are not based on direct vegetation measurements, which limits the informative value of the computed relationships. Nevertheless, the results of the correlation analysis imply the successful application of SiVM at the ecosystem scale as the inferred stress signal has a high degree of correspondence with changes in plant trait proxies indicating inhibited growth. Best-performing vegetation indices are Car REopt (proxy for C xc ), PRI 512 (proxy for C xc /C ab ratio and LUE) and WBI (proxy for RWC). For assessing the severity of vegetation stress, pigment-related indices Chl REopt , Car REopt , and PRI 512 are important. Carotenoids serve multiple purposes in plant physiology, playing an important role in capturing light energy as well as dissipating excess energy [114]. Because they protect the plants' photosystems from damage, e.g., counteracting different forms of reactive oxygen species, increases in C xc are usually observed under severe stress conditions [115]. Therefore, we expected an increase in Car REopt (and hence C xc ) in stressed vegetation, but the lower Car REopt values in the 2019 dataset imply only moderate stress conditions. However, one would expect C ab to decrease faster than total C xc [73,116] under stress conditions. Such an increase in the C xc /C ab ratio would manifest in a tendency towards lower PRI values [78]. In fact, the strong positive correlation of ζ and PRI 512 suggests an increase in the C xc /C ab ratio following the drought period.
These results can be explained by the sensitivity of grasslands to precipitation patterns [117], which allows for faster recovery after rainfall events. Although soil water was decreasing again in spring 2019, winter precipitation had allowed some replenishment of the diminishing reservoirs (see Figure 3). The timing of data collection is another influencing factor. Both of the images were taken in spring, so that the ecosystems were not subject to the heat stress of the summer months. We conclude that the average vegetation stress level at DE-GsB in spring 2019 was not severe, but moderate. This claim is also supported by the shape of the archetypes in Figure 4. Severe stress leads to a notable "blue shift" of the red edge transition point that moves to shorter wavelengths [65,118]. No such shift could be observed in the stressed archetypes with the given spectral resolution of 3.2 nm.
Boosted beta regression was useful in removing redundant predictors from the model, despite some presumed remaining instability in the final models due to multicollinearity [119,120]. Nevertheless, statistical inference supports the conclusions regarding the variable importance drawn from the correlation analysis. All of the geophysical measurements and the DEM were excluded from all µ models or shrunk towards zero. In case of the DEM, the exclusion can be attributed to the level research site which only has an elevation range of about 1 m within it's 200 × 200 m extent. The estimation of the contribution of soil traits to the vegetation stress signal would require the integration of actual soil traits, such as soil moisture, porosity or pH. Keeping the strong collinearity among certain vegetation indices in mind, the notion of "irrelevant" proxy variables should be considered with caution [121]. If there is a physiological cause for collinearity among predictors, the variables in question have to be considered as relevant. This is the case for the pigment indices Chl REopt and Car REopt (see Figure 6). We assume a close correlation of C ab and C xc as long as plants are not severely stressed [122], even though Car REopt is the vegetation index that shows the closest agreement with ζ values. PRI 512 is the predictor variable with the smallest VIF value in multicollinearity testing and it is an important variable in the full µ model. However, it was quite irrelevant in the subset models. This demonstrates that the PRI 512 tracked "long-term" stress responses of the grassland site, but it is less related to the within-image stress gradient. We reason that the PRI 512 contributed most unique information to the estimation procedure. The importance of this spectral trait constitutes a possible link between the latent variable model and photosynthesis.

Potential of SiVM for Vegetation Monitoring
Even though there is a strong correlation between ζ and the carotenoid-related variables Car REopt and PRI 512 , additional campaigns to collect imagery of severely stressed grassland would be required to confirm the results. This illustrates a restriction of SiVMbased stress detection that does not result in a standardised, quantitative measurement of stress, but generates a qualitative classification of stress levels occuring in the input datasets. Hyperspectral reference datasets of extreme environmental conditions can be included in an analysis to address this issue.
The present work also highlights several advantages of the factorisation-based stress detection approach. The integrated treatment of background, shade, and other effects that can adversely affect an analysis makes different preprocessing steps obsolete. Because SiVM is a method of unsupervised learning, it does not require previous knowledge or labeled data to perform clustering. This property makes it a valuable tool for areas where ground truth is not available, location of spectral features of interest is unknown, or in the case of rapid diagnosis of stress levels of vegetation, irrespective of the actual driver. Given that all calculated archetypes and their combinations are restricted to the unit interval, results from the factorisation are easily and intuitively interpretable. Although not investigated in this study, SiVM has proven to be successful in pre-symptomatic stress detection, being able to track temporal stress development several days earlier when compared to visual analysis or vegetation indices [26]. The method has also been expanded to include Gaussian process priors of arbitrary covariates, such as plant species or location for prediction tasks [61].
The demonstrated approach to latent variable stress detection did not result in a distinction of different stressors. The fact that drought and waterlogging both lead to an increased production of reactive oxygen species and changes in carotenoid content [123] is one example for the difficulties in stress agent distinction. The reflectance changes in the visible wavelengths are known to be similar for different stress sources [65]. Stress responses are species dependent, but the 0.4 m spatial resolution in this analysis did not allow for detecting species responses. Thus, we inferred the functional stress response of a grassland ecosystem. The quality and specificity of stress classifications could be improved by incorporating SWIR and TIR wavelengths into the factorisation [26]. Both of the spectral regions contain essential information regarding the vegetation health status related to heat and drought stress, and they have been employed in many applications [22,124,125]. Nevertheless, the visible spectrum is a reliable indicator of plant stress in general. The use of visible spectrum sensors is cheaper and, therefore, more widespread, which makes the methodology presented in this study suitable for low-cost monitoring networks [126], whether in precision agriculture or scientific applications [127].
The generalisation of the methodology has to be further tested. As a first step, SiVM should be applied to imagery from different ecosystems to evaluate the robustness of the resulting stress index and investigate necessary adjustments of the approach. While the method has performed satisfactory using data from a rather homogeneous canopy of simple-structured grass leaves, its application to a structurally complex canopy remains to be evaluated. Extrapolation of derived stress indices to larger scales has to be evaluated in detail. Different applications of SiVM for stress detection at the leaf scale have already been proven successful [26,61,128]. In this study, we could show that the methodology can be transferred to the ecosystem scale. Hence, an upscaling effort to the regional or continental scale based on spaceborne data seems to be worthwhile, when considering the relevance of satellites for monitoring tasks and the increasing ability of next-gen spaceborne spectrometers, like DESIS [129], PRISMA [130], or the upcoming EnMAP [131].

Conclusions
The main achievement of this study is the successful application of SiVM for an unsupervised classification of grassland drought stress at the ecosystem scale. The stressrelated vegetation indices included in the analysis were strongly correlated with the stress classification computed with SiVM (four out of nine with |r| > 0.8, none with |r| < 0.75, p < 0.001). Previously, this method has been applied at a close range above individual plants and a homogeneous corn canopy [26,61]. The findings of this study suggest that SiVM is robust towards mixed pixel effects, and it can be transferred to field scale applications and beyond. This property, the capability for pre-symptomatic stress detection, and the easily interpretable results make SiVM a valuable tool for precision agriculture applications that use remote sensing data from airborne sensors. Large-scale stress detection is also highly valuable for reducing uncertainties in land surface and ecosystem models. Physiological stress indices can be integrated to improve the estimation of the impact of stress on carbon and water fluxes of ecosystems [132,133].
We performed different statistical analyses to assess the quality of SiVM-based stress classification using a set of evaluation variables that included spectral indices and geophysical measurements. In the correlation analysis, Car REopt had the highest correlation with computed vegetation stress levels (r = 0.89, p < 0.001). This proxy for carotenoid content indicates higher carotenoid levels in healthier vegetation. In contrast to our expectations, Car REopt values in the 2019 dataset decreased, which suggested a C xc reduction. This shows that neither prolonged water stress during 2018 nor the re-emering drought in spring 2019 had a lasting or severe impact on vegetation health at DE-GsB. Immediately after an intensive heat and drought event, like the summer 2018 drought, increased C xc values would most likely have been observable. However, a validation with direct vegetation measurements would have generated more reliable results.
Statistical inference with boosted beta regression revealed that PRI 512 , the second carotenoid-related index, is closely related to the unsupervised stress classification in the interannual full model and it shows comparably low multicollinearity with other evaluation variables. Likewise, covariation between PRI and carbon uptake of plants has been found in a number of studies [74,80,134]. These results suggest that a relationship between the SiVM-based stress classification and photosynthetic efficiency exists. Thus, the methodology could possibly be used for the estimation of carbon fluxes based on hyperspectral data. Further research on the issue would require local hyperspectral data of high temporal resolution in combination with eddy covariance measurements to enable the calibration of remotely sensed flux estimates [25]. Local scale optical sensors are less dependent on favorable atmospheric conditions than satellites and enable continuous measurements. If carbon flux estimation with SiVM proves to be practical at the local scale, transfer to satellite scale could be considered for large-scale productivity monitoring.
Plant stress that is caused by water scarcity or water logging cannot be distinguished so far. This limitation could be improved by introducing more categories into the archetype classification. However, a more precise distinction requires in-depth knowledge regarding the subtle disparities of reflectance spectra due to different stress agents for different species and landscape types. A database of labeled spectral fingerprints of stress effects could be used to achieve a more differentiated classification with high accuracy using supervised learning techniques. Beyond that, labeling data within already operative "fingerprinting" efforts, like the Spectranomics database [135], would produce synergies that could foster an easier usage of hyperspectral data for different monitoring tasks. The authors are thankful for the good cooperation and inspiring discussions within the project group. We kindly thank F. Boeing for providing interpolated daily meteorological data and S. Kögler for providing the DEM. We thank M. Pohle for acquisition of geophysical data. The technical support from A. Schmidt and M. Wu is gratefully acknowledged.

Conflicts of Interest:
The authors declare no conflict of interest.

Abbreviations
The following abbreviations are used in this manuscript: