Validating the Predictive Power of Statistical Models in Retrieving Leaf Dry Matter Content of a Coastal Wetland from a Sentinel-2 Image

: Leaf dry matter content (LDMC), the ratio of leaf dry mass to its fresh mass, is a key plant trait, which is an indicator for many critical aspects of plant growth and survival. Accurate and fast detection of the spatiotemporal dynamics of LDMC would help understanding plants’ carbon assimilation and relative growth rate, and may then be used as an input for vegetation process models to monitor ecosystems. Satellite remote sensing is an e ﬀ ective tool for predicting such plant traits non-destructively. However, studies on the applicability of remote sensing for LDMC retrieval are scarce. Only a few studies have looked into the practicality of using remotely sensed data for the prediction of LDMC in a forest ecosystem. In this study, we assessed the performance of partial least squares regression (PLSR) plus 11 widely used vegetation indices (VIs), calculated based on di ﬀ erent combinations of Sentinel-2 bands, in predicting LDMC in a coastal wetland. The accuracy of the selected methods was validated using LDMC, destructively measured in 50 randomly distributed sample plots at the study site in Schiermonnikoog, the Netherlands. The PLSR applied to canopy reﬂectance of Sentinel-2 bands resulted in accurate prediction of LDMC (coe ﬃ cient of determination (R 2 ) = 0.71, RMSE = 0.033). PLSR applied to the studied VIs provided an R 2 of 0.70 and RMSE of 0.033. Four vegetation indices (enhanced vegetation index(EVI), speciﬁc leaf area vegetation index (SLAVI), simple ratio vegetation index (SRVI), and visible atmospherically resistant index (VARI)) computed using band 3 (green) and band 11 of the Sentinel-2 performed equally well and achieved a good measure of accuracy (R 2 = 0.67, RMSE = 0.034). Our ﬁndings demonstrate the feasibility of using Sentinel-2 surface reﬂectance data to map LDMC in a coastal wetland.


Introduction
Many ecological studies have shown that plant functional traits control a variety of terrestrial ecosystem properties, including productivity, soil carbon, nutrient dynamics, and soil carbon storage e.g., [1,2]. Individual species and whole plant communities respond to natural and anthropogenic gradients, as well as to climatically different growing seasons, by adjusting their physiology. This can be studied by quantifying traits [3]. Leaf dry matter content (LDMC) is one of the most widely used plant functional traits from the leaf economics spectrum [4] and provides essential information on the response of plant communities to changing environmental conditions [5].
Verrelst, et al. [38] sub-categorized the techniques into: (i) parametric regression, (ii) non-parametric regression, (iii) physically-based, and (iv) combined methods. The different forms of vegetation indices and parametric approaches based on quasi-continuous spectral band configurations are grouped under the parametric (statistical) approaches. Stepwise multiple linear regression (SMLR), principal components regression (PCR), partial least squares regression (PLSR), artificial neural networks (ANNs), Kernel methods, and Bayesian networks are some of the non-parametric (statistical) approaches. Physical approaches are based on radiative transfer model inversion. The combined methods merge elements of statistical as well as physical models.
Many of the statistically-based approaches, such as vegetation indices (VIs), are simple and convenient algebraic combinations of spectral information used to retrieve variables from remotely sensed data. They are easy and fast to implement, but often have the limitation of being specific to time, vegetation type, and location. Moreover, the representativeness of the relationship in statistical models is limited to the representativeness of the database [26]. Physically-based and combined methods, on the other hand, allow the creation of simulated training databases covering a wide range of spectral data to which inversion algorithms can be applied to retrieve variables. However, they are computationally demanding, and uncertainties in models may result in large variations in results [39]. It is difficult to obtain optimal parameterized solutions for radiative transfer model inversions [40], which may then provide challenges when extrapolating models in space and time.
Therefore, due to their ease of computation, robustness, and capacity, statistics-based approaches have been widely used. A number of researchers [20,[41][42][43][44] have validated the performance of various forms of VIs in predicting vegetation parameters, such as chlorophyll content, leaf area index (LAI), SLA, fractional vegetation cover, and biomass, from remotely sensed data. Inoue, Guerif [20] reported the better performance of the non-parametric model PLSR, which can utilize spectral information of many wavebands, compared to other statistical models in predicting canopy chlorophyll content (CCC) from different remotely sensed datasets. Similarly, Atzberger, et al. [45], who investigated the predictive power and noise sensitivity of three non-parametric regression methods (i.e., SMLR, PCR, and PLSR) to assess CCC of winter wheat using spectroradiometric measurements obtained at multiple sites and dates, found PLSR to be relatively insensitive to sensor noise and to outperform the other techniques.
Although estimation of leaf traits from remotely sensed data has been widely studied, to our knowledge, no reports exist that validate the performance of remote sensing methods in predicting LDMC from the recently launched Sentinel-2 multi spectral imager (MSI). Sentinel-2 is one of a new generation of satellites with medium spectral and high spatial resolution imagery and provides high temporal imaging for regional, continental, and global vegetation studies. The potential contributions of sentinel data products for environmental monitoring were recently assessed by several authors e.g., [46,47]. The improvements in the spectral and spatial resolution of this imagery may enable accurate prediction of biodiversity variables to a large spatiotemporal extent. This study aims to test how accurately LDMC can be estimated in a saltmarsh and grassland ecosystem from Sentinel-2 data by examining the performance of statistics-based approaches, including partial least square regression (PLSR) and vegetation indices optimized to Sentinel-2 band settings.
There may be a seasonal variation in the amount of LDMC. Climate conditions in different seasons may lead to varying amounts of LDMC during the growth periods. Therefore, it is of high importance to have a long-term record of LDMC to disentangle the temporary changes that occur under normal growth conditions from permanent alterations of LDMC that indicate change patterns in the functioning of the ecosystem. However, here we tested the potential of statistical methods to retrieve LDMC from Sentinel-2 data so that the recommended approach could be used in future studies to predict those long-time-series records of LDMC products required to examine and understand plant responses to climate and other environmental changes. Thus, spatiotemporal variation analysis is beyond the scope of this study, and our objective is to identify the best statistical algorithm for accurate prediction of LDMC using a Sentinel-2 image that matches our field campaign.

Study Area
Schiermonnikoog is one of the Dutch barrier islands. It is located in the northern part of the Netherlands (province of Friesland) with the central geographical coordinates 53 • 29 21.7464" N and 6 • 13 51.2796" E ( Figure 1). The island is about 40 km 2 in area and has approximately 1000 inhabitants in its single village. It has an annual rainfall of 824 mm and an annual average temperature of 10.2 • C.

Study Area
Schiermonnikoog is one of the Dutch barrier islands. It is located in the northern part of the Netherlands (province of Friesland) with the central geographical coordinates 53°29′21.7464″ N and 6°13′51.2796″ E (Error! Reference source not found.). The island is about 40 km 2 in area and has approximately 1000 inhabitants in its single village. It has an annual rainfall of 824 mm and an annual average temperature of 10.2 °C.
The island predominantly consists of natural landscapes, including beaches in the north, dunes extending from west to east, and saltmarsh in the south and southeast. The island is a valuable coastal wetland site for essential ecological services and was designated a Ramsar wetland site in 2000 [48]. The island's entire natural area was officially declared a national park in 1989. Forest, shrub, and grass form the main vegetation cover types (Error! Reference source not found.). The dune area is forested and the saltmarsh area is covered in herbs, sedges, rushes, and grasses [49].

Field Data
A field campaign was conducted between September 26 and October 5, 2017. The test site was the grassland and saltmarsh area in the southeast part of the island. First, the test site was divided into six strata based on the existing vegetation type map of the island (Error! Reference source not found.). We randomly selected 50 plots from four of the six main vegetation cover strata. Samples were collected from 24 plots in the middle-high marsh, nine plots in the brackish marsh, nine plots in the high marsh, and eight plots in the low marsh areas (a total of 50 plots). Due to harsh weather and site conditions, no data were collected from the pre-pioneer and pioneer zone marsh areas. The plots were, on average, 250 m from an open water body to avoid the effect of water on the sample plots' reflectance. Considering time, money, and resources constraints, we assumed a sample size of 50 plots to represent the selected test site. This sample size (50) has been previously used for leaf area index estimation of the same saltmarsh [34]. The island predominantly consists of natural landscapes, including beaches in the north, dunes extending from west to east, and saltmarsh in the south and southeast. The island is a valuable coastal wetland site for essential ecological services and was designated a Ramsar wetland site in 2000 [48]. The island's entire natural area was officially declared a national park in 1989. Forest, shrub, and grass form the main vegetation cover types ( Figure 1). The dune area is forested and the saltmarsh area is covered in herbs, sedges, rushes, and grasses [49].

Field Data
A field campaign was conducted between September 26 and October 5, 2017. The test site was the grassland and saltmarsh area in the southeast part of the island. First, the test site was divided into six strata based on the existing vegetation type map of the island (Figure 1). We randomly selected 50 plots from four of the six main vegetation cover strata. Samples were collected from 24 plots in the middle-high marsh, nine plots in the brackish marsh, nine plots in the high marsh, and eight plots in the low marsh areas (a total of 50 plots). Due to harsh weather and site conditions, no data were collected from the pre-pioneer and pioneer zone marsh areas. The plots were, on average, 250 m from an open water body to avoid the effect of water on the sample plots' reflectance. Considering time, money, and resources constraints, we assumed a sample size of 50 plots to represent the selected test site. This sample size (50) has been previously used for leaf area index estimation of the same saltmarsh [34].
Remote Sens. 2019, 11, 1936 5 of 17 Each plot was 20 × 20 m. The coordinates of each plot were recorded by averaging 50 GPS readings (Garmin eTrex 30×, ±2 m accuracy), and leaf samples were collected from five representative subplots of 1 ×1 m, evenly distributed within a given plot (see Figure 2). Leaf samples were placed in a zip-locked plastic bag, transported to the laboratory, and stored in a cold dark room. All samples were processed on the day of collection. Each plot was 20 × 20 m. The coordinates of each plot were recorded by averaging 50 GPS readings (Garmin eTrex 30×, ±2 m accuracy), and leaf samples were collected from five representative subplots of 1 ×1 m, evenly distributed within a given plot (see Error! Reference source not found.). Leaf samples were placed in a zip-locked plastic bag, transported to the laboratory, and stored in a cold dark room. All samples were processed on the day of collection. A digital scale of high precision was used to measure the fresh weight of each sample. The samples were then oven-dried at 60 °C for 72 h, and their dry biomass was weighed. The leaf trait (LDMC) was then computed based on a sample's fresh and dry weight, according to Equation (1): where LDMC is leaf dry matter content (g/g), and are leaf dry and fresh weight in g, respectively. Table 1 presents a summary of the samples' statistics.  A digital scale of high precision was used to measure the fresh weight of each sample. The samples were then oven-dried at 60 • C for 72 h, and their dry biomass was weighed. The leaf trait (LDMC) was then computed based on a sample's fresh and dry weight, according to Equation (1):

Sentinel-2 Image and Pre-Processing
where LDMC is leaf dry matter content (g/g), W d and W f are leaf dry and fresh weight in g, respectively. Table 1 presents a summary of the samples' statistics. showed less than 10% cloud coverage. The Sentinel-2 satellite image (Level-1C) covering the study area from October 13, 2017, was found to be relatively cloud-free (<10%), and thus was selected for the study. The downloaded image, which had been corrected for systematic radiometric and geometric errors, was atmospherically corrected and converted to reflectance through processing from top-of-atmosphere (TOA) Level 1C S2 to bottom-of-atmosphere (BOA) Level 2A using Sen2cor 2.5.5 stand-alone software, which is freely distributed under the GNU's Not UNIX (GNU general public license) (http://step.esa.int/main/third-party-plugins-2/sen2cor/). The output of the process provided three sets of images of spectral reflectance of Sentinel-2 bands with 10, 20, and 60 m resolution, respectively. The bands with 10 m spatial resolution were resampled and spectral information from nine bands (band 2, 3, 4, 5, 6, 7, 8a, 11, and 12) with a 20 m cell size, which approximates the sample plots' size (20 m × 20 m), was used for this study. Bands 1, 9, and 10 of Sentinel-2 were mainly used for atmospheric corrections and were not relevant for our purpose. The reflectance values of the sample plots were extracted from the projected reflectance image and used for calibration and validation of the algorithms used in this study.

Methods
From the overarching methods available in the literature, this study evaluated the performance of 11 vegetation indices and PLSR in predicting LDMC of a coastal wetland from Sentinel-2 imagery. The selection of those statistical models was based on their simplicity, computational efficiency, and accuracy regarding reliable and fast estimation of plant traits from remotely sensed data. We aimed to demonstrate for the first time the applicability of Sentinel-2 data for fast and reliable measurement of the weight-based plant trait LDMC. Developing new algorithms or comparing different existing methods to improve the prediction accuracy and precision was beyond the scope of this study.

Vegetation Indices
There are no studies that have tested the performance of vegetation indices in retrieving LDMC from Sentinel-2 satellite images. In the literature, the shortwave infrared (SWIR) region of the electromagnetic spectrum is reported as the most sensitive region for predicting dry-matter-related parameters [13,24,28,50,51]. However, Sentinel-2 only has a few bands in this region, and therefore, examination of all possible combinations of the Sentinel-2 bands is imperative for accurate retrieval of LDMC.
We used vegetation indices that are robust to estimate leaf dry-mass-related traits, such as SLA from airborne and ground spectroradiometer measurements [44]. Table 2 shows the selected indices that were tested to assess their capability in predicting LDMC from Sentinel-2 reflectance data. The bands for each index were determined by testing the performance of every combination of Sentinel-2 bands. The best combinations of bands for calculation of an index were those that provided a higher coefficient of determination (R 2 ) values between the index and the in situ measured LDMC dataset.

Multivariable Regression Models
Multivariable regression models, such as partial least square regression (PLSR), have been proven to be highly applicable when aiming to quantify vegetation characteristics from remotely sensed data e.g., [20,37,45]. Unlike vegetation indices, multivariable regression models are able to use all of the spectral information from remotely sensed data, and may thus have higher predictive power than vegetation indices. Therefore, to make a general comparison of the performance of statistics-based approaches, we also examined the performance of PLSR in predicting LDMC from Sentinel-2 data using spectral wavebands. We have also applied PLSR to develop empirical relationships between LDMC and all of the vegetation indices tested in this study, as previous research indicated that integration of multiple vegetation indices could improve prediction accuracy [62]. The number of components of the multivariable methods was optimized by testing different combinations of explanatory variables (i.e., Sentinel-2 bands, and vegetation), adding an extra component to the models and observing measures of accuracy (RMSE and R 2 ) between the in situ and predicted values of the leaf trait. To examine the influence of each predictor variable in the model, analysis of the variable importance in the prediction (VIP) was undertaken. The analysis was performed in Matlab R2017b using the Integrated Library for Partial Least Squares Regression (libPLS) toolbox [63] to calculate VIP scores.

Validation
Accuracy assessment of both the index-and multivariate-based approaches was performed using cross-validation. The in situ measured LDMC and the corresponding reflectance data extracted from the Sentinel-2 image were used in a leave-one-out cross-validation procedure, in which a calibration set of n-1 samples is used to fit the predictive model and then evaluated using the sample that has been left out. Root mean square error (RMSE), bias, and R 2 were calculated as statistical measures of accuracy of the methods (Equations (2)-(4)). Models with high R 2 , low RMSE, and a bias close to zero were considered to be more appropriate predictors of LDMC from remotely sensed data.
Remote Sens. 2019, 11, 1936 8 of 17 where y i and y i are the actual and predicted values for sample i, and n is the number of samples considered.

Sentinel-2 Reflectance and LDMC
To assess the response of Sentinel-2 reflectance to LDMC variation, the relationship between reflectance in the different bands of Sentinel-2 and the field measured LDMC was investigated before the calibration of vegetation indices. As shown in Figure 3, many of the Sentinel-2 bands' reflectance did not show a statistically significant correlation to LDMC variation. Band 2 (560 nm) and Band 11 (1614 nm) were the only bands that showed a significant but weak (R 2 = 0. 08 and R 2 = 0.06 respectively) correlation (p < 0.05).
where and are the actual and predicted values for sample , and n is the number of samples considered.

Sentinel-2 Reflectance and LDMC
To assess the response of Sentinel-2 reflectance to LDMC variation, the relationship between reflectance in the different bands of Sentinel-2 and the field measured LDMC was investigated before the calibration of vegetation indices. As shown in , many of the Sentinel-2 bands' reflectance did not show a statistically significant correlation to LDMC variation. Band 2 (560 nm) and Band 11 (1614 nm) were the only bands that showed a significant but weak (R 2 = 0. 08 and R 2 = 0.06 respectively) correlation (p < 0.05).

Band Optimization of Vegetation Indices
To our knowledge, no vegetation indices have been developed or suggested specifically for accurate retrieval of LDMC from remotely sensed satellite data. Existing vegetation indices are mainly calibrated for prediction of other vegetation traits, such as chlorophyll, leaf mass per area, water, nitrogen content, and leaf area index, but not optimized for LDMC estimation. Directly applying indices obtained from the literature did not result in any strong correlation with LDMC (results not shown here). Consequently, the band combinations of the 11 tested vegetation indices were determined by comparing the performance of all possible band combinations using R 2 as a measure of accuracy. Figure 4 provides the 2D graphical representations (matrix) of R 2 in identifying the optimal band Remote Sens. 2019, 11,1936 9 of 17 combinations of Sentinel-2 for LDMC estimation using the simple ratio vegetation index. There is a limited number of band combinations that correlate significantly with LDMC. Vegetation indices based on bands at 560 nm and 1614 nm central wavelengths showed apparently higher R 2 with the measured LDMC dataset. accurate retrieval of LDMC from remotely sensed satellite data. Existing vegetation indices are mainly calibrated for prediction of other vegetation traits, such as chlorophyll, leaf mass per area, water, nitrogen content, and leaf area index, but not optimized for LDMC estimation. Directly applying indices obtained from the literature did not result in any strong correlation with LDMC (results not shown here). Consequently, the band combinations of the 11 tested vegetation indices were determined by comparing the performance of all possible band combinations using R 2 as a measure of accuracy. Error! Reference source not found. provides the 2D graphical representations (matrix) of R 2 in identifying the optimal band combinations of Sentinel-2 for LDMC estimation using the simple ratio vegetation index. There is a limited number of band combinations that correlate significantly with LDMC. Vegetation indices based on bands at 560 nm and 1614 nm central wavelengths showed apparently higher R 2 with the measured LDMC dataset. The results of the correlational analysis between the optimized VIs and measured LDMC are illustrated in . As expected, the optimization tremendously improved the correlation between the VIs and measured LDMC. All vegetation indices showed a statistically significant linear correlation (Error! Reference source not found.) to LDMC (p < 0.05). The relatively lower correlation was observed for the two vegetation indices, modified chlorophyll absorption in reflectance index (MCARI) and transformed chlorophyll absorption in reflectance Index (TCARI) (R 2 = 0.48). The results of the correlational analysis between the optimized VIs and measured LDMC are illustrated in Figure 5. As expected, the optimization tremendously improved the correlation between the VIs and measured LDMC. All vegetation indices showed a statistically significant linear correlation (Table 3) to LDMC (p < 0.05). The relatively lower correlation was observed for the two vegetation indices, modified chlorophyll absorption in reflectance index (MCARI) and transformed chlorophyll absorption in reflectance Index (TCARI) (R 2 = 0.48). Table 3. The performance of optimized vegetation indices in predicting LDMC from Sentinel-2 top-of-canopy reflectance cross-validated with the measured dataset (n = 50). Λ 1,2,3 elucidate the central wavelengths of the Sentinel-2 bands used in the models. Accuracy and correlation of the statistics-based models against in situ leaf dry matter content were validated using coefficient of determination (R 2 ), root mean square error (RMSE), normalized RMSE (NRMSE) and bias. Models with higher accuracy such as enhanced vegetation index (EVI), specific leaf area vegetation index (SLAVI), Simple ratio vegetation index (SRVI), partial list square regression based on bands (PLSR-bands) and PLSR based on vegetation indices (PLSR-VIs) are shown in bold.

Choosing the Number of Components for PLSR
As demonstrated in Figure 6, the expected error decreased continuously up to the tenth component when PLSR was applied to the reflectance data, though at a very low rate after the fourth component. For VI-based regression, however, the expected error reached its minimum at the third component without showing any significant subsequent decrease. To avoid overfitting, an RMSE ≥ 2% change criterion was applied to determine the most appropriate number of components [37]. This led to four components for reflectance-data-based PLSR and three for VI-based PLSR. The expected prediction errors were generally much higher for many of the band-based regression techniques than for VI integration. Analysis of variable importance in the prediction portrayed the importance of the Near infrared and short wave infrared region Sentinel-2 bands for the projection. However, only three vegetation indices (i.e., MCARI, TCARI, and TCARI/OSAVI) exhibited a VIP value ≥ 1.
for VI-based PLSR. The expected prediction errors were generally much higher for many of the bandbased regression techniques than for VI integration. Analysis of variable importance in the prediction portrayed the importance of the Near infrared and short wave infrared region Sentinel-2 bands for the projection. However, only three vegetation indices (i.e., MCARI, TCARI, and TCARI/OSAVI) exhibited a VIP value ≥ 1.

Accuracy Assessment of the Prediction
The cross-validation result demonstrated that four of the tested VIs (i.e., EVI, SLAVI, SRVI, and VARI) showed similar performances (Error! Reference source not found.). The optimal band combinations for many of the VIs, including the four better-performing ones, were band 11 and band 3 of Sentinel-2. However, predictions by PLSR slightly outperformed all of the VIs when regressed using either reflectance or the VIs themselves as a predictor. Against reflectance PLSR demonstrated a slightly higher correlation (R 2 = 0.71) and a slightly lower accuracy (RMSE = 0.0333) than against

Accuracy Assessment of the Prediction
The cross-validation result demonstrated that four of the tested VIs (i.e., EVI, SLAVI, SRVI, and VARI) showed similar performances ( Table 3). The optimal band combinations for many of the VIs, including the four better-performing ones, were band 11 and band 3 of Sentinel-2. However, predictions by PLSR slightly outperformed all of the VIs when regressed using either reflectance or the VIs themselves as a predictor. Against reflectance PLSR demonstrated a slightly higher correlation (R 2 = 0.71) and a slightly lower accuracy (RMSE = 0.0333) than against VIs (R 2 = 0.70, RMSE = 0.0330). It can be observed from Figure 7 that there are no outliers between PLSR-predicted and measured LDMC, resulting in a lower RMSE.

Discussion
The results of this study proved the feasibility of using satellite remote sensing for accurate prediction of LDMC in combination with statistical models, which are simple and fast to implement. The findings confirmed that optimization of VIs improved the relationship between spectral information and LDMC enormously. Comparison of the results illustrated in -5 shows the exponential improvement in the relationship. All of the tested VIs exhibited a statistically significant correlation. VIs are known to significantly improve the spectrally sensitive information of vegetation variables [64,65]. However, each VI has its suitability and specific uses, and

Discussion
The results of this study proved the feasibility of using satellite remote sensing for accurate prediction of LDMC in combination with statistical models, which are simple and fast to implement. The findings confirmed that optimization of VIs improved the relationship between spectral information and LDMC enormously. Comparison of the results illustrated in Figures 3-5 shows the exponential improvement in the relationship. All of the tested VIs exhibited a statistically significant correlation. VIs are known to significantly improve the spectrally sensitive information of vegetation variables [64,65]. However, each VI has its suitability and specific uses, and for practical applications, the choice of a VI needs to be made with caution by optimizing the existing VIs that are to be applied in a specific environment [66].
Consequently, the applicability of selected VIs in retrieving LDMC was investigated by optimizing them based on Sentinel-2 band settings. As can be seen in Table 3, the most frequently observed wavebands in many of the vegetation indices were Sentinel-2 bands, with central wavelengths of 560 nm and 1614 nm (band 3 and band 11). It seems that these two bands form the ideal combination for LDMC retrieval from Sentinel-2 data using the tested VIs. Sentinel-2 band 12, with 2202 nm as the central wavelength, provided the second top R 2 result when used in the formulation of the VIs (see Figure 4). Sentinel-2 bands 11 and 12 are located in the SWIR region of the electromagnetic spectrum, which is reported to be the most sensitive region for different forms of leaf dry matter [13,23,24,26,27,50]. This region is reportedly sensitive to subtle variations in vegetation chemical composition, such as starch, cellulose, and lignin [67], which are constituents of LDMC.
LDMC was accurately predicted from Sentinel-2 data by using both parametric (VIs) and non-parametric (PLSR) regression methods. However, a more accurate result (R 2 = 0.71 and RMSE = 0.0333) was obtained by PLSR than by any of the vegetation indices (R 2 = 0.67, RMSE = 0.034). The superior performance of PLSR over VIs may be partly attributed to the capability of non-parametric regression approaches to utilize more spectral information from remotely sensed data. Our findings are in agreement with those by Inoue, Guerif [20], who confirmed the superior performance of PLSR compared to VIs in predicting vegetation canopy chlorophyll content from different remotely sensed datasets.
However, the empirical relationship developed by applying PLSR on the selected VIs did not improve the retrieval accuracy. Although employing PLSR on vegetation indices showed a sharp decline in RMSE % as the number of components increased (Figure 6), the cross-validation result (R 2 = 0.70, and RMSE = 0.033) was not significantly different from the reflectance-based approach. This finding is contrary to previous studies, which have suggested that applying non-parametric regression methods to integrate multiple spectral indices can improve the prediction accuracy [63]. The reason may partly be the existing correlation between predictors. When applying PLSR, a strong correlation between relevant predictors is a prerequisite to achieving good performances [68], however, in this study, the VIs were less correlated with each other than the reflectance of the different bands (not shown). Moreover, analysis of VIP portrayed the presence of more bands (four) than VIs (three) with higher VIP values (VIP ≥ 1) (Figure 6), which may not improve the relative performance of the set of VIs as PLSR predictor variables.
It is worth noting that many of the vegetation indices optimized in this study performed well for LDMC retrieval. Generally, in the cross-validation results for VIs, R 2 ranged from 0.49 to 0.67, and RMSE from 0.0344 to 0.041. Of the VIs that utilized the green and SWIR bands of Sentinel-2, four provided an accurate LDMC prediction ( Table 3). The proposed VIs, as well as PLSR, were more robust in predicting LDMC than in a similar study performed in a temperate forest (R 2 = 0.58) by Ali, Skidmore [51] using airborne hyperspectral data through wavelet analysis. However, the lowest RMSE (0.033 g/g) in this study was higher than the latter authors' finding (RMSE = 0.016 g/g). Another study conducted in the temperate humid forest also obtained slightly higher prediction accuracy (RMSE = 0.022) by applying an artificial neural network on Landsat 7 enhanced thematic mapper plus data [11]. One possible reason may be that in the natural grassland of this study area, LDMC variability was much higher than the variability in the temperate forests. Records show the presence of more than 120 species in the study site, of which approximately 15 are dominant species [69]. The more diverse species composition of the coastal wetland may constitute high variability in LDMC, which in turn may lead to higher prediction errors. This is in agreement with earlier findings by le Maire, Francois [26], who reported larger errors for leaf mass per area (LMA) retrieval at the leaf level than at the canopy level due to the high variability of LMA in leaf-level compared to canopy-level measurements. Similarly, Ali, Darvishzadeh [28] obtained a normalize root mean square (NRMSE) of 9%, which is close to the lowest NRMSE (10.98%) in this study, while estimating two leaf functional traits (i.e., LDMC and SLA) by inversion of the leaf radiative transfer model Leaf optical PROperties SPECTra model (PROSPECT).
Several studies have investigated the direct relationship between plant traits and canopy reflectance to remotely measure and monitor plants' responses to environmental changes. For instance, Sari, et al. [70] found a high Pearson correlation (R 2 = 0.86) between leaf chlorophyll and red band reflectance measured with a portable spectroradiometer. A strong relationship (R 2 = 0.88) between specific leaf area and canopy reflectance at the SWIR band of a hyperspectral sensor was reported by Ali, Darvishzadeh [44]. However, in this study, statistically significant correlations were not found between LDMC and many of the Sentinel bands ( Figure 3). Even the statistically significant correlation (R 2 = 0.08) found between Band 11 and reflectance was marginal. This result may be explained by the fact that area-based variables, such as chlorophyll (µg/cm 2 ) and SLA (cm 2 /g), are more closely related to canopy reflectance than mass-based variables, such as LDMC (g/g). This finding has important implications for the types of variables that can be remotely measured. The absence of a direct relationship between reflectance and a variable of interest does not impede the prediction of the variable using remotely sensed data and approaches, as the coalition between plant traits may cause strong indirect relations with remotely sensed data.
In general, the accurate prediction of grassland LDMC in this study confirmed that a key leaf trait, LDMC, is measurable with Sentinel-2 reflectance data. This may lay the foundation for other ecological plant trait studies using remotely sensed satellite data over large spatiotemporal scales. It also highlights the potential role that the new Sentinel-2 sensor may play in biodiversity assessment and monitoring in different biomes across the globe. Hence, remotely sensed plant traits could facilitate future studies on plant responses to biodiversity and climate change.

Conclusions
In this study, we compared the performance of (non-)parametric regression methods, more specifically vegetation indices and partial list square regression (PLSR), in retrieving LDMC from Sentinel-2 data. LDMC acquired from 50 sample plots in a coastal wetland site on Schiermonnikoog was measured in situ and used to calibrate the algorithms and validate the results through the leave-one-out cross-validation technique.
Our results showed that LDMC could be quickly and accurately estimated by regressing non-parametric approaches on all available Sentinel-2 bands. Unlike in other studies, regressing PLSR on vegetation indices did not improve the accuracy. We also examined the retrieval accuracy of vegetation indices for LDMC prediction by testing combinations of bands, and identified a suite of VIs with optimal band combinations from Sentinel-2 that can be utilized to remotely estimate LDMC. Hence, we identified for the first time the potential VIs that can be utilized to estimate LDMC from Sentinel-2. Vegetation indices formulated based on band 3 and band 11 of Sentinel-2, such as EVI, SLAVI, SRVI, and VARI, provided more accurate LDMC estimates than when based on any other band combination. Since non-parametric regression approaches are computationally very demanding and have overfitting problems, the identified VIs may provide an operationally efficient approach, particularly for large spatiotemporal scale prediction of LDMC, to better understand and monitor ecosystem function. Despite these promising results, further work is required to validate the applicability of the proposed indices in ecosystems other than wetlands.