Estimating Community-Level Plant Functional Traits in a Species-Rich Alpine Meadow Using UAV Image Spectroscopy

: Plant functional traits at the community level (plant community traits hereafter) are commonly used in trait-based ecology for the study of vegetation–environment relationships. Previous studies have shown that a variety of plant functional traits at the species or community level can be successfully retrieved by airborne or spaceborne imaging spectrometer in homogeneous, species-poor ecosystems. However, ﬁndings from these studies may not apply to heterogeneous, species-rich ecosystems. Here, we aim to determine whether unmanned aerial vehicle (UAV)-based hyperspectral imaging could adequately estimate plant community traits in a species-rich alpine meadow ecosystem on the Qinghai–Tibet Plateau. To achieve this, we compared the performance of four non-parametric regression models, i.e., partial least square regression (PLSR), the generic algorithm integrated with the PLSR (GA-PLSR), random forest (RF) and extreme gradient boosting (XGBoost) for the retrieval of 10 plant community traits using visible and near-infrared (450–950 nm) UAV hyperspectral imaging. Our results show that chlorophyll a , chlorophyll b , carotenoid content, starch content, speciﬁc leaf area and leaf thickness were estimated with good accuracies, with the highest R 2 values between 0.64 (nRMSE = 0.16) and 0.83 (nRMSE = 0.11). Meanwhile, the estimation accuracies for nitrogen content, phosphorus content, plant height and leaf dry matter content were relatively low, with the highest R 2 varying from 0.3 (nRMSE = 0.24) to 0.54 (nRMSE = 0.20). Among the four tested algorithms, the GA-PLSR produced the highest accuracy, followed by PLSR and XGBoost, and RF showed the poorest performance. Overall, our study demonstrates that UAV-based visible and near-infrared hyperspectral imaging has the potential to accurately estimate multiple plant community traits for the natural grassland ecosystem at a ﬁne scale.


Introduction
Plant functional traits may directly or indirectly affect plant fitness by influencing individuals' growth, reproduction and survival [1]. They reflect the morphological, physiological or phenological responses of species to the environment and serve as proxies for life strategies [2][3][4]. However, plant species primarily represent and survive as part of a plant community instead of as separate species or individuals. Therefore, the variation in traits among species cannot represent the characteristics of plant communities or vegetation, nor the ecological processes at the ecosystem level [5,6].
Plant communities are composed of sets of species with various abundances and maintain a dynamic balance through interactions between species [7]. Therefore, plant functional traits at the community level (plant community traits hereafter) include information not only on plant functional traits but also on species composition. Plant community traits offer a trait-based approach to address several key questions related to plant community assembling and productivity regulation [8,9] and will assist in exploring how vegetation responds to climate change [10].
The community-weighted mean (CWM) trait, which aggregates plant functional traits at the community level using the weighted-mean approach, is a commonly used indicator for the study of vegetation-environment relationships [7]. The CWM trait is typically calculated as the mean of plant functional trait values at the species level weighted by the relative abundance of taxa [5]. However, measuring CWM traits through field surveys is time-consuming and labour-intensive, while using trait data from published databases is hindered by differences in sampling and measurement criteria and usually ignores intraspecific variation of traits [11]. Moreover, functional traits derived from discrete sites could hardly reflect the continuous spatial change in vegetation characteristics [12]. Hence, an alternative approach to collecting trait information at the community level covering variations within and between species is of great significance [13].
Remote sensing has the potential to provide a spatially continuous representation of plant functional traits and intraspecific variations [14,15]. Previous studies have shown that a variety of plant functional traits at the species or community level in forest ecosystems can be successfully retrieved by airborne or spaceborne imaging spectrometer [16,17]. As to grassland ecosystems, although a few studies have attempted to quantify trait variations at the community level, they are either limited to a relatively homogeneous condition with few co-existing species [18], dependent on a well-managed experimental platform and observed datasets [19] or focus only on limited plant traits [20]. The conclusions drawn from these studies may not apply to heterogeneous and species-rich grasslands in natural conditions [21].
The unmanned aerial vehicle (UAV) platform can be operated flexibly according to weather and field conditions, so it has become increasingly used in ecological research [22,23]. UAV-based imaging spectroscopy is a relatively new remote sensing technology with significant benefits for high-resolution remote sensing applications, making it possible to study trait variations at the community level at a fine scale [24]. Moreover, UAV-based imaging spectroscopy could offer a link between field investigation and satellite observation, which may support the estimation of plant functional traits at the community level on a broader spatial scale [25]. However, despite the potential of UAV-based imaging spectroscopy, little research has been performed on the estimation of plant community traits directly from UAV hyperspectral imagery, particularly in heterogeneous natural grasslands. Thus, in this paper, we aim to determine whether UAV-based hyperspectral imaging could adequately estimate plant community traits in a species-rich alpine meadow. Specifically, we set out to assess the performance of four non-parametric regression models for retrieving 10 plant community traits from visible and near-infrared (450-950 nm) UAV hyperspectral imagery.

Study Area
Our study is within a river basin located in the northeastern Qinghai-Tibet Plateau in Qinghai Province, China. The basin covers an area of approximately 244.8 km 2 with an average elevation of 3385 m.a.s.l. (Figure 1). This area is characterised by a continental monsoon climate with a mean annual temperature of -1.1 • C and a mean annual precipitation of 485 mm. Around 80% of the precipitation falls in the growing season from mid-April to mid-October. The dominant vegetation communities are alpine meadows and alpine shrubs. These communities are very rich in plant species, ranging from 30 to 50 per square meter [26]. (C) study area with the location of used UAV flight sites; and (D) sampling design for each site in which the red boxes denote the three 2 m × 2 m sampling plots, and the green (25 cm × 25 cm), blue (50 cm × 50 cm) and purple (1 m × 1 m) boxes denote the quadrats for biomass harvesting, UAV spectra extraction and species survey, respectively, in each plot. Aboveground biomass in the subplots 1, 2 and 3; 2, 3 and 4; and 1, 3 and 4 for plots 1, 2 and 3, respectively, were harvested after the spectral measurements.

Hyperspectral Data Collection and Pre-Processing
In this study, we determined 20 survey sites according to the altitude gradients, with the altitude ranging from 3040 to 3450 m. Each survey site was 100 m × 60 m in size. UAV flight campaigns were conducted between 10 and 26 August 2021 during the growing season of the meadow. At each survey site, we collected the hyperspectral data by a Cubert UHD185 Firefly spectrometer (UHD185) equipped on a hexacopter UAV (DJI M600 PRO). DJI M600 PRO was equipped with the A3 Pro flight controller including three Inertial Measurement Units (IMU) and three Global Navigation Satellite System (GNSS) (C) study area with the location of used UAV flight sites; and (D) sampling design for each site in which the red boxes denote the three 2 m × 2 m sampling plots, and the green (25 cm × 25 cm), blue (50 cm × 50 cm) and purple (1 m × 1 m) boxes denote the quadrats for biomass harvesting, UAV spectra extraction and species survey, respectively, in each plot. Aboveground biomass in the subplots 1, 2 and 3; 2, 3 and 4; and 1, 3 and 4 for plots 1, 2 and 3, respectively, were harvested after the spectral measurements.

Hyperspectral Data Collection and Pre-Processing
In this study, we determined 20 survey sites according to the altitude gradients, with the altitude ranging from 3040 to 3450 m. Each survey site was 100 m × 60 m in size. UAV flight campaigns were conducted between 10 and 26 August 2021 during the growing season of the meadow. At each survey site, we collected the hyperspectral data by a Cubert UHD185 Firefly spectrometer (UHD185) equipped on a hexacopter UAV (DJI M600 PRO). DJI M600 PRO was equipped with the A3 Pro flight controller including three Inertial Measurement Units (IMU) and three Global Navigation Satellite System (GNSS) units. The UHD185 comprises 125 spectral channels and spans the spectral range from 450 to 950 nm at a 4-nm sampling interval, and the spectral resolution is 8 nm. One panchromatic band and 125 hyperspectral bands were simultaneously recorded into the UHD185 during the flight. Before each flight, we calibrated the UHD185 spectrometer using a white reference panel and a black plastic lens cap. Three 1.2 m × 1.2 m standard reference panels (with approximately 10%, 50% and 80% reflectance, respectively) were set up in the flight area for the follow-up relative normalisation [27]. To minimize the atmospheric perturbations and BRDF effects, we conducted all these flight campaigns between 11:00 and 15:00 local time on clear sunny days. The flight speed was 4.8 m/s at a flight altitude of 40 m above ground level. The UAV survey was designed to acquire 70% forward overlap and 60% side overlap. The average size of the UAV stripes was 17 m × 100 m. The spatial resolution was about 0.02 m for the panchromatic image and about 0.3 m for the hyperspectral image. The collected hyperspectral images were first fused with the corresponding panchromatic images using Cube-Pilot software (Cubert GmbH, Ulm, Germany). The entire hyperspectral image of each site was then mosaicked from the fusing images using Agisoft PhotoScan (Agisoft, St. Petersburg, Russia). As a result, 14 out of 20 mosaiced UAV hyperspectral images were retained after eliminating blurring images caused by sudden strong turbulence over the plateau.
We extracted the field spectra of three standard reference panels from each panel's centred pixels from the hyperspectral image. In addition, the reference reflectance of each panel was measured from a laboratory-integrating sphere using the full range of Analytical Spectral Devices (ASD-FR). Based on the field spectra and reference spectra of these panels, images of different study sites were calibrated using an empirical line method [24,27,28]. All images were smoothed by the Savitzky-Golay filter with a factor of 5 to remove high-frequency noise ( Figure S1).
We used a 25 × 25 pixel (around 50 cm × 50 cm in size) window at four corners of the 2 m × 2 m plot to extract spectra from each plot ( Figure 1). Image processing was performed with ENVI 5.3 (Exelis Visual Information Solutions, Boulder, CS, USA).

Field Data Collection
The field samples were collected on the same day as the UAV flight campaign, i.e., between 10 and 26 August 2021. At each survey site, we randomly set three 2 m × 2 m plots in a 100 m × 60 m range ( Figure 1). The distance between any two plots was at least 15 m. We investigated the 1 m × 1 m area in the centre of each plot for species composition and species-wise coverage. To do so, we divided the 1 m 2 quadrat into a grid of 100 squares each representing 1% cover and then estimate the percentage cover occupied by each species in the quadrat. We marked each plot in its centre for identification in images. In this research, we sampled species that accounted for the accumulative coverage of over 80% of the entire plot in each plot.
We collected 20 fully mature leaves of each sampled species at three vertical canopy positions along the plant stem: lower (n = 6), middle (n = 6) and upper (n = 8). We mixed the sampled leaves of each species and divided them into two equal subsamples. One subsample was quickly stored in liquid nitrogen for physiological trait measurement, and the other subsample was wrapped in wet tissue and stored in an icebox for structural trait measurement. For each sampled species, we randomly selected 5-10 mature and healthy individuals for the plant height measurement and calculated the average. In each plot, four 25 cm × 25 cm subplots at corners were clockwise numbered with the southern corner ranked 1 (Figure 1). Aboveground biomass in the subplots 1, 2 and 3; 2, 3 and 4; and 1, 3 and 4 of plots 1, 2 and 3, respectively, were harvested after the spectral measurements. In total, 40 out of 60 investigated plots were considered in this study ( Figure 2). The relative coverage of each species in each plot was calculated based on these sampled species according to the following formula: where rc i represents the relative coverage of the ith species in a given plot, C i is the coverage of the ith species, C j is that of the jth species in the plot, and n is the total number of all species in the plot. (1) where rci represents the relative coverage of the ith species in a given plot, Ci is the coverage of the ith species, Cj is that of the jth species in the plot, and n is the total number of all species in the plot.  Figure 1D) in each plot used in this study.

Foliar Trait Measurements and Plant Community Trait Calculation
In this study, we measured six biochemical traits including chlorophyll a content, chlorophyll b content, carotenoid content, nitrogen content, phosphorus content and starch content as well as four structural traits including plant height, leaf thickness, leaf dry matter content and specific leaf area (Table S1).
We measured chlorophyll a and b and carotenoid contents, as well as all structural traits, with the community weighted means approach. We determined chlorophyll a and b and carotenoid contents with a UV/VIS Spectrophotometer (UV-1800PC, Shanghai Mapada Instruments Co., Ltd., Shanghai, China). Except for the plant height, which was measured during the investigation, the other structural traits, such as leaf thickness, leaf dry matter content and specific leaf area, were measured on the same day of sampling. Leaf thickness was measured by a micrometre, and fresh weight was measured by an analytical balance. After leaf thickness measurement, we scanned leaves for the fresh leaf area with a flatbed scanner and then oven-dried those leaves at 65 °C for 72 h to a constant weight to determine the specific leaf area (fresh area/dried weight) and leaf dry matter content (dried weight/fresh weight).
We calculated the CWM values of chlorophyll a and b and carotenoid contents, plant height, leaf thickness, leaf dry matter content and specific leaf area according to the following formula: (2) where t represents a community-level functional trait, ti denotes the functional trait of ith species, Ci is the coverage of ith species, d represents the total number of sampled species in a given plot and n is the total number of all species in a given plot.
In addition, we measured plant community nitrogen, phosphorus and starch contents using mixed samples. We shredded the harvested biomass, which was oven-dried at 65 °C for at least 72 h and homogenised it to mixed samples. The nitrogen content was  Figure 1D) in each plot used in this study.

Foliar Trait Measurements and Plant Community Trait Calculation
In this study, we measured six biochemical traits including chlorophyll a content, chlorophyll b content, carotenoid content, nitrogen content, phosphorus content and starch content as well as four structural traits including plant height, leaf thickness, leaf dry matter content and specific leaf area (Table S1).
We measured chlorophyll a and b and carotenoid contents, as well as all structural traits, with the community weighted means approach. We determined chlorophyll a and b and carotenoid contents with a UV/VIS Spectrophotometer (UV-1800PC, Shanghai Mapada Instruments Co., Ltd., Shanghai, China). Except for the plant height, which was measured during the investigation, the other structural traits, such as leaf thickness, leaf dry matter content and specific leaf area, were measured on the same day of sampling. Leaf thickness was measured by a micrometre, and fresh weight was measured by an analytical balance. After leaf thickness measurement, we scanned leaves for the fresh leaf area with a flatbed scanner and then oven-dried those leaves at 65 • C for 72 h to a constant weight to determine the specific leaf area (fresh area/dried weight) and leaf dry matter content (dried weight/fresh weight).
We calculated the CWM values of chlorophyll a and b and carotenoid contents, plant height, leaf thickness, leaf dry matter content and specific leaf area according to the following formula: where t represents a community-level functional trait, t i denotes the functional trait of ith species, C i is the coverage of ith species, d represents the total number of sampled species in a given plot and n is the total number of all species in a given plot.
In addition, we measured plant community nitrogen, phosphorus and starch contents using mixed samples. We shredded the harvested biomass, which was oven-dried at 65 • C for at least 72 h and homogenised it to mixed samples. The nitrogen content was analysed using an elemental analyser (Vario MACRO Cube, Frankfurt, Germany). The phosphorus content was measured by the molybdate-ascorbic acid method after H 2 SO 4 -H 2 O 2 digestion [29]. Moreover, the starch content was measured by the anthrone colorimetric method Remote Sens. 2022, 14, 3399 6 of 14 using a UV/VIS Spectrophotometer (UV-1800PC, Shanghai Mapada Instruments Co., Ltd., Shanghai, China). Chlorophyll a, chlorophyll b and nitrogen contents were quantified in an area-based approach [30,31]. Area-based traits (mg/cm 2 ) were calculated according to the following formula:

Mapping Plant Community Traits
Here we tested four non-parametric models' capability for the retrieval of various community-level traits (Figure 3). The two linear models are partial least square regression (PLSR) and the generic algorithm integrated with the PLSR (GA-PLSR). PLSR is a widely used algorithm in hyperspectral vegetation parameters retrieval [32,33]. As a model designed to incorporate multicollinearity problems, PLSR derives a smaller number of latent variables from the original data [34]. In this way, PLSR can eliminate the less informative variables but concentrate most explanatory variables on a few latent variables. However, the "large p-small n" problem (a large number of variables but a few samples) can still spoil the PLSR result [35]. In this condition, a variable selection pre-processing is known to improve PLSR performance [36]. Here, we adopted GA-PLSR which allows a band selection procedure in PLSR [37]. There are numerous studies showing GA-PLSR to be useful in promoting PLSR model performance [38,39]. It obeys the rule of biological evolution and natural selection to select informative features. Important features are able to survive after multiple iterations of model fitting and feature selection procedures. We selected random forest (RF) and extreme gradient boost (XGBoost) to evaluate nonlinear model performance in traits estimation. The RF model is one of the popular techniques of foliar trait prediction [40] and has been applied to map vegetation parameters at various scales [41,42]. XGBoost is an emerging machine learning algorithm showing satisfactory model performance in recent research [43]. phosphorus content was measured by the molybdate-ascorbic acid method after H2SO4 H2O2 digestion [29]. Moreover, the starch content was measured by the anthrone color metric method using a UV/VIS Spectrophotometer (UV-1800PC, Shanghai Mapada Instru ments Co., Ltd., Shanghai, China). Chlorophyll a, chlorophyll b and nitrogen content were quantified in an area-based approach [30,31]. Area-based traits (mg/cm 2 ) were ca culated according to the following formula:

Mapping Plant Community Traits
Here we tested four non-parametric models' capability for the retrieval of variou community-level traits (Figure 3). The two linear models are partial least square regres sion (PLSR) and the generic algorithm integrated with the PLSR (GA-PLSR). PLSR is widely used algorithm in hyperspectral vegetation parameters retrieval [32,33]. As model designed to incorporate multicollinearity problems, PLSR derives a smaller num ber of latent variables from the original data [34]. In this way, PLSR can eliminate the les informative variables but concentrate most explanatory variables on a few latent varia bles. However, the "large p-small n" problem (a large number of variables but a few sam ples) can still spoil the PLSR result [35]. In this condition, a variable selection pre-pro cessing is known to improve PLSR performance [36]. Here, we adopted GA-PLSR whic allows a band selection procedure in PLSR [37]. There are numerous studies showing GA PLSR to be useful in promoting PLSR model performance [38,39]. It obeys the rule of bio logical evolution and natural selection to select informative features. Important feature are able to survive after multiple iterations of model fitting and feature selection proce dures. We selected random forest (RF) and extreme gradient boost (XGBoost) to evaluat nonlinear model performance in traits estimation. The RF model is one of the popula techniques of foliar trait prediction [40] and has been applied to map vegetation parame ters at various scales [41,42]. XGBoost is an emerging machine learning algorithm show ing satisfactory model performance in recent research [43]. In total, 40 plots were used as input for the four tested models. We used the leave one-out-cross-validation (LOOCV) approach for model training and validation. Based o the pls package in R [44], we determined the number of latent factors used in PLSR fo each community-level trait dataset by the predicted residual sum of squares (PRESS In total, 40 plots were used as input for the four tested models. We used the leaveone-out-cross-validation (LOOCV) approach for model training and validation. Based on the pls package in R [44], we determined the number of latent factors used in PLSR for each community-level trait dataset by the predicted residual sum of squares (PRESS) statistic [45]. The feature selection of GA-PLSR was performed by the plsVarSel package in R [39]. After that, the standard PLSR routine was performed to determine the latent factors. As for the nonlinear models, we conducted a hyper-parameter optimisation process for each trait dataset (Table S2). RF models were performed with a randomforest package in R [46]. The number of trees (ntree, 100-1000 with the interval of 100) and the number of variables randomly sampled as candidates at each split (mtry, 1-125) were tuned, and each combination was replicated for 10 times to obtain the optimal parameters with the highest correlation coefficient. The XGBoost models were performed with the xgboost package in R [47], and the learning rate (eta, 0.1-1), maximum depth of a tree (max_depth, 0.1-1) and iteration rounds (nrounds, 1-100) were tuned to search for the best parameter in combination with the highest correlation coefficient.
We selected the model with the best performance for plant community trait mapping to visually represent plant community traits in space. Here we assessed the best model as a model with the highest R 2 among all tested traits. To test whether the selected model could be adequately applied in a species-rich meadow, we further analysed the relationship between the predicted residuals of all plant community traits and the number of dominant species in each plot. To display spatial patterns of traits, we chose one image covering an area composed of a fenced meadow and a highly disturbed meadow. The location of this image was indicated in Figure 1. Thanks to the precise spatial resolution, we excluded non-vegetation pixels by supervised classification. The image was resampled into 50-cm spatial resolution by the nearest-neighbour algorithm ( Figure S2). The relative uncertainty (standard deviation/mean) was calculated based on the 40 models generated from LOOCV. Data analysis was performed with R 4.1.0 [48].

Results
Among the biochemical traits, chlorophyll a and b, carotenoid and starch contents, showed good predictive accuracy, with the highest R 2 value of four models ranging from 0.64~0.83. Phosphorus and nitrogen showed R 2 values lower than 0.60 ( Figure 4). As for the structural traits, specific leaf area (highest R 2 = 0.70) and leaf thickness (highest R 2 = 0.68) were both estimated well (Figure 4), while the estimates of plant height (R 2 = 0.44) and leaf dry matter content (R 2 = 0.30) were relatively poor.   Among four estimation models, GA-PLSR proved more accurate, as it produced the highest R 2 value and the lowest nRMSE in most of the 10 plant community traits (Table 1). However, the other models displayed various performances in different traits. PLSR showed good performance in most traits, such as the biochemical traits related to photosynthesis (chlorophyll a and b and carotenoid contents) and specific leaf area. XGBoost owned model performance comparable with GA-PLSR for carotenoid content and plant height. However, it showed relatively low R 2 values for nitrogen and starch contents. RF presented the worst predictive accuracy in most traits. Moreover, GA-PLSR had nRMSE values all below 0.4, while the other three models produced higher nRMSE values for most structural traits. Among the 10 traits, the model performance for leaf thickness, nitrogen content and starch content illustrated significant differences. The R 2 values of different models varied from 0.68 to 0.04. As for linear models, the model performance of PLSR increased significantly with band selection (GA-PLSR). In addition, GA-PLSR conquered the weakness of PLSR in some traits' retrieval, such as nitrogen content, and yielded more satisfactory results than the two nonlinear models. For the two nonlinear models, XGBoost showed an obvious advantage over RF in all traits. However, the four tested models' prediction of leaf dry matter content all below 0.4 made it the worst prediction among all 10 traits.
In addition, we analyzed the relationships between the predicted residuals of plant community traits and the number of dominant species in each plot. It showed that there was no statistically significant relationship (p > 0.05) between the predicted residuals of traits and the number of species ( Figure 5). Based on these results, we mapped spatial patterns of all tested traits, applying the GA-PLSR, as it exhibited stability and outperformed all models, to map traits ( Figure 6). Our results showed that biochemical traits displayed more homogeneous distributions while structural traits, especially for plant height, had obvious spatial patterns. We calculated uncertainties from the predicted maps ( Figure S3). For all traits, most uncertainty values were near zero. Based on these results, we mapped spatial patterns of all tested traits, applying the GA-PLSR, as it exhibited stability and outperformed all models, to map traits ( Figure 6). Our results showed that biochemical traits displayed more homogeneous distributions while structural traits, especially for plant height, had obvious spatial patterns. We calculated uncertainties from the predicted maps ( Figure S3). For all traits, most uncertainty values were near zero.

Discussion
In this study, we predicted 10 significant plant community traits from the near-ground UAV-based hyperspectral image in a highly heterogeneous grassland with high species richness, with 9 traits producing moderate to good accuracies. Among all tested traits, chlorophyll a and b contents, carotenoid content, specific leaf area, leaf thickness and starch content could generate a predictive accuracy comparable to that of previous studies with R 2 values greater than 0.60 [19].
The retrieval of plant height, phosphorus content and nitrogen content produced moderate predictive accuracy with R 2 values varying from 0.44 to 0.54. This may be because the nonpigmented compounds of foliage and structure correlation intervals were mostly at longer wavelengths, while the detectable weak correlation could even be influenced by strong absorption of water content in fresh leaves [49,50]. Nevertheless, the R 2 values of nitrogen and phosphorus content are comparable to other research in grasslands [18,19,51]. Similarly, all our tested models of leaf dry matter content produced poor predictive accuracies in this study, probably because leaf dry matter content reflected leaf water content. The distinct effect of foliar liquid water was at 1450 and 1950 nm, which was out of the available spectra range in this study [50]. This result suggested the relatively limited capacity of visible and near-infrared spectra in the retrieval of certain traits, which should be considered in future studies.
In this study, we tested four non-parametric models, including two linear models and two nonlinear models, in the direct retrieval of multiple traits. The results suggested that the predictive accuracy of some traits might be influenced to some extent by the choice of models. For example, an obvious increase of R 2 was observed in starch content and leaf thickness when the linear model was applied. In contrast, a few traits yielded relatively consistent predictive accuracies across four models, such as plant height and carotenoid content.
As to the linear models, PLSR produced comparative average performance among all models but failed in some traits, such as nitrogen and leaf dry matter content. The better performance of GA-PLSR was consistent with some previous studies which confirmed the effects of feature selection in PLSR [52]. On the one hand, these significant improvements in model performance may because of the "large p-small n" problem. Since PLSR will make better use of all given features than nonlinear models, redundant variables may obscure truly usable bands [39]. As a result, PLSR with relatively limited training samples cannot handle hundreds of correlated bands well [53]. On the other hand, environmental and instrumental noise was inevitably mixed with spectral data, which may weaken the predictive accuracy.
We found that the performance of GA-PLSR was not sensitive to species richness, indicating its robustness in extracting information of functional traits based on the UAV hyperspectral imaging in a fine resolution. Therefore, GA-PLSR model could adequately estimate various plant community traits in species-rich alpine meadows.
Between two nonlinear models, XGBoost reported considerable advantages in predictive accuracy over RF. This result was consistent with recently published studies [43,54]. As XGBoost is effective in high-dimension data analysis, it is becoming a reliable method in vegetation parameter modelling using UAV-based hyperspectral data [52]. This may also provide a new option when conducting similar research facing nonlinear model selection.
Overall, GA-PLSR, the PLSR model using the GA feature selection approach, outperformed the other candidates for all tested traits in this research. The trade-off between complexity reduction and information preservation is a great challenge in hyperspectral data analysis. GA-PLSR can deal with estimations of various traits from canopy spectra with relatively satisfactory predictive accuracies. Maps generated from GA-PLSR depicted various patterns at the local level that could not be achieved only by site investigation. Meanwhile, the PLSR model combined with feature selection could serve as an option for researchers facing the "large p-small n" problem.

Conclusions
In this study, we investigated whether UAV-based hyperspectral imaging could be used to estimate 10 different plant functional traits at the community level in a species-rich alpine meadow ecosystem on the Qinghai-Tibet Plateau. In addition, we compared the performance of four non-parametric regression models, i.e., PLSR, GA-PLSR, RF and XGBoost. Based on the result, we conclude that the UAV-based hyperspectral image can be used to adequately estimate plant community traits in a species-rich alpine meadow with moderate to high accuracy. Specifically, we show that chlorophyll a, chlorophyll b, carotenoid content, starch content, specific leaf area and leaf thickness were estimated with good accuracies, with the highest R 2 values between 0.64 (nRMSE = 0.16) and 0.83 (nRMSE = 0.11). While the estimation accuracies for nitrogen content, phosphorus content, plant height and leaf dry matter content were relatively low, with the highest R 2 varying from 0.3 (nRMSE = 0.24) to 0.54 (nRMSE = 0.20). Among the four tested algorithms, the GA-PLSR produced the highest accuracy, followed by PLSR and XGBoost, and RF showed the poorest performance. Our study demonstrates the potential of UAV-based visible and near-infrared hyperspectral imagery to directly estimate various plant community traits in a natural grassland ecosystem at a fine scale.
Supplementary Materials: The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/rs14143399/s1, Figure S1: Examples of corrected spectra of alpine meadow from five different flight sites. Figure S2: The UAV hyperspectral image used for mapping plant community traits. The upper one is the raw image and the lower one is the corrected image shown in true colour composites. This image covers plot 32, plot 33 and plot 34 in Figure 2. The white arrows indicate the location of the three reference panels. Figure S3: Frequency distribution of relative uncertainty of the 10 plant community trait maps produced by the GA_PLSR model. Table S1: Summary statistics of the 10 plant functional traits for sampled plots (n = 40). Table S2: An overview of optimal hyper-parameters of two machine learning models for 10 plant community traits.