1. Introduction
Grasslands provide forage for ruminant livestock to produce meat, milk, wool, and hide [
1] and for biogas production. Permanent grasslands are the predominant type of grassland in topographically and climatically disadvantaged regions, such as mountainous areas [
2]. In contrast with the intensively utilized grasslands that occur in agriculturally more favorable areas, and which usually consist of only a few plant species, the permanent grasslands in mountainous and alpine regions provide species-rich vegetation and are utilized under moderate management regimes [
3]. Grassland vegetation typically comprises grasses, herbs, and legumes. These species groups and plant species represent different functional traits [
4] and feed values; knowledge of their relative proportions, therefore, offers advantages for site-specific management and livestock feeding. Grass–legume mixtures generally outperform pure grass stands in yield, resilience, and nitrogen efficiency because of leguminous nitrogen fixation in symbiosis with the soil bacteria rhizobia, weed suppression, and forage quality [
5,
6,
7,
8,
9]. Rasmussen et al. [
10] observed more than 300
/ha per year of N
2-fixation by
Trifolium pratense L. and
Medicago sativa L. with
Lolium perenne L. as companion grass species. This allows a substantial reduction in the mineral nitrogen fertilizer input and thus a reduction in CO
2 emissions from nitrogen fertilizer production [
11]. Another advantage of grass–clover mixtures is their increased feeding efficiency due to their high nutritional value. In particular, high crude protein content helps to meet the increasing demand for proteins [
7]. Because of spatio-temporal variability in species groups and species composition, comprehensive mapping of the grassland sward can contribute substantially to improvement of site-specific management, especially with respect to fertilization and feed efficiency.
As proposed by Peratoner and Pötsch [
12], estimating species groups and species composition is laborious, time-consuming, and requires advanced botanical and agronomic knowledge. Remote sensing is a promising alternative for assessing botanical composition, yield, and forage quality. It is non-destructive and can be used for the reproducible sensing of large areas in a very efficient manner [
13]. Sensors for grassland monitoring can be based on the technical principles of photography, spectrometry, spectral imaging, synthetic aperture radar, light detection and ranging, and ultrasound [
13].
Besides the high number of different technical principles and their possible combinations, the number of applications in grassland itself is also high. The large number of recent publications concerning remote sensing in grassland emphasizes the importance of this area. Common applications include modeling of grassland successional stages [
14], forage quality parameters [
15], biomass [
16,
17,
18,
19], legume N-fixation [
17], chlorophyll content [
20], species richness [
21,
22], leaf area index [
23,
24], and (species) classification [
25,
26,
27,
28,
29,
30]. Next to the vast number of applications, there is a large variability of grassland types due to local and regional factors [
18,
26,
27,
30]. In particular, grassland species and their composition vary by management type and site conditions. Opportunities for comparability between different studies are limited, and therefore an exclusive focus is set on managed permanent grassland in Austria, which is representative for many areas in the European alpine arc. Concerning grassland species classification, traditional computer vision approaches using morphological operators, such as those implemented by Bonesmo et al. [
31] for mapping white clover in pastures, led to systems with a high sensitivity to adjustments and, thus, a small generalization ability. More recently, Bateman et al. [
32], Skovsen et al. [
33], and Sun et al. [
34] developed species distribution mapping systems for grass–clover mixtures using RGB images and convolutional neural networks. However, their systems were trained on a forage ley, which has limited comparability to permanent grasslands. Compared to such broadband sensors, hyperspectral sensors with narrow and near-continuous spectra facilitate better granularity [
13]. Furthermore, multispectral and hyperspectral sensors often extend the spectral range from the visible spectrum (VIS) to the near-infrared region (NIR). Taken together, these methods might enable the detection of even subtle differences in plants.
In principle, the usage of spatial information can be beneficial for practical grassland classification applications. In an ideal scenario however, without any spectral mixing of multiple plants, spectral information might be sufficient to train machine learning models to analyze the species composition. Therefore, an exclusive use of spectral information would allow simpler sensor systems and reduced machine learning effort to obtain satisfactory models. Combining spectral and spatial information could further increase classification quality for field applications.
Conti et al. [
21] successfully used a six-channel multispectral camera and a spatial resolution of approximately 3
to assess the link between species diversity and spectral characteristics for permanent grassland biodiversity. The work and results of Suzuki et al. [
29] are promising for spectral-based classification of grassland. They used hyperspectral data to analyze the botanical composition of Japanese grassland concerning the classes perennial ryegrass, white clover and other plants. They reached an overall accuracy of 80.3% based on linear discriminant analysis models. However, in their work only three classes were differentiated and today there are many new and optimized machine learning frameworks such as CatBoost [
35] for gradient boosting and PyTorch [
36] for neural networks available.
Analysis of species groups and species composition of managed permanent grasslands based on spectral data in the visible and near-infrared range has not been covered in detail so far.
Spectral signatures may vary, depending not only on the species groups but also on the plant parts captured (flower, leaf, and stem). The latter can even be absent at certain times (e.g., flowers) or show different characteristics over the course of the vegetation period due to species-specific differences in phenology and development, but also due to certain environmental conditions such as drought (e.g., leaf structure). Further, including information on the plant part composition might reveal insights into sward structures. The authors are not aware of any other investigations or publications regarding species group and plant part classification of managed, species-rich permanent grasslands.
Vegetation indices can be used with comparably low effort [
37] as no training is necessary compared to machine learning and they allow for estimating plant functional traits [
38]. However, many indices only utilize few spectral channels [
37] and hence might exclude substantial information. Furthermore, indices might be affected by saturation problems [
39]. Machine learning models based on hyperspectral data might overcome these limitations by using all available information at the same time.
Partial least squares discriminant analysis (PLS-DA) and random forest (RF) are popular classification algorithms, and neither system suffers from the multicollinearity usually present in high-dimensional spectral data [
17]. A powerful alternative might be multilayer perceptron (MLP). This feedforward neural network type can characterize and learn features for prediction purposes [
40]. For omics data [
41] and weed and grass discrimination [
42], MLP classification outperformed various other machine learning algorithms. Independent of the algorithm utilized, the quantity and quality of data are of utmost importance for a reliable analysis. This depends on thorough data acquisition with an accurate calibration process for spectral data, but biases and uncertainties remain [
43]. As for data acquisition, image quality is mainly affected by consistent illumination and sufficient spatial resolution. While adjusting these parameters in on-field applications can be challenging [
44], a laboratory setting provides a controlled environment assuring constant data quality with the added possibility of radiometric calibration.
Dark and bright reflection standards are commonly used for calibration purposes [
45] to consider the sensor-specific dark current and the light source’s heterogeneous spectrum. Because parameters such as plant height and light conditions influence spectral signals and may render calibration techniques unsuccessful, laboratory conditions are advantageous. Further, data preprocessing might be a substantial step in enhancing the model performance. The use of derivatives with spectral data is a common technique [
30,
46]. It removes background signals and visualizes spectral curve shape differences that might not be evident in the spectra [
47]. Smoothing operations such as Savitzky–Golay filtering are frequently applied [
16,
45] as well as data standardization or normalization.
A first step towards spectral-based classification of permanent grassland vegetation is to examine spectral properties under laboratory conditions, thereby enabling reproducible results to be obtained by minimizing the effects of influencing factors such as changing illumination or spatial distance variations. Only a systematic review under such conditions can reveal the influence of the vast number of data processing variants in combination with machine learning on the classification accuracy of grassland plants. Accordingly, the objectives of this study to lay a foundation for the development of spectral-based managed permanent grassland applications are as follows:
Determine the spectral-based classification potential of grassland plants with respect to species group and plant part, and
perform a systematic analysis regarding the influence of model type, calibration variant, and data preprocessing on classification accuracy.
4. Discussion
For both species group and plant part, MLP showed the highest classification accuracies. These were 96.8% for species group and 88.6% for plant part for the best models. Similar results using hyperspectral imaging data from a pot experiment to discriminate between three weed and one grass species using PLS-DA, SVM, and MLP led to a superior MLP model with 89.1% accuracy [
42]. However, the classification of individual species is not directly comparable to the classification of species groups. In other areas, MLP also demonstrated powerful classification abilities. Research conducted by de Castro et al. [
64] classifying cruciferous weeds, wheat, and broad beans under field conditions yielded an MLP model with nearly 100% accuracy. However, these results are hardly comparable to those presented here, mainly because of different methods, research objects, and exogenous conditions.
Next to the best performing MLP models, typical model types of PLS-DA and RF were investigated. The common attribute between all the three model types is that hyperparameters influence their reachable accuracies. However, for PLS-DA and RF, it is unlikely that unsuitable hyperparameters is the reason for their lower performance compared with MLP. For PLS-DA, the number of components included (ncomp) is the standard parameter that is tuned. It is usually tuned over the entire possible range up to the number of spectral channels. However, as the spectral dataset used is highly colinear, high ncomp values cannot be realized. Because of this collinearity and the typical decline in accuracy increment per increased ncomp step, the maximum ncomp value was set to 64 (approximately of available predictors). PLS-DA model accuracies were on average steeply increasing up to approximately 30 and 60 components for plant part and species group, respectively. The results based on the top five models for species groups and plant part show that only one model for species groups used all available 64 components. Furthermore, for the last five ncomp steps, the absolute average accuracy changes were only % for species groups and % for plant part. Therefore, no significant gain with higher ncomp values is expected.
Several hyperparameters could be tuned using the ranger RF package. According to Probst et al. [
65], the two most essential hyperparameters are the sample fraction and mtry, which were tuned in our analysis. However, tunability within the ranger is generally low [
65]. During the hyperparameter search for a representative variant, the accuracy was already stable with 400 trees; no further accuracy gain with additional tuning was expected.
For MLP an assessment of further tunability is virtually impossible as, for neural networks, even random variables such as the weights for network initialization may have a noticeable effect on model quality [
66]. In addition, other search spaces or even different model architectures could significantly affect model accuracy. The accuracies achieved by our models indicate that the chosen parameters are acceptable. In comparison, Adagbasa et al. [
25] used a similar MLP design to classify grass species based on Sentinel-2 MSI data. For their purposes, they proposed 3–5 hidden layers with a learning rate of 0.001, a weight decay of 1 × 10
and the usage of the ADAM optimizer, which is similar to our configuration. Interestingly, their trained MLP outperformed other machine learning algorithms, including RF.
Independent of the algorithm used, data quality is of the utmost importance to achieve high-performance machine learning models. When dealing with spectral data, the covered spectral range is a crucial parameter. The observed spectral range in our case was 440.43
to 957.9
. Asner [
67] determined that plant tissue properties are wavelength-dependent. In the case of green leaves, the smallest variation occurs in the VIS region. At the same time, it is more significant in the NIR region. Therefore, for foliar chemistry, the NIR region provided the best link. The studies of Yu et al. [
30] underline the importance of this to discriminate different grassland species, where NIR and SWIR performed better than VIS alone. Furthermore, Basinger et al. [
68] and Pfitzner et al. [
28] indicate that utilizing the spectral range VIS-SWIR could benefit species discrimination. Therefore, considering this range could potentially lead to improved model performance.
Spatial resolution is another important parameter in addition to spectral resolution. As individual grassland samples show spectral variations, a low spatial resolution leads to averaging effects [
69]. The high spatial resolution of approximately 60 pixel/
in the deployed measurement system circumvents this issue. It allows the capture of spectral signatures for distinct sample regions. Consequently, with the usage of a high spatial sampling resolution, manual averaging or algorithms with high generalization ability are mandatory to achieve high model accuracies.
Biological characteristics that result in similar reflection properties can give rise to misclassifications. Here, leaf sheaths covering the stems might have introduced these misclassifications for stems. In terms of accuracy, herb classification was underperformed compared to grass and legume classification. This might be because the sample number of herbs was small compared to the other classes and because of the potential for a greater amount of variability within the herb species group.
Parameters such as spectral and spatial resolutions are commonly determined by the equipment used. However, different calibration and data processing methods are applicable independently of this constraint. According to Dao et al. [
45], proper calibration of airborne hyperspectral imaging data is mandatory to detect slight differences in spectral curve changes, especially in vegetation such as grassland. Here, it seems that for certain model types, specific calibration variants result in better-performing models. However, all calibration variants were present in similar fractions in the best statistically non-significant different groups, except for RF in the species group classification. Further, our results show, on average, no statistically significant difference between the calibration variants UC, DC, and RC for each model type for species group and plant part. This does not translate to the UC variant being sufficient because light conditions, although changing between measurement days due to variations in lamp positioning, were kept relatively constant compared to realistic field conditions. The 3D canopy structure can significantly impact reflection behavior [
45,
67] with plant height differences and light scattering effects making a physically correct radiometric calibration unfeasible. Nevertheless, our results suggest that calibration does not influence the model performance under nearly constant light and three-dimensional conditions.
MLP and PLS-DA performed well with a wide range of preprocessing variants, but this was not the case with RF. The main reason for this is that RF usually uses only a few predictors at the tree level to form a decision boundary [
70], which makes it more sensitive to data variations than MLP and PLS-DA. Thus, MLP and PLS-DA show a high generalization ability with respect to preprocessing variants compared to RF. Preprocessing variants, including a Savitzky–Golay filter before a derivation, work particularly well for data with low spectral band distances. In this case, differences between successive spectral channels may be slight compared to random noise [
47]. Other variants can also benefit from Savitzky–Golay filtering employed as a noise reduction technique. Interesting preprocessing variants that performed well, independent of the model and calibration type, included the combination S-D without a second D. These variants were present in the best statistically non-significant groups. However, for RF this holds only for RC data variants and for species group classification. This is the only calibration variant present in the best group. This underlines the usefulness of spectral gradients in combination with smoothing for machine learning applications. Independent of model type and calibration, Z-standardization showed no significant differences on average. Preprocessing steps that do not lead to increased accuracy, such as Z-standardization, should be avoided for the sake of simplicity.
5. Conclusions
The vegetation composition of grasslands with respect to species group and plant part could be determined under laboratory conditions with high accuracy. The exclusive use of spectral information seems to be sufficient for grassland classification according to these criteria. In particular, MLP outperformed PLS-DA and RF and, thus, can be recommended for further research and applications. Interestingly, calibration on average under laboratory conditions did not influence the classification accuracy for all tested model types. Although raw spectral data variants led to acceptable classification accuracies, further data preprocessing before model training can improve the classification performance. In particular, variants including Savitzky–Golay smoothing and a subsequent derivation seem beneficial with respect to classification quality. In contrast, data processing, including two derivations, seems to be detrimental. The presented results provide an essential basis for the comprehensive mapping of grassland species group distribution to aid site-specific management and feed efficiency. However, these findings do not directly apply to field-like conditions, as in this case the system complexity increases. While seasonal and local canopy structure effects were implicitly included in the presented results, effects of the 3D structure were not considered. Here, further research, e.g., with potted plants could be a next step. Still, this work can contribute to existing sensor systems and it demonstrates the potential for future precision farming applications in general. In particular for grassland management, an automatized, non-destructive discrimination of species groups and plant parts would be beneficial to adapt management strategies and increase process sustainability. However, other aspects and challenges such as location influence, species differentiation, illumination, 3D sward structure, spectrally mixed pixels, and identification of wavelengths relevant for classification need to be addressed in further research.