1. Introduction
One-third of the earth’s land area is forest [
1], which plays a vital role in sequestering carbon and releasing oxygen, regulating climate, preserving soil and water, conserving water sources, and purifying the air [
2,
3,
4]. Therefore, there is considerable interest in obtaining information on the state of forests. Trees are the dominant growth form in forests. Mapping tree species accurately and opportunely serves as a critical foundation for natural resource management and conservation, such as precision silviculture practices, biodiversity monitoring, and facilitating the evaluation of terrestrial carbon pool dynamics [
3,
4,
5].
So far, remote sensing has been widely used for tree species classification [
6,
7,
8,
9]. Satellite images cover a large area but have low spatial and temporal resolution and severe cloud contamination, making it challenging to obtain high-quality satellite images [
10]. In contrast, UAV-based remote sensing has many advancements, including (1) effectively reducing cloud pollution; (2) efficiently and cost-effectively collecting data with high spatial, spectral, and temporal resolution; and (3) relatively flexible free operating area, and thus has been applied in tree species classification and many other fields [
11,
12,
13,
14,
15]. These advantages increase the feasibility and practicality of UAVs for classifying forest tree species.
The potential of RGB images for tree species classification has also been convincingly demonstrated [
16,
17,
18]. Easy-to-use consumer drones can obtain the RGB image data required for classification at a reasonable price. This expands the applicability of the approach in a variety of contexts. While existing studies have demonstrated the feasibility of tree species classification using RGB imagery, current approaches predominantly rely on deep learning architectures [
19,
20,
21,
22,
23,
24]. However, the substantial dependency of these algorithms on large volumes of high-quality training data [
25] poses significant constraints for practical implementation, particularly in sample-limited areas. A promising alternative lies in synergizing UAV-collected RGB imagery with classification approaches that balance moderate sample requirements with sustained accuracy, potentially enabling effective tree species mapping in data-scarce regions.
RF and SVM models are effective in processing high-dimensional and complex datasets, so they are widely used in tree species classification [
26,
27,
28,
29]. They can still achieve good results even with a small number of samples [
30,
31] and unbalanced samples [
32,
33,
34,
35]. Furthermore, both classifiers have been found to provide a high level of accuracy with no significant differences in tree species classification [
13,
35,
36,
37,
38,
39]. This methodology could serve as a computationally efficient alternative to deep learning architectures for UAV RGB-based tree species classification under limited training sample conditions.
The use of VIs improves classification performance compared with using spectral signatures individually [
25,
40]. To date, research on VIs for tree species classification has mainly focused on multispectral and hyperspectral images. However, few studies assessed the impact of VIs derived from UAV-based RGB images on the classification of forest tree species. There is relatively little known about the impacts of RGB-based spectral information for tree species classification. Investigating the spectral attributes inherent in UAV-acquired RGB imagery presents a scientifically grounded pathway to enhance the classification efficacy of RF and SVM models in tree species discrimination tasks.
The Loess Plateau is a typical area where artificial forests are concentrated. In recent decades, several large-scale forestry projects have been implemented to restore the ecological environment, such as the “Grain for Green” program [
41]. These programs have been highly successful in increasing forest cover. There is an urgent need to monitor forest quality and vegetation changes in the region. Tree species classification is the basis for these efforts. Although numerous local and international studies have addressed forest tree species classification and mapping, systematic research on tree species classification in the Loess Plateau region remains limited. Although Zhang et al. [
42] applied multi-source remote sensing data to classify forest types in the Loess Plateau, their approach lacked species-level resolution.
Despite substantial progress in RGB image-based tree species classification, current methodologies predominantly depend on deep learning architectures. These approaches face operational constraints in data-scarce contexts requiring large labeled training datasets. This study proposes a pragmatic alternative by integrating UAV-RGB imagery with sample-efficient machine learning classifiers (SVM and RF), systematically assessing its operational feasibility and ecological generalizability in the Loess Plateau region. Therefore, we selected three study sites to include as many representative forest species of the Loess Plateau as possible within a limited area. The three research sites encompass several tree species, including Robinia pseudoacacia, Platycladus orientalis, Pinus tabuliformis, Quercus wutaishanica, Larix gmelinii, Populus davidiana, Hippophae rhamnoides, and several other less common broadleaf species. Our main research objectives are as follows: (1) to evaluate the feasibility of combining RGB images with two machine learning algorithms for accurately mapping forest tree species on the Loess Plateau; (2) to investigate the impact of input features on the classification results of tree species on the Loess Plateau; and (3) to explore the differences in classification outcomes across various sites on the Loess Plateau.
3. Results
3.1. Feature Analysis
A total of 516 valid samples across nine categories were used to analyze sample separability. ANOVA analysis was used to assess the separability of various samples. The results showed that the values of 22 features are significantly dependent on the type of sample at a 95% confidence interval level with significant differences (
p < 0.01), and the between-group variance is significantly greater than the within-group variance (
Figure 3,
Table S2). This proves that there is good separability among different types of samples.
Figure 4 shows the results of the feature importance analysis using the Boruta algorithm. In this analysis, 22 features were considered, including 3 original bands, 3 normalized DNs, and 16 vegetation indices based on RGB images. The analysis results show that the importance values of all 22 features far exceed those of the shadow features; thus, all features are deemed significant. Among these features, the blue band has the highest importance, followed by the INT and the green band, which showed significantly higher significance than the rest. The importance of the remaining 19 variables is similar to each other and exceeds the maximum value of the shadow features.
3.2. Tree Species Classification
3.2.1. Tree Species Classification Based on RF
Table 5 shows the total accuracy of tree species using the RF classifier. As shown, in all three sites, RGBVIs and RGBFS constantly improved the overall classification effect. RGBVIs performed best with an OA (Kappa) of 84.96% (0.80), 86.67% (0.84), and 87.69 (0.84), followed by RGBFS, with slightly lower OA, and Kappa. However, the RGBR’G’B’ feature did not significantly improve the classification effect and even caused a slight decrease (OA:1.5%, Kappa: 0.02) in ZN compared to the RGB feature.
Figure 5 presents a comparison of class accuracies for the three regions based on four input features using the RF classifier. It is evident that the RGBVI and RGBFS features generally improved class accuracy, particularly in YS and ZN.
In YS, the identification of R. pseudoacacia and the “other” category was significantly better than that of P. tabuliformis and P. orientalis. When compared to the sole use of RGB imagery, the F1 scores for R. pseudoacacia, P. orientalis, P. tabuliformis, and the “other” category increased by 13.37% (10.20%), 16.00% (20.00%), 8.28% (6.81%), and 14.97% (13.05%), respectively, when RGBVI (RGBFS) features were used as inputs.
In ZN, there was a significant disparity in UA and PA when using RGB imagery and RGBR’G’B’ features, leading to considerable misclassification and omission issues, particularly with H. rhamnoides, which reached an accuracy drop of 25.68%. The use of RGBVIs and RGBFS features notably alleviated these problems. Compared to the sole use of RGB imagery, when RGBVI (RGBFS) features were input, the F1 scores for L. gmelinii, P. tabuliformis, R. pseudoacacia, P. davidiana, H. rhamnoides, and the “other” category improved by 7.88% (8.24%), 11.74% (10.09%), 12.22% (3.98%), 9.21% (7.92%), 10.73% (8.01%), and 4.76% (−3.56%), respectively.
In YC, all categories generally exhibited high accuracy, with F1 scores exceeding 80%, and the issues of misclassification and omission were notably less severe than in YS and ZN. However, the RGBVIs and RGBFS features did not significantly improve class accuracy. When using RGBVI (RGBFS) features, the accuracy for the Q. wutaishanica did not improve and even decreased by 3.16%. Conversely, the F1 scores for R. pseudoacacia, broadleaf species, and the “other” category increased by 4.13% (4.56%), 5.38% (4.10%), and 6.65% (8.17%), respectively.
3.2.2. Tree Species Classification Based on SVM
Based on the total accuracy of the three regions, when using the SVM classifier, compared to using RGB images alone, the other three features improved the overall accuracy (3.5–15.05%) and Kappa (0.05–0.2) to varying degrees (
Table 6). Among them, the improvement effect of RGBVIs is the most obvious, especially in YS, OA, and Kappa increased by 15.05% and 0.19, respectively. Meanwhile, RGBR’G’B’ and RGBFS did not show consistent superiority.
Figure 6 illustrates the class accuracy for the three regions based on the SVM classifier using four input features. In the various sites, the RGBVIs and RGBFS features led to varying degrees of improvement in class accuracy.
In YS, R. pseudoacacia and the “other” category were identified significantly better than P. tabuliformis and P. orientalis. Compared to the sole use of RGB imagery, when RGBVI (RGBFS) features were employed, the F1 score for R. pseudoacacia, P. orientalis, P. tabuliformis, and the “other” category increased by 17.05% (14.11%), 17.78% (15.15%), 11.11% (11.11%), and 11.18% (11.18%), respectively.
In ZN, there were significant issues with misclassification and omission when using RGB and RGBR’G’B’ features. Taking H. rhamnoides for example, the disparity between UA and PA reached as high as 20.57%. The use of RGBVIs and RGBFS features significantly alleviated this problem. When RGBVI (RGBFS) features were input, the F1 score for H. rhamnoides improved dramatically, increasing by 19.05% and 13.33% compared to the sole use of RGB imagery. In contrast, the F1 scores for L. gmelinii, P. tabuliformis, R. pseudoacacia, and P. davidiana showed only slight improvements, while the F1 score for the “other” category experienced a minor decline (0.93% and 3.12%). However, it remained above 80%.
Overall, the accuracy for all categories in YC is higher than that in YS and ZN, particularly after adding extra features, with the F1 score exceeding 85%. Additionally, the issues of misclassification and omission are significantly less severe compared to YS and ZN. In contrast to YS and ZN, all three additional features contributed to varying degrees of improvement in class accuracy: R. pseudoacacia improved by 7.73% to 9.52%, Q. wutaishanica by 1.66% to 4.56%, broadleaf species by 2.38% to 4.90%, and the “other” category by 2.28% to 5.6%. Among these, RGBVIs had the most significant impact, while RGBFS and RGBR’G’B’ did not demonstrate a consistent superiority.
3.2.3. Spatial Difference Analysis of Tree Species Classification
We evaluated the classification effectiveness of the three sites based on the average and optimal levels of classification accuracy for each feature set. As shown in
Figure 7, regardless of the input features, there was a consistent trend for average OA, Kappa, and F1 score: YC > ZN > YS for average OA, Kappa, and F1 score. The maximum OA (OA_max) and maximum Kappa (Kappa_max) also followed the same trend, except that YS and ZN are nearly equal under the RGBFS features. When considering F1_max, YC significantly outperformed both ZN and YS. When RGB imagery is used, ZN is higher than YS; however, when RGBVI features are input, YS surpasses ZN.
This study compared the performance of SVM and RF classifiers using three additional features against the sole use of RGB imagery, assessing overall accuracy, Kappa, and the average F1 score for each tree species. The results are illustrated in
Figure 8. In all three study sites, the RGBVI features exhibited the best enhancement effects. Regardless of the classifier used, the improvement in accuracy for the YS area was the most significant, with overall accuracy increasing by 2.65% to 15.04%, Kappa values rising by 0.03 to 0.20, and F1 scores increasing by 2.01% to 14.28%. In the YC area, all three features demonstrated relatively stable and comparable enhancement effects, with overall accuracy improving from 2.31% to 6.15%, Kappa increasing from 0.03 to 0.08, and F1 scores rising from 2.57% to 6.15%. For ZN, when using the RF classifier, the RGBR’G’B’ feature exhibited some inhibitory effects; however, the RGBVI features significantly enhanced classification performance.
To compare the optimal classification effects across study sites, the best features selected for each region were used to evaluate classification performance. As shown in
Table 7, when employing the RF classifier, the classification effectiveness ranked as follows: YC > ZN > YS, with OA exceeding 84% in all regions, Kappa values equal to or greater than 0.8, and F1 scores also exceeding 84%. When the SVM classifier was utilized, a similar trend was observed (YC > ZN > YS); however, the performance of YC was notably superior to the other two regions, achieving an OA of 90%, a Kappa of 0.87, and an F1 score of 90.08%.
3.3. Tree Species Mapping
Through the research conducted in
Section 3.2, we identified the optimal classification combinations for each study area, ultimately producing maps and statistics on the area distribution and proportions of each category, as shown in
Figure 9.
In YS, although the tree species distribution is simple, there is a severe imbalance in category distribution. R. pseudoacacia dominate, accounting for 70.7% of the total area, followed by non-forested land at 22.2%. The proportions of P. tabuliformis and P. orientalis are minimal, comprising only 6.6% and 0.5%, respectively.
In contrast, ZN exhibits a relatively uniform and complex distribution of categories. The most prevalent tree species is R. pseudoacacia, covering 1.88 ha (33.2%), followed by P. tabuliformis at 1.31 ha (23.2%), L. gmelinii at 0.62 ha (11%), P. davidiana at 0.29 ha (5.1%), and H. rhamnoides, which only occupies 0.22 ha (3.9%).
The category distribution in YC is relatively even, primarily consisting of broadleaf species. The largest area is occupied by Q. wutaishanica, covering 5.40 ha (30.6%), followed by broadleaf trees at 28.3%, while R. pseudoacacia accounts for only 19.5% (3.45 ha).
4. Discussion
4.1. Visible Band and Derived VIs
4.1.1. The Importance of Visible Band in Classification
The visible band, especially the blue band, plays a significant role in mapping tree species. Our results indicated that the visible band is feasible for discriminating dominant tree species on the Loess Plateau. Numerous studies based on multispectral satellite imagery [
71,
72,
73] and aerial imagery [
74,
75,
76,
77] have also highlighted the importance of the visible band. In the feature importance test results of this study, the blue band was significantly more important than other features, indicating its crucial role in tree species classification. This may be due to the blue band’s sensitivity to chlorophyll and its resistance to canopy shading [
78]. Many previous studies have also highlighted the significance of the blue band [
14,
17,
78,
79]. Therefore, the blue band will become increasingly crucial in future research [
17,
80].
4.1.2. The Importance of VIs in Classification
This study quantitatively evaluated 16 VIs derived from standard RGB spectral bands for their efficacy in tree species discrimination. Our findings revealed the outstanding contribution of VIs derived from RGB images in accurate tree species classification. This may be because the RGB image-derived VI correlates with various aspects of plant growth and development, including canopy closure, density, leaf area index, biomass, and yield [
81,
82,
83,
84]. In addition, empirical investigations confirm the extensive integration of UAV-acquired RGB imagery-derived VIs into forest monitoring and management [
85,
86]. This further confirms the critical potential of UAV-based RGB images and their derived VIs in vegetation monitoring, especially tree species identification.
4.2. Feature Selection
In classification tasks, feature selection aims to identify the most pertinent features for accurate classification, which helps improve computational speed and classification accuracy [
87]. In this study, we selected the top seven ranked features as the RGBFS features and compared the results with RGBVIs, as shown in
Figure 10. It can be observed that, although RGBFS achieved good results in all cases, there was still a slight gap compared to RGBVIs. The most enormous difference was observed using the RF classifier in site ZN, where OA, Kappa, and F1_avg were only 3.67%, 0.04, and 3.66% lower than RGBVIs, respectively. This finding differs slightly from earlier studies [
72,
88], which reported that selected features achieved the highest accuracy. However, some studies [
89,
90,
91] also concluded that feature selection does not necessarily yield better results than using all features combined. We think that this discrepancy may be due to the difference in the number of features. Aneece et al. pointed out that classification accuracy increases with the number of features and tends to stabilize at around 15–20 features [
92]. In this study, the maximum number of features was 22, which may explain why RGBVIs showed higher accuracy than RGBFS.
4.3. Classification in Regions with Imbalanced Class Distributions
4.3.1. The Effect of Imbalanced Class Distributions in Classification
Class imbalances negatively affected the results for a specific class. In our study, YS performed the worst among the three sites, and
R. pseudoacacia showed a more stable and significant effect compared to several other categories. This may be attributed to the severe class imbalance in the tree species distribution within the study area, with more than 70% of the pure forests being
R. pseudoacacia. This may bias the predicted probability distribution of the classification model toward
R. pseudoacacia. Many previous studies have observed similar patterns [
71,
93,
94,
95,
96].
In cases of imbalanced class distribution, training samples can significantly impact classification results. In this study,
P. orientalis and
P. tabuliformis exhibited poor classification performance. This may be because they typically appear in mixed forests within the research area and are rarely found in pure stands, especially
P. tabuliformis. As a result, it is challenging to find reliable reference samples. Therefore, some of the observed misclassifications may be due to errors in the reference dataset. Another potential reason could be spectral overlap due to the broad spectral range of some tree species [
34].
4.3.2. Mitigating the Impact of Class Imbalance on Classification
To eliminate the effect of class imbalance on classification results, it is usually necessary to draw approximately the same number of sampling units for each class [
97,
98]. This can be achieved, for example, by down sampling the large class (
R. pseudoacacia) to match the sample size of the small class (
P. tabuliformis,
P. orientalis) [
99]. However, to date, there is a relative lack of research on this issue. In future research, this topic will be systematically and in-depth studied.
4.4. Comparison of RF and SVM Classifiers
Our analysis revealed no consistent superiority between RF and SVM classifiers across the three study sites (
Figure 11). In the YS site (
Figure 11a), neither algorithm demonstrated statistically dominant performance across evaluated metrics. For ZN (
Figure 11b), SVM achieved systematically higher accuracy than RF when using RGB and RGBR’G’B’ features, whereas this pattern was reversed with RGBVIs and FS. In contrast, SVM consistently outperformed RF in the YC (
Figure 11c) regardless of input feature sets, suggesting region-specific algorithm sensitivity to ecological heterogeneity.
Our findings align with previous comparative studies of SVM and RF in forest and land cover classification [
13,
34,
100]. Cetinh and Yastikli [
27] reported RF’s superior performance over SVM in urban tree species mapping, whereas other studies observed the opposite trend under specific conditions [
101,
102]. A meta-analysis by Sheykhmousa et al. [
36] of 32 remote sensing classification studies employing both algorithms revealed RF outperformed SVM in 19 cases. This discrepancy may stem from RF’s inherent advantage in processing low-spatial-resolution imagery, where its ensemble-based feature selection effectively mitigates spectral noise. However, no consistent algorithmic superiority was observed with medium-to-high spatial resolution data, likely due to SVM’s enhanced capability to resolve fine-scale spectral-textural patterns.
These results suggest that neither classifier is universally superior for UAV RGB-based tree species classification. The optimal algorithm choice should be guided by spatio-spectral characteristics of the target imagery and ecological complexity of the study area, rather than relying on generic performance assumptions.
4.5. Limitations and Future Research Directions
Impact of imaging acquisition conditions: The classification performance varied significantly across the three study areas in this research. These discrepancies may be partially attributable to variations in data acquisition conditions—such as illumination intensity, solar zenith angle, atmospheric conditions, and local elevation—across different regions. However, due to the lack of quantitative environmental parameters recorded during data acquisition in this study, a rigorous analysis of these factors was not feasible. Future investigations could validate this hypothesis through systematic controlled experiments designed to isolate and quantify the impacts of specific environmental variables on classification accuracy.
Integration of multisensor data for enhanced tree species classification. While UAV-based RGB imagery demonstrates preliminary feasibility in tree species classification, its accuracy and robustness require further improvement. Hyperspectral and LiDAR systems have shown considerable potential in this domain, particularly as technological advancements have rendered them increasingly accessible. Future research should prioritize the development of robust classification frameworks that synergistically integrate multi-source data (e.g., spectral, structural, and spatial features) to achieve higher classification consistency across heterogeneous forest environments.
Incorporation of textural features for enhanced classification accuracy: In a study on tree species classification utilizing UAV-based multispectral imagery, Abdollahnejad et al. [
103] demonstrated that the integration of textural variables improved the OA by 4.24%. This finding highlights the significant potential of textural features in UAV-driven tree species discrimination. Future research should prioritize leveraging the abundant textural information inherent in high-spatial-resolution RGB imagery acquired by UAV systems, which may further optimize classification performance.
Addressing class imbalance in forest resource classification: Future research should address the pervasive class distribution skew in forest resource inventories through two strategies: (1) integrating adaptive resampling techniques during sample selection to balance species representation and (2) developing cross-region transfer learning frameworks to leverage external ecological knowledge. These approaches collectively aim to enhance classification reliability by improving dataset representativeness and model generalizability for underrepresented species.