Individual Tree Species Classification in a Mining Area of the Yellow River Basin Using UAV-Based LiDAR, Hyperspectral, and RGB Data

Wang, Guo; Nie, Sheng; Xi, Xiaohuan; Wang, Cheng; Wang, Hongtao

doi:10.3390/rs18091361

Open AccessArticle

Individual Tree Species Classification in a Mining Area of the Yellow River Basin Using UAV-Based LiDAR, Hyperspectral, and RGB Data

by

Guo Wang

^1,2

,

Sheng Nie

^2,3,*

,

Xiaohuan Xi

^2,3

,

Cheng Wang

^2,3 and

Hongtao Wang

²

¹

School of Civil Engineering, Henan University of Engineering, Zhengzhou 451191, China

²

School of Surveying and Land Information Engineering, Henan Polytechnic University, Jiaozuo 454000, China

³

Aerospace Information Research Institute, Chinese Academy of Sciences, Beijing 100094, China

^*

Author to whom correspondence should be addressed.

Remote Sens. 2026, 18(9), 1361; https://doi.org/10.3390/rs18091361

Submission received: 27 February 2026 / Revised: 11 April 2026 / Accepted: 20 April 2026 / Published: 28 April 2026

(This article belongs to the Special Issue Remote Sensing and Smart Forestry (Third Edition))

Download

Browse Figures

Versions Notes

Highlights

What are the main findings?

In the ecologically fragile and structurally complex mining landscapes of the Yellow River Basin, the fusion of UAV-based LiDAR, hyperspectral, and RGB data enabled the extraction of 278 complementary features for each individual tree. These features comprehensively captured biochemical, structural, and textural characteristics, facilitating the discrimination of spectrally similar species within heterogeneous tree–shrub–grass mosaics.
Through rigorous statistical evaluation using 5 × 5 repeated cross-validation combined with Friedman and Nemenyi tests, XGBoost was identified as the optimal classifier for this challenging mining environment. It achieved superior performance (Overall Accuracy = 0.897, Kappa = 0.811), demonstrating high stability and computational efficiency compared with linear, instance-based, and single-tree models.
Feature importance analysis further indicated that blue-edge spectral bands sensitive to stress-induced pigment variation, red-edge vegetation indices, and LiDAR-derived canopy height were the dominant contributors to accurate species discrimination. These results confirm that the integration of biochemical information with three-dimensional structural attributes is essential for reliable classification in mining-disturbed ecosystems.

What are the implications of the main findings?

The XGBoost-based framework establishes a reproducible pipeline for individual tree species mapping, supported by statistical significance testing (Friedman and Nemenyi tests). However, due to severe class imbalance (e.g., only 27 samples for Ligustrum quihoui) and other limitations (single site, minimal LiDAR exploitation), the framework currently achieves reliable discrimination only for dominant and moderately represented classes; high-precision mapping of rare species remains an unresolved challenge.
In addition, the resulting wall-to-wall species distribution map supports evidence-based and spatially targeted restoration strategies. It enables precision revegetation planning, identification of areas where restoration progress has stagnated, and long-term adaptive management through repeatable and cost-effective UAV surveys. Consequently, the framework directly contributes to the sustainable ecological rehabilitation of mining-impacted regions.

Abstract

The Yellow River Basin contains abundant coal resources; however, its ecological environment is inherently fragile, and vegetation degradation has been further intensified by extensive mining activities. Accurate classification of individual tree species in mining-affected areas is therefore essential for assessing ecological conditions and establishing a scientific foundation for targeted restoration and sustainable management. To address this need, an evaluated machine learning framework was developed and evaluated for individual tree species classification in a coal mining area of the Yellow River Basin using integrated unmanned aerial vehicle (UAV) data. A comprehensive feature set was constructed by extracting 278 attributes per tree. These attributes included 224 spectral bands and 29 hyperspectral indices derived from hyperspectral imagery, 24 textural metrics obtained from RGB orthophotos, and one canopy height feature generated from a LiDAR-derived model. Based on ground-truth data from 1095 individual trees, seven machine learning algorithms were trained and systematically compared: Random Forest (RF), Support Vector Machine (SVM), K-Nearest Neighbors (KNN), Decision Tree (DT), Gradient Boosting (GB), Logistic Regression (LR), and XGBoost. Statistical significance testing using 5 × 5 repeated cross-validation, together with the Friedman test and post hoc Nemenyi test, and additional model stability analysis consistently identified XGBoost as the optimal classifier. On an independent test set, XGBoost achieved high accuracy (Overall Accuracy = 0.897, Kappa = 0.811) with an efficient training time of 2.36 s. Further analysis demonstrated the critical and complementary roles of hyperspectral and structural features in species discrimination. The optimized model was subsequently applied to generate a detailed wall-to-wall tree species map across the entire mining area. Overall, this study presents a statistically informed comparison of classifiers for multi-source feature-based species discrimination and delivers an evaluated and practical pipeline for effective vegetation monitoring. The proposed framework provides a scientific tool for assessing and managing ecological recovery in complex mining environments, particularly within ecologically sensitive regions such as the Yellow River Basin.

Keywords:

tree species classification; mining area; UAV remote sensing; multi-source data fusion; machine learning

1. Introduction

Serving as both an ecological corridor and economic hub, the Yellow River Basin contributes significantly to China’s sustainable development goals [1]. It holds rich mineral resources and is widely recognized as China’s “Energy Basin” [2]. However, large-scale mining operations have caused surface subsidence, ground fissures, and declines in groundwater levels. These disturbances have led to widespread degradation of trees, shrubs, and grasses [3,4]. In the Henan section of the basin, these intensive extraction activities have further engendered severe environmental consequences, including the loss of vegetation ecological functions and exacerbated soil erosion [5,6], which hinder ecological protection and high-quality development in the Yellow River Basin [7].

Accurate tree species classification is a fundamental task for vegetation monitoring, biodiversity assessment, and evaluation of ecological restoration outcomes [8,9]. It helps assess ecosystem stability and restoration success, especially in human-disturbed environments like mining regions [10,11]. However, in post-mining landscapes, vegetation is often sparse, fragmented, and heavily influenced by variable ground conditions and human activities. These factors make species-level identification using remote sensing data exceptionally difficult [12,13].

Ground surveys provide accurate tree species information at the plot level and remain a standard method for vegetation inventory. However, this approach is difficult to apply in mining areas due to rough terrain, high safety risks, and the near impossibility of achieving complete spatial coverage [14]. Remote sensing offers a broader view and has been widely used for ecological monitoring in mining areas over the past 50 years [15]. Researchers have applied indices such as NDVI, FVC, and the Ecological Environment Quality Index to assess restoration status at regional scales [16,17,18]. Most existing studies use medium-resolution satellite images. These images have mixed-pixel problems, and vegetation indices like NDVI tend to saturate in complex mining environments [19].

Recent advances in unmanned aerial vehicle (UAV) platforms have addressed many of these limitations by enabling rapid acquisition of high-spatial-resolution imagery and three-dimensional data over rugged and inaccessible terrain [20,21]. LiDAR sensors capture canopy height and three-dimensional structure, which helps separate individual trees from the underlying shrub and grass layers [22,23,24,25]. Hyperspectral sensors record continuous reflectance spectra and support biochemical discrimination among species [26,27,28]. High-resolution RGB imagery further provides fine-scale texture and morphological details [29,30]. The combination of these sensors produces complementary information layers. This multi-source data framework significantly improves the accuracy and reliability of tree species classification in complex, multi-layered vegetation environments [31,32,33]. Recent advancements in UAV-based multi-source data fusion have significantly improved the accuracy of tree species classification in various forest ecosystems, demonstrating great potential for fine-scale vegetation mapping [34,35,36].

However, accurate tree species classification in mining areas remains challenging due to heterogeneous vegetation structure, spectral similarity among species, and limited field accessibility. Recent studies have applied UAV-based remote sensing to address these issues, yet most efforts remain constrained to single-sensor or dual-sensor configurations. Luo et al. [37] used UAV RGB imagery alone and developed an improved Faster R-CNN model for individual tree detection in a coal mine afforestation area. Although detection accuracy was high, the method lacked spectral information necessary for species-level discrimination. Deng et al. [38] employed UAV hyperspectral imagery and proposed a 3DCNN model with attention mechanisms for tree species classification in a mining restoration site. While spectral resolution was sufficient, structural data such as canopy height and three-dimensional form were not captured.

To overcome the limitations of single-sensor approaches, several studies have explored dual-source fusion. Zhong et al. [39] integrated UAV LiDAR and RGB data using an improved YOLOv8 model and achieved higher tree species identification accuracy in complex mixed forests than either data source alone. He et al. [19] fused UAV RGB and LiDAR data and proposed a multi-scale hierarchical classification method for fine-scale vegetation mapping in an open-pit phosphate mining area. However, both studies lacked hyperspectral data, which provides continuous spectral signatures essential for biochemical discrimination among closely related species.

In non-mining environments, multi-sensor fusion has shown greater potential. By integrating UAV-based LiDAR, hyperspectral, and ultrahigh-resolution RGB data, Qin et al. [34] successfully classified tree species in subtropical broadleaf forests with high accuracy. Yet their method relied on a single random forest classifier and was developed in a non-mining environment; its transferability to heterogeneous, ecologically disturbed mining restoration sites remains uncertain. Meanwhile, Gominski et al. [40] proposed an automated method to mine species labels from public inventory data, but their work focused on scalable labeling rather than multi-sensor fusion in disturbed mining landscapes.

Despite these advances, the combined use of all three sensor types has not been systematically investigated in mining areas, which differ fundamentally from natural forests [41,42]. As a result, methods developed for natural forests cannot be directly transferred to mining areas, where several unique challenges converge. (1) Canopy structure is often heterogeneous and irregular due to tree growth on constrained or reconstructed soils. (2) The species pool is deliberately limited yet ecologically strategic, resulting in complex mosaics of trees, shrubs, and grasses. (3) Mixed-age stands emerge from phased planting over time. (4) Spectral and structural confusion frequently occurs both among different life forms—such as young trees versus tall shrubs-and among species that exhibit similar stress responses to residual mining impacts. This widespread pattern of tree–shrub–grass intergrowth, as observed in the mining areas along the Henan section of the Yellow River Basin, adds substantial complexity to individual tree detection and feature extraction [43,44]. These conditions call for analytical approaches specifically designed for heterogeneous, human-disturbed mining landscapes [45].

To address this research gap, this study centers on the Yushan coal mining area in the Henan section of the Yellow River Basin, a representative ecological restoration site, and develops a UAV-based multi-source remote sensing framework tailored for individual tree species classification in complex, multi-layered mining environments. The specific objectives are to: (1) extract 278 features per individual tree from synchronously acquired LiDAR, hyperspectral, and RGB data; (2) systematically evaluate the classification performance of seven machine learning algorithms including Random Forest, Support Vector Machine, K-Nearest Neighbors, Decision Tree, Gradient Boosting, Logistic Regression, and XGBoost, using 1095 ground-truth tree samples; (3) identify the optimal classifier under the dual constraints of statistical significance and model stability via 5 × 5 repeated cross-validation combined with the Friedman test and Nemenyi post hoc analysis; and (4) analyze the key discriminative features, elucidate the complementary roles of hyperspectral and structural attributes, and apply the optimized model to generate a high-resolution species distribution map at the individual tree level across the entire mining area to establish a practical, empirically evaluated technical pipeline for effective tree monitoring.

2. Materials and Methods

2.1. Study Area

The study area is situated in the Yushan Coal Mine, located in the hilly region of western Henan Province within the middle reaches of the Yellow River Basin (approximately 34.69°N, 112.10°E), as illustrated in Figure 1. The region experiences a warm-temperate continental monsoon climate, with a mean annual temperature of around 14.2 °C and limited precipitation that falls primarily during the summer months. The Yushan mine is an underground operation with an annual production capacity of 700,000 tons. Influenced by both the climatic conditions and prolonged mining activities, the site’s native vegetation primarily consists of drought-tolerant warm-temperate deciduous broad-leaved shrubs and grasses. In some localized sections, this is supplemented by artificially restored secondary forests or other green vegetation, contributing to an overall fragile ecological baseline. Notably, in 2023, the mine was included in Henan Province’s Green Mine Catalogue, reflecting its recognized achievements in ecological management practices and sustainable resource utilization, and making it a pertinent and representative site for investigating vegetation restoration within a regulated mining context in the Yellow River Basin.

2.2. Data Acquisition and Preprocessing

2.2.1. Field Data Collection

Field surveys were conducted in August 2023 across the entire study region. Geographic coordinates of randomly selected scattered trees were recorded using a Real-time Kinematic Global Navigation Satellite System (RTK-GNSS), and plant species identification was performed by a plant ecologist, field data collection activities are illustrated in Figure 2. After data cleaning and the removal of incomplete records, a total of 1095 samples were retained for analysis. These samples represented four commonly used tree species and one shrub category in the Yellow River Basin, including 27 Ligustrum quihoui, 108 Populus tomentosa, 116 Quercus variabilis, 688 Sophora japonica, and 156 Shrubs.

2.2.2. UAV-Based LiDAR Data and RGB Imagery

LiDAR (Light Detection and Ranging) data were acquired in July 2023 using the LiDAR500 laser scanning module mounted on a D2000S UAV platform from Shenzhen Feima Robotics Technology Co., Ltd. (Shenzhen, China). The system operated at a wavelength of 905 nm with a triple-echo mode, providing a default pulse repetition rate of up to 192 points per second. The scanner features 32 laser lines with a ±20.3° vertical field of view. The UAV was flown at an altitude of approximately 100 m above ground level, with a side overlap of 70% and a forward overlap of 80%, achieving an average point density greater than 150 points/m². Concurrently, high-resolution RGB (Red-Green-Blue) imagery was captured by the integrated 24-megapixel camera with a 20 mm focal length. Raw point clouds were processed using the Cloth Simulation Filter (CSF) [46] module in CloudCompare software (version 2.14) to generate digital terrain models (DTM) and digital surface models (DSM) at a 0.1 m resolution. Canopy height models (CHM) with the same resolution were derived by subtracting DTM from DSM.

2.2.3. Hyperspectral Data

The dataset utilized in this study was acquired from 24 to 29 July 2023 using a GaiaSky-mini3-VN hyperspectral imager integrated with a DJI M300 RTK UAV (DJI, Shenzhen, China). Data collection was performed under clear, cloud-free weather conditions, with an average temperature of approximately 28 °C. The survey primarily targeted the areas above the coal mining face within the mining site. The UAV was flown at an altitude of 100 m with 60% overlap in both heading and side directions. The resulting hyperspectral images achieved a spatial resolution of approximately 0.14 m. Detailed sensor specifications are provided in Table 1. Preprocessing procedures included radiometric calibration, atmospheric and geometric correction, orthorectification, as well as image cropping and mosaicking.

3. Methodology

3.1. Feature Extraction at Sample Points

3.1.1. Structural Features from LiDAR

The primary structural information for individual trees was derived from the Canopy Height Model (CHM). To ensure geometric consistency with other data sources, the CHM was resampled to match the spatial resolution and coordinate system of the hyperspectral and RGB orthomosaic.

For each of the 1095 field-surveyed sample trees, a structural feature, tree height, was extracted. This was accomplished by creating a circular buffer zone with a predefined radius of 0.5 m centered on the precise GNSS coordinates of each tree. The maximum pixel value within this buffer zone was extracted from the CHM and recorded as the representative height for that individual (Tree Height, TH). This method mitigates the potential misalignment between the sample point and the actual treetop location in the raster data.

While LiDAR point clouds can yield a vast array of structural descriptors (e.g., canopy volume, density profiles, or complexity metrics), this study intentionally focused on the fundamental metric of height for the model comparison phase. This deliberate simplification serves two purposes: (1) it provides a clear, universally understood structural variable to test its discriminative power within the fused feature set, and (2) it allows the subsequent comparative analysis of machine learning classifiers to focus on the interaction between spectral, textural, and this basic structural attribute. This deliberate simplification is justified because TH serves as a stable proxy for tree size and vertical position—attributes consistently linked to species-specific growth strategies in mixed mining restoration forests. Thus, the LiDAR contribution in this study is intentionally minimal. We do not assert that this represents a full exploitation of LiDAR point clouds. The term ‘fusion’ is used to indicate that data from three sensor types were co-registered and combined in the feature space, not that every possible feature from each sensor was extracted.

3.1.2. Spectral Features from Hyperspectral Data

Twenty-nine spectral features were calculated [34], focusing on vegetation stress indicators relevant to mining environments (Table 2).

3.1.3. Textural Features from RGB Imagery

For each RGB band, 8 textural features, listed in Table 3, were extracted using Gray-Level Co-occurrence Matrix (GLCM) analysis.

Table 3 lists the 8 GLCM texture features per band, giving a total of 24 texture features across the three RGB bands,

P_{i, j} = \frac{V_{i, j}}{\sum_{i, j = 0}^{N - 1} V_{i, j}}

, where represents the pixel value at row i and column j within the moving window, and N denotes the window dimension. A 3 × 3 moving window was empirically selected as a balance between capturing fine-scale textural variations within the crown and maintaining computational efficiency. Larger windows (e.g., 5 × 5, 7 × 7) tended to introduce excessive smoothing, which we observed in initial explorations, but a systematic optimization of window size was not performed.

3.2. Dataset Construction and Splitting

The predictive modeling dataset was constructed by integrating the field-collected species labels with the corresponding multi-source feature vectors extracted from the co-registered remote sensing data. This resulted in a total of 1095 complete and valid samples, each representing an individual tree characterized by a 278-dimensional feature set (224 spectral bands, 29 hyperspectral features, 24 textural, 1 structural) and its ground-truth species class.

To ensure a proper evaluation of model generalization, the complete dataset was partitioned into three independent subsets using a stratified random split: training Set (70%, n = 767 samples), validation Set (15%, n = 164 samples), independent Test Set (15%, n = 164 samples).

The stratified splitting was performed using a fixed random seed to guarantee reproducibility. It preserved the original proportion of each tree species across all three subsets, ensuring their representativeness. After splitting the dataset, we performed feature scaling tailored to each classifier’s requirements. The details of this scaling strategy are described in Section 3.3.2, along with the corresponding model training protocols. All preprocessing steps—including any necessary normalization—were fitted exclusively on the training set to prevent data leakage, and the derived parameters were then applied to the validation and test sets.

To address the potential impact of the observed class imbalance (27 Ligustrum quihoui vs. 688 Sophora japonica), we carefully assessed the model’s performance using metrics stable to skewed distributions, such as the F1-score and the macro-averaged accuracy, rather than relying solely on overall accuracy. While resampling techniques were considered, we opted to preserve the natural distribution of the tree species to ensure the model reflects the actual vegetation composition of the Yellow River Basin, thereby avoiding the introduction of synthetic biases.

3.3. Machine Learning Classifiers and Training

3.3.1. Classification Algorithms Used for Comparison

Seven established algorithms were selected to ensure comprehensive comparison, covering a range of learning paradigms including linear, instance-based, tree-based, and ensemble methods: Logistic Regression (LR) [47]: Linear classifier; baseline for assessing linear separability. K-Nearest Neighbors (KNN) [48]: Instance-based; evaluates local discriminative power. Support Vector Machine (SVM) [49]: Non-linear kernel method; maximizes margin. Decision Tree (DT) [50]: Interpretable rule-based model; foundation for ensembles. Random Forest (RF) [51]: Bagging ensemble; stable, with feature importance. Gradient Boosting (GB) [52]: Boosting ensemble; sequential error correction. XGBoost [53]: Optimized gradient boosting with regularization; state-of-the-art. This portfolio systematically contrasts inductive biases—linear separability (LR), local similarity (KNN), margin separation (SVM), hierarchical rules (DT), bagging (RF), and boosting (GB/XGBoost)—for tree species classification using multi-source features.

3.3.2. Model Training and Hyperparameter Optimization

To ensure a fair and reproducible comparison of the seven machine learning algorithms, a consistent and theoretically grounded training protocol was established. Given the high-dimensional feature space relative to the sample size, the primary objective was to configure each model to achieve stable generalization while mitigating the risk of overfitting.

A fixed set of well-established hyperparameters was selected for each classifier, informed by common practices in remote sensing classification and ecological modeling. These parameters were chosen to balance model complexity with predictive stability, avoiding extensive tuning that could lead to over-optimism on the validation set. The key hyperparameter configurations are summarized in Table 4.

As outlined in Section 3.2, we applied feature scaling selectively based on each classifier’s sensitivity to input scales. We carefully considered the treatment of feature scaling. Algorithms whose performance is sensitive to the scale of input features—specifically SVM, KNN, and LR—were trained and evaluated using standardized feature data (zero mean, unit variance). In contrast, tree-based ensemble methods (DT, RF, GB, XGBoost) are invariant to monotonic feature transformations and were therefore applied to the original, unscaled feature values. This protocol ensures each algorithm operates under its theoretically optimal conditions.

All models were trained on the combined training and validation subset (n = 931). The final, locked model for each algorithm was then applied to the strictly independent test set (n = 164) to obtain an unbiased estimate of its predictive performance, as reported in Section 4.1. This entire process was conducted with a fixed random seed to guarantee the complete reproducibility of the results.

3.4. Model Evaluation

3.4.1. Core Classification Accuracy Metrics

The classification performance was assessed using four standard metrics: Overall Accuracy (OA), User’s Accuracy (UA), Producer’s Accuracy (PA), and the Kappa coefficient [54]. Specifically, OA quantifies the overall classification accuracy, while UA and PA assess the accuracy of individual class-specific classifications. The Kappa coefficient measures the agreement between predicted and observed classifications beyond chance, providing a more stable assessment of classification reliability than overall accuracy alone. In addition to these metrics, the F1-score provides a more complete reflection of the reliability of the classification results in the case of class imbalance. Therefore, in this study, we also calculated the F1-score as a supplementary metric to enhance the stability of classification accuracy assessment. These metrics were computed using the following equations:

O A = \frac{\sum_{i = 1}^{n} P_{i}}{N}

(1)

U A = \frac{P_{i}}{P_{+ i}}

(2)

P A = \frac{P_{i}}{P_{i +}}

(3)

K a p p a = \frac{N \sum_{i = 1}^{n} P_{i} - \sum_{i = 1}^{n} (P_{i +} \times P_{+ i})}{N^{2} - \sum_{i = 1}^{n} (P_{i +} \times P_{+ i})}

(4)

F 1 - s c o r e = 2 \times \frac{\Pr e c i s i o n \times Re c a l l}{\Pr e c i s i o n + Re c a l l}

(5)

In Equations (1)–(5), n denotes the total number of classes, and N represents the overall number of validation samples. For each class P_i is the number of correctly classified samples,

P_{i +}

is the total number of samples assigned to that class, and

P_{+ i}

is the total number of samples actually belonging to that class. It should be noted that User’s Accuracy (UA) corresponds to Precision, while Producer’s Accuracy (PA) corresponds to Recall.

3.4.2. Statistical Significance Testing

A 5-fold cross-validation repeated 5 times (denoted as 5 × 5 cross-validation) was performed, resulting in 25 different training-validation splits. This procedure accounts for both model variability and data sampling effects, thereby enhancing the stability of the statistical inference. To determine whether the differences observed in classifier performance were statistically significant and not due to random chance, a two-step non-parametric statistical testing procedure was employed:

The Friedman test evaluates the null hypothesis that all classifiers perform equally. Based on the cross-validation accuracy scores, we calculated the Friedman test statistic as defined [55]:

F_{r} = \frac{12}{n K (K + 1)} \sum_{j = 1}^{K} R_{j}^{2} - 3 n (K + 1)

(6)

where n is the number of classifiers, k is the number of data sets, and

R_{j}

is the sum of ranks for each condition j.

To test the null hypothesis that all seven classifiers exhibit equivalent performance, the Friedman test was initially applied to the OA and Kappa values obtained from the test set. A significant result (

p < 0.05

) would indicate that not all classifiers are equivalent.

Given the significant result of the Friedman test, we performed post hoc pairwise comparisons [56] using the Nemenyi test to identify which specific pairs of classifiers exhibited statistically significant performance differences. The Nemenyi test assesses the significance of differences between classifiers by computing a Critical Difference (CD) value, defined as:

C D = q_{α, k, \infty} \sqrt{\frac{k (k + 1)}{6 n}}

(7)

where

q_{α, k, \infty}

is the critical value derived from the Studentized range statistic, k denotes the number of groups, and n is the number of blocks.

This test identifies which specific pairs of classifiers differ significantly in their performance, controlling for multiple comparisons. The results are used to group classifiers into statistically distinct tiers of performance.

3.5. Operational Application for Wall-to-Wall Mapping

3.5.1. Treetop Detection of the Mining Area

Accurate treetop detection is a prerequisite for subsequent individual tree-level analysis, such as species classification [57]. However, this task is particularly challenging in post-mining restoration landscapes due to the complex and heterogeneous environment. The study area is characterized by steep slopes, undulating terrain from past excavation and deposition, a mosaic of planted tree stands at varying stages of growth, intermixed with patches of shrubs, grasses, bare soil, and remnant mining infrastructure. This heterogeneity can cause significant commission errors, where non-tree objects like buildings, rocks, or dense shrub patches are falsely identified as trees if a standard area-wide detection algorithm is applied.

To minimize false positives and focus on forested areas, we first created a precise vegetation mask by manually delineating forest boundaries directly on the ultra-high-resolution RGB orthomosaic, thereby excluding buildings, roads, and agricultural fields. All subsequent processing was restricted to this masked forest region.

For treetop detection within this masked region, we employed the core principles of the Watershed-Spectral-Textural-controlled Normalized Cut (WST-Ncut) framework [34], adapting it specifically for pinpointing tree apex locations rather than performing full crown delineation. The rationale for this adaptation was twofold: (1) The complex canopy structure in the mixed-species, uneven-aged stands—featuring overlapping crowns and varied crown morphologies—makes unambiguous boundary delineation highly uncertain and error-prone at this stage. (2) The primary goal for the wall-to-wall mapping was to generate reliable point locations (treetops) for feature extraction, as the classification model was trained on features extracted from field-located tree points.

The adapted procedure was as follows. First, a local maximum filter was applied to the Canopy Height Model (CHM) within the masked area to identify potential treetop candidates. Each local maximum seed was then used to initiate a marker-controlled watershed segmentation on the CHM, producing an initial set of over-segmented crown objects. Crucially, instead of proceeding to merge these objects into final crowns, we extracted the mean reflectance values from five representative hyperspectral bands centered at 468 nm (blue-edge), 550 nm (green), 670 nm (red), 720 nm (red-edge), and 800 nm (NIR). A graph was constructed where nodes represented these segments, and edge weights reflected their spectral-textural similarity and spatial proximity. The normalized cut algorithm was then applied, not to merge segments into crowns, but to prune and validate the initial treetop seeds. Seeds associated with segments that were grouped into inconsistent or spectrally anomalous clusters (suggesting they originated from non-tree objects or noise in the CHM) were filtered out. This step leveraged the multi-source data fusion to reduce errors from terrain artifacts or within-crown height variations that commonly plague CHM-based methods alone.

Thus, the output was a refined set of geolocated treetops, optimized for the complex mining area context by integrating structural (CHM), spectral, and textural information to enhance detection accuracy and reliability. This conservative approach of generating high-confidence treetop locations, rather than potentially erroneous full crown polygons, provided the optimal input for the next stage: extracting the identical set of 278 features (as used for model training) for each detected tree and applying the optimal model for tree species prediction across the entire site.

3.5.2. Feature Extraction for All Objects

Following the treetop detection within the forest mask, the identical suite of 278 remote sensing features used for model training was extracted for each detected tree location to ensure consistency between model development and application. This step transformed the structural map of individual trees into a comprehensive feature database ready for species prediction. The extraction was performed on each detected treetop coordinate to ensure that the extracted features represented the spectral, textural, and structural properties of the individual tree crown as an integrated entity.

For each detected tree object, features were extracted from the three co-registered data sources following the same protocols defined in Section 3.1.

This process generated a feature matrix for the entire study area, where each row corresponded to a detected tree and each column to one of the 278 predefined features. As described in Section 3.3.2, different classifiers have different requirements regarding feature scaling. Therefore, the scaling strategy applied to the wall-to-wall feature matrix will depend on the final optimal model selected through the comparative analysis in Section 4.1. If the selected model belongs to the category of algorithms sensitive to feature scales (e.g., SVM, KNN, or LR), the feature matrix will be standardized using the mean and standard deviation derived exclusively from the training set (as established in Section 3.2) to ensure consistency and prevent data leakage. Conversely, if a tree-based ensemble method (e.g., DT, RF, GB, or XGBoost) is identified as the optimal classifier—given their invariance to monotonic feature transformations—the original unscaled feature values will be used directly for prediction. This conditional approach ensures that the feature preprocessing for wall-to-wall mapping aligns precisely with the conditions under which the final model was trained and validated.

3.5.3. Spatial Prediction

The final stage of the workflow involved applying the optimized classifier, selected through the comprehensive comparative analysis, to predict the tree species for every individual object detected across the Yushan mining area. This step transformed the feature matrix of all detected trees into a spatially explicit species distribution map, fulfilling the applied objective of the study.

Depending on the scaling requirements of the optimal classifier identified in Section 4.1, the feature matrix was either used in its original unscaled form (if a tree-based ensemble method was selected) or standardized using the training-set derived parameters (if a scale-sensitive algorithm was chosen). This preprocessed feature matrix then served as the input for the predictive model. The pre-trained and saved optimized classifier model, along with the associated label encoder, was loaded. For each feature vector representing a detected tree, the model generated a vector of prediction probabilities across all species classes trained on. The species label for each tree was determined by selecting the class with the highest predicted probability.

Finally, the vector of species labels and confidence values was joined back to the spatial layer containing the coordinates of the detected treetops. This created a georeferenced point dataset, which was then used to generate the final wall-to-wall raster classification map of the mining area. For visualization and area summary statistics, each point was rasterized using the tree’s coordinates and assigned the pixel value corresponding to its predicted species class. The resulting map not only visualizes the spatial distribution of tree species but also enables quantitative analysis, such as calculating the relative abundance and spatial aggregation patterns of different species across the complex post-mining landscape. This final product directly serves the management objective of providing a detailed, high-resolution baseline inventory for monitoring ecological restoration progress.

4. Results

4.1. Comparative Performance of Machine Learning Models

4.1.1. Overall Model Performance

The comprehensive evaluation of seven machine learning classifiers revealed distinct performance patterns (Table 5). XGBoost emerged as the superior algorithm, achieving the highest test accuracy (OA = 0.897) and Kappa coefficient (κ = 0.811), closely followed by Gradient Boosting (OA = 0.891, κ = 0.796). Ensemble methods dominated the performance ranking, with boosting algorithms occupying the top two positions. Random Forest secured third place (OA = 0.824), while Logistic Regression and K-Nearest Neighbors both achieved identical test accuracies (OA = 0.812). Decision Tree (OA = 0.794) and Support Vector Machine (OA = 0.715) demonstrated progressively lower performance.

4.1.2. Statistical Significance Testing Based on 5 × 5 Cross-Validation

1. Friedman Test for Overall Performance Differences

A 5 × 5 cross-validation scheme combined with the non-parametric Friedman test was employed to systematically assess whether the observed performance differences among the seven classifiers were statistically significant. This approach accounts for both model variability and data sampling effects, thereby enhancing the stability of the statistical inference. Figure 3 presents the average ranks of the seven classifiers across the 25 cross-validation folds.

The Friedman test yielded a chi-square statistic of 138.4187 with a p-value < 0.000001, providing strong evidence to reject the null hypothesis. This result confirms that the seven algorithms exhibit statistically significant differences in classification performance.

2. Model Ranking and Cross-Validation Performance

The average ranks and performance metrics derived from the 5 × 5 cross-validation procedure are presented in Table 6, where a lower average rank indicates superior performance.

The cross-validation results exhibit strong consistency with the independent test set performance. XGBoost maintained the highest ranking across both evaluation approaches, with a cross-validation mean accuracy of 0.8877 closely matching its test set accuracy of 0.8970. This alignment underscores the stable generalization capability of the models.

3. Post hoc Nemenyi Test for Pairwise Comparisons

The critical difference (CD) was calculated as 1.644. Pairwise rank differences exceeding this threshold were deemed statistically significant.

The critical difference (CD = 1.644) is shown in Figure 4; classifiers connected by a horizontal line are not significantly different at α = 0.05. The results delineate a clear performance hierarchy:

Top tier: XGBoost and Gradient Boosting exhibited statistically equivalent performance (rank difference = 0.800 < CD).

Intermediate tier: Random Forest, Logistic Regression, and K-Nearest Neighbors formed a middle group, significantly outperformed by the top tier, though within-tier differences were largely non-significant.

Lower tier: Decision Tree and SVM ranked lowest, demonstrating statistically significant inferiority to all other classifiers.

4.1.3. Model Performance and Stability Analysis

1. Cross-Validation Performance Stability

The standard deviation of cross-validation accuracy reflects model stability (Figure 5):

High Performance, High Stability: XGBoost: 0.8877 ± 0.0162, Gradient Boosting: 0.8780 ± 0.0151. These algorithms demonstrate both high accuracy and low variability.

Moderate Performance, Moderate Stability: Random Forest: 0.8281 ± 0.0267, Logistic Regression: 0.8199 ± 0.0191, K-Nearest Neighbors: 0.8135 ± 0.0170.

Lower Performance, Higher Variability: Decision Tree: 0.7587 ± 0.0306, Support Vector Machine: 0.6944 ± 0.0237.

2. Performance-Computation Efficiency Trade-off

XGBoost achieves the optimal trade-off, delivering top-tier accuracy with a training time of 2.36 s—a 61.7% reduction relative to Gradient Boosting while maintaining statistically equivalent performance. KNN trains rapidly (0.002 s) but at a substantial cost to accuracy. Random Forest offers moderate efficiency (0.243 s) yet underperforms boosting-based models in classification accuracy (Figure 6).

3. Generalization Performance Assessment

Strong agreement between cross-validation and test set accuracy confirms stable generalization. XGBoost (CV: 0.8877, Test: 0.8970, Δ = +0.0093) and Gradient Boosting (CV: 0.8780, Test: 0.8909, Δ = +0.0129) exhibit slight positive generalization gaps, suggesting marginal underfitting during cross-validation. Random Forest shows near-identical performance (CV: 0.8281, Test: 0.8242, Δ = −0.0039). Overall consistency across evaluation protocols underscores model stability (Figure 7).

4.1.4. Statistical Performance Tiers

Statistical analysis delineated three distinct performance tiers among the seven classifiers (Figure 8), with direct implications for ecological monitoring applications:

Tier 1 (Optimal)—XGBoost and Gradient Boosting: statistically equivalent top-tier accuracy; recommended for effective tree species discrimination.

Tier 2 (Competent)—Random Forest, Logistic Regression, KNN: viable alternatives under computational or interpretability constraints, though significantly outperformed by Tier 1.

Tier 3 (Limited)—Decision Tree and SVM: substantially lower accuracy; not recommended for detailed species mapping in this context.

4.1.5. Conclusion of Comparative Analysis

Comprehensive statistical evaluation using 5 × 5 cross-validation and significance testing supports four main conclusions:

(1) Significant performance differences exist among the seven classifiers (Friedman test: χ² = 138.42, p < 0.000001).

(2) XGBoost is the optimal classifier, achieving the highest average rank (1.10) with accuracy statistically equivalent to Gradient Boosting but superior computational efficiency.

(3) Three statistically distinct performance tiers were identified, with boosting ensembles (XGBoost, Gradient Boosting) consistently leading.

(4) Model stability is confirmed by strong agreement between cross-validation and test set performance, validating generalization capability.

These findings justify the selection of XGBoost for subsequent feature importance analysis and operational species mapping in the Yushan mining restoration area. The systematic 5 × 5 cross-validation framework ensures conclusions are stable against sampling variability, providing reliable, evidence-based guidance for ecological monitoring and restoration in the Yellow River Basin.

4.2. Feature Importance of the Optimal Model

To elucidate the decision-making mechanism of the optimal XGBoost model and validate the contribution of multi-source data, an analysis of feature importance was conducted. The results, measured by the mean decrease in impurity [58], reveal the relative contribution of individual features to the classification process. The top 20 most important features are listed in Figure 9, and the cumulative feature importance distribution is shown in Figure 10.

The XGBoost feature importance analysis reveals three key principles underlying its species discrimination capability:

(1) Blue-edge spectral dominance: the top-ranked feature (ρ468.62, importance = 0.0358) in the 470–500 nm region is sensitive to carotenoids and leaf surface traits. Four additional blue-region bands in the top ten underscore pigment-related absorption as critical for taxonomic differentiation. However, as demonstrated by the correlation filtering analysis (Section 4.3), the high importance of these blue-edge bands largely stems from strong collinearity with adjacent spectral bands. After removing highly correlated features (|r| > 0.9), none of the original blue-edge bands were retained, yet the XGBoost performance remained identical (OA = 0.897, Kappa = 0.811). Therefore, these importance values should be viewed as directional indications rather than precise ecological attributions. The discriminative information originally associated with blue-edge wavelengths can be equivalently captured by other, less correlated features (e.g., vegetation indices, tree height, texture).

(2) Structural–biochemical integration: tree height (TH) ranks third, confirming that three-dimensional structure is essential for accurate classification. Three vegetation indices (PSSR, VOG, RVSI) in the top eleven provide complementary physiological and biochemical information beyond raw spectra.

(3) Multi-spectral synergy: the model leverages features across green, red, red-edge, NIR, and NIR–SWIR transition regions, each linked to chlorophyll reflectance, absorption, leaf internal structure, or canopy water content. This broad spectral utilization enables robust species discrimination even when raw signatures are similar.

It is important to interpret the feature importance results with caution. The importance metric used here—mean decrease in impurity—is known to be sensitive to correlated predictors. To empirically assess the impact of multicollinearity, we conducted a correlation-based filtering analysis (see Section 4.3). After removing feature pairs with Pearson correlation coefficient > 0.9, the retained feature set (40 features) yielded identical XGBoost performance (OA = 0.897, Kappa = 0.811) to the original full set. Notably, none of the original blue-edge bands were retained in this reduced set (see Supplementary Figure S2 for the list of retained features), indicating that their high importance in the full set largely stemmed from collinearity rather than independent discriminative information. The fact that model performance remained unchanged after removing these bands demonstrates that the discriminative information is equivalently captured by other features (e.g., vegetation indices, tree height, texture). Therefore, the ecological interpretation of blue-edge importance should be considered directional rather than precise.

These results demonstrate that XGBoost excels by synthesizing structural, pigment, physiological, and biophysical information—an approach consistent with ecological theory. Practically, prioritizing blue-edge bands (~468 nm), key vegetation indices (PSSR, VOG), and structural metrics (TH) can reduce dimensionality while preserving accuracy, improving efficiency in large-scale mapping. In contrast, given that the importance of specific blue-edge bands is largely driven by collinearity, dimensionality reduction should focus on feature families (e.g., spectral indices, texture, structure) rather than on individual narrow bands. Future work should test the consistency of this importance pattern across diverse forest types and sensor systems.

4.3. Feature Redundancy Analysis

To assess the impact of feature redundancy and multicollinearity on model performance and feature importance interpretation, we conducted a correlation-based filtering analysis. Pairwise Pearson correlation coefficients were computed among all 278 features using the training set. Feature pairs with an absolute correlation coefficient greater than 0.9 were identified, and one feature from each such pair was removed, resulting in a reduced set of 40 relatively uncorrelated features (only ~14.4% of the original 278). The optimal classifier (XGBoost) was retrained on this reduced set following the same protocol as in Section 3.2.

The results are summarized in Supplementary Figure S1. XGBoost achieved an overall accuracy of 0.897 and a Kappa of 0.811 on the reduced feature set, identical to the performance obtained with the full 278-feature set. The training time decreased from 2.36 s to 0.53 s (a reduction of 77.5%).

Notably, although blue-edge bands ranked highly in the original feature importance analysis (Section 4.2), none of them were retained in the reduced set. This indicates that the high importance of these bands largely stemmed from collinearity with adjacent spectral bands, rather than from independent discriminative information. The fact that XGBoost performance remained unchanged after removing these bands demonstrates that the discriminative information originally attributed to specific blue-edge wavelengths can be equivalently captured by other features (e.g., vegetation indices, tree height, texture). This redundancy analysis supports the robustness of the model and suggests that future work can safely reduce feature dimensionality without loss of accuracy, while gaining substantial computational efficiency. Detailed results, including the list of retained features and the correlation matrix, are provided in the Supplementary Materials.

4.4. Class-Wise Performance of the Optimal Model

The classification performance of the optimal model for each target species was evaluated using User’s Accuracy (UA), Producer’s Accuracy (PA), and F1-score (Figure 11). Results indicate an apparent association between classification accuracy and the number of training samples per class, while also revealing difficulties arising from spectral and morphological similarities among certain species.

The optimized model exhibited marked variation in per-species classification performance, closely tied to training sample size and class distinctiveness.

Sophora japonica (n = 688) achieved the highest F1-score (0.939), supported by high user’s accuracy (UA = 0.917) and exceptional producer’s accuracy (PA = 0.962), indicating robust discriminative learning and effective generalization. Shrubs (n = 156) also attained strong and balanced metrics (F1 = 0.917), suggesting that their distinctive structural or spectral signatures were well captured despite moderate sample size.

For Quercus variabilis (n = 116), the model delivered high recall (PA = 0.941) but moderate precision (UA = 0.842), implying that while most Quercus variabilis individuals were correctly identified, confusion occurred with other broadleaved species.

Greater difficulties emerged for Populus tomentosa (n = 108) and Ligustrum quihoui (n = 27). Populus tomentosa exhibited high UA (0.889) yet critically low PA (0.500), indicating that half of the Populus tomentosa individuals were misclassified likely due to phenotypic similarity to coexisting tall species or insufficient coverage of intra-class variability. Ligustrum quihoui, with the fewest samples, suffered from severe underfitting across all metrics (UA = PA = F1 = 0.500), reflecting a failure to learn class-specific features under extreme data scarcity. The normalized confusion matrix (Figure 12) further illustrates the specific misclassification patterns among species.

These results demonstrate that classification performance is strongly conditioned by training set composition. Abundant or distinct classes achieve reliable discrimination, whereas minority classes or those with ambiguous spectral signatures are prone to omission or confusion. Addressing class imbalance—via data augmentation, strategic sampling, or incorporation of additional discriminative features—is therefore essential for improving model equity and robustness in future applications.

The variance in sample size among the five categories (Ligustrum quihoui, Populus tomentosa, Quercus variabilis, Sophora japonica, and Shrubs) reflects the ecological reality of the study area. Although this imbalance is associated with lower precision for minority classes (e.g., Ligustrum quihoui), the model maintained high overall robustness. Future improvements will involve exploring cost-sensitive learning or class-weighting strategies to further optimize the detection of underrepresented species without distorting their natural frequency in the landscape.

4.5. Tree Species Classification Map of the Yushan Mining Area

The ultimate applied output of this study is the wall-to-wall tree species classification map for the Yushan mining area, presented in Figure 13. This map was generated by applying the optimized XGBoost model, validated in the preceding sections, to predict the species for every individual tree object detected across the entire study area.

The resulting classification map (Figure 13) visualizes the patterns quantified in Table 7. Shrubs constitute the most abundant class and exhibit a widespread, pervasive distribution across the study area, indicative of early successional stages or areas experiencing limited tree establishment. The Sophora japonica, a recognized pioneer species, forms the second-largest component, often appearing in sizable, contiguous patches that likely correspond to areas of historical planting or vigorous natural colonization. Quercus variabilis is present in significant numbers but displays a more fragmented or clustered spatial pattern, potentially associated with specific microhabitats or later successional niches. The populations of Populus tomentosa and Ligustrum quihoui are minimal and highly localized, suggesting they are minor components in the current vegetation assemblage.

This spatially explicit inventory translates the model’s predictive capability into a tangible representation of the restoration landscape’s structure. The prevalence of shrubs and pioneer trees, as captured by the map, provides a critical baseline for assessing the current successional stage and informing future management interventions aimed at steering the ecosystem towards a more mature and diverse forest state.

5. Discussion

5.1. Interpretation of Model Performance and the Superiority of Data Fusion

Comparative analysis of the seven classifiers reveals two key insights that underpin the methodological contribution of this study.

First, XGBoost consistently outperformed all competing models, achieving the highest overall accuracy (0.897), Kappa coefficient (0.811), and macro F1-score (0.891). Its high Kappa value indicates substantial agreement beyond chance, confirming robust classification reliability. Gradient Boosting delivered comparable accuracy but required approximately three times the training time, highlighting XGBoost’s superior optimization and computational efficiency.

Second, the strong performance of tree-based ensemble methods—particularly boosting algorithms—can be directly attributed to the characteristics of the fused multi-source dataset. The integration of spectral, textural, and structural features produces a high-dimensional, heterogeneous feature space characterized by non-linear and interactive relationships with target classes. Tree-based models are inherently well suited to such complexity: they capture non-linear decision boundaries and hierarchical interactions via recursive partitioning, while their built-in feature importance metrics facilitate dimensionality navigation and mitigate sensitivity to noise or redundancy.

Conversely, models relying on linear separability or distance-based metrics exhibited clear limitations. Logistic Regression achieved only moderate accuracy (OA = 0.812), suggesting an inability to fully model the non-linear decision surfaces embedded in the fused feature space. K-Nearest Neighbors (KNN), despite near-zero training time, suffered from degraded predictive performance (F1 = 0.787)—a manifestation of the curse of dimensionality, wherein distance metrics lose discriminative power in high-dimensional spaces. Support Vector Machine (SVM), despite its theoretical strength in high-dimensional settings, yielded the lowest overall accuracy. This is likely attributable to increased class overlap and distributional complexity in the fused feature space, which hinder the identification of a globally optimal separating hyperplane.

Collectively, the observed performance hierarchy—ensemble boosting > bagging > linear/instance-based models—empirically validates the superiority of the data fusion paradigm. The fused feature set provides rich, complementary information that sophisticated non-linear models can effectively exploit, whereas simpler models fail to unlock its full potential. Thus, the synergy between comprehensive multi-source data fusion and the powerful pattern recognition capacity of gradient boosting constitutes the foundation of the effective classification framework established in this study.

It should be noted that the visual appearance in Figure 13 is influenced by species spatial distribution patterns and cartographic overlay effects among vegetation of different heights; the exact quantitative comparison should be based on Table 7.

To contextualize our findings within the broader literature, Table 8 provides a qualitative comparison of key methodological aspects across relevant prior studies and the present work.

As summarized in Table 8, direct numerical comparison is hindered by substantial differences in data sources, environments, classifiers, and class definitions. Our study extends prior efforts by combining multi-source fusion with a systematic multi-classifier evaluation under systematic statistical testing in a heterogeneous post-mining restoration landscape.

5.2. Ecological Insights from Feature Importance and Species Distribution Patterns

Beyond technical performance, the feature importance analysis of the optimized XGBoost model and the resultant wall-to-wall species map provide ecologically meaningful insights into forest recovery dynamics within the Yushan mining area.

5.2.1. Decoding Feature Importance: A Trait-Based Perspective

The predominance of red-edge indices, chlorophyll-sensitive vegetation metrics, and LiDAR-derived canopy height among top predictors reveals a trait-based foundation for species discrimination. High importance of photosynthetic-related spectral features indicates that foliar biochemistry—particularly pigment composition and nitrogen status—varies substantially among species, reflecting divergent strategies in light use, stress tolerance, or resource acquisition. For instance, the distinct spectral signature of Sophora japonica, a nitrogen-fixing pioneer, likely facilitates its reliable identification.

Maximum canopy height delineates a structural–successional gradient, serving as a proxy for competitive dominance, colonization timing, and life history strategy rather than a mere dimensional metric. Taller individuals, typically early colonizers or fast-growing species, dominate the overstory, whereas suppressed or understory cohorts indicate later successional status or microsite constraints. The model’s reliance on this metric underscores vertical stratification as a key axis of forest recovery.

5.2.2. Interpreting Spatial Distribution Patterns

The final classification map reveals non-random spatial aggregation of species, interpretable through the lens of restoration history and environmental heterogeneity. Contiguous patches of pioneer species likely reflect historical planting blocks, preserving the spatial legacy of initial rehabilitation interventions. In contrast, more dispersed or topographically constrained distributions of slower-growing, site-sensitive species suggest niche partitioning along moisture, radiation, or edaphic gradients.

Spatial mixing or abrupt boundaries between species patches further inform ecological processes such as competition, succession, and facilitation. Fine-scale interspersion may indicate natural regeneration or secondary succession, while sharp edges often demarcate anthropogenic planting limits or abrupt soil transitions. Thus, the map functions as a spatial hypothesis generator, identifying priority zones for targeted fieldwork on soil properties, microclimate, or planting history.

The XGBoost model effectively learns to discriminate species based on ecologically meaningful traits—biochemical and structural—that are directly linked to adaptive strategies and life-history variation. The resulting species distribution map transcends a mere classification product; it constitutes a spatially explicit snapshot of ongoing ecological processes—succession, competition, and niche differentiation—shaping the recovery trajectory of this anthropogenically disturbed ecosystem. This trait-mediated interpretation bridges remote sensing observations with foundational ecological theory, thereby enhancing the scientific and applied value of the methodological framework for restoration monitoring.

5.3. Implications for Restoration Monitoring and Management in the Yellow River Basin

The proposed framework, integrating UAV-based multi-source remote sensing with machine learning, offers transformative potential for evidence-based restoration in mining-disturbed regions of the ecologically fragile Yellow River Basin. Beyond proof-of-concept, it provides actionable solutions to key challenges in large-scale, long-term restoration monitoring and adaptive management.

5.3.1. Transition to Spatially Explicit, Quantitative Baselines

Conventional monitoring, based on plot surveys or moderate-resolution satellite imagery, fails to capture fine-scale heterogeneity in ecosystem recovery. This study provides high-resolution, wall-to-wall quantitative baselines that deliver precise metrics—such as pioneer species density, late-successional canopy cover, and woody–shrub configuration—across entire sites. These spatially explicit inventories establish objective reference states, enabling systematic quantitative evaluation of restoration interventions and successional trajectories.

5.3.2. Precision Restoration Informed by Spatial Ecology

The generated species distribution map encodes critical species–environment relationships. Coupling species patterns with LiDAR-derived topographic attributes (slope, aspect, topographic wetness index) allows identification of key species’ micro-habitat preferences. For instance, associating Populus tomentosa with moist depressions and Sophora japonica with dry south-facing slopes directly informs spatially targeted revegetation. Matching species to empirically delineated suitable habitats optimizes seed sourcing and planting, enhancing survival rates, reducing costs, and accelerating self-sustaining ecosystem development. Moreover, mapping restoration-stalled areas (e.g., persistent shrublands) enables prioritization of secondary interventions.

5.3.3. A Cost-Effective Adaptive Management Framework

Rapid UAV surveys, coupled with a machine learning pipeline, establish a repeatable, scalable, and cost-effective monitoring protocol. Biennial or seasonal resurveys using consistent methods enable longitudinal assessment of forest dynamics, informing adaptive management through data-driven adjustments. Spectral stress indices (e.g., the Photochemical Reflectance Index, PRI) capture native species expansion or health decline, offering early warning signals for timely intervention. This approach reconfigures restoration management from static, project-based efforts into a dynamic, feedback-driven process—critical for achieving long-term sustainability in the complex and variable Yellow River Basin.

5.4. Limitations and Future Research

While this study presents an effective framework for tree species classification in coal mining context, several limitations should be acknowledged to contextualize the findings and guide subsequent work.

5.4.1. Limitations of the Current Study

(1) Sample size and class imbalance: Despite a total of 1095 samples, severe class imbalance persisted. Although partially mitigated via class weighting, this likely biased performance toward dominant species and limited reliable detection of rare taxa. Thus, the demonstrated high accuracy primarily applies to species with adequate training samples; for severely underrepresented classes, the framework remains exploratory and requires further validation.

(2) Spatial and temporal specificity: Data were collected from a single mining area during a single phenological stage. Model transferability to other regions, forest types, or seasons remains unvalidated; captured spectral–structural signatures are temporally constrained and may not reflect full annual cycles.

(3) Minimal use of LiDAR structural information: This study extracted only tree height (TH) from the LiDAR data. Richer three-dimensional metrics—such as crown volume, canopy porosity, vertical complexity, and point density profiles—were not used. Consequently, the term “LiDAR fusion” in this paper refers to the integration of a single structural summary rather than a comprehensive set of LiDAR-derived features. The full potential of LiDAR point clouds for species discrimination in complex mining environments remains untapped here and awaits future investigation.

(4) High-dimensional feature space relative to sample size: Another limitation concerns the high dimensionality of the fused feature set (278 features) relative to the sample size (1095 samples). Although tree-based ensemble methods such as XGBoost incorporate regularization (e.g., L1/L2 penalties, column subsampling) to mitigate overfitting, the risk remains non-negligible, especially for minority classes. In this study, we did not perform explicit feature selection before model training. Future work should systematically apply dimensionality reduction techniques—such as recursive feature elimination (RFE), principal component analysis (PCA), or Boruta—to identify the most discriminative subset of features. Such approaches could further enhance model generalizability, reduce computational overhead, and improve interpretability, particularly when transferring the framework to other mining restoration sites with different species compositions.

(5) Potential multi-source registration errors: Although we carefully co-registered the LiDAR, hyperspectral, and RGB data to a common spatial resolution (0.14 m) and coordinate system, residual misalignment—especially in areas with steep terrain or complex crown boundaries—cannot be completely ruled out. Such registration errors may introduce noise in feature extraction (e.g., mismatches between spectral signatures and canopy height) and could disproportionately affect smaller crowns or species with irregular geometries. Future work should quantify registration uncertainty (e.g., using ground control points or mutual information metrics) and develop co-registration refinement strategies tailored to heterogeneous mining landscapes. Additionally, using a multi-scale feature extraction window or probabilistic feature assignment could help mitigate the impact of residual misalignments.

5.4.2. Future Research Directions

(1) Expanding data diversity and volume: Future efforts should prioritize larger, balanced, and multi-temporal datasets. Incorporating seasonal acquisitions would enable phenology-aware classification and improve discrimination of deciduous vs. evergreen species. Extending the framework to multiple restoration sites would facilitate systematic assessment of model transferability.

(2) Advanced LiDAR feature engineering: Full exploitation of 3D point cloud data—including crown volume, vertical foliage profiles, and point density metrics—promises to enrich structural feature sets and narrow performance gaps between spectrally similar species.

(3) Incorporation of environmental context: Integrating auxiliary spatial data—topographic derivatives, soil maps, or planting records—would enable models to learn species–environment relationships, improving predictive accuracy in sparsely vegetated or topographically complex areas and enhancing ecological realism.

(4) Robust feature importance evaluation: Given the high dimensionality and collinearity among spectral features, future work should apply more robust importance measures—such as permutation importance, SHAP values, or grouped importance by feature family—to validate ecological interpretations derived from mean decrease in impurity. The correlation filtering analysis in Section 4.3 demonstrates that collinearity can substantially inflate the importance of individual narrow bands; therefore, caution is warranted when attributing ecological meaning to specific wavelengths without proper decorrelation.

6. Conclusions

This study developed and evaluated an integrated framework for the accurate classification of individual tree species within a complex coal mining landscape of the Yellow River Basin. By integrating UAV-based hyperspectral (full bands and indices), RGB (textural features), and a minimal LiDAR-derived structural metric (tree height), and by conducting a rigorous, statistically grounded evaluation of seven machine learning classifiers, the research provides clear insights for both methodological advancement and practical ecological management.

Our principal findings are threefold. First, the systematic comparative analysis, reinforced by statistical significance testing, identified XGBoost as the optimal classifier. It achieved a superior and stable test set performance (Overall Accuracy = 0.897, Kappa = 0.811), demonstrating an exceptional capacity to model the complex, non-linear interactions within the high-dimensional, multi-source feature space. Second, the analysis unequivocally proved the critical importance of multi-sensor data fusion. The complementary information from LiDAR (canopy structure), hyperspectral imagery (biochemical properties), and RGB data (crown texture) was indispensable for high classification accuracy. Feature importance analysis confirmed that spectral and structural features were the primary drivers of model performance. Third, the operational workflow, from effective treetop detection using an adapted WST-Ncut method in a heterogeneous landscape to the application of the optimal model, successfully generated a high-resolution, wall-to-wall tree species distribution map for the entire Yushan mining area.

The primary methodological contribution of this work is the establishment of a reproducible, robust, and statistically evaluated processing chain for individual tree species classification, encompassing multi-sensor data fusion, comprehensive model evaluation, and operational large-area mapping. Beyond methodology, this study delivers significant practical value. The resulting detailed species map and the underlying framework equip ecological restoration managers in the Yellow River Basin with a potentially powerful tool for establishing quantitative baselines, enabling precision restoration planning, and facilitating long-term monitoring through adaptive management. This work demonstrates that the integration of multi-source UAV remote sensing with advanced machine learning offers a precise, scalable, and efficient solution for monitoring ecosystem recovery, thereby contributing directly to the sustainable management of ecologically vulnerable regions. We explicitly acknowledge that LiDAR data were severely underutilized in this study. Future work must incorporate richer three-dimensional structural metrics to more fully exploit the complementary information offered by LiDAR point clouds. Given the study’s limitations (single site, single season, class imbalance, and minimal LiDAR feature exploitation), the framework should be viewed as a demonstration of potential rather than a fully operational solution. Future work across multiple sites and seasons is necessary to assess generalizability.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/rs18091361/s1, Figure S1: Correlation filtering analysis—(a) XGBoost performance (Overall Accuracy and Kappa) before (278 features) and after (40 features) removing highly correlated feature pairs (|r| > 0.9). (b) Distribution of pairwise Pearson correlation coefficients among all 278 features; the red dashed line indicates the threshold of 0.9; Figure S2: Comparison of blue-edge band importance before and after correlation filtering. Blue bars represent importance in the original full feature set (278 features); orange bars represent importance in the reduced set (40 features). Only blue-edge bands (470–500 nm) that appeared in the top five of the original set are shown. None of these bands were retained in the reduced set, indicating that their high importance largely stemmed from collinearity.

Author Contributions

Proposing the innovative points and conceiving the study, theoretical analysis, data analysis and validation, and writing the original manuscript, G.W.; theoretical analysis, S.N.; data curation, X.X. and C.W.; validation, C.W. and H.W.; resources, H.W. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported in part by the National Natural Science Foundation of China under Grant U22A20566, in part by the Henan Province Key Research and Development Program under Grant 262102321113, and in part by key scientific research projects of colleges and universities in Henan Province under Grant 25A420004.

Data Availability Statement

Data cannot be publicly shared due to ongoing mining operations but are available from the corresponding author upon reasonable request.

Acknowledgments

The authors are grateful to express their sincere gratitude to the editors and anonymous reviewers for their insightful comments and constructive suggestions. The authors also gratefully acknowledge the UAV data acquisition and field survey team for their support in collecting the hyperspectral imagery and LiDAR point cloud data over the Yushan mining area in Luoyang, within the Yellow River Basin, and for their assistance in the reference tree species investigation and validation data preparation.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Wang, G. Estimation of ecological products in the Yellow River Basin. Carbonsphere 2025, 1, 9510001. [Google Scholar] [CrossRef]
Jian, S.; Zhang, Q.; Wang, H. Spatial–Temporal Trends in and Attribution Analysis of Vegetation Change in the Yellow River Basin, China. Remote Sens. 2022, 14, 4607. [Google Scholar] [CrossRef]
Li, W.; Xie, H.; Sun, W.; Han, Y.; Jiang, X.; Huang, G.; Tao, P. Mine Water Production, Treatment, and Utilization in the Yellow River Basin: Spatial Patterns and Sustainable Transformation Pathways. Appl. Sci. 2025, 15, 12353. [Google Scholar] [CrossRef]
Li, X.; Du, S.; Hu, S.; Dong, D.; Jiang, D.; Cao, C.; Lin, G.; Zhang, J. Simulation of surface water–groundwater interaction in coal mining subsidence areas: A case study of the Kuye River Basin in China. J. Hydrol. 2025, 659, 133243. [Google Scholar] [CrossRef]
Zhou, J.; Li, P.; Zhang, H. Research status and prospect of ecological restoration technology for coal mines in the Yellow River basin. China Min. Mag. 2021, 30, 8–14. [Google Scholar]
Lin, M.; Hou, L.; Qi, Z.; Wan, L. Impacts of climate change and human activities on vegetation NDVI in China’s Mu Us Sandy Land during 2000–2019. Ecol. Indic. 2022, 142, 109164. [Google Scholar] [CrossRef]
Chai, H.; Guan, P.; Hu, J.; Geng, S.; Ding, Y.; Xu, H.; Zhao, Y.; Xu, M. Temporal and Spatial Variations in the Normalized Difference Vegetation Index in Shanxi Section of the Yellow River Basin and Coal Mines and Their Response to Climatic Factors. Appl. Sci. 2023, 13, 12596. [Google Scholar] [CrossRef]
Jonsson, M.; Bengtsson, J.; Gamfeldt, L.; Moen, J.; Snall, T. Levels of forest ecosystem services depend on specific mixtures of commercial tree species. Nat. Plants 2019, 5, 141–147. [Google Scholar] [CrossRef]
Fassnacht, F.E.; Latifi, H.; Stereńczak, K.; Modzelewska, A.; Lefsky, M.; Waser, L.T.; Straub, C.; Ghosh, A. Review of studies on tree species classification from remotely sensed data. Remote Sens. Environ. 2016, 186, 64–87. [Google Scholar] [CrossRef]
Wu, X.; Cheng, X.; Ju, Q.; Su, S.; Zhang, X.; Zhu, T. Research progress in mining ecological restoration technology. J. Ind. Saf. Environ. Prot. 2024, 1, 100004. [Google Scholar] [CrossRef]
Harries, K.L.; Woinarski, J.; Rumpff, L.; Gardener, M.; Erskine, P.D. Characteristics and Gaps in the Assessment of Progress in Mine Restoration: Insights from Five Decades of Published Literature Relating to Native Ecosystem Restoration after Mining. Restor. Ecol. 2023, 32, e14016. [Google Scholar] [CrossRef]
Aili, A.; Zhang, Y.G.; Lin, T.; Xu, H.L.; Wahhed, A.; Zhao, W.Y.; Kuerban, A.; Liu, K.; Dou, H.T. Optimizing vegetation restoration: A comprehensive index system for reclaiming abandoned mining areas in arid regions of China. Biology 2025, 14, 23. [Google Scholar] [CrossRef]
Dehkordi, M.M.; Nodeh, Z.P.; Dehkordi, K.S.; Salmanvandi, H.; Khorjestan, R.R.; Ghaffarzadeh, M. Soil, air, and water pollution from mining and industrial activities: Sources of pollution, environmental impacts, and prevention and control methods. Results Eng. 2024, 23, 102729. [Google Scholar] [CrossRef]
Perikleous, D.; Margariti, K.; Velanas, P.; Blazquez, C.S.; Gonzalez-Aguilera, D. Aerial Drones for Geophysical Prospection in Mining: A Review. Drones 2025, 9, 383. [Google Scholar] [CrossRef]
McKenna, P.B.; Lechner, A.M.; Phinn, S.; Erskine, P.D. Remote sensing of mine site rehabilitation for ecological outcomes: A global systematic review. Remote Sens. 2020, 12, 3535. [Google Scholar] [CrossRef]
Xie, J.Y.; Liu, Y.X.; Xie, M.M.; Xia, L.; Yang, R.J.; Li, J.A. Exploring the Restoration Stability of Abandoned Open-pit Mines by Vegetation Resilience Indicator Based on the LandTrendr Algorithm. Ecol. Indic. 2024, 166, 112392. [Google Scholar] [CrossRef]
Song, D.Y.; Hu, Z.Q.; Zeng, J.Y.; Sun, H. Influence of Mining on Vegetation in Semiarid Areas of Western China Based on the Coupling of above Ground and below Ground—A Case Study of Daliuta Coalfield. Ecol. Indic. 2024, 161, 111964. [Google Scholar] [CrossRef]
Lian, Z.K.; Hao, H.C.; Zhao, J.; Cao, K.Z.; Wang, H.S.; He, Z.C. Evaluation of Remote Sensing Ecological Index Based on Soil and Water Conservation on the Effectiveness of Management of Abandoned Mine Landscaping Transformation. Int. J. Environ. Res. Public Health 2022, 19, 9750. [Google Scholar] [CrossRef]
He, S.; Deng, Y.; Wang, J.; Luo, M. Fine identification of vegetation types in open pit mining regions using combined UAV RGB imagery and LiDAR point cloud data. Int. J. Digit. Earth 2025, 18, 2515269. [Google Scholar] [CrossRef]
Zhang, Z.; Zhu, L. A Review on Unmanned Aerial Vehicle Remote Sensing: Platforms, Sensors, Data Processing Methods, and Applications. Drones 2023, 7, 398. [Google Scholar] [CrossRef]
Lockhart, K.; Sandino, J.; Amarasingam, N.; Hann, R.; Bollard, B.; Gonzalez, F. Unmanned Aerial Vehicles for Real-Time Vegetation Monitoring in Antarctica: A Review. Remote Sens. 2025, 17, 304. [Google Scholar] [CrossRef]
Wang, L.; Lu, D.; Xu, L.; Robinson, D.T.; Tan, W.; Xie, Q.; Guan, H.; Chapman, M.A.; Li, J. Individual tree species classification using low-density airborne multispectral LiDAR data via attribute-aware cross-branch transformer. Remote Sens. Environ. 2024, 315, 114456. [Google Scholar] [CrossRef]
Cao, L.; Coops, N.C.; Innes, J.L.; Dai, J.; Ruan, H.; She, G. Tree species classification in subtropical forests using small-footprint full-waveform LiDAR data. Int. J. Appl. Earth Obs. Geoinf. 2016, 64, 59–70. [Google Scholar] [CrossRef]
Dong, T.; Zhang, X.; Ding, Z.; Fan, J. Multi-layered tree crown extraction from LiDAR data using graph-based segmentation. Comput. Electron. Agric. 2020, 170, 105071. [Google Scholar] [CrossRef]
Rust, S.; Stoinski, B. Enhancing Tree Species Identification in Forestry and Urban Forests through Light Detection and Ranging Point Cloud Structural Features and Machine Learning. Forests 2024, 15, 188. [Google Scholar] [CrossRef]
Marconi, S.; Weinstein, B.G.; Zou, S.; Bohlman, S.A.; Zare, A.; Singh, A.; Stewart, D.; Harmon, I.; Steinkraus, A.; White, E.P. Continental-scale hyperspectral tree species classification in the United States National Ecological Observatory Network. Remote Sens. Environ. 2022, 282, 113223. [Google Scholar] [CrossRef]
Juola, J.; Hovi, A.; Rautiainen, M. Classification of tree species based on hyperspectral reflectance images of stem bark. Eur. J. Remote Sens. 2023, 56, 2161420. [Google Scholar] [CrossRef]
Hou, C.; Liu, Z.; Chen, Y.; Wang, S.; Liu, A. Tree Species Classification from Airborne Hyperspectral Images Using Spatial–Spectral Network. Remote Sens. 2023, 15, 5679. [Google Scholar] [CrossRef]
Chen, X.; Shen, X.; Cao, L. Tree Species Classification in Subtropical Natural Forests Using High-Resolution UAV RGB and SuperView-1 Multispectral Imageries Based on Deep Learning Network Approaches: A Case Study within the Baima Snow Mountain National Nature Reserve, China. Remote Sens. 2023, 15, 2697. [Google Scholar] [CrossRef]
Vélez, S.; Vacas, R.; Martín, H.; Ruano-Rosa, D.; Álvarez, S. High-Resolution UAV RGB Imagery Dataset for Precision Agriculture and 3D Photogrammetric Reconstruction Captured over a Pistachio Orchard (Pistacia vera L.) in Spain. Data 2022, 7, 157. [Google Scholar] [CrossRef]
Li, Q.; Hu, B.; Shang, J.; Li, H. Fusion Approaches to Individual Tree Species Classification Using Multisource Remote Sensing Data. Forests 2023, 14, 1392. [Google Scholar] [CrossRef]
Wang, M.; Zhang, Q.; Liu, X.; Zhang, J.; Yu, F.; Zhang, X.; Zhao, R. Research progress on multimodal data fusion in forest resource monitoring. Front. Plant Sci. 2026, 16, 1710618. [Google Scholar] [CrossRef] [PubMed]
Hu, B.; Li, Q.; Hall, G.B. A decision-level fusion approach to tree species classification from multi-source remotely sensed data. Int. J. Appl. Earth Obs. Geoinf. 2021, 103, 102471. [Google Scholar] [CrossRef]
Qin, H.; Zhou, W.; Yao, Y.; Wang, W. Individual tree segmentation and tree species classification in subtropical broadleaf forests using UAV-based LiDAR, hyperspectral, and ultrahigh-resolution RGB data. Remote Sens. Environ. 2022, 280, 114229. [Google Scholar] [CrossRef]
Wang, A.; Shi, S.; Yang, J.; Luo, Y.; Tang, X.; Du, J.; Bi, S.; Qu, F.; Gong, C.; Gong, W. Integration of LiDAR and Hyperspectral Imagery for Tree Species Identification at the Individual Tree Level. Photogramm. Rec. 2025, 40, e70007. [Google Scholar] [CrossRef]
Gong, Y.; Zhu, D.; Li, X.; Lv, L.; Zhang, B.; Xuan, J. Using UAV LiDAR Intensity Frequency and Hyperspectral Features to Improve the Accuracy of Urban Tree Species Classification. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2023, 16, 2849–2865. [Google Scholar] [CrossRef]
Luo, M.; Tian, Y.; Zhang, S.; Huang, L.; Wang, H.; Liu, Z.; Yang, L. Individual Tree Detection in Coal Mine Afforestation Area Based on Improved Faster RCNN in UAV RGB Images. Remote Sens. 2022, 14, 5545. [Google Scholar] [CrossRef]
Deng, Y.; He, S.; Wang, J.; Zhang, J.; Yi, B.; Hu, C.; Jiang, Y.; Chen, A. Joint cascaded 3DCNN and SDTA encoding for tree species classification using UAV-based hyperspectral image in mining areas. Comput. Electron. Agric. 2026, 244, 111438. [Google Scholar] [CrossRef]
Zhong, H.; Zhang, Z.; Liu, H.; Wu, J.; Lin, W. Individual Tree Species Identification for Complex Coniferous and Broad-Leaved Mixed Forests Based on Deep Learning Combined with UAV LiDAR Data and RGB Images. Forests 2024, 15, 293. [Google Scholar] [CrossRef]
Gominski, D.; Ortiz-Gonzalo, D.; Brandt, M.; Mugabowindekwe, M.; Fensholt, R. Mining Field Data for Tree Species Recognition at Scale. arXiv 2024, arXiv:2408.15816. [Google Scholar] [CrossRef]
Zhang, K.; Ning, L.; Ning, K.; Jin, Z.; Wang, X.; Zhu, H. Structural characteristics and spatial heterogeneity of vegetation and below-ground habitat during the long-term succession of ecosystems in mining areas. Ecol. Eng. 2025, 214, 113456. [Google Scholar] [CrossRef]
Xu, Y.; Yang, G.; Zhang, Y.; Guo, J.; Zhang, C. Mapping and interpreting spatio-temporal trends in vegetation restoration following mining disturbances in large-scale surface coal mining areas. Front. Environ. Sci. 2025, 13, 1531424. [Google Scholar] [CrossRef]
Chen, Z.; Liu, X.; Feng, H.; Wang, H.; Hao, C. The Spatiotemporal Evolution of Vegetation in the Henan Section of the Yellow River Basin and Mining Areas Based on the Normalized Difference Vegetation Index. Remote Sens. 2024, 16, 4419. [Google Scholar] [CrossRef]
Wang, G.; Wang, C.; Wang, H.T.; Chen, C.; Yang, F.Q. Vegetation extraction of mining areas in the Yellow River Basin based on UAV dense matching point cloud. China Min. Mag. 2023, 32, 65–71. [Google Scholar]
Liu, J.; Zhang, Q.; Zhang, Z. Analysis of spatial and temporal evolution of vegetation cover in mining areas based on the three-endmember linear spectral mixture model. Geocarto Int. 2025, 40, 1. [Google Scholar] [CrossRef]
Zhang, W.; Qi, J.; Wan, P.; Wang, H.; Xie, D.; Wang, X.; Yan, G. An Easy-to-Use Airborne LiDAR Data Filtering Method Based on Cloth Simulation. Remote Sens. 2016, 8, 501. [Google Scholar] [CrossRef]
Miyazaki, Y.; Kawakami, M.; Kondo, K.; Hirabe, A.; Kamimoto, T.; Akimoto, T.; Hijikata, N.; Tsujikawa, M.; Honaga, K.; Suzuki, K.; et al. Logistic regression analysis and machine learning for predicting post-stroke gait independence: A retrospective study. Sci. Rep. 2024, 14, 21273. [Google Scholar] [CrossRef]
Uddin, S.; Haque, I.; Lu, H.; Moni, M.A.; Gide, E. Comparative performance analysis of K-nearest neighbour (KNN) algorithm and its different variants for disease prediction. Sci. Rep. 2022, 12, 6256. [Google Scholar] [CrossRef]
Hearst, M.A.; Dumais, S.T.; Osuna, E.; Platt, J.; Scholkopf, B. Support vector machines. IEEE Intell. Syst. 1998, 13, 18–28. [Google Scholar] [CrossRef]
Murthy, S.K. Automatic Construction of Decision Trees from Data: A Multi-Disciplinary Survey. Data Min. Knowl. Discov. 1998, 2, 345–389. [Google Scholar] [CrossRef]
Salman, H.A.; Kalakech, A.; Steiti, A. Random Forest Algorithm Overview. Babylon. J. Mach. Learn. 2024, 2004, 69–79. [Google Scholar] [CrossRef] [PubMed]
Bentéjac, C.; Csörgő, A.; Martínez-Muñoz, G. A comparative analysis of gradient boosting algorithms. Artif. Intell. Rev. 2021, 54, 1937–1967. [Google Scholar] [CrossRef]
Gharagoz, M.M.; Noureldin, M.; Kim, J. Explainable machine learning (XML) framework for seismic assessment of structures using Extreme Gradient Boosting (XGBoost). Eng. Struct. 2025, 327, 119621. [Google Scholar] [CrossRef]
Amin, G.; Imtiaz, I.; Haroon, E.; Saqib, N.U.; Shahzad, M.I.; Nazeer, M. Assessment of Machine Learning Algorithms for Land Cover Classification in a Complex Mountainous Landscape. J. Geovis. Spat. Anal. 2024, 8, 34. [Google Scholar] [CrossRef]
Niedoba, T.; Surowiak, A.; Hassanzadeh, A.; Khoshdast, H. Evaluation of the Effects of Coal Jigging by Means of Kruskal–Wallis and Friedman Tests. Energies 2023, 16, 1600. [Google Scholar] [CrossRef]
Agbangba, C.E.; Sacla Aide, E.; Honfo, H.; Glèlè Kakai, R. On the use of post-hoc tests in environmental and biological sciences: A critical review. Heliyon 2024, 10, e25131. [Google Scholar] [CrossRef]
Diószegi, G.; Molnár, V.É.; Nagy, L.A.; Enyedi, P.; Török, P.; Szabó, S. A new method for individual treetop detection with low-resolution aerial laser scanned data. Model. Earth Syst. Environ. 2024, 10, 5225–5240. [Google Scholar] [CrossRef]
Theng, D.; Bhoyar, K.K. Feature selection techniques for machine learning: A survey of more than two decades of research. Knowl. Inf. Syst. 2024, 66, 1575–1637. [Google Scholar] [CrossRef]

Figure 1. Location of Yushan Coal Mine: (a) the Yellow River Basin, (b) Henan Province, (c) topography of Henan Section of Yellow River Basin, (d) Yushan Coal Mine site and (e) Study area of Yushan Coal Mine.

Figure 2. Field data collection activities at the Yushan mining area.

Figure 3. Friedman test result.

Figure 4. Model ranks with critical difference. Lines connect classifiers whose average ranks are not significantly different (pairwise rank difference < critical difference, Nemenyi test, α = 0.05).

Figure 5. Cross-validation accuracy distribution of seven classifiers.

Figure 6. Scatter plot of training time vs. overall accuracy.

Figure 7. Comparison of cross-validation and test set accuracy.

Figure 8. Statistical performance tiers result.

Figure 9. The top 20 most important features of the optimal model. Note: TH = Tree Height; PSSR = Pigment Specific Simple Ratio; VOG = Vogelmann Red Edge Index. Different colors indicate feature categories: blue for spectral features, and green for structural features.

Figure 10. The cumulative feature importance distribution.

Figure 11. Species-specific classification performance of the optimal model.

Figure 12. Normalized confusion matrix of the optimal model.

Figure 13. Tree species classification map of the Yushan mining area. Note: The visual dominance of Sophora japonica in this map is primarily due to its spatially aggregated (contiguous patch) distribution. Shrubs, although having a higher total count (20,102 vs. 13,800 in Table 7), are more discretely scattered and are often overlaid by tree crown symbols because they are generally shorter than trees. This overlay effect further reduces their visual prominence. Readers are advised to refer to Table 7 for exact quantitative comparisons.

Table 1. Parameters of GaiaSky-mini3-VN.

Parameter	Value
Spectral range	400–1000 (nm)
Spectral resolution	5.5 nm
Spectral sampling rate	2.7 nm@224
Spectral channel number	224
Single image resolution	1024 × 1000

Table 2. List of hyperspectral features.

Features	Formula	Features	Formula
Normalized differential vegetation index	$N D V I = \frac{ρ_{800} - ρ_{670}}{ρ_{800} + ρ_{670}}$	Slope of red edge	$S L = \frac{ρ_{740} - ρ_{690}}{50}$
Enhanced vegetation index	$E V I = 2.5 \times \frac{ρ_{798} - ρ_{679}}{1 + ρ_{798} + 6 \times ρ_{679} - 7.5 \times ρ_{482}}$	Mean red edge	$M {ean}_{690 - 740} = \frac{\sum_{690}^{740} ρ_{i}}{n}$
Adjusted vegetation index	$S A V I = 1.5 \times \frac{ρ_{798} - ρ_{679}}{ρ_{798} - ρ_{679} + 0.5}$	Red edge ratio vegetation index	$M R E S R I = \frac{ρ_{750} - ρ_{445}}{ρ_{750} + ρ_{445}}$
Green index	$G I = \frac{ρ_{798}}{ρ_{553}} - 1$	Plant pigment ratio	$P P R = \frac{ρ_{550} - ρ_{450}}{ρ_{550} + ρ_{450}}$
Photochemical reflectance index	$P R I = \frac{ρ_{570} - ρ_{530}}{ρ_{570} + ρ_{530}}$	Water content index	$W B I = \frac{ρ_{898}}{ρ_{969}}$
Plant senescence reflectance index	$P S R I = \frac{ρ_{695}}{ρ_{760}}$	Anthocyanin content index	$A C I = \frac{ρ_{650}}{ρ_{550}}$
Structurally insensitive pigment index	$S I P I = \frac{ρ_{800} - ρ_{450}}{ρ_{800} + ρ_{680}}$	Ratio vegetation stress index	$S I = \frac{ρ_{600}}{ρ_{760}}$
Pigment-specific simple ratio	$P S S R = \frac{ρ_{800}}{ρ_{680}}$	Band value of 550 nm	ρ₅₅₀
Datt Chlorophyll content index	$D a t t = \frac{ρ_{850} - ρ_{710}}{ρ_{850} - ρ_{680}}$	Band value of 750 nm	ρ₇₅₀
Anthocyanin reflectance index	$A R I = ρ_{800} \times (\frac{1}{ρ_{500}} - \frac{1}{ρ_{700}})$	1st principal component	PCA1
Chlorophyll index	$C I = \frac{ρ_{760}}{ρ_{700}} - 1$	2nd principal component	PCA2
Transformed chlorophyll absorption in reflectance index	$T C A R I = 3 \times [(ρ_{700} - ρ_{670}) - 0.2 \times (ρ_{700} - ρ_{550}) \times \frac{ρ_{700}}{ρ_{670}}]$	3rd principal component	PCA3
Modified NDVI	$M N D V I = \frac{ρ_{750} - ρ_{705}}{ρ_{750} + ρ_{705} - ρ_{445} \times 2} - 1$	4th principal component	PCA4
Red-edge vegetation stress index	$R V S I = \frac{ρ_{714} + ρ_{752}}{2} - ρ_{733}$	5th principal component	PCA5
Vogelmann red edge index	$V O G = \frac{ρ_{740}}{ρ_{720}}$

Where ρwavelength denotes the reflectance at a specific wavelength (e.g., ρ₈₀₀ represents the reflectance at 800 nm), and n is the total number of spectral bands. Note: This table lists only the 29 derived features; the original 224 spectral bands are not included. ρ₅₅₀ and ρ₇₅₀ denote reflectance at specific wavelengths; PCA1–PCA5 are the first five principal components derived from the 224 spectral bands; all other entries are vegetation indices computed from reflectance values.

Table 3. Textural features for canopy characterization.

Features	Formula
Mean	$M E A = \sum_{i, j = 0}^{N - 1} i (P_{i, j})$
Variance	$V A R = {\sum_{i, j = 0}^{N - 1} P_{i, j} (i - M E A)}^{2}$
Homogeneity	$H O M = \sum_{i, j = 0}^{N - 1} \frac{P_{i, j}}{1 + {(i - j)}^{2}}$
Contrast	$C O N = \sum_{i, j = 0}^{N - 1} P_{i, j} {(i - j)}^{2}$
Dissimilarity	$D I S = \sum_{i, j = 0}^{N - 1} P_{i, j} \|i - j\|$
Entropy	$E N T = \sum_{i, j = 0}^{N - 1} P_{i, j} (- \ln P_{i, j})$
Second moment	$S M = \sum_{i, j = 0}^{N - 1} P_{i, j}^{2}$
Correlation	$C O R = \sum_{i, j = 0}^{N - 1} P_{i, j} [\frac{(i - M E A_{i}) (j - M E A_{j})}{\sqrt{(V A R_{i}) (V A R_{j})}}]$

Table 4. Hyperparameter configurations for the evaluated machine learning classifiers.

Classifier	Key Hyperparameter Settings
XGBoost	n_estimators = 200, max_depth = 5, learning_rate = 0.1, gamma = 0, min_child_weight = 1, subsample = 0.8, colsample_bytree = 0.8
Gradient Boosting	n_estimators = 200, max_depth = 5, learning_rate = 0.1, min_samples_split = 2, min_samples_leaf = 1, max_features = ‘sqrt’
Random Forest	n_estimators = 200, max_depth = 20, min_samples_split = 5, min_samples_leaf = 2, max_features = ‘sqrt’, bootstrap = True
Support Vector Machine	C = 1.0, kernel = ‘rbf’, gamma = ‘scale’, class_weight = ‘balanced’
Decision Tree	max_depth = 10, min_samples_split = 2, min_samples_leaf = 1, class_weight = ‘balanced’
K-Nearest Neighbors	n_neighbors = 5, weights = ‘distance’, metric = ‘minkowski’, p = 2
Logistic Regression	C = 1.0, penalty = ‘l2’, solver = ‘lbfgs’, class_weight = ‘balanced’, max_iter = 1000

Table 5. Comparative Analysis of Machine Learning Models.

Rank	Model	OA	Kappa	F1-Score	Training Time (s)
1	XGBoost	0.897	0.811	0.891	2.360
2	Gradient Boosting	0.891	0.796	0.880	6.165
3	Random Forest	0.824	0.701	0.823	0.243
4	Logistic Regression	0.812	0.693	0.825	1.625
5	K-Nearest Neighbors	0.812	0.643	0.787	0.002
6	Decision Tree	0.794	0.671	0.813	0.123
7	Support Vector Machine	0.715	0.575	0.749	0.216

Table 6. Cross-validation performance and ranking of machine learning algorithms.

Model	Average Rank	CV Mean Accuracy	CV Std Accuracy
XGBoost	1.10	0.8877	0.0162
Gradient Boosting	1.90	0.8780	0.0151
Random Forest	3.78	0.8281	0.0267
Logistic Regression	4.02	0.8199	0.0191
K-Nearest Neighbors	4.22	0.8135	0.0170
Decision Tree	6.08	0.7587	0.0306
Support Vector Machine	6.90	0.6944	0.0237

Table 7. Statistics of predicted tree species classes for the Yushan mining area.

Species Class	Predicted Number of Individual Trees	Percentage of Total (%)
Shrubs	20,102	48.93
Sophora japonica	13,800	33.59
Quercus variabilis	6554	15.95
Populus tomentosa	401	0.98
Ligustrum quihoui	229	0.56
Total	41,086	100.00

Table 8. Comparison of key methodological aspects among related studies.

Study	Data Sources	Environment	Classifier(s)	Key Challenge Addressed	Major Limitation
[37]	RGB only	Coal mine afforestation area	Faster R-CNN	Individual tree detection	RGB only; no spectral and LiDAR data
[38]	Hyperspectral only	Mining restoration site	3DCNN	Tree species classification	Hyperspectral only; no structural data
[34]	LiDAR + Hyperspectral + RGB	Subtropical natural forest	Random Forest	Multi-source fusion classification	Single classifier; non-mining environment
Our study	LiDAR + Hyperspectral + RGB	Coal mining restoration area with heterogeneous tree–shrub–grass mosaics	Seven algorithms	Effective classification in complex mining environment	Class imbalance; high dimensionality

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Wang, G.; Nie, S.; Xi, X.; Wang, C.; Wang, H. Individual Tree Species Classification in a Mining Area of the Yellow River Basin Using UAV-Based LiDAR, Hyperspectral, and RGB Data. Remote Sens. 2026, 18, 1361. https://doi.org/10.3390/rs18091361

AMA Style

Wang G, Nie S, Xi X, Wang C, Wang H. Individual Tree Species Classification in a Mining Area of the Yellow River Basin Using UAV-Based LiDAR, Hyperspectral, and RGB Data. Remote Sensing. 2026; 18(9):1361. https://doi.org/10.3390/rs18091361

Chicago/Turabian Style

Wang, Guo, Sheng Nie, Xiaohuan Xi, Cheng Wang, and Hongtao Wang. 2026. "Individual Tree Species Classification in a Mining Area of the Yellow River Basin Using UAV-Based LiDAR, Hyperspectral, and RGB Data" Remote Sensing 18, no. 9: 1361. https://doi.org/10.3390/rs18091361

APA Style

Wang, G., Nie, S., Xi, X., Wang, C., & Wang, H. (2026). Individual Tree Species Classification in a Mining Area of the Yellow River Basin Using UAV-Based LiDAR, Hyperspectral, and RGB Data. Remote Sensing, 18(9), 1361. https://doi.org/10.3390/rs18091361

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Individual Tree Species Classification in a Mining Area of the Yellow River Basin Using UAV-Based LiDAR, Hyperspectral, and RGB Data

Highlights

Abstract

1. Introduction

2. Materials and Methods

2.1. Study Area

2.2. Data Acquisition and Preprocessing

2.2.1. Field Data Collection

2.2.2. UAV-Based LiDAR Data and RGB Imagery

2.2.3. Hyperspectral Data

3. Methodology

3.1. Feature Extraction at Sample Points

3.1.1. Structural Features from LiDAR

3.1.2. Spectral Features from Hyperspectral Data

3.1.3. Textural Features from RGB Imagery

3.2. Dataset Construction and Splitting

3.3. Machine Learning Classifiers and Training

3.3.1. Classification Algorithms Used for Comparison

3.3.2. Model Training and Hyperparameter Optimization

3.4. Model Evaluation

3.4.1. Core Classification Accuracy Metrics

3.4.2. Statistical Significance Testing

3.5. Operational Application for Wall-to-Wall Mapping

3.5.1. Treetop Detection of the Mining Area

3.5.2. Feature Extraction for All Objects

3.5.3. Spatial Prediction

4. Results

4.1. Comparative Performance of Machine Learning Models

4.1.1. Overall Model Performance

4.1.2. Statistical Significance Testing Based on 5 × 5 Cross-Validation

4.1.3. Model Performance and Stability Analysis

4.1.4. Statistical Performance Tiers

4.1.5. Conclusion of Comparative Analysis

4.2. Feature Importance of the Optimal Model

4.3. Feature Redundancy Analysis

4.4. Class-Wise Performance of the Optimal Model

4.5. Tree Species Classification Map of the Yushan Mining Area

5. Discussion

5.1. Interpretation of Model Performance and the Superiority of Data Fusion

5.2. Ecological Insights from Feature Importance and Species Distribution Patterns

5.2.1. Decoding Feature Importance: A Trait-Based Perspective

5.2.2. Interpreting Spatial Distribution Patterns

5.3. Implications for Restoration Monitoring and Management in the Yellow River Basin

5.3.1. Transition to Spatially Explicit, Quantitative Baselines

5.3.2. Precision Restoration Informed by Spatial Ecology

5.3.3. A Cost-Effective Adaptive Management Framework

5.4. Limitations and Future Research

5.4.1. Limitations of the Current Study

5.4.2. Future Research Directions

6. Conclusions

Supplementary Materials

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI