1. Introduction
Forests are the mainstay of the global terrestrial ecosystem, with functions such as water conservation, soil and water conservation, and maintaining the balance of the ecosystem [
1,
2]. Tree species’ composition is an essential component of forest certification programs, and information on the spatial distribution of tree species can provide basic information for studies such as biodiversity assessment, invasive alien species monitoring, and wildlife habitat mapping [
3].
Traditional methods of forest surveying involve random sampling and field surveys at each sample site, which are time-consuming and labor-intensive. Remote sensing technology can obtain forest information from rough terrain or hard-to-reach areas and speed up the work progress [
4]. The application of remote sensing images to tree species classification originated from the visual interpretation of color aerial photographs [
5], and then developed to multi-spectral image classification [
6,
7], multi-temporal remote sensing image [
8,
9] hyperspectral image analysis [
10,
11], and fusion of multiple sensor images [
12,
13]. Medium resolution images have been applied to classify tree species on the regional scale. However, species classification accuracy was inadequate for forest inventory at the stand scale. Hyperspectral images with continuous narrow wavelength bands, especially in visible and near-infrared regions, can detect subtle changes from different tree species on the canopy and leaf levels [
14]. In the recent decade, airborne platforms, especially UAV platforms, have been widely used, which are commonly equipped with multi-spectral or hyperspectral sensors to meet the requirements of flexible spectral and spatial resolution for tree species classification [
15,
16]. LiDAR can detect the 3D structural information of tree species and can obtain rich spectral and structural features, providing a favorable platform and technical basis for fine tree species classification and forest parameter estimation at small and medium scales [
17,
18,
19].
However, it is challenging to popularize the application of plant diversity monitoring and forest resources surveys in large areas, and satellite-based spectral remote sensing data became a better choice.
There are some studies of satellite-based hyperspectral for tree species classification; M.Papeş et al. (2010) verified that the Hyperion imaging spectroscopy has the potential for developing regional mapping of large-crowned tropical trees [
20]. George et al. (2014) demonstrated the potential utility of the narrow spectral bands of Hyperion data in discriminating tree species in a hilly terrain [
21]. Lu et al. (2017) generated fine spatial-spectral-resolution images by blending the environment 1A series satellite (HJ-1A) multispectral images; the spatial and spectral information was utilized simultaneously to distinguish various forest types [
22]. Xi et al. (2019) used OHS-1 hyperspectral images combined with RF models to classify tree species, and the overall classification accuracy was 80.61% [
23]. Vangi et al. (2021) compared the capabilities of the new PRISMA sensor and the well-known Sentinel-2A/2B Multispectral Instrument (MSI) in identifying different forest types. The PRISMA hyperspectral sensor was able to distinguish forest types [
24]. Wan et al. (2020) used Landsat 8 OLI, simulated Hyperion and GF-5 image data sets, RF and SVM to classify mangrove species, and the results showed that GF-5 has the highest accuracy [
25]. Compared to previous hyperspectral images, the GF-5 has an advantage in terms of band number and bandwidth, with 330 bands and a strip width of 60 km [
26]. However, its spatial resolution is 30 m, the misclassification of tree species is still very obvious, the classification accuracy is not very high, and it is not enough to reach the demand of forest inventory. To make up for the low spatial resolution of hyperspectral images, many researchers began to explore the method of fusing high spatial resolution images and high spectral resolution images.
In recent years, many methods of hyperspectral image fusion have emerged, which can be classified into four categories, including Component Substitution (CS) comma Multiresolution Analysis (MRA), Variational Optimization (VO)-based methods, and Deep Learning (DL)-based methods [
27]. Gram-Schmidt (GS) belongs to the CS method, which is more widely used in the fusion of multi-source remote sensing data, with the fusion of multi-spectral and high-resolution images such as Gaofen-2 (GF-2), ZiYuan-3 (ZY-3), Worldview-2, QuickBird, and IKONOS imagery [
28,
29,
30]. The image quality evaluation results showed that the spectral information fidelity was better than the traditional IHS, Brovey, and PCA fusion methods [
31].
Harmonic analysis was firstly applied to power systems, then a study applied the theory of harmonic analysis to the NDVI time series analysis of vegetation from remote sensing images such as AVHRR and MODIS for the first time [
32,
33]. Jakubauskas et al. (2003) applied this algorithm to crop species identification in their subsequent harmonic study [
34]. The image fusion algorithm based on harmonic analysis overcame the problems of low fidelity and low pervasiveness of the fused data, could be compatible with panchromatic, single-band, or multispectral images for processing, and could obtain good fusion effects and solved the defects of the low spatial resolution of hyperspectral data [
35].
Sentinel-2A, with three red-edge bands, has advantages in vegetation classification and spatial resolution, which could be obtained free of charge, and has a spatial resolution of 10 m. Fusion with GF-5 images results in hyperspectral high spatial resolution images, and it is hoped that the advantages of the respective sensors of Sentinel-2A and GF-5 images would be used to improve the accuracy of tree species classification and explore their classification effects and application extension values.
Due to their excellent performance, machine learning algorithms are increasingly used in tree species classifications. Commonly used machine learning algorithms include Support Vector Machine (SVM) and Random Forest (RF) classifier [
36]. RF can evaluate the importance of features based on internal sorting before classification, and extract the most important feature set to participate in classification. RF can filter features and rank feature importance based on accuracy [
37].
In this paper, Sentinel-2A and GF-5 images were used, combined with two methods of Gram-Schmidt and harmonic analysis for image fusion, then image quality evaluation was performed to select images with better results and band selection was performed; then, spectral features, vegetation indexes, texture features, and topographic features were extracted as feature variables. RF was combined to map the spatial distribution of tree species, with three objectives. RF was chosen as the classification method for this study. Accordingly, we tried to compare the effect of the two image fusion methods of Gram-Schmidt and harmonic analysis, the changes in classification accuracy of tree species before and after GF-5 fusion, and the extent to which the fused images and their features improve the classification effect of tree species.
4. Discussion
Spectral features are the basis of tree species classification. Sentinel-2A images have relatively higher spatial resolution and less spectral resolution than GF-5 images. The spectral resolution of GF-5 images is about 5 nm for VNIR and 10 nm for SWIR [
47]. High spectral resolution showed great potential in the classification of complex tree species [
25,
26]. When fusing hyperspectral and multispectral images, the ratio of their spatial resolution also had an impact on the results. In this study, the ratio of spatial resolution between hyperspectral and multispectral images is three. When the ratio of spatial resolution is two, major changes cannot be represented, and image fusion methods can improve spatial fidelity [
52].
In this paper, Gram-Schmidt (GS) and Harmonic Analysis Fusion (HAF) were used to fuse GF-5 and Sentinel-2A images. Ren et al. (2020) showed that Adaptive Gram-Schmidt (GSA) and Smoothing Filtered-Based Intensity Modulation (SFIM) can be used to fuse GF-5 with Sentinel-2A images [
53]. Based on the fused spectral image of ZY-3 and Sentinel-2A, the fusion of ZY-3 and Sentinel-2A images increased the spectral bands from 4 to 10 and improved the classification accuracy by 14.2% at 2 m spatial resolution, and the image fusion played an important role in the improvement of tree species classification accuracy [
54]. In this study, the classification accuracy of fused images was 17.98% higher than that of GF-5 images (Scheme 1 and Scheme 2).
Zhang et al. (2018) studied the fusion process of Gaofen-2 (GF-2), and fused the visible and near-infrared bands, respectively. It was found that the GS method was more suitable for the fusion of near-infrared band images and was more suitable for monitoring vegetation and water indicators [
55]. Mauro Dalla Mura et al. (2015) combined panchromatic images with multispectral or hyperspectral data; the results illustrated that the Gram-Schmidt-Pansharping method can be used as an example to evaluate the performance of global and local gain estimation strategies. In the fusion of Synthetic Aperture Radar (SAR) images with optical images [
56], Yan et al. (2020) used the Non-Subsampled Contour wave Transform (NSCT) to improve the GS method to obtain high-resolution images containing spectral information and SAR image detail information [
57]. These showed the superiority of the GS method for image fusion.
Zhang et al. (2020) constructed an improved feature set, HGFM, by over combining multi-scale Guided Filter (GF)-optimized Harmonic Analysis (HA) with morphological operations for HSI classification, combined with random forests, etc., to evaluate different feature sets; experimental results confirmed the effectiveness of the feature set in terms of classification accuracy and generalization ability [
58]. The hyperspectral anomaly detection method based on harmonic analysis and low-rank decomposition proposed by Xiang et al. (2019) was validated using public hyperspectral datasets (University of Pavia ROSIS, Indian Pines AVIRIS, Salinas AVIRIS), and the method was found to have excellent visual properties, ROC curves and AUC values with excellent performance and satisfactory results [
59]. In this paper, only the common HAF method was used for fusion, and the fusion results were less reliable than those of the GS method (
Table 2); improvement of the HAF method is needed for future hyperspectral fusion research.
In addition, most of the top-ranked features are vegetation indices with NDVI, WBI, etc. Water Band Index (WBI) was very sensitive to changes in canopy water status and has important applications in canopy stress analysis, productivity prediction, and modeling [
60]. NDVI was widely used in tree species classification. These vegetation indices accounted for the majority of the results and improved the results of tree species classification; they differed in canopy pigmentation and moisture content, and these differences were important in distinguishing tree species, which was similar to the results of Li et al. (2020) whose findings indicate that it was possible to classify forests based on narrow-band vegetation indices (NDVI705, mSR705, mNDVI705, VOG1, VOG2, REP) and texture information to classify forests in cloud-shaded areas, which was better than using only reflectance images [
61]. Zagajewski et al. (2015) also demonstrated the validity of NDVI and NDWI for tree species classification [
62].
Aspect and Slope were ranked highly (
Figure 4), suggesting that topographic features can improve the classification accuracy of tree species (Scheme 6, Scheme 7). Minfei Ma et al. (2021) added topographic features as variables to the spectral bands and found that elevation, slope, and aspect all influenced the current spatial distribution of the four tree species in the study area [
36]. The slope derived from the DEM data contributed the most to the classification of forest types, which is consistent with the findings of this paper (
Figure 4) [
63].
The similarity of textural between tree species of the same type was significant, while natural forests have more variation in canopy size, height, and density. Textural features could effectively reduce the “salt-and-pepper noise” and improve the integrity of patches and classification accuracy [
64]. This is consistent with the findings of Ye et al. (2021), who used WorldView-2 and Sentinel-2A image fusion to classify eight land cover classes with the participation of spectral features, texture features, and who also assessed the relative importance of these features, with GLCM-Mean ranking second in importance [
2].
In terms of PA, the precision of all species was greater than 70%, and
Celtis sinensis Pers. (CSP) and
Phyllostachys edulis (BA) were both 98%. The highest accuracy occurred in homogeneous forest areas with high forest and tree canopy cover [
65]; they were more concentrated in spatial distribution and had a smaller number of missed image elements.
Pinus elliottii (PE) had an accuracy of less than 80%; these two trees were more dispersed in space.
Pinus elliottii (PE) covered a smaller area, with a significant mix of pixels, and more pixels were missed and easily classified as other species.
This study showed that GF-5 and Sentinel-2A images have the potential to identify tree species types, which can provide new options for forest resource detection and management, and the results of seven forest species classification showed that the method can obtain high classification accuracy. In this study, only a single remote sensing image of Sentinel-2A was used for image fusion. It has been shown that time series images play an important role in tree species classification, which can more intuitively reflect the spectral changes during the growth and development of trees, and the use of multi-temporal Sentinel-2A and GF-5 fusion can be considered to show the spectral change pattern more finely [
66,
67,
68]. In addition, only one machine learning method (RF) was used in this paper; other machine learning methods such as Support Vector Machine (SVM), Artificial Neural Network (ANN), K-Nearest Neighbor (KNN), and BP neural network could also be used in the study. Deep learning methods (AlexNet, VGG-16, ResNet-50, LSTM, etc., have also been used recently in tree species classification [
1,
69,
70], and these methods can be considered in subsequent studies to compare the classification accuracy of different methods. Future research can consider using some deep learning methods for research, which may further improve the potential of mapping forest tree species types.