Examining the Roles of Spectral, Spatial, and Topographic Features in Improving Land-Cover and Forest Classifications in a Subtropical Region

: Many studies have investigated the e ﬀ ects of spectral and spatial features of remotely sensed data and topographic characteristics on land-cover and forest classiﬁcation results, but they are mainly based on individual sensor data. How these features from di ﬀ erent kinds of remotely sensed data with various spatial resolutions inﬂuence classiﬁcation results is unclear. We conducted a comprehensively comparative analysis of spectral and spatial features from ZiYuan-3 (ZY-3), Sentinel-2, and Landsat and their fused datasets with spatial resolution ranges from 2 m, 6 m, 10 m, 15 m, and to 30 m, and topographic factors in inﬂuencing land-cover classiﬁcation results in a subtropical forest ecosystem using random forest approach. The results indicated that the combined spectral (fused data based on ZY-3 and Sentinel-2), spatial, and topographical data with 2-m spatial resolution provided the highest overall classiﬁcation accuracy of 83.5% for 11 land-cover classes, as well as the highest accuracies for almost all individual classes. The improvement of spectral bands from 4 to 10 through fusion of ZY-3 and Sentinel-2 data increased overall accuracy by 14.2% at 2-m spatial resolution, and by 11.1% at 6-m spatial resolution. Textures from high spatial resolution imagery play more important roles than textures from medium spatial resolution images. The incorporation of textural images into spectral data in the 2-m spatial resolution imagery improved overall accuracy by 6.0–7.7% compared to 1.1–1.7% in the 10-m to 30-m spatial resolution images. Incorporation of topographic factors into spectral and textural imagery further improved overall accuracy by 1.2–5.5%. The classiﬁcation accuracies for coniferous forest, eucalyptus, other broadleaf forests, and bamboo forest can be 85.3–91.1%. This research provides new insights for using proper combinations of spectral bands and textures corresponding to speciﬁcally spatial resolution images in improving land-cover and forest classiﬁcations in subtropical regions.


Introduction
The subtropical ecosystem in China plays an important role in the global carbon cycle because of its abundant forest cover and high carbon sequestration [1]. This region is characterized by high distinct dry and wet seasons with a long summer and short winter. The average annual temperature is about 21 °C and the average annual rainfall is about 1300 mm. This study area has low terrain in the southwest and high terrain in the northeast. The altitude is between 70 m and 900 m and the slope is mainly between 25° and 35°. Gaofeng Forest Farm was established in 1953 and is the largest stateowned forest farm in Guangxi. The total area is about 220 km 2 with forest coverage of about 85%. The dominant forest types in this farm are plantations, including Masson pine, Chinese fir, eucalyptus, and other broadleaf evergreen forests.

The Proposed Framework
The strategy of mapping land-cover and forest distribution using different data sources is illustrated in Figure 2. The major steps included the following: (1) Collection and organization of different data sources such as remotely sensed and field survey data. For this study, all remotely sensed data were registered to the Universal Transverse Mercator (UTM) coordinate system and

The Proposed Framework
The strategy of mapping land-cover and forest distribution using different data sources is illustrated in Figure 2. The major steps included the following: (1) Collection and organization of different data sources such as remotely sensed and field survey data. For this study, all remotely sensed data were registered to the Universal Transverse Mercator (UTM) coordinate system and atmospherically and topographically corrected; and all field survey data were organized and randomly selected as training samples or validation samples. (2) Extraction of DEM data from the ZiYuan-3 (ZY-3) stereo image. The resulting DEM data were used for topographic correction of remotely sensed data and for calculation of topographic factors. (3) Data fusion of different spatial resolution images or different optical sensor data. (4) Extraction and selection of textures and design of data scenarios.

Data Preparations
Datasets used in this research include different optical sensor data (ZY-3, Sentinel-2, and Landsat 8 Operational Land Imager (OLI)), field survey data, and DEM data developed from the ZY-3 stereo data ( Table 1). All data were registered to the UTM coordinate system.

Data Preparations
Datasets used in this research include different optical sensor data (ZY-3, Sentinel-2, and Landsat 8 Operational Land Imager (OLI)), field survey data, and DEM data developed from the ZY-3 stereo data (Table 1). All data were registered to the UTM coordinate system.

Dataset Description Acquisition Date
ZiYuan-3 (ZY-3) (L1C) Four multispectral bands (blue, green, red, and near infrared (NIR)) with 5.8-m spatial resolution and stereo imagery (panchromatic band-nadir-view image with 2.1-m, backward and forward views with 3.5-m spatial resolution) were used. The DEM data with 2-m spatial resolution were produced from digital surface model (DSM) data which were extracted from the ZY-3 stereo data. During field surveys, the coordinates of a site and detailed information about land-cover/forest types, as well as the tree species' composition, ages of plantation, and estimated height of mixed forest, were recorded for each site visited. All the recorded field data were imported into ArcGIS software and organized in digital format. Field survey data were refined by overlaying the high spatial resolution images from Google Earth and compared with existing forest maps obtained from Farm archives. The refined samples were randomly divided into two groups: Training samples and validation samples. Based on field surveys and our research objectives, a classification system consisting of 11 land-cover classes with special emphasis on forest types (Table 2) was designed.

Collection and Preprocessing of Different Remotely Sensed Data
The remotely sensed data used in this research include ZY-3, Sentinel-2, and Landsat 8 OLI. The ZY-3 multispectral and panchromatic data were orthorectified, then atmospherically calibrated using the fast line-of-sight atmospheric analysis of spectral hypercubes (FLAASH) method, and topographically corrected using SCS+C approach (a modified sun-canopy-sensor topographic correction) [59]. The multispectral images were resampled to a pixel size of 6 m, while the panchromatic band was resampled as 2 m. The ZY-3 imagery was used as reference. Both Sentinel-2 and Landsat 8 OLI images were registered to UTM coordinate system with root mean square error of less than 0.5 pixels. The ZY-3 stereo image was used to develop digital surface model (DSM) data with 2-m spatial resolution [19]. Since DSM represents the land surface height, not bare ground height, it was necessary to conduct post-processing of DSM to reduce the impacts of canopy heights on elevation data [60]. Filtering is a commonly used approach to achieve such a goal. Thus, in this research, a minimum filtering algorithm with a window size of 5 by 5 pixels was first conducted. Then, a median filtering algorithm with the same window size was applied, so the processed DSM could be used as a proxy of DEM for further use in topographic correction of these optical sensor data. The 2-m DEM data were then resampled to 6 m, 10 m, and 30 m using the mean algorithm, matching the cell sizes of satellite images of ZY-3, Sentinel-2, and Landsat OLI, respectively.
The 10 spectral-band Sentinel-2 data (Level-1C product) with 10-m and 20-m spatial resolutions (see Table 1) were used here. The Sentinel-2 Atmospheric Correction (Sen2Cor) was used to conduct atmospheric calibration [61] and Sen2Res was used to convert all spectral bands to the 10-m spatial resolution [62]. For the seven spectral-band Landsat 8 OLI (Level-2 product) data with 30 m for multispectral bands and 15 m for panchromatic band, the LaSRC (Land Surface Reflectance Code) [63] was used for atmospheric calibration. Because the study area is in a mountainous region, undulating terrain has serious impacts on the optical sensor data. It was necessary to conduct topographic correction to reduce the terrain impacts on the surface reflectance values. In this research, the SCS+C approach was used to conduct the topographic correction for both Sentinel-2 and Landsat 8 OLI data, as this algorithm has proven to provide better correction effects, especially when the solar elevation angle is relatively low [59].

Multisensor/Multiresolution Data Fusion
Many data fusion algorithms are available (see review papers by Pohl and van Genderen [39] and Zhang [40]) and some algorithms such as High Pass Filter (HPF), Gram-Schmidt (GS), and wavelet are often used for data fusion because they can effectively preserve multispectral features while improving spatial details in the fused image [10,19]. HPF extracts rich spatial features from high spatial resolution imagery using a high-pass filtering approach and then adds the extracted features into individual spectral bands [10]. In this research, we used HPF in the following scenarios (note: The following abbreviations ZY, ST, and LS represent ZiYuan In addition, the original multispectral bands from ZY6 (ZY-3 MS (6 m)-4 multispectral bands with 6-m spatial resolution), ST10 (Sentinel-2 MS (10 m)-10 multispectral bands with 10-m spatial resolution), and LS30 (Landsat 8 OLI MS (30 m)-6 multispectral bands with 30-m spatial resolution) were also used as comparisons. The PC1 from ZY-3 multispectral image was used here because it concentrates most information from the multispectral bands (over 80% of total variance explained in this research) and only one band with high spatial resolution was required in the HPF data fusion.

Extraction and Selection of Textural Images
Optical spectral bands may be the most important variables in land-cover classification [35]. As spatial resolution increases, how to effectively incorporate spatial features into spectral data has long been an important research topic [37,64,65]. One approach to use spatial features is to calculate the textural image, a newly generated image from a spectral one, by using a proper texture measure and window size [37]. The GLCM texture measures (mean, variance, homogeneity, contrast, dissimilarity, entropy, second moment, and correlation) are often used in practice due to their easy implementation and effectivity in extracting spatial information [66][67][68]. In order to increase efficiency, we did not conduct the same texture calculation on each spectral band; instead, we conducted principal component analysis on multispectral bands. Then, the PC1 (the first component image) was used for calculation of textural images because the PC1 concentrated the largest amount of information from the multispectral image. Depending on spatial resolution of the images, different window sizes were explored when calculating textural images based on PC1. Specifically, we explored (1) window sizes 5 × 5 to 31 × 31 (14 sizes) at intervals of two for the image at 2-m spatial resolution; (2) window sizes 3 × 3 to 21 × 21 (10 sizes) at intervals of two for the image at 6-m spatial resolution; (3) window sizes 3 × 3 to 15 × 15 (7 sizes) for the images at 10-m and 15-m spatial resolution; and (4) window sizes 3 × 3 to 11 × 11 (6 sizes) for the images at 30-m spatial resolution.
Use of many variables in land-cover classification cannot guarantee a higher classification accuracy [35]. In contrast, using more variables in a classification procedure requires a large number of training samples and demands long processing time and heavy workloads. Because some variables may have limited roles in land-cover classification or some variables have high correlations to each other, identifying the optimal combination of variables becomes necessary before implementing classification [10,33]. One possible solution to identify an optimal combination of variables is the RF algorithm to provide the ranking of variable importance, which is often used in land-cover classification [68][69][70]. Within the selected potential variables using RF, the Pearson's correlation analysis is used to examine the correlation coefficients between the potential variables. If two variables have high correlation, the one having relatively low-ranking importance may be not needed and, thus, can be removed. This process is repeated until a minimum number of variables is identified while classification performance reaches stability [19]. Therefore, in this research, we used RF to identify the key variables of textural images for each designed dataset. The selected textural images were incorporated into the spectral bands for land-cover classification.

Design of Data Scenarios
Based on sensor data with various spatial and spectral resolutions (ZY-3, Sentinel-2, and Landsat), as well as topographic factors, different data scenarios were designed for a comparative analysis of classification results to understand how incorporation of textures and topographic factors influence land-cover and forest classification. For each dataset, three data scenarios-spectral bands, combination of spectral and textural images, and combination of spectral bands, textures, and topographic data-were designed. Thus, a total of 24 data scenarios were available, as summarized in Table 3. The textures in this table were identified using the RF approach based on calculated textural images for each data scenario mentioned before. The DEM-derived elevation, slope, and aspect variables were incorporated into different data scenarios.

Land-Cover Classification Using the Random Forest Classifier
Selection and optimization of training samples are critical steps in the classification procedure [35]. Based on field surveys, a total of 1268 training samples covering 11 land-cover classes were collected. The number of samples for each land cover is provided in Table 2. Transformed divergence was used to examine the separability of land-cover classes, and optimization of the training samples for each class was conducted through examining spectral curves.
Many classification algorithms, such as maximum likelihood classifier (MLC), artificial neural network (ANN), and support vector machine (SVM), are available [35], but the researcher must select a classification algorithm suitable for a given study area and datasets. The machine learning algorithms such as RF and SVM have been proven to provide better classification than MLC when multisource data are used [11,27,71,72]. Compared with ANN and SVM, which require optimization of different parameters and much longer optimization processing times, RF has the advantages of easy optimization of parameters and much less computation time [71,73]. RF is a nonparametric classifier based on decision tree strategy and has been extensively used for land-cover classification [19,[68][69][70]. Three parameters are required to optimize: (1) Ntree, the number of regression trees (default is 500); (2) mtry, the number of input variables per node (default value is one-third of the total variables); and (3) node size (default is one). Node size and mtry are often kept as default values. Thus, the emphasis is on the optimization of ntree. As previous research has indicated, the best ntree value can be 50 [74,75], 100 [68], and even 500 [71] to reach a stable classification accuracy, depending on the classification system and data used. In this research, we explored different ntree values and finally selected 500 as the best value. As summarized in Table 3, RF was used to produce the classification result based on each data scenario. The classification results were then recoded to a system with 11 land-cover classes (see Table 2) for comparative analyses using the accuracy assessment approach.

Comparative Analysis of Classification Results
A stratification sampling approach with a maximum of 200 samples and minimum of 30 samples was used to collect validation samples. Based on field surveys, all validation samples were visually checked to decide the land-cover type. A total of 648 validation samples were collected and are summarized in Table 2. An error matrix for each scenario was used to evaluate classification results. Overall accuracy and kappa coefficient were calculated from the error matrix [76,77] and used for a comparative analysis of classification results from different scenarios. Meanwhile, user's accuracy (UA) and producer's accuracy (PA) were also calculated from each error matrix and used to evaluate the classification accuracy for each land-cover class, especially for individual forest type. In order to easily compare the accuracies among different land-cover types, a mean accuracy (MA) based on PA and UA was calculated; that is, MA = (PA + UA)/2 (Xie et al. 2019). Through comparative analysis of the classification results from different data scenarios, we can better understand the performances of different data sources and whether or not the classification system is suitable for practical applications. In this way, a better forest classification system can be proposed according to research objective and capability of data sources corresponding to given subtropical forest ecosystem.

Comparative Analysis of Classification Results Based on Overall Accuracies
The overall classification results based on different scenarios (Table 4) indicate that the STZY2 (10) under SPTXTP (combination of spectral bands, textures, and topographic factors) provided the best classification with overall accuracy of 83.5% and kappa coefficient of 0.80, followed by STZY2 (10) under SPTX (combination of spectral bands and textures) and STZY6(10) under SPTX or SPTXTP with overall accuracies of 76.2-78.1% and kappa coefficients of 0.71-0.74. All other scenarios had overall accuracies of less than 74.2% and kappa coefficient of less than 0.69. These results imply the important roles of both spectral and spatial resolutions through fusion of ZY-3 and Sentinel-2 data and the importance of incorporating textural images and topographic factors into spectral bands. The comprehensive role of both textures and topographic factors in 2-m spatial resolution images improved overall accuracy by 11.4-11.6% compared with only spectral bands.

The Role of Spectral Features in Land-Cover Classification
Considering the classification results using spectral bands only, STZY6(10) provided the best accuracy of 73.6%, followed by STZY2(10) with 72.1%, while ZY2(4) provided the poorest accuracy of 57.9%, following LS30(6) with 59.9%, implying the importance of combined spectral and spatial features in improving land-cover classification. Overall, 10 spectral bands (STZY2(10), STZY6(10), ST10(10)) had better classification accuracy (68.2-73.6%) than six bands (59.9-66.2%) and four bands (57.9-62.4%), implying the important role of an increased number of spectral bands in improving land-cover classification. For example, at 2-m spatial resolution, the increasing number of spectral bands from 4 (ZY2(4)) to 10 (STZY2(10)) improved classification accuracy by 14.2%; at 6-m spatial resolution and the same increase in spectral bands from ZY6(4) to STZY6(10), the overall accuracy increased by 11.1%, but increasing the spectral bands from four (ZY6(4) to six (LSZY6(6)) increased the overall accuracy by only 3.7%, implying the important roles of using red-edge and narrow NIR bands (only in Sentinel-2) in addition to the SWIR bands (in both Sentinel-2 and Landsat OLI).
The role of data fusion varies in improving land-cover classification, depending on the number of spectral bands and spatial resolution. If the spatial resolution is the same, increasing the number of spectral bands can considerably improve classification accuracy, for example, from ZY2(4) to STZY2 (10) and from ZY6(4) to LSZY6(6) or STZY6 (10). However, if the number of spectral bands is the same, the role of improved spatial resolution varies; for instance, higher spatial resolution may produce higher heterogeneity of the same forest class, resulting in reduced classification accuracy, as shown in the data scenarios between ZY6(4) and ZY2(4), and between STZY6(10) and STZY2 (10). However, for the relatively coarse spatial resolution images, improved spatial resolution is indeed helpful for land-cover classification, as shown in the data scenarios among LS30(6), LS15(6), and LSZY6 (6), and between ST10(10) and STZY6(10). These results imply that a high spatial resolution image without a sufficient number of spectral bands (e.g., ZY2(4)) or a relatively coarse spatial resolution image (LS30 (6)) provides poor classification accuracy (less than 60% in this research), while the imagery having both high spatial and spectral resolutions (e.g., STZY6(10)) provides the best classification accuracy, implying the importance of selecting remotely sensed data with both spectral and spatial resolutions suitable for land-cover classification.

The Role of Textures in Land-Cover Classification
For the data scenarios under SPTX, STZY2(10) and STZY6(10) provided the best accuracies, of 78.1% and 76.2%, followed by ST10(10), LSZY6(6), and ZY(6) with overall accuracies of 68.8-69.4%, and LS30 had the poorest accuracy at only 61%, implying different roles of textures in improving land-cover classification. One important finding in Table 4 is that incorporation of textures into spectral bands improved land-cover classification, but the contribution of textures varied, depending on spatial resolution and the number of spectral bands. As shown in Table 4, using textures in ZY6(4) and ZY2(4) improved overall accuracy by 6.3% and 7.7%, respectively, in ST10(10), STZY6(10), and STZY2(10) by 1.2%, 2.6%, and 6.0%, respectively, and in LS30(6), LS15(6), and LSZY6(6) by 1.1%, 1.7%, and 2.8%, respectively. For the same number of spectral bands, the textures from higher spatial resolution imagery (better than 6 m) played more important roles than the ones from relatively coarse spatial resolution imagery. This finding indicates the need to combine the spatial features from high spatial resolution imagery into spectral bands to produce accurate land-cover classification.

The Role of Topographic Factors in Land-Cover Classification
Incorporation of topographic factors into spectral and textural datasets-SPTXTP-improved overall accuracy by 1.2-5.5% for all scenarios. The highly improved accuracies of 5.0-5.5% were from STZY2(10), ZY6(4), LS15(6), and LS30(6) scenarios, implying the complex roles of topographic factors in land-cover classification. The topographic factors had relatively low effects (increased accuracy by only 1.2-2.8%) on the data scenarios such as STZY6(10) and LSZY6(6) with 6-m spatial resolution and 10 or six spectral bands. The results in Table 4 indicate that the role of topographic factors is more important in high spatial resolution images (better than 6 m) or relatively low spatial resolution images (15 or 30 m here) than the 6-m spatial resolution images with more spectral bands.

The Comprehensive Roles of Textures and Topographic Factors in Land-Cover Classification
Compared to spectral bands alone (SP data scenario), incorporation of both textures and topography into spectral data improved overall land-cover classification accuracies by 3.9-11.8%. In particular, the data scenarios with high spatial resolution such as ZY2(4), ZY2(10), and ZY6(4) improved overall accuracies by 11.4-11.8%. Generally speaking, textures played more important roles than topography for high spatial resolution images but inversely for relatively coarse spatial resolution images. The least improvement using both textures and topography occurred in STZY6(10) with increased accuracy of only 3.9%. This situation implies that the combined effects of both textures and topographic factors relied on the spectral and spatial resolutions.

Comparative Analysis of Classification Results Based on Individual Forest Classes
The mean accuracies of individual land-cover classes (Tables 5 and 6) indicate that different data sources have their own performances in classifying individual classes. Spectral features are still the most important for land-cover classification, and incorporation of textures and topographic factors improved some land-cover classes. Overall, eucalyptus had high classification accuracies of 72.5-90.1%, no matter which datasets were used. In particular, STZY2(10) and STZY6(10) provided the best accuracies of 83.1-90.1% and 84.2-85.6%, respectively, for eucalyptus, implying the importance of both spatial and spectral features. The following subsections will mainly focus on the accuracy analysis based on forest types.

The Role of Spectral Features in Individual Forest Classification
Although spectral signature is the most important feature in forest classification, its role varies depending on specific forest types. For example, at 2-m spatial resolution, four spectral bands with three visible bands and one NIR band (ZY2(4)) cannot effectively separate forest classes, but the 10 spectral bands with visible, red-edge, NIR, and SWIR bands (STZY2(10)) can considerably improve classification accuracy. For example, Tables 5 and 6 show that the classification accuracy for Castanopsis hystrix can increase from 13.8% to 56.7%, and bamboo forest from 33.9% to 78.3%. Overall, no matter which data source was used, eucalyptus had the best accuracies, of 72.5-85.6%, but Masson pine, Chinese fir, and other broadleaf trees had accuracies of less than 54.5%, 60.9%, and 54.4%, respectively. Other forest types had various accuracy ranges, depending on which kinds of data sources were used.
With the same spatial resolution, an increased number of spectral bands improved classification accuracies for the majority of forest types; for example, STZY2(10) provided much better classification accuracies for eucalyptus, Chinese anise, Castanopsis hystrix, Schima, and bamboo forest than ZY2(4). A similar situation occurred with ZY6(4), LSZY6(6), and STZY6(10) for eucalyptus, Chinese anise, and bamboo forest, but not for Masson pine or Chinese fir. In contrast, for the same number of spectral bands, increased spatial resolution (e.g., ZY2(4) vs. ZY6(4)) did not improve forest classification accuracy and made it worse for most forest types. This may be due to the high spatial resolution resulting in high spatial heterogeneity in images. For the 10 spectral bands, STZY6(10) provided the best accuracies for Masson pine, Chinese fir, and eucalyptus, but STZY2(10) provided the best for Schima and bamboo forest, while ST10(10) provided the best for Chinese anise, Castanopsis hystrix, and other broadleaf trees, implying that spatial resolution may play important roles for forest classification but different forest types require different spatial resolutions because of their unique forest canopy structures. Considering six spectral bands with spatial resolutions from 6 m to 15 m to 30 m, LSZY6 provided the best classification accuracies for Chinse fir, eucalyptus, Castanopsis hystrix, and Schima, but LS30 provided the best for Masson pine and Chinese anise. Overall, spectral signatures alone cannot provide high classification accuracy for most forest types, and 6 m, instead of 2 m or coarser than 10 m, is the optimal spatial resolution. This implies that proper selection of spatial and spectral resolutions is needed for forest classification. Individual sensor data do not have both high spatial and high spectral resolutions and, thus, data fusion is an alternative to improve both spatial and spectral features, thus improving forest classification, as shown in Tables 5 and 6.

The Role of Textures in Individual Forest Classification
Spectral bands alone, especially without NIR and SWIR bands, make it difficult to extract some forest types such as Masson pine, Chinese anise, Castanopsis hystrix, other broadleaf trees, and bamboo forest, but incorporation of textures considerably improved their classification accuracies. However, the effectiveness of using textural images is influenced by spatial resolution. As shown in Tables 5 and 6, the textures from STZY2(10) provided the best accuracies for Masson pine, eucalyptus, other broadleaf trees, and bamboo forest. The textures from STZY6(10) worked best for Chinese fir and Chinese anise. The textures from ZY2(4) worked best for Schima and the textures from LSZY6(6) worked best for Castanopsis hystrix, implying that textures from high spatial resolution images played more important roles than those from medium spatial resolution images in improving forest classification.
For ZY2(4) with 2-m spatial resolution, incorporation of textures into spectral bands improved classification accuracies for all forest types. In particular, the accuracies for Chinese anise and Castanopsis hystrix, respectively, increased from 41.8% and 13.8% based on spectral bands alone to 59.3% and 40.0% on the combination of spectral and textural images. For relatively coarse spatial resolution data, such as LS15 (6) and LS30 (6), incorporation of textural images into spectral bands yielded some improvement for forest types such as Chinese fir, Chinese anise, and Schima, but may have worse accuracies for other forest types such as Masson pine, eucalyptus, and bamboo forest. For 6-m spatial resolution, incorporation of textural images into spectral bands improved classification accuracy for most forest types. The results in Tables 5 and 6 imply the need to identify suitable textures that correspond to specific forest types, and no textures are optimal for different forest types because of their differences in forest stand structures, patch sizes, and shapes.

The Role of Topographic Features in Individual Forest Classification
The roles of topographic factors depend on specific forest types and spatial and spectral resolutions of datasets used. Overall, the incorporation of topographic factors into spectral and textural images improved classification accuracies of most forest types; the improvement was as high as 18.2% for Schima in ST10(10)) and 19.0% and 21.5% for other broadleaf trees in ZY6(4) and LS30(6), respectively. However, these forest types had relatively low accuracies based on the combination of spectral and textural images. In some cases, for example, use of topographic factors in STZY6(10) reduced classification accuracies for Chinese fir and Chinese anise by 6.6% and 8.9%, respectively. This situation implies that use of topographic factors as extra bands should consider the sensitivity of topography on forest distribution.

The Comprehensive Roles of Textures and Topographic Factors in Individual Forest Classification
Overall, STZY2(10) SPTXTP provided the best classification accuracies for most forest types in this study (Tables 5 and 6): The accuracies for eucalyptus, bamboo forest, and Schima reached 90.1%, 91.1%, and 87.1%, respectively, while Masson pine, Chinese fir, and other broadleaf trees had their best accuracies of 66.7-76.5%, implying that the incorporation of spectral bands, textural images, and topographic factors based on high spectral and spatial resolutions is needed for forest classification. On the other hand, the best classification accuracy for Chinese anise was 79.1% with STZY6(10) and the best accuracy for Castanopsis hystrix was 73.3% with LSZY6(6) under SPTX scenario, implying the important roles of both spectral bands and textures in forest classification, but topographic factors may be not needed for some forest classification. Tables 5 and 6 show the difficulty of classification for some forest types and the need to design a proper forest classification that takes the classification accuracies, research objectives, and complexity of forest ecosystem into account.

Design of Different Forest Classification Systems
The classification results in Table 4 show that the data scenario STZY2(10) SPTXTP provided the best classification accuracy of 83.5%. However, Tables 5 and 6 indicate that some forest types, such as Castanopsis hystrix, Masson pine, and Chinese fir, had relatively low accuracies of 64.8-69.4%. As shown in Table 7, Chinese fir and Castanopsis hystrix had PAs of 51.2% and 53.3%, respectively, and Masson pine had a UA of 52.7%, implying major confusion among some forest types. For example, the error matrix from the STZY2(10) SPTXTP classification result (Table 8) shows that Masson pine is highly confused with Chinese fir, and Chinese anise is confused with Castanopsis hystrix. This result indicates that current classifications cannot provide sufficiently high accuracy for some forest types, and it is necessary to design a proper classification system by combining research objectives, remotely sensed data, and classification procedure so the accuracy for each type can meet the user's requirement for real applications. Note: PA, UA, and MA represent producer's accuracy, user's accuracy, and mean accuracy, respectively, of an individual class [i.e., MA = (PA + UA)/2]; OA, overall accuracy; KA, kappa coefficient; meanings of the land-cover abbreviations used in this table are given in Table 2. Table 8. Error matrix based on the best accuracy result from the STZY2(10) SPTXTP data scenario. Row Total  UA  PA  MP  CF  EU  CA  CH  SC  OBT BBF  SH  NP  OLC   MP  29  18  2  3  1  1  0  0  0  0  1 Total  36  41  194  33  30  32  46  71  35  42  88 Note: The meanings of the land-cover abbreviations used in this table are given in Table 2.

Reference Data
Based on data scenarios currently used, some forest types have relatively low classification accuracies that are not suitable for real applications such as forest management. Considering that Masson pine and Chinese fir belong to coniferous forest, they can be grouped into that class, while Chinese anise, Castanopsis hystrix, Schima, and other broadleaf trees, which constitute a small proportion of this study area, can be grouped into one class called other broadleaf species (except eucalyptus). The newly merged classes-coniferous forest and other broadleaf species-had average accuracies of 91.0% and 85.3%, respectively. Overall accuracy and kappa coefficient of seven classes became 88.9% and 0.86, Remote Sens. 2020, 12, 2907 16 of 24 respectively, and mean accuracies of forest classes reached 85.3-91.1% (see Table 7). As an example of the classification image with seven land-cover classes, Figure 3 shows that eucalyptus plantations were distributed throughout the study area and made up the largest proportion, followed by coniferous forest. Other broadleaf forests and bamboo forests had small proportions and were widely dispersed.
accuracies that are not suitable for real applications such as forest management. Considering that Masson pine and Chinese fir belong to coniferous forest, they can be grouped into that class, while Chinese anise, Castanopsis hystrix, Schima, and other broadleaf trees, which constitute a small proportion of this study area, can be grouped into one class called other broadleaf species (except eucalyptus). The newly merged classes-coniferous forest and other broadleaf species-had average accuracies of 91.0% and 85.3%, respectively. Overall accuracy and kappa coefficient of seven classes became 88.9% and 0.86, respectively, and mean accuracies of forest classes reached 85.3-91.1% (see Table 7). As an example of the classification image with seven land-cover classes, Figure 3 shows that eucalyptus plantations were distributed throughout the study area and made up the largest proportion, followed by coniferous forest. Other broadleaf forests and bamboo forests had small proportions and were widely dispersed.

Increasing the Number of Spectral Bands to Improve Land-Cover and Forest Classification
Spectral signature is fundamental for land-cover and forest classification, and multispectral imagery is commonly used but cannot produce sufficiently high classification accuracies for all classes [19,20]. This is especially true when high spatial resolution images with only a limited number of spectral bands (visible and NIR) are used. Considering complex forest landscapes with various patch sizes, this research exhibited the importance of using both high spatial and spectral resolution images. However, in fact, the majority of high spatial resolution optical sensor images such as Quickbird, IKONOS, ZY-3, and GaoFen-1 have only four spectral bands consisting of visible and NIR bands. The fewer spectral bands limit their capability of differentiating forest types and species, especially in subtropical regions with rich tree species, due to the spectral confusion of some forest types and impacts of undulating terrain on surface reflectance values. This research indicated the importance of inclusion of more spectral bands such as red-edge and SWIR into the classification procedure. On the other hand, Landsat images with relatively coarse spatial resolution are not sufficient for implementing fine forest classification in a complex forest ecosystem with relatively small patch sizes due to the mixed pixel problem. To overcome this problem, data fusion is an effective tool to integrate high spatial and spectral features into a new dataset. This research used HPF to conduct the fusion of different sensor data (e.g., ZY-3 PAN and Sentinel-2 MS) or different spatial resolution images from the same sensor data (e.g., ZY-3 MS and PAN, Landsat 8 OLI MS and PAN) and found that the fused images indeed improved classification accuracy, a conclusion similar to tropical forest classification research in the Brazilian Amazon [10,27,33].
Another way to increase the number of spectral bands is to use multitemporal images. Use of images from different seasons is especially valuable for distinguishing deciduous and evergreen vegetation classes [11,19,45]. However, in subtropical regions where evergreen forests dominate, multitemporal images may not provide much new information for distinguishing forest classes, and cloud-free images are somewhat rare. An alternative is to use hyperspectral images such as Hyperion or airborne hyperspectral images [78][79][80]. However, hyperspectral imagery has not been used extensively for forest classification due to the difficulty of image acquisition for a given study area. Therefore, more research may be focused on making full use of multisource data such as spatial features inherent in the spectral signatures and ancillary data in a classification procedure.

Incorporating Textures into Spectral Data to Improve Land-Cover and Forest Classification
This research showed that incorporation of textures into spectral bands can considerably improve overall classification accuracies (see Table 4), especially when high spatial resolution images are used. Many previous studies also came to this conclusion [10,33,37]. The roles of textures in improving specific forest classes vary, depending on spatial and spectral features. This implies the difficulty in identifying universal textures that can be used to improve classification accuracy for each class.
One key in using textural images is to identify an optimal combination of textural images. This involves deciding how many textural images should be selected. Some previous studies focusing on vegetation classification indicated that incorporation of two or three textural images as extra bands into spectral data is suitable, based on separability analysis of training samples [37]. There are so many potential scenarios of different combinations, it is often difficult to identify an optimal one. This research used RF to identify the best combination of textural images based on an importance ranking and has proven to be an effective method.
The important role of textures in improving forest classification is well recognized, but lack of universal knowledge to guide the selection of textures in a study area hinders the process. Selection of a suitable window size for calculation of textural images is critical and depends on the spatial resolution of the image and complexity of forest landscapes under investigation [37]. Window sizes 7 × 7 and 9 × 9 pixels were found to be suitable for calculation of textures from Landsat images [10]. As spatial resolution increases, such as with Quickbird, the window size can be as large as 21 × 21 pixels [81]. This research indicated that a large window size such as 31 × 31 or 25 × 25 is needed for high spatial resolution images (2 m), and a small window size such as 5 × 5 is needed for medium spatial resolution images (30 m). Also, a combination of textural images from different window sizes and texture measures is necessary but may not improve the accuracies for some forest types. Therefore, more research should be conducted on the selection of suitable textures for specific forest classes, not based on overall land covers.

Using Ancillary Data to Improve Land-Cover and Forest Classification
Ancillary data such as population density, DEM, and soil type are easily obtainable and may be used in a land-cover or forest classification. How to effectively employ ancillary data to improve land-cover classification has long been an important research topic [35]. In mountainous regions, spatial distribution of different land-cover or forest types is often related to topography and soil types. For example, agricultural lands and villages are usually located in relatively flat areas, and some tree species are likely to appear in sunny-or shady-slope areas. Previous studies have explored the effectiveness of employing topographic factors to improve forest classification [19]. This research confirmed their important roles in improving forest classification. In addition, we found that topographic factors have different effects in differentiation of forest types. As shown in Tables 5 and 6, topographic factors can improve classification accuracies of Masson pine, Chinese fir, and eucalyptus for different data scenarios, but may reduce accuracies for Chinese anise based on data scenarios such as STZY2(10), ZY6(4), and STZY6(10), implying the different roles of topographic factors in distinguishing forest types. It also implies that direct use of topographic factors as extra variables may not be an optimal method. Suitable expert knowledge must be developed about the relationships between topographic factors and specific forest distribution to enhance forest classification. With such knowledge, the hierarchically based approach that can effectively determine specific variables for extraction of forest types may be preferable [20].

The Importance of Using Multiple Data Sources to Improve Land-Cover and Forest Classification
Single-sensor remotely sensed data have limitations in spectral and spatial features and, thus, may not produce accurate land-cover classification, especially for forest types in subtropical regions under complex terrain conditions and rich tree species. This research confirmed that proper integration of different data sources, such as spectral bands, textural images, and topographic factors, can considerably improve forest classification. However, it is necessary to consider the spatial and spectral resolutions when different data sources are combined, because the ability to improve classification performance may be vastly different. Considering the forest types, differences in forest stand structures among forest types, especially plantations such as eucalyptus in this research, are important features that can be used. Texture is one of the features that can reflect different forest stand structures. Previous research has indicated that proper use of spectral mixture analysis on the Landsat multispectral imagery can improve forest classification [10,82]. With Lidar and stereo imagery, the canopy features from those images may be an alternative to incorporate into optical sensor data to improve forest classification [83][84][85]. More research is needed to explore how to effectively integrate different data sources in a classification procedure and what classification algorithm is optimal for a forest classification based on multiple data sources.
In tropical and subtropical regions, cloud is often a problem for collection of cloud-free optical sensor data. Thus, different sensor data with various acquisition dates have to be used in reality.
In this case, cautions should be taken to reduce the effects of different acquisition dates of data in land-cover classification. As shown in Table 1, we used ZY-3 imagery on 10 March 2018 and Landsat OLI imagery on 1 February 2017. The one-year gap between both images may influence the data fusion result because the fast growth of some tree species such as eucalyptus in this research may affect the tree crown size and forest canopy density, thus affecting forest reflectance in the optical sensor data. This difference caused by the gap of image acquisition dates may affect forest classification results. Therefore, we paid much attention to the selection of training samples and validation samples to avoid the potential impacts of land-cover change (e.g., eucalyptus harvest) on the classification and accuracy assessment.
In addition to the careful selection of suitable input variables from different data sources (e.g., remotely sensed data, ancillary data), collection of sufficient number of training samples and validation samples for each class is also critical for a successful land-cover classification. In general, samples are often collected from field survey or visual interpretation of high spatial resolution images based on Google Earth. Considering the expense and intensive labor when doing field work, effective use of existing open data sources will be an alternative to collect more samples. The crowdsourced data such as OpenStreetMap and Volunteered Geographic Information obtained through Citizen Science may be used for collection of training and validation samples [86,87]. This is especially important when multiple source data are used as input variables for land-cover and forest classification using advanced machine learning algorithms such as deep learning [15,85,88]. More research is needed to design an optimal procedure to include multiple data sources as input variables for classification and to select useful samples from open data sources in addition to the field survey data.

Conclusions
This research explored land-cover classification with emphasis on forest types in a subtropical region through a comprehensive comparison of classification results based on multiple data sources (i.e., ZY-3, Snetinel-2, and Landsat 8 OLI). Data scenarios based on spectral, textural, and topographic data with spatial resolution ranges from 2 m to 30 m were designed, and RF was used to conduct the classification. Major conclusions are summarized as follows: (1) Spectral signature is more important than spatial resolution in land-cover and forest classification.
High spatial resolution images with a limited number of spectral bands (i.e., only visible and NIR) cannot produce accurate classifications, but increasing the number of spectral bands in high spatial resolution images through data fusion can considerably improve classification accuracy. For instance, increasing the number of spectral bands from 4 to 10 increased overall land-cover classification accuracy by 14.2% based on 2-m spatial resolution and by 11.1% based on 6-m spatial resolution. (2) The best classification scenario was STZY2(10) with SPTXTP, with overall land-cover classification accuracy of 83.5% and kappa coefficient of 0.8, indicating the comprehensive roles of high spatial and spectral resolutions and topographic factors. Overall, incorporation of both textures and topographic factors into spectral data can improve land-cover classification accuracy by 3.9-11.8%. In particular, overall accuracy increased by 11.4-11.6% in high spatial resolution images (2 m) compared to medium spatial resolution images (10-30 m) yielding only 5.6-7.2% improvement. (3) Textures from high spatial resolution imagery play more important roles in improving land-cover classification than textures from medium spatial resolution images. The incorporation of textural images into spectral data in the 2-m spatial resolution imagery raised overall accuracy by 6.0-7.7% compared to 10-m to 30-m spatial resolution images with improved accuracy of only 1.1-1.7%. Incorporation of topographic factors into spectral and textural imagery can further improve overall land-cover classification accuracy by 1.2-5.5%, especially for the medium spatial resolution imagery (10-30 m) with improved accuracy of 4.3-5.5%. (4) Integration of spectral, textural, and topographic factors is effective in improving forest classification accuracy in the subtropical region, but their roles vary, depending on the spatial and spectral data used and specific forest types. Increasing the number of spectral bands in high spatial resolution images through data fusion is especially valuable for improving forest classification. Incorporation of textures into spectral bands can further improve forest classification, but textures from high spatial resolution images work better than those from medium spatial resolution images. (5) Forest classification with detailed plantation types was still difficult even using the best data scenario (i.e., STZY2(10) with SPTXTP) in this research. The classification accuracies for Masson pine, Chinese fir, Chinese anise, and Castanopsis hystrix were only 64.8-70.7%, while the accuracies for coniferous forest, eucalyptus, other broadleaf forest, and bamboo forest could reach 85.3-91.1%, indicating the necessity to design suitable forest classification system. The roles of textures and topographic factors in improving forest classification vary, depending on specific forest types. (6) More research is needed on selection of the proper combination of textural images and topographic factors corresponding to specific forest types, instead of overall land-cover or forest classes. A hierarchically based classification procedure that can effectively identify optimal variables for each class could be a new research direction for further improving forest classification based on the use of multiple data sources covering spectral, spatial, and topographic features and forest stand structures (e.g., from Lidar-derived height features).