Classifying Forest Types over a Mountainous Area in Southwest China with Landsat Data Composites and Multiple Environmental Factors

: Accurate information about forest type and distribution is critical for many scientiﬁc applications. It is possible to make a forest type map from the satellite data in a cost effective way. However, forest type mapping over a large and mountainous geographic area is still challenging, due to complex forest type compositions, spectral similarity among various forest types, poor quality images with clouds or cloud shadows and difﬁculties in managing and processing large amount data. Based on the Google Earth Engine (GEE) cloud platform, a method of forest types mapping using Landsat-8 OLI imagery and multiple environmental factors was developed and tested within Yunnan Province (about 390,000 km 2 ) of China. The proposed approach employed a pixel-based seasonal image compositing method to produce two types of seasonal composite images, i.e., four 7-spectral-band composite images and four 5-VI-band composite images associated in spring, summer, autumn, and winter. Then, single-season feature bands and multi-seasonal feature bands were combined with the feature bands of topography, temperature, and precipitation, respectively, and resulting in 17 feature combinations. Finally, using a random forest (RF) classiﬁer, 17 feature combinations were separately experimented to classify the forest type over the study area. The study area was ﬁrstly classiﬁed into the forest and the non-forest, and then the forest was sub-classiﬁed into ﬁve forest types (evergreen needleleaf forest, deciduous needleleaf forest, evergreen broadleaf forest, deciduous broadleaf forest, and mixed forest). The results showed that the pixel-based multi-seasonal median composite can produce a cloud-free image for the entire region and is suitable for forest type mapping. with a single-season composite, a multi-seasonal composite can distinguish different forest types more effectively. The environmental factors also improve the accuracy of forest type mapping. With the ground survey samples as reference values, the classiﬁcation performance of 17 feature combinations was compared, and the optimal feature combination was found out. For the optimal feature combination, its overall accuracy of the forest/non-forest cover map and the forest type map reached 97.57% (Kappa = 0.950) and 70.30% (Kappa = 0.628), respectively. The proposed approach has demonstrated strong potential of high classiﬁcation accuracy and convenient calculation when mapping forest types over a national or global scale, and its product of 30 m resolution forest type map is capable of contributing to forest resource management. validation, L.W. and G.O.; formal analysis, R.L., W.Z., and L.W.; investigation, G.O.; resources, L.W. and G.O.; data curation, R.L., P.F., and L.W.; writing—original draft preparation, R.L. and L.W.; writing—review and editing, L.W., W.Z., and X.H.; visualization, R.L. and P.F.; supervision, L.W. and W.Z.; project administration, L.W.;


Introduction
As the largest biological resource bank on the Earth, forests cover about 30% of the land surface and play an irreplaceable role in mitigating climate change, improving the environment, and ensuring ecological security [1]. Through increasing carbon storage and water holding capacity and decreasing soil erosion, the ecological effects of forests contribute to the biodiversity of the Earth [2]. At the same time, the ecological function of forest ecosystems varies with forest types, and the extent of different forest types is significant for the ecological effectiveness of forests [3]. Yunnan Province, located in the southwest of China, is rich in forest resources, and forest covers 65.04% of the whole area [4]. Because of the rugged terrain and diverse climates, Yunnan Province owns an extremely rich biodiversity and a wide variety of vegetation types in China [5]. Therefore, information about the distribution of different forest types in this area is important for a variety of forest research.
The ground survey is a traditional method to collect forest stand and composition information. Usually, forest field inventory data are collected and aggregated by measuring sampling plots, and therefore the final ground survey data are generally at the forest stand level [6]. However, most forest areas, especially in mountainous regions, are often remote and even inaccessible [7], and thus the ground survey is costly, time-consuming, and spatially restricted [6]. In recent decades, remote sensing technology has become an important and efficient approach to obtain forest spatial distribution information at the regional and national levels in a more efficient way and it allows a higher update frequency [6,8]. Thanks to the free access to Landsat and Sentinel, as well as rapid improvements in data-storage and computing capabilities, a Chinese forest type product (Forest2010) [9] and several global land-cover products (FROM_GLC2010 [10], FROM_GLC2015 [11], GlobeLand30 [12] and FROM_GLC10 [13]) at fine spatial resolution (30 m and 10 m) have been successfully developed. However, the current main source of 30 m forest type data is global land cover products, and these land cover products did not focus on forest type classification and do not make better use of phenological information, which limits the research and application of forest type classification.
In the research of forest type mapping, phenological characteristics of forests are often used as useful features to discriminate forest types. Phenology includes obvious processes such as the coloring of leaves in deciduous temperate forests in autumn due to leaf senescence, the intense green colors of fresh leaves and needles in springtime as well as flowering events [14]. The seasonal variation of visible spectral response in forest species and the phenological differences in senescence among tree species could present unique forest classification opportunities [15]. The research on forest type classification using phenological characteristics is mainly divided into three types: (1) Identify one forest type by the key phenology phases, which are extracted through analyzing the vegetation index time series of different land cover types [16]. (2) Map multiple forest types by collecting a set of cloudless images from specific phenological windows, which show different phenological characteristics for different types. For this type of research, it not only requires good prior knowledge about the phenological characteristics of different forests, but also faces challenges to acquire cloudless imagery for specific phenological events, especially in areas with frequent cloud cover [8,15]. (3) Map multiple forest types using spectral information from time series data. This type of research does not require image selection and good prior knowledge about the phenological characteristics of different forests. For instance, Li et al. used forest phenological characteristics from MODIS EVI time-series data to map forests of China [9]. However, the methods, which use dense timeseries data to map forest types over large area, usually only use one or several vegetation index time-series to classify forests without considering the original band features; and they will lose the multi-seasonal spectral band information that may be more effective for forest classification. In addition, when mapping forest types on a large scale, especially in large mountainous areas, clouds and rain often reduce image quality and then greatly impact the accuracy of forest type mapping.
In previous studies, topographic factors have been proven to improve the accuracy of forest type classification [6,17]. Topographic factors can be represented by DEM or variables derived from a DEM [18]. In addition, climate plays an important role in the dynamic change of forests, so the impact of climate factors on forest type could not be neglected especially in mountainous areas [8]. Among many climate factors, temperature and precipitation are very critical for local forest types [19]. However, climate factors were less considered in previous studies because climate factors with low spatial resolution, e.g., precipitation, cannot be measured or deduced easily. In addition, forest type mapping over a large scale requires powerful geo-big data processing capabilities. The production of products over a large scale implemented on a supercomputer or processor PC is timeconsuming and laborious and is not easy to update in time. Google Earth Engine (GEE) is a planetary-scale platform for large-scale Earth observation data and geographical analysis, and it can provide massive geospatial data and machine learning algorithms [20]. GEE has the entire Landsat archive along with many free raster datasets from NASA, European Space Agency (ESA), and other imagery [21]. The remote sensing cloud computing platforms can effectively solve limitations such as data collection, managing, and processing [20,22]. GEE has been a study platform for many studies [23][24][25][26][27][28][29]. There are also many studies on forest-type mapping based on GEE, such as extracting bamboo using Landsat time series [30] and mapping tree species using Sentinel-2 imagery [31]. However, there are still very limited ways to combine multiple environmental factors with a cloud platform to map forest types in large mountainous areas using multi-seasonal composites of one year of Landsat imagery [6].
In this context, to meet the needs of various fields for forest type maps over a large area, this study aims to explore methods of mapping forest types over large, complicated mountain areas by using one year of Landsat imagery and multiple environmental factors, based on the Google Earth Engine cloud platform. The study tries to address the impact of multi-seasonal and climate-related variables and determine the optimal feature combination for the forest type classification.

Study Area
As shown in Figure 1a, the study area is situated in the southwest of China, Yunnan Province. With a substantially large area of 390,000 km 2 and 16 administrative divisions, Yunnan Province stretches over more than 8 latitudes between tropical and subtropical climate zones. This region is featured as a mountainous plateau and has diversified terrain types. The whole terrain inclines from north to the south and southeast [32], with an elevation difference of more than 6000 m (Figure 1b). Different elevational belts can be found within the study area (montane, alpine and sub-alpine zone, and basin). The climate of Yunnan belongs to the plateau-type tropical monsoon climate, with annual average temperature of 15 • and annual mean precipitation of over 1000 mm [33]. However, the distribution of annual precipitation dramatically decreases from west to east, with rainy areas in the west and south and dry areas in the northwest (Figure 1d). Due to its complex and varied terrain, the inner local climates of Yunnan vary greatly, including hot-dry, tropical, and subtropical, temperate, and cold climates.  [34]); and (d) the annual total precipitation (OpenLandMap Precipitation monthly product [35]).

Landsat OLI Imagery
The research used the Landsat 8 surface reflectance (SR) Tier 1 images archived in the GEE platform (Table 1). According to the path/row of Landsat data, Yunnan Province is fully covered by 35 scenes of Landsat images. Given the frequent cloud coverage, especially in a mountainous region, all the available Landsat 8 images from 1 March 2017 to 1 March 2018 were collected to ensure that each pixel location has as much cloud-free data as possible. In total, 530 valid images were collected during the study period, with 155 images in spring, 91 images in summer, 133 images in autumn, and 151 images in winter ( Figure 2). Because Yunnan is located between tropical and subtropical zones and has a large area, there are few cloud-free Landsat images available in the growing season.   [34]); and (d) the annual total precipitation (OpenLandMap Precipitation monthly product [35]).

Landsat OLI Imagery
The research used the Landsat 8 surface reflectance (SR) Tier 1 images archived in the GEE platform (Table 1). According to the path/row of Landsat data, Yunnan Province is fully covered by 35 scenes of Landsat images. Given the frequent cloud coverage, especially in a mountainous region, all the available Landsat 8 images from 1 March 2017 to 1 March 2018 were collected to ensure that each pixel location has as much cloud-free data as possible. In total, 530 valid images were collected during the study period, with 155 images in spring, 91 images in summer, 133 images in autumn, and 151 images in winter ( Figure 2). cloudy from June to October in Yunnan that most images in this period have more than 50% cloud coverage and few have less than 10% cloud coverage and this makes it difficult to capture the phenological change of forests during the growing season, which usually last from June to September in Yunnan.

Environmental Factors
The growth of forest relies on environmental conditions, so forest types are closely related to environmental factors. Adding environmental factors into remote sensing classification is bound to improve the accuracy of forest type mapping. Many studies have shown that topographic data, especially elevation, slope, and aspect, could increase the estimation accuracy on the land cover when classifying with spectral bands. At the same time, temperature and precipitation are critical for the growth of forests. Some studies have shown that rising temperatures could lead to a significant advancement of the spring phenology of plants in temperate and frigid zones, leading to a prolonged plant growth season [36]. Precipitation indirectly affects forest phenology by changing accumulated temperature and radiation. Hence, environmental factors, including terrain factors  Because Yunnan is located between tropical and subtropical zones and has a large area, there are few cloud-free Landsat images available in the growing season. Figure 2 shows the number of images with different cloud cover percentage levels in each month from 1 March 2017 to 1 March 2018. Images with a cloud coverage of less than 10% are mainly from January to May and November to December. The five months from June to October are the rainy season in Yunnan, so the images available in this period are less than the rest of the months because some poor-quality images are taken as invalid. It is so cloudy from June to October in Yunnan that most images in this period have more than 50% cloud coverage and few have less than 10% cloud coverage and this makes it difficult to capture the phenological change of forests during the growing season, which usually last from June to September in Yunnan.

Environmental Factors
The growth of forest relies on environmental conditions, so forest types are closely related to environmental factors. Adding environmental factors into remote sensing classification is bound to improve the accuracy of forest type mapping. Many studies have shown that topographic data, especially elevation, slope, and aspect, could increase the estimation accuracy on the land cover when classifying with spectral bands. At the same time, temperature and precipitation are critical for the growth of forests. Some studies have shown that rising temperatures could lead to a significant advancement of the spring phenology of plants in temperate and frigid zones, leading to a prolonged plant growth season [36]. Precipitation indirectly affects forest phenology by changing accumulated temperature and radiation. Hence, environmental factors, including terrain factors (elevation, slope, and aspect), temperature, and precipitation were also considered in the study. However, it is worth noting that the temperature and precipitation data used in the study both are 1-km spatial resolution data products (Table 1). In addition, the temperature data were introduced from the MOD11A2 V6 product [34], which provides an average 8-day land surface temperature (LST). The monthly precipitation dataset used in the research was from OpenLandMap Precipitation monthly product [35]. Here, topographic data were derived from the Shuttle Radar Topography Mission (SRTM) with 30 m resolution [37]. In addition, the two environmental factors were obtained and resampled to 30 m resolution by the GEE platform.

Select of Sample Site
In order to have a reasonable coverage of samples, samples were collected from eight sites through the evenly distributed sampling scheme, as shown in Figure 3. The spatial locations of the eight sites are roughly located in the east, southeast, south, west, northwest, north, northeast, and central of Yunnan Province, and they include all climate types from the cold temperate zone to subtropical zone. There are significant differences in temperature, precipitation, and altitude among eight sites.
There are multiple climate types in the eight counties (cities) due to the complex geographic environment. In terms of temperature, the monthly average temperatures of site 5 and site 6 are significantly lower than other sites; in particular, the average monthly temperature is in the range of 12.9-22.2 • C; and the average monthly temperature of other sites is in the range of 14.2-31.4 • C. From Figure 3d, it can be said that the precipitation of site 2, site 3, and site 5 are higher than that of other regions in most months of a year; the monthly averages of precipitation of these three regions are extremely uneven in different seasons, with 208-324 mm in July and only 15-43 mm in February. Moreover, the monthly average precipitation of site 6, with a range of 3 mm (December)-186 mm (July), is lower than other sites. The climate of site 6 is relatively dry and cold. The eight sites also have an obvious difference in altitude. Compared with other sites, site 1 has the smallest altitude difference with an altitude range of 156 m-1834 m. In addition, the altitude differences between the lowest and highest altitudes at site 2, site 3, site 5, site 7, and site 8 are in the range of 1776-2974 m, and the altitude differences at site 4 and site 6 are 3434 m and 3894 m, respectively.
The eight sites include evergreen needleleaf forests, deciduous needleleaf forests, evergreen broadleaf forests, deciduous broadleaf forests, and mixed forests. The tree species of evergreen coniferous forests in Yunnan Province mainly include spruce (Picea asperata Mast.), fir (Abies fabr (Mast.) Craib), cypress (Cupressus funebris Endl.), Chinese white pine (Pinus armandii Franch.), Yunnan pine (Pinus yunnanensis), sikang pine (Pinus densata Mast.), khasi pine (Pinus kesiya var. langbianensis), and so on. Deciduous broadleaf forest is the most common forest type in the temperate zone. Evergreen broad-leaved forest is also called Sub-tropical evergreen broad-leaved forest because it occurs in the areas dominated by a subtropical monsoon climate [38,39]. The evergreen broad-leaved forests in Yunnan distribute from 800 to about 3000 m and most are abundant between 1100 and 2700 m. In Yunnan, evergreen broad-leaved forest almost covers the entire tropical mountains and the subtropical area, except for the alpine and subalpine areas above 3000 m asl in northwestern Yunnan [39]. The distribution of deciduous broad-leaved forests is widely distributed, mainly distributed in the area north of 23 • 39 N, but the area is small and sporadic [39]; this type of forest is all deciduous in winter, with obvious seasonal changes in the community physiognomy. Mixed forests occur in the transition zone between the evergreen broad-leaved forest belt and the subalpine coniferous forest, mainly in the mountainous areas of 2400 m and 3100 m [40]; this type of forest is concentrated on both sides of ditches and gentle slopes within a small area.

Methods
The methodological workflow of this study consists of the following steps ( Figure 4): (1) extracting seven spectral reflectance bands and five vegetation indexes for all the images; (2) extracting three environmental factors; (3) implementing seasonal image composite; (4) building 17 feature combinations using all existing features; (5) implementing forest/nonforest classification and forest type classification using RF classifier based on the 17 feature combinations; (6) accuracy assessment and comparison of 17 results; (7) improving forest type classification accuracy based on the feature combination with the best accuracy in (6); (8) analyzing the results using the information of RF feature importance and feature group ranking; (9) comparing the forest/non-forest map and the forest type map with four public reference products, Forest2010, FROM_GLC2015, GLC_FCS2020 and MCD12Q1.

Classification Scheme
Due to the diverse climate types and topographical conditions in Yunnan Province, its forest has a complex stand structure and environmental conditions, which makes it difficult to improve the accuracy of forest types classification. The spectral characteristics of forest types and other vegetation types (such as crops, shrubs and grasslands) are easy to be confused in the classification of land cover. In addition, the spectral characteristics of forest types and non-vegetation land cover types (such as bare land, construction land, water body, snow and cloud) are obviously different, which is easy to classify. To reduce the impact of other land cover types on forest classification, the study adopts two-level classification and distinguishes forests and non-forests firstly.
For the classification of forest type, the forest areas were further classified into 5 types, including evergreen needleleaf forest (ENF), deciduous needleleaf forest (DNF), evergreen broadleaf forest (EBF), deciduous broadleaf forest (DBF), and mixed forest (MF). The definitions of the five forest types are drawn from the International Geosphere-Biosphere Program (IGBP) classification schemes [41]. Table 2 lists the forest type classification scheme adopted by the study.

Classification Scheme
Due to the diverse climate types and topographical conditions in Yunnan Province, its forest has a complex stand structure and environmental conditions, which makes it difficult to improve the accuracy of forest types classification. The spectral characteristics of forest types and other vegetation types (such as crops, shrubs and grasslands) are easy to be confused in the classification of land cover. In addition, the spectral characteristics of forest types and non-vegetation land cover types (such as bare land, construction land, water body, snow and cloud) are obviously different, which is easy to classify. To reduce the impact of other land cover types on forest classification, the study adopts two-level classification and distinguishes forests and non-forests firstly.
For the classification of forest type, the forest areas were further classified into 5 types, including evergreen needleleaf forest (ENF), deciduous needleleaf forest (DNF), evergreen broadleaf forest (EBF), deciduous broadleaf forest (DBF), and mixed forest (MF). The definitions of the five forest types are drawn from the International Geosphere-Biosphere Program (IGBP) classification schemes [41]. Table 2 lists the forest type classification scheme adopted by the study.

Sampling Strategy
Supervised classification methods rely on training samples [42]. The influence of training sample size on classification accuracy is greater than that of the adopted algorithm [42,43]. Thus, the size and quality of training samples are the key to classification [42]. Representative samples were collected in the above eight counties (cities) (Figure 3) from Chinese Forest Management Inventory (FMI) data and high-resolution images from Google Earth. Chinese FMI is a ground survey of every ten years at forest stand level. FMI collects forest information about forest management type, dominant tree, average height, the average diameter at breast height, age, canopy density, topographic factors, and soil type, etc. [1,44]. FMI is extremely laborious and time-consuming, and the latest one was carried out in 2016. Here, it is assumed that there were no dramatic changes in the forest types from the year of FMI data (2016) to the year of acquisition of remote sensing images (March 2017-March 2018). Moreover, with exhaustive comparison with high-spatial-resolution images from Google Earth, the quality of reference samples can be guaranteed.
Based on the consideration of spectral characteristics of different classes involved in the classification scheme (Table 2), two levels of samples with different sample size were collected in the research. For the first level, the non-forest classes consisted of multiple land cover types, which involved multiple spectral. Therefore, samples of non-forest types are needed to fully mix the spectra of multiple land cover types, and a large sample plot with 300 m × 300 m was employed. For the forest-type classification level, the aim is the division of internal natural attributes of the forest land. The sample unit with a larger coverage area may lead to more mixed pixels, which represent the spectral mixture of different forest types. Hence, 30 m × 30 m sample pixels, rather than 300 m × 300 m, were involved in the classification of forest type.
The 300 m × 300 m sample plots were produced by buffering sample points that generated randomly and distributed evenly in the eight sites ( Figure 3a). Each sample plot was reviewed and categorized into the forest and the non-forest using high-resolution images from Google Earth. Sample plots that caused the problems with spectral confusion would be removed or manually modified. Sample pixels for five forest types were also generated randomly, and the forest type of each pixel was specified according to the dominant tree information of FMI. An internal of at least 90 m was set for adjacent sample pixels (3 pixels at least on Landsat 8 images). At the same time, sample pixels located within 600 m (20 pixels) of the adjacent area of different forest types were eliminated. The remaining sample pixels were also reviewed using field work data, high spatial resolution Google Earth images, and the Landsat NDVI time series curve. Unfortunately, it is not easy to check every forest type sample due to the huge quantity of samples and only parts of the forest type sample pixels were randomly selected for checking. Finally, the remaining sample plots and pixels were randomly divided into training and validation sets according to the ratio of 7 to 3, respectively (Table 3). There is no overlap between the training sample and validation sample sets.

Image Composite and Feature Extraction
Although the long-term plans for periodic and systematic data acquisitions have guaranteed large amounts of Landsat data available, the availability of Landsat data is not ideal for persistently cloud-contaminated areas, i.e., the mountain area, is less than ideal [45,46]. Most of the Landsat 8 images for Yunnan are not free of clouds, and images from June to September are practically affected by clouds and cloud shadows ( Figure 2). In this research, pixels affected by cloud and cloud shadows in each Landsat imagery (530 images in total) were identified and eliminated by Pixel Quality Assessment (QA) band from Landsat 8 SR collection [47][48][49]. The QA band generated from the CFMASK algorithm [48], and each pixel in the QA band contains a value that represents bit-packed combinations of conditions, such as cloud, cloud shadow, snow, and water, that can affect the given pixel [50]. Therefore, cloud and cloud shadow-affected pixels were masked.
After the elimination of cloud-affected pixels, seven spectral bands were derived for each image from the nine bands of original OLI image. Seven spectral bands are blue, green, red, NIR, SWIR1, SWIR2, and NIR_SWIR1. Because the spectral reflectance difference of different forest types in the NIR and SWIR1 is obvious, a compound band, NIR_SWIR1, is calculated by adding NIR and SWRI1. Phenology is a useful trait for forest type mapping by using the multi-temporal vegetation index approach [51]. The link between vegetation indexes (VIs) and vegetation phenological variability is more robust than that of original single spectral bands and vegetation phenological variability [52]. Therefore, five vegetation indexes, NDVI [53], EVI [54], SAVI [55], LSWI [56,57], and NBR2 [21] were calculated for each image using formulas shown in Table 4. Table 4. Vegetation indexes used in the study.

Index Abbreviation Formula
Normalized Difference Vegetation Index Then, a pixel-based multi-seasonal composite method was used to produce two types of seasonal composite images, i.e., four 7-spectral-band composite images and four 5-VIbands composite images corresponding to spring, summer, autumn and winter, respectively. Firstly, the time series were divided into four seasonal periods according to the seasonal characteristics of forest phenology and seasonal climate of Yunnan. Next, an annual composite was generated using all images in a year (from 1 March 2017 to 1 March 2018), and four seasonal composites for spring, summer, autumn, and winter were also respectively produced using temporal statistic operators. As for the production of four seasonal composites, if there was no observation due to the elimination of cloud-affected pixels, pixel values from the annual composite would be used instead. The temporal statistic operator, the mean, and median values of time series pixels were explored to perform the seasonal image compositing. Here, the seasonal pixel-based composite method was applied to the time series of 7-spectral-band images and the time series of 5-VI-band images, respectively. As a result, four seasonal 7-spectral-band composites (Spr_Ref, Sum_Ref, Aut_Ref, and Win_Ref) and four seasonal 5-VI-band composites (Spr_VI, Sum_VI, Aut_VI, and Win_VI) were generated in the GEE cloud platform, respectively (shown in Table 5). Table 5. Features, including four seasonal of spectral reflectance images and four seasonal vegetation index images, three terrain factors, temperature, and precipitation in 12 months, used in the research. The monthly mean temperature for 12 months. P_12s 12

Number of Feature Bands Description
The monthly mean precipitation for 12 months.
In addition, three commonly used terrain factors, elevation, slope, and aspect, were derived from the SRTM image. The average monthly temperature and average monthly precipitation data for 12 months were also produced from MOD11A2 Terra Land Surface Temperature and Emissivity and OpenLandMap Precipitation monthly data, respectively. The temperature and precipitation data were resampled from 1 km to 30 m resolution using nearest neighbor resampling to match the spatial resolution of other data. In summary, three types of feature sets, a total 75 bands, were derived, including four seasonal spectral bands sets, four seasonal vegetation index sets, and environmental factors ( Table 5). The code for the implementing multi-seasonal image composite is provided in Supplementary Materials.

Random Forest Classification
Random Forest (RF) is an integrated learning method based on a decision tree, which is combined with many ensembles regression or classification trees [58], and has become increasingly common in remote sensing applications due to its flexible, nonparametric nature and ability to limit overfitting [59,60]. Bremian found that this method performs better than many other classifiers, such as discriminant analysis, support vector machines, and neural networks [58]. Many studies have shown that the RF classifier can successfully process high-dimensional data with fast calculating speed and is insensitive to over fitting [61]. In the past 20 years, RF classifier has attracted much attention and has been widely used in the processing remote sensing images [21,61].
Different features make the final result of classification different in accuracy [62]. In this study, based on the RF classifier, 17 feature combinations were built to perform hierarchical classification. As shown in Table 6, 17 feature combinations can be included in several types, which consist of single season features (R SS , VI SS , and RVI SS ), multi-seasonal features (R 4S , VI 4S , and RVI 4S ), and the combination of multi-seasonal features and environmental factors (RVI 4S T and RVI 4S TTP). The hierarchical classification process was performed on GEE. In the first level, 17 feature combinations were separately carried out the forest/non-forest classification, and the accuracy of 17 forest/non-forest classification results were evaluated by the overall accuracy (OA) and Kappa. Then, the forest area of the result with the highest classification accuracy was taken as the final forest distribution area. In the second level, based on the forest cover produced from the first classification level, the forest was subdivided into five forest types ( Table 2). Forest type classification was also conducted on 17 feature combinations, respectively, and the classification accuracy of 17 results were also evaluated and compared by OA and Kappa.
The classification accuracy of RF classifier is insensitive to two adjustable parameters: the number of trees and the number of prediction variables [45,63]. Based on the balance between classification accuracy and computation time, a final RF model with 100 trees for forest/non-forest classification and 200 trees for forest type classification was set up in the study. The number of selected prediction variables were set to the square root of the number of variables for both forest/non-forest classification and forest type classification.

Feature Importance Assessment
Feature importance assessment is a vital step when high-dimensional datasets are used [64]. In the research, in addition to considering the impact of different feature combinations on forest classification, we also used "Gini importance" to find out which spectral, vegetation index, and environmental factor features were most important for classification of different forest types. "Gini importance" is a feature importance measure method embedded in Random Forest classifier [58]. The Gini importance score describes the relative importance of features after comparing the input features, and it corresponds to features that are consistently found more often and higher up in the split of individual decision trees [64]. Therefore, based on GEE, feature importance score was calculated by an additional RF classification with all the 75 features (The set Ref_VI 4S _TTP shown in Table 6) using "Gini importance".
Furthermore, although including more predictor variables potentially adds additional information to separate classes, this increased dimensionality and complexity may result in a decrease in classification accuracy [42]. Using only the most important features may result in higher classification accuracies than using all available features [8]. Even if the accuracy is not improved by only the most important features, some studies have shown that the loss of the overall classification accuracy is relatively small when the number of features was reduced, during mapping land cover with RF [42,65]. In addition, the reduction of predictor variables may help to simplify the model, shorten data processing times, and make the experiment reproducible. In this study, the Random Forest recursive feature elimination (RFRFE) algorithm was used to perform feature selection (the algorithm principle is demonstrated in [18,66]). We performed RFRFE on GEE.

Accuracy Evaluation and Comparison
Accuracy assessment of our resultant forest maps include two aspects: (1) evaluation by calculating confusion matrix with the test samples collected from FMI, and (2) visual and quantitate comparison with four public forest maps.
The confusion matrix is often used to evaluate the classification accuracy of remote sensing image, and it is able to identify the confusion among categories and the possible sources of errors. In this study, the confusion matrix was calculated by using the test samples that came from FMI, which is built through a field survey. The overall accuracy (OA), user accuracy (UA), producer accuracy (PA), commission error (CE), omission error (OE), and Kappa coefficient were calculated for the classification results of both levels.
Furthermore, the forest/non-forest classification result with the best OA and the forest type classification result with the best OA were selected to compare with four commonly used forest type products, including Forest2010 [9], FROM_GLC2015 [11], GLC_FCS2020 [67], and MCD12Q1 [68], to assess the quality of the forest type mapping product of this study. Among them, Forest2010 is a product focusing on mapping forest types over China, and the others are global land cover products with 30 m resolution and 500 m resolution. Forest2010 and FROM_GLC2015 also use IGBP classification schemes to define their six types of forest [9,11,68]. Compared to the five forest types in this study, Forest 2010 has one more type of bamboo forest, while FROM_GLC2015 divides mixed forest into evergreen mixed and deciduous mixed forests. GLC_FCS2020 has ten forest types according to forest canopy density and forest types (Table 7). In this study, the ten forest types of GLC_FCS2020 has been categorized into five forest types used in this study for comparison. MCD12Q1 includes a land cover classification scheme from IGBP classification, and this research used the MCD12Q1 products produced in 2017. The comparisons were conducted on the Yunnan Province as a whole and eight selected case regions (Figure 3a, red rectangle), respectively.

Results
In order to evaluate the effectiveness of the proposed method, many experiments were conducted. Firstly, the mean and median statistical operators were tested on four seasonal composites and a multi-seasonal composite image by comparing forest/non-forest classification accuracy and forest classification accuracy, to determine the optimal statistics operator (Section 3.1). Secondly, based on the optimal multi-seasonal image composite, by comparing the accuracies of hierarchical classification of 17 feature combinations, the feature combination obtained the best accuracy was fixed (Section 3.2). Thirdly, by performing feature importance analysis, an optimal feature subset for forest type classification were selected from the feature combination (Section 3.3). In addition, finally, the forest and the non-forest classification and forest type classification results were produced by using the optimal feature combination and subset, respectively (Section 3.4).

Input Seasonal Image Composite
As highlighted by Phan et al. [69], the choice of the image composite method can influence the accuracy of the classification process. Different statistical operation leads to different accuracy values at the end of the classification process. As mentioned in Materials and Methods, the mean and median statistical operators were tested on four seasonal composites with 7-spectral-band composites (Spr_Ref, Sum_Ref, Aut_Ref, and Win_Ref) and a multi-seasonal 28-spectral-band composite (R 4S ), and the two operators were compared by assessing the forest type classification accuracy of their corresponding composites. Table 8 reports the main results of two statistical operations on four seasonal composites. In the first level where forest and non-forest classification was performed, the median-based composites achieved higher accuracy than mean-based composites, and the best result of forest/non-forest classification was obtained using the median-based multi-seasonal composite image, with a Kappa of 0.945, and OA of 97.33%. As far as the forest type classification was concerned, the results of mean-based composites in spring, summer, and autumn obtained better accuracy than that of median-based composites; however, the median-based multi-seasonal composite performed better than mean-based multi-seasonal composite, with a Kappa of 0.442, and OA of 58.81%. Therefore, a median statistic operator was finally chosen to produce four seasonal 7-spectral-band composites and a multi-seasonal 28-spectral-band composite image.

Random Forest Classification Results
Different features lead to different accuracy values at the end of the classification process. To assess the best achievable accuracy, based on the RF classifier, 17 feature combinations were separately carried out the forest classification on GEE (Table 6). Table 9 compares the classification accuracy of all the experimental results with OA and Kappa coefficient. For the forest/non-forest classification, RVI 4S TTP obtained the best accuracy with OA and Kappa values of 97.64% and 0.952. Therefore, the forest/non-forest map from set RVI 4STTP was used as the basis of forest type classification. For the forest type classification, the best accuracy value was reached when using 75-feature set RVI 4S TTP again (71.14% and 0.638 of OA and Kappa, respectively). Using only single season features (R SS , VI SS , and RVI SS ) resulted in their OA and Kappa lower than 60% and 0.500. The usage of multi-seasonal spectral reflectance features (R 4S ) achieved an OA of 61.23%, Kappa of 0.513, and RVI 4S , which was formed by adding vegetation indexes to R 4S , displayed an OA to 63. 82%. Moreover, RVI 4S T, which had three additional terrain features (elevation, slope, and aspect) than RVI 4S , resulted in an obvious improvement in accuracy (65.28% and 0.564 of OA and Kappa, respectively), so it could be concluded that the additional environmental factors also improved the OA and Kappa of RVI 4S TTP.
The lowest accuracy found in the result from VI_sum which has five VI bands in summer, showing 44.54% and 0.304 of OA and Kappa, respectively.

Feature Importance
Feature importance score was used to find out which features contributed more in the classification of forest types. In addition, it was performed using the Gini criterion in the RF algorithm of GEE platform. Figure 5 shows the feature importance of 75 features in the classification of five forest types. The variables are ranked by the score of importance. According to feature importance score, vegetation index features and environmental features show significant importance. The top 10 features were six vegetation index features and four environmental features, and the elevation contributed the most to the classification, which is consistent with [6]. Among all the vegetation index features, the VI features in winter composite image and autumn composite image contributed the most to the classification of five forest types. Furthermore, the top 36 features were 14 vegetation index variables, 12 temperature features, six precipitation features, three terrain features, and only one spectral reflectance feature. Therefore, besides vegetation index features and terrain features, the monthly mean temperature and monthly mean precipitation over 12 months also show their importance. It could be found that spectral reflectance features (blue in Figure 5) show less importance since they mainly ranked in the last 30 variables.

Feature Importance
Feature importance score was used to find out which features contributed more in the classification of forest types. In addition, it was performed using the Gini criterion in the RF algorithm of GEE platform. Figure 5 shows the feature importance of 75 features in the classification of five forest types. The variables are ranked by the score of importance. According to feature importance score, vegetation index features and environmental features show significant importance. The top 10 features were six vegetation index features and four environmental features, and the elevation contributed the most to the classification, which is consistent with [6]. Among all the vegetation index features, the VI features in winter composite image and autumn composite image contributed the most to the classification of five forest types. Furthermore, the top 36 features were 14 vegetation index variables, 12 temperature features, six precipitation features, three terrain features, and only one spectral reflectance feature. Therefore, besides vegetation index features and terrain features, the monthly mean temperature and monthly mean precipitation over 12 months also show their importance. It could be found that spectral reflectance features (blue in Figure 5) show less importance since they mainly ranked in the last 30 variables. From the feature with the highest score to the one with the lowest score, RFRFE starts with all the features to execute forest type classification and recursively removes one insignificant feature each time. Figure 6 shows how the OA and Kappa in the forest type classification increase with the increase in the number of features. According to Figure 6, the highest accuracy was obtained when using 44 variables. The accuracy obviously increases from using four features (60.63% of OA) to using 16   From the feature with the highest score to the one with the lowest score, RFRFE starts with all the features to execute forest type classification and recursively removes one insignificant feature each time. Figure 6 shows how the OA and Kappa in the forest type classification increase with the increase in the number of features. According to Figure 6, the highest accuracy was obtained when using 44 variables. The accuracy obviously increases from using four features (60.63% of OA) to using 16 features used (70.03% of OA). After reaching the peak when 44 features are used (72.17% and 0.657 of OA and Kappa, respectively), the resulting differences among the following results are minor, with OA values oscillating between 69.33 and 72.17%. Considering achieving a better OA with a faster process speed on GEE, the forest type mapping result from 16 features were used. The 16 features are elevation, Winter_LSWI, Winter_NBR2, Winter_NDVI, Autumn_NDVI, Autumn_LSWI, Autumn_NBR2, LST_12, Slope, Precip_7, Precip_8, LST_7, Precip_4, Spring_NBR2, Summer_LSWI, Precip_6.  Figure 7 illustrates the final classification outputs of this study. The forest/non-forest classification results were finally produced by using 75 features since the highest accuracy was achieved when using 75 variables according to Table 9. Specifically, 28 features are multispectral bands from seasonal composite images in four seasons, 28 features are VIs from seasonal composite images in four seasons, 3 features are from terrain features, and 24 features are from monthly mean precipitation and monthly mean temperature over 12 months. The forest type classification results were finally produced by using 16 features, considering achieving a better OA and a faster process speed on GEE (depicted in Section 3.3). Through the evaluation of sufficient forest and non-forest sample plots that covers more than 150,000 pixels, the results show that these features can get 97.64% OA for distinguishing the forest and non-forest area (Table 10), and the class of forest also obtained high UA (97.72%) and PA (98.22%) for the class of forest. Therefore, the identified forest cover could be the basis of forest type classification. Figure 7a shows Figure 7 illustrates the final classification outputs of this study. The forest/non-forest classification results were finally produced by using 75 features since the highest accuracy was achieved when using 75 variables according to Table 9. Specifically, 28 features are multispectral bands from seasonal composite images in four seasons, 28 features are VIs from seasonal composite images in four seasons, 3 features are from terrain features, and 24 features are from monthly mean precipitation and monthly mean temperature over 12 months. The forest type classification results were finally produced by using 16 features, considering achieving a better OA and a faster process speed on GEE (depicted in Section 3.3).  Figure 7 illustrates the final classification outputs of this study. The forest/non-forest classification results were finally produced by using 75 features since the highest accuracy was achieved when using 75 variables according to Table 9. Specifically, 28 features are multispectral bands from seasonal composite images in four seasons, 28 features are VIs from seasonal composite images in four seasons, 3 features are from terrain features, and 24 features are from monthly mean precipitation and monthly mean temperature over 12 months. The forest type classification results were finally produced by using 16 features, considering achieving a better OA and a faster process speed on GEE (depicted in Section 3.3). Through the evaluation of sufficient forest and non-forest sample plots that covers more than 150,000 pixels, the results show that these features can get 97.64% OA for distinguishing the forest and non-forest area (Table 10), and the class of forest also obtained high UA (97.72%) and PA (98.22%) for the class of forest. Therefore, the identified forest cover could be the basis of forest type classification. Figure 7a shows   Through the evaluation of sufficient forest and non-forest sample plots that covers more than 150,000 pixels, the results show that these features can get 97.64% OA for distinguishing the forest and non-forest area (Table 10), and the class of forest also obtained high UA (97.72%) and PA (98.22%) for the class of forest. Therefore, the identified forest cover could be the basis of forest type classification. Figure 7a shows the final map result of forest/non-forest mapping in the southwest of Yunnan Province. We can see that the nonforest land covers, including urban, cropland, water, and grass types, are mainly located in the flat areas and valleys, while the forests are mainly located in the mountainous areas with an altitude of less than 4000 m. The area of forest cover in the southwest half of Yunnan is more than that of the northeast half, and this is consistent with the precipitation trend. Most of the forests are distributed in the mountainous areas where the precipitation is more than 400 mm. Table 11 reports the confusion matrix for the second stage classification. Forest type classification achieved an overall accuracy of 70.30% and a Kappa coefficient of 0.628. The PA of the five forest types ranged from 56.99% (mixed forest) to 95.44% (deciduous needleleaf forest), while UA ranged from 62.21% (mixed forest) to 94.49% (deciduous needleleaf forest). The identification of deciduous needleleaf forest is better than other forest types. Mixed forests had the highest CE (37.79%) and OE (43.01%), and it had been found to be mostly misclassified as evergreen needleleaf forest and evergreen broadleaf forest. In addition, there was confusion among evergreen coniferous forests and evergreen broad-leaved forests, probably due to these two forest types belong to evergreen forests, and the seasonal variation of phenology between the two forest types is relatively small.  Figure 7b shows the result of forest type classification. The forest type distribution map shows that needleleaf forests are mainly distributed in northwestern and northeast of Yunnan, and mixed forests are mainly distributed in southern Yunnan and the western boundary of Yunnan. It also displays that the forest types in Yunnan Province gradually change from coniferous forests in the northwest to broadleaf forests in the middle, and then to mixed forests in the south. Deciduous needleleaf forest is mainly distributed in the northwestern part of Yunnan Province and less in other regions. Comparing the forest type map (Figure 7b) to the elevation map of Yunnan (Figure 1b), it could be found that the transition of forest types is consistent with the transition trend of altitude.

Seasonal Composite Image
A key challenge in mapping forest type over a large region with remote sensed data is that the coverage of valid data is limited both time and in area due to the presence of clouds. Due to the complex topography and climate of Yunnan Province, there are more clouds and rain from June to October, which leads to fewer high-quality images (Figure 2). Except for the overlapping area of the adjacent scenes, the original observation frequency of 35 scenes of images covering Yunnan is in the range of 10 to 20 times from 1 March 2017 to 1 March 2018 (Figure 8a). However, after cloud elimination, the number of observations per pixel in a 12-month period is mostly in the range of 0 to 10 times (Figure 8b). Therefore, it is difficult to obtain a high-quality single-date image at every location of 35 scenes within a season, especially in summer and autumn (Figure 9a-d). In addition, for a large research area, it takes more time to select the images with less amounts of clouds.
Different from the traditional way, there is no valid image selection in the production of the median-based seasonal/annual composite images on GEE. All images in the 12-month period were used to produce the composite images, making sure all cloud-free pixels can be used. As shown from Figure 9e-h, four seasonal cloud-free images covering the whole area were generated using the pixel-based median composite method. The colors of spring, autumn, and winter composite images look much smoother across the whole area, while the summer composite image has obvious color boundaries. This is mainly because all the images for summer composition have clouds or cloud shadows in some specific areas, and thus the summer composite output values in these areas are derived from the annual composite image rather than the median values from the images for summer. As a result, the large spectral differences of some land covers between summer and other seasons make the color boundaries in the summer composite image. In summary, the seasonal image composite method using 12-month time series images offers an opportunity to overcome the limited availability of valid data.

Seasonal Composite Image
A key challenge in mapping forest type over a large region with remote sensed data is that the coverage of valid data is limited both time and in area due to the presence of clouds. Due to the complex topography and climate of Yunnan Province, there are more clouds and rain from June to October, which leads to fewer high-quality images ( Figure  2). Except for the overlapping area of the adjacent scenes, the original observation frequency of 35 scenes of images covering Yunnan is in the range of 10 to 20 times from 1 March 2017 to 1 March 2018 (Figure 8a). However, after cloud elimination, the number of observations per pixel in a 12-month period is mostly in the range of 0 to 10 times ( Figure  8b). Therefore, it is difficult to obtain a high-quality single-date image at every location of 35 scenes within a season, especially in summer and autumn (Figure 9a-d). In addition, for a large research area, it takes more time to select the images with less amounts of clouds.
Different from the traditional way, there is no valid image selection in the production of the median-based seasonal/annual composite images on GEE. All images in the 12month period were used to produce the composite images, making sure all cloud-free pixels can be used. As shown from Figure 9e-h, four seasonal cloud-free images covering the whole area were generated using the pixel-based median composite method. The colors of spring, autumn, and winter composite images look much smoother across the whole area, while the summer composite image has obvious color boundaries. This is mainly because all the images for summer composition have clouds or cloud shadows in some specific areas, and thus the summer composite output values in these areas are derived from the annual composite image rather than the median values from the images for summer. As a result, the large spectral differences of some land covers between summer and other seasons make the color boundaries in the summer composite image. In summary, the seasonal image composite method using 12-month time series images offers an opportunity to overcome the limited availability of valid data.    Figure 10 shows the spectral reflectance curves of five forest types in a summer singledate image and summer composite image. As shown in Figure 10, due to the median statistic operation for the composite image, the spectral reflectance values of five forest types from composite image are lightly lower than that of from the single-date image. In general, the spectral reflectance values of five forest types in the summer composite image are close to the corresponding values in the cloudless single-date image for five forest types. Analysis of these spectral curve suggests that the median-based seasonal composite values are credible.  Figure 10 shows the spectral reflectance curves of five forest types in a summer single-date image and summer composite image. As shown in Figure 10, due to the median statistic operation for the composite image, the spectral reflectance values of five forest types from composite image are lightly lower than that of from the single-date image. In general, the spectral reflectance values of five forest types in the summer composite image are close to the corresponding values in the cloudless single-date image for five forest types. Analysis of these spectral curve suggests that the median-based seasonal composite values are credible.

The Analysis of Different Feature Combinations
(1) The influence of different single-season composite images In the experiments of R SS , VI SS and RVI SS , four single-season composite images were separately used to classify forest types. Through the analysis of the OA and Kappa of the three sets of experimental results, it is found that the season of input image has a significant impact on the forest classification accuracy. All the experimental results from R SS , VI SS and RVI SS showed a similar pattern that the composite images from winter had the highest OA, followed by the autumn composite images, and the classification accuracy of the summer composite images were the lowest ( Table 9). The OAs of forest type classification using winter composite in R SS , VI SS and RVI SS are 8.82%, 9.54%, and 10.21%, higher than that of the summer composites, respectively (Table 9). Given the best season of composite image to extract each forest type may vary with forest type, the PA and UA of five forest types were compared in Figure 11. For deciduous needleleaf forests, composite images from autumn produced the highest PA and UA in all the single-season feature combinations. In addition, when classifying evergreen needleleaf forests, deciduous broadleaf forests, and mixed forests, composite images from winter produced the highest PA and UA in all single-season feature combinations. The results of this study are different from that of [18] who found forest species can be best separated using image in the growing season. One of the reasons is that the levels of forest type classification focused on the two studies are different. For example, we classified the forest area into five broad forest categories and [18] classified forest types in more detail and determined the forest types at the individual tree level. Another possible explanation for the inconsistent result is that the study areas adopted by the two studies have different climates characteries. Yunnan Province located in the tropical and subtropical climate zones and the phenological difference of different forest types are more obvious in the leaf-off season. For instance, the length of season life of broad-leaved forest is longer than that of needleleaf forest; therefore, the vegetation index value of broad-leaved forest decreases later than that of needleleaf forest. Furthermore, the vegetation index values of deciduous forests are significantly reduced in winter, while that of evergreen forests decreased only slightly in winter.

The Analysis of Different Feature Combinations
(1) The influence of different single-season composite images In the experiments of RSS, VISS and RVISS, four single-season composite images were separately used to classify forest types. Through the analysis of the OA and Kappa of the three sets of experimental results, it is found that the season of input image has a significant impact on the forest classification accuracy. All the experimental results from RSS, VISS and RVISS showed a similar pattern that the composite images from winter had the highest OA, followed by the autumn composite images, and the classification accuracy of the summer composite images were the lowest ( Table 9). The OAs of forest type classification using winter composite in RSS, VISS and RVISS are 8.82%, 9.54%, and 10.21%, higher than that of the summer composites, respectively (Table 9). Given the best season of composite image to extract each forest type may vary with forest type, the PA and UA of five forest types were compared in Figure 11. For deciduous needleleaf forests, composite images from autumn produced the highest PA and UA in all the single-season feature combinations. In addition, when classifying evergreen needleleaf forests, deciduous broadleaf forests, and mixed forests, composite images from winter produced the highest PA and UA in all single-season feature combinations. The results of this study are different from that of [18] who found forest species can be best separated using image in the growing season. One of the reasons is that the levels of forest type classification focused on the two studies are different. For example, we classified the forest area into five broad forest categories and [18] classified forest types in more detail and determined the forest types at the individual tree level. Another possible explanation for the inconsistent result is that the study areas adopted by the two studies have different climates characteries. Yunnan Province located in the tropical and subtropical climate zones and the phenological difference of different forest types are more obvious in the leaf-off season. For instance, the length of season life of broad-leaved forest is longer than that of needleleaf forest; therefore, the vegetation index value of broad-leaved forest decreases later than that of needleleaf forest. Furthermore, the vegetation index values of deciduous forests are significantly reduced in winter, while that of evergreen forests decreased only slightly in winter. Figure 11. The accuracy comparison of four seasonal single-season composite images on 7 spectralreflectance bands, 5 VI bands, and the combination of bands of spectral reflectance and VI, respectively. Figure 11. The accuracy comparison of four seasonal single-season composite images on 7 spectralreflectance bands, 5 VI bands, and the combination of bands of spectral reflectance and VI, respectively. Previous research has clearly demonstrated the effectiveness of using multi-seasonal images to distinguish among forest types [1,15,64,70]. It was found that that multi-seasonal feature sets consistently and significantly outperformed all single season feature sets across the result comparison between R 4S , VI 4S , RVI 4S and Ref SS , VI SS , RVI SS . For all the classifications using spectral reflectance features alone, vegetation index features alone, or a combination of spectral reflectance and VI features, their OAs showed that the combined use of multi-seasonal images features (R 4S , VI 4S , RVI 4S ) could achieve higher accuracy than using single-season image features alone (Ref SS  Furthermore, the strategy of using multi-seasonal images varies in different research. Ref. [1] selected images from the leaf-off, growing, and leaf-on season, and then extracted spectral reflectance and vegetation index features from these images. Ref. [15] screened the image from the period which is the best phenological state for the classification of different tree species; therefore, five images in different seasons were used. Ref. [64] collected all available Landsat observations from 1985 to 2015 and then extracted harmonic and phenology features from time series of all cloud-clear observations. For the forest type mapping over a large region, as suggested in [64], multi-seasonal feature composites tend to show more spatial variability compared to individual date images.

(3) The influence of spectral reflectance and vegetation index
The influence of spectral band and vegetation index on forest type classification accuracy is related to the season of the image. For example, the specific single-season spectral reflectance features in R SS performs better than their corresponding single-season vegetation index features in VI SS , and the OAs of four single-season experiments from set R SS are 1.81-3.58% higher than that of from VI SS . In contrast, the OA of multi-seasonal spectral reflectance composite (R 4S ) is 2.04% lower than that of multi-seasonal VI composite (VI 4S ). This indicates that multi-seasonal VI features are crucial for the forest type classification because multi-seasonal VI can reflect the seasonal phenology differences of different forests. In general, the combination of the spectral reflectance and vegetation index features has a better classification result than either of them alone, regardless of whether the combination is used in a single season or multiple seasons, i.e., RVI SS is better than R SS or VI SS , and RVI 4S is better than R 4S or VI 4S .

(4) The influence of environmental factors
In this research, three topographic features, elevation, slope, and aspect were chosen as features to classify forest. According to the feature importance score, elevation is the most important features among all features involved in forest type classification ( Figure 5). In the classification of five forest types, the addition of three terrain features into the feature combination for forest type classification resulted in an improvement of PA of 2.54-3.65%, and UA of 0.44-3.71% (Figure 12). Elevation and slope are also reported as important features in mapping tree species distribution in [1,8] as they can provide geographical characteristics. In addition, topographic data, especially elevation and slope, are frequently used in mapping land use and land cover over a regional or global scale [10,45,71]. Therefore, topographical features should also be considered when mapping global forest types.
Temperature and precipitation are crucial factors that cause the change of phenology with time [72]. Our results show that the features of precipitation and temperature characteristics also have obvious effects on improving forest type classification accuracy. Figure 12 compared RVI 4S and RVI 4S TTP, and it could be found that the addition of the features of precipitation and temperature factors brought a significant improvement of PA of 3.37-12.28%, and UA of 4.85-10.01% on evergreen broadleaf forest, deciduous broadleaf forest, and mixed forest. The accuracy increases of the classification results on evergreen needleleaf and deciduous needleleaf forest are not obvious. This is because the distribution of forest types is related to the distribution of the trend of temperature and precipitation and broad-leaved forests require more water and are more sensitive to the variation of precipitation. In the research, due to the lack of temperature and precipitation data with 30 m resolution over a large region, the 1-km temperature and precipitation data were to resample by the reproject function embedded in GEE. This processing might influence the classification accuracy. In this study, the impact of resampling method on forest type classification was not discussed. Temperature and precipitation are crucial factors that cause the change of phenology with time [72]. Our results show that the features of precipitation and temperature characteristics also have obvious effects on improving forest type classification accuracy. Figure 12 compared RVI4S and RVI4STTP, and it could be found that the addition of the features of precipitation and temperature factors brought a significant improvement of PA of 3.37-12.28%, and UA of 4.85-10.01% on evergreen broadleaf forest, deciduous broadleaf forest, and mixed forest. The accuracy increases of the classification results on evergreen needleleaf and deciduous needleleaf forest are not obvious. This is because the distribution of forest types is related to the distribution of the trend of temperature and precipitation and broad-leaved forests require more water and are more sensitive to the variation of precipitation. In the research, due to the lack of temperature and precipitation data with 30 m resolution over a large region, the 1-km temperature and precipitation data were to resample by the reproject function embedded in GEE. This processing might influence the classification accuracy. In this study, the impact of resampling method on forest type classification was not discussed.

Comparisons to Other Products
As mentioned in the Introduction, there are some 30 m resolution products with fined forest types, including Forest2010 [9], FROM_GLC2015 [11], GLC_FCS2020 [67], and MCD12Q1 [68]. The methods of forest type mapping or land use and cover mapping were compared, and there are four products adopting different strategy to select remote sensing data and features (Table 12). The early product, Forest2010, was derived from singledate Landsat images circa 2010 [9]. FROM_GLC2015 was produced after the launch and free access of Landsat 8, and therefore it could use multi-seasonal Landsat images [11]. FROM_GLC2015 chose one image with the least cloud for each season (spring, summer, autumn, and winter) and the vegetation growing season, so there are five multi-seasonal images for each path/row position, which can better distinguish vegetation types with similar spectral characteristics in a particular season and improve mapping accuracy. GLC_FCS2020 adopted a new data selection method, time series Landsat data, which uses all Landsat imagery from specific period [45,67]. With the advancement of computer computing ability, the production of forest type mapping over large areas has also shifted from local computers to cloud computing platforms. Taking FMI as "true" data, the forest/non-forest classification and the forest type classification results of this study were

Comparisons to Other Products
As mentioned in the Introduction, there are some 30 m resolution products with fined forest types, including Forest2010 [9], FROM_GLC2015 [11], GLC_FCS2020 [67], and MCD12Q1 [68]. The methods of forest type mapping or land use and cover mapping were compared, and there are four products adopting different strategy to select remote sensing data and features (Table 12). The early product, Forest2010, was derived from single-date Landsat images circa 2010 [9]. FROM_GLC2015 was produced after the launch and free access of Landsat 8, and therefore it could use multi-seasonal Landsat images [11]. FROM_GLC2015 chose one image with the least cloud for each season (spring, summer, autumn, and winter) and the vegetation growing season, so there are five multi-seasonal images for each path/row position, which can better distinguish vegetation types with similar spectral characteristics in a particular season and improve mapping accuracy. GLC_FCS2020 adopted a new data selection method, time series Landsat data, which uses all Landsat imagery from specific period [45,67]. With the advancement of computer computing ability, the production of forest type mapping over large areas has also shifted from local computers to cloud computing platforms. Taking FMI as "true" data, the forest/non-forest classification and the forest type classification results of this study were further compared with four reference products, respectively.

Comparisons on Forest/Non-Forest
The forest/forests classification result of this study were compared with the reference products in two aspects. On one hand, the comparison was made at the administrative division level of Yunnan Province; and on the other hand, it was carried out on the eight selected case regions selected from Forest Management Inventory. Because the four reference products have many forest types and non-forest types, all the non-forest types of each of the four products were combined into one type named as non-forest, and different forest types were combined as forest.
Yunnan Province has a total of 16 administrative divisions. This study calculated the forest area of each division in the products of this study and the four references. Taking FMI data as a standard, the forest area of 16 divisions in each product were respectively compared to FMI. As shown in Figure 13, the estimated forest area of this study had the best correlation with FMI with R 2 = 0.9816, followed by Forest2010 with R 2 = 0.977 and GLC_FCS2020 with R 2 = 0.9742. FROM_GLC2015 has the lowest correlation with FMI with an R 2 = 0.8341.  The forest/forests classification result of this study were compared with the reference products in two aspects. On one hand, the comparison was made at the administrative division level of Yunnan Province; and on the other hand, it was carried out on the eight selected case regions selected from Forest Management Inventory. Because the four reference products have many forest types and non-forest types, all the non-forest types of each of the four products were combined into one type named as non-forest, and different forest types were combined as forest.
Yunnan Province has a total of 16 administrative divisions. This study calculated the forest area of each division in the products of this study and the four references. Taking FMI data as a standard, the forest area of 16 divisions in each product were respectively compared to FMI. As shown in Figure 13, the estimated forest area of this study had the best correlation with FMI with R 2 = 0.9816, followed by Forest2010 with R 2 = 0.977 and GLC_FCS2020 with R 2 = 0.9742. FROM_GLC2015 has the lowest correlation with FMI with an R 2 = 0.8341. Besides the above quantitative assessment, Figure 14 showed the comparisons of the classification performance of each product with FMI over eight case regions (Figure 3a), which covers various climate and landscape environment. The forest/non-forest map of this study has great performance in Regions 3, 4, 5, 6, and 8. In Regions 1, 2, and 7, there Besides the above quantitative assessment, Figure 14 showed the comparisons of the classification performance of each product with FMI over eight case regions (Figure 3a), which covers various climate and landscape environment. The forest/non-forest map of this study has great performance in Regions 3, 4, 5, 6, and 8. In Regions 1, 2, and 7, there are some obvious non-forest cover lands that were classified as forest in this study. As for Forest 2010, it performs well in Regions 4, 6, 7, and 8, but its forest distribution is significantly sparser than FMI in Regions 1, 2, and 3, and some of the forest on the 8 side of the its Region 5 has not been identified. The distribution of forests in FROM_GLC2015 is significantly more than FMI, which is obvious in Regions 1, 2, and 7. Compared with FMI, FLC_FCS2020 also has great performance in Regions 4, 5, 6, and 8, and forest distribution is significantly more than FMI in Regions 1, 2, and 7, and the details of Region 3 are not as good as our products. Because of the coarse resolution, MCD12Q1 shows less detail in all eight case regions. Based on the above comparisons, it could be found that the forest/non-forest mapping result of this study is superior to other existing products, and therefore it can be used as the basis for forest type classification.

Comparisons on Forest Types
By comparing the distribution of forest types of different products over the whole Yunnan Province, it can be found that the differences between different products are very large [73]. Due to the lack of FMI statistical data on the distribution area of different forest types in the Yunnan Province, we only compared the classification details of this study and the reference products in eight case regions. Figure 15 shows that: (1) In the results of this study, Regions 3, 4, and 8 have achieved satisfactory classification results, while Region 2 and Region 7 have serious confusion between DBF and MF, the classification in Region 5 is not as good as the result in Forest2010, and, in Region 6, some ENF are divided into EBF.
(2) As for Forest2010, it works well in Region 5 and Region 8, but the difference between Forest2010 and FMI is obvious in the other six regions. In addition, there is no deciduous needleleaf forest in Yunnan Province in Forest2010. (3) The pattern of FROM_GLC2015 is relatively fragmented, which is quite different from FMI. The most obvious differences between FROM_GLC2015 and FMI occur in Regions 2, 3, 4, and 8. In Region 2, many pixels with deciduous broadleaf forest are classified as evergreen broadleaf forest or evergreen deciduous forest. In Region 3, the mixed forest is not distinguished. In Region 4, the mixed forest is classified as evergreen broadleaf forest. In Region 8, evergreen coniferous forest is classified as mixed forests. (4) GLC_FCS2020 works well in Region 6 and Region 8, and obvious difference exists in other six regions by comparing with FMI. (5) MCD12Q1 product has poor visual effects due to its low resolution. It could be found that the forest product of this study is better than that of others by comparing with FMI. This study and Forest 2010 are forest products that focus on forest type mapping. FROMGLC 2015 and MCD12Q1 are land cover products that focus on the classification of all land cover types, and forest cover or forest types is one of all types. Therefore, the results of this study and Forest2010 are more accurate that FROM_GLC2015 and MCD12Q1. Since FMI is a ground survey result, it is taken as a correct data for comparison. Overall, the result of this study has a better performance.
In addition, the samples in this study are distributed in eight sites instead of the entire Yunnan Province because it is difficult to obtain samples from the entire study area. As a result, the classification is inevitably affected by the local distribution of samples. Therefore, we will use samples evenly distributed in the entire study area for further research. GEE provides great convenience for massive data managing, processing, and analysis. The large amount of calculation that originally took several days can be completed in a few hours, which greatly saves work time. The using of Landsat time-series imagery and GEE provides a new direction for the improvement of large-scale forest type mapping.

Conclusions
In the research, a method of forest types mapping using Landsat-8 OLI imagery and multiple environment factors has been developed and tested within Yunnan Province of China, a large mountainous region with landscape of high geographic and climate heterogeneity. The approach employs a seasonal image compositing methodology adapted for incomplete annual data availability and frequent cloud cover. By comparing with a single-date image, the multi-seasonal image composite method could generate a cloud-free image over the whole mountainous region, Yunnan Province. The results of comparative experiments, based on a two-step classification strategy combining a feature combination procedure, confirmed that the multi-seasonal features can distinguish forest types more accurately than only single-seasonal feature. Seasonal characteristics quantify meaningful spectral and temporal variability in the complex forested landscape of Yunnan Province. Our results also demonstrated that adding environmental factors can significantly improve the classification accuracy of forest types. The feature combination of multi-seasonal spectral reflectance features, and vegetable index features, and three environmental factors (topography, temperature data, and precipitation data) obtained the highest accuracy on both forest/non-forest and forest type classification. Through the method of this study, an updated annual forest type distribution product with a resolution of 30 m can be obtained. In addition, benefitting from the powerful computing abilities and archive of Google Earth Engine cloud platform, the production of the products in this study is convenient and conducive to the real-time update of future data. Based on Google Earth Engine, the approach has the potential to be used in other areas and applied to remotely sensed images from different sources to map forest types at regional or global scales.

Data Availability Statement:
The data presented in this study are available on request from the corresponding author. The data are not publicly available due to privacy restrictions.