Tree Species Distribution Change Study in Mount Tai Based on Landsat Remote Sensing Image Data

Located in the Mount Tai state-owned forest farm, this study adopted Landsat multispectral remote sensing data in 2000 and 2016 on the GEE (Google Earth Engine) platform and selected four phases of images each year according to the phenological period. By dealing with the current situation map of forestry resources in 2000 and the field survey data in 2016, the samples of tree species distribution in 2000 and 2016 were obtained. On the basis of topographic correction with the empirical rotation model, this study used the random forest (RF) classifier to classify tree species from remote sensing images in 2000 and 2016, achieving high classification accuracy. The results showed that, after 16 years of evolution, the percentage of pine species in the forest decreased from 55.69% to 50.22%, with a percentage decrease as high as 5.47%. The percentage of black locust (Robinia pseudoacacia) increased from 10.15% in 2000 to 13.75% in 2016, with an increase of 3.60%. Quercus also had a positive growth in the area. This result reflected the expansion of black locust.


Introduction
Tree species classification is an essential task in forest management [1]. The investigation of tree species distribution is a considerable workload. If it only relies on manual work, it will be time-consuming. As remote sensing technology can promptly collect massive real-time surface information, it was used to classify tree species over the past 40 years [2]. In recent years, according to related research, it was successfully used to explore tree species classification methods to improve classification accuracy [3][4][5][6][7][8][9][10][11].
Although the application of remote sensing technology in forest tree species classification is relatively mature, there are few studies on the analysis of forest tree species change by comparing the classification results of two periods. There are some objective reasons. For one, the classifier needs sample data; thus, whether historical data are available or not becomes a key limiting factor. Another important reason is that satellite data, which are widely used at present, may be too recent to study the long-term changes of forest tree species. For instance, Quick Bird satellite was the first commercial satellite in the world to provide sub-meter resolution. The high resolution of the satellite image reaches 0.61 m, which has great advantages in tree species classification. However, the Quick Bird satellite was launched in 2001; thus, even if there were ground survey sample data before 2001, the Quick Bird data could not be used to classify historical tree species. By contrast, Landsat satellite was launched successfully in 1972. Despite its lower spatial resolution of 30 m, its data are relatively more feasible to study the tree species distribution change with accurate ground survey sample data over a longer period. Although, at present, there are not many researches on tree species classification using Landsat data, some scholars tried. For example, Chi et al. [12] used Landsat 8 OLI (Operational Land Imager) imagery data and normalized difference vegetation index to distinguish the distribution of tree species in Changting County, Fujian Province, China.
Mount Tai Forest Farm was selected as the research site. Within this farm, many forest farm professionals and technicians believe that the area of black locust expanded a lot in recent years, while the proportion of oil pine (Pinus tabulaeformis) is decreasing year by year. Some experts even believe that black locust should be listed as an ecological invasive species in Mount Tai. In practice, technicians often identify the impact of tree species change and take corresponding measures based on subjective experience, but the accurate area and location of the tree species change remain to be further investigated. Therefore, adopting multi-temporal remote sensing images and existing classification techniques is vital to the study of tree species distribution change, and the analysis of the driving mechanism of tree species distribution change is significant for the formulation of corresponding forest management measures.
Based on Landsat image data, this study used the random forest method to classify tree species on the GEE platform, adopting time series image data of 2000 and 2016 to classify the tree species in the study area. Therefore, this study was not only the research of tree species classification, but also the research of long-term changes of tree species distribution. This study used Landsat image data, which are not widely used in tree species classification, and it proved that Landsat image data were also feasible in tree species classification with reliable sample data.

Study Area
Mount Tai state-owned forest farm was the research area, and it is mainly located in Tai'an, Shandong Province, China, with a total area of 11,732.96 ha. Small parts of Mount Tai state-owned forest farm are also distributed in Changqing District and Licheng District of Jinan, Shandong Province, China.
Tai'an city belongs to the warm temperate continental semi-humid monsoon climate zone, with four distinct seasons. It is dry and windy in spring, hot and rainy in summer, sunny and airy in autumn, and cold and snowless in winter. By searching the statistical yearbook of Tai'an in 2017, the annual average sunshine hours of the city were 2120 h, 193 h less than that in former years (2313 h); the annual average temperature was 14.6 • C, 1.0 • C higher than that in former years (13.6 • C); the annual average rainfall was 621 mm, 6.5% less than that in former years (664 mm). In 2017, the annual sunshine hours on the top of Mount Tai were 2564 h, 100 h less than in former years (2664 h); the annual average temperature was 6.9 • C, 1.0 • C higher than in former years (5.9 • C); the annual rainfall was 905 mm, 13.5% less than in former years (1046 mm). The temperature changes obviously with altitude. The foot of the mountain is a warm temperate zone, and the top of the mountain is a middle temperate zone.
According to the data provided by Mount Tai Forest Farm, the main type of vegetation in Mount Tai is temperate coniferous forest, which is also the most representative vegetation type, accounting for 66.4% of the total area of forest vegetation in Mount Tai, and it is composed of oil pine forest, Chinese arborvitae (Platycladus orientalis) forest, Japanese red pine (Pinus densiflora) forest, Japanese black pine (Pinus thunbergii) forest, Armand pine (Pinus armandii) forest, and lace-bark pine (Pinus bungeana) forest in order of proportion; the deciduous broad-leaved forest in the warm temperate zone mainly includes sawtooth oak (Quercus acutissima) forest, Black locust forest, truncate-leaved maple (Acer truncatum) forest, miscellaneous wood forest, and orchard, accounting for 33.2% of the forest area. There is no typical cold temperate coniferous forest vegetation in Mount Tai, but there are some cold temperate tree species introduced from other places, mainly including Japanese larch (Larix kaempferi), Korea pine (Pinus koraiensis), and Mongol scotch pine (Pinus sylvestris var. mongolica), accounting for 0.4% of the total area of Mount Tai forest vegetation. In the forest farm, all kinds of tree species are irrigated mainly by natural rainfall, and there is basically no artificial irrigation; thus, the weather conditions have a great impact on the growth of trees. The main objective of this study was to distinguish main tree species, including Pines, Quercus, Chinese arborvitae, and black locust, and to assess the change in tree species distribution over a time span of 16 years.
The stratum of Mount Tai is mainly composed of mixed rock, mixed granite, and gneiss. The soil types are mainly brown soil, cinnamon soil, and mountain meadow soil. There are 11 subordinate management districts in Mount Tai Forest Farm. Every year, forestry technicians in each subordinate management district nurture the forest and conduct relative investigations. However, the investigations mainly focus on pest disasters, and rarely cover the entire forest farm.
The location of the Mount Tai state-owned forest farm in China is shown in Figure 1.

Data Source and Pretreatment
Landsat 7 ETM+ (Enhanced Thematic Mapper) and Landsat 8 OLI data In this study, tree species classification was based on GEE, which means that multi-spectral Landsat image data in 2000 and 2016 were obtained from the GEE platform. Based on Hościło's measure [13], four images for each year were used to reflect different phenological periods: 26 March 2016, 16 July 2016, 18 September 2016, and 5 November 2016. The images of the four periods show different phenological periods of the tree species. The first image (26 March) was in early spring, during which the broadleaf trees just began to turn green; the second one (16 July) was in summer, and the trees were very dense and the color was very thick green; the third one (18 September) was in early autumn, when some trees' leaves gradually changed color; the fourth one (5 November) was in early winter. At this time, most of the broadleaf trees became discolored or shed their leaves.
In order to compare the data with that in 2016 and find out the changing law of forest species distribution in Mount Tai The remote sensing observation conditions of Landsat 7 ETM+ and Landsat 8 OLI are basically the same, and their satellite parameters such as altitude, inclination, coverage period, and sweep width are the same. Considering these variables, the two sets of image data of 2000 and 2016 in the study area could be used for comparison. The data of the corresponding bands of Landsat 7 ETM+ and Landsat 8 OLI used in the study are shown in Table 1. In addition, Landsat 7 ETM+ and Landsat 8 OLI data products downloaded from the GEE platform were surface reflectance data, which were geometrically corrected and atmospherically corrected.
The vector boundary of the study area was uploaded to the GEE platform. Through masking, the remote sensing image of the study area was obtained.

Sample Data
The map showing current forestry resources in each subordinate management district of Mount Tai Forestry Farm in 2000 was generated by collecting historical data. Combining the boundary of each subordinate management district and the ridges, valleys, and landmarks in the image, we clearly marked the distribution area of tree species. Taking the area as the sample unit, we obtained 235 training samples and 78 validation samples. Samples in 2000 were shown in Figure 2. Sample data of 2016 were drawn from the forest resources survey conducted in 2016. Taking the forest sub-compartment as the unit, the survey contained various information reflecting forest conditions such as tree species structure, dominant tree species, and forest age, as well as geographic information such as slope, slope direction, and soil conditions. In China, the sub-compartment is the basic unit of forest resource protection planning, investigation, statistics, and management. The division of sub-compartments should be based on obvious terrain and object boundaries as much as possible and take into account the needs for resource investigation and management. In general, the area of a sub-compartment in forest land is between 0.067 ha and 25 ha. The survey data were imported into the GIS (geographic information system) platform. By simultaneously extracting the tree species structure (pure or mixed forest) and dominant tree species, we obtained the sample data of four main tree species (pines, Chinese arborvitae, Quercus, and black locust) in Mount Tai. The samples were divided into 328 training samples and 163 validation samples. Samples in 2016 were shown in Figure 3.

DEM (Digital Elevation Model) Data
The correction for topographic illumination should be considered as a standard pre-processing step for land-cover classification and land-use change detection, especially for mountainous areas [14]. In order to calculate parameters in the terrain correction model, the 30-m resolution DEM data provided by NASA (National Aeronautics and Space Administration) JPL (Jet Propulsion Laboratory) were used in this study [15].

Research Method Topographic Illumination Correction
Illumination condition (IC) is the basis of all reflectivity compensation correction models. IC is a proportional relationship determined by the cosine of the angle between the solar zenith and the normal line of the slope. It is defined as follows [16]: where θ z is the solar zenith angle, θ s is the topographic slope angle, ϕ z is the solar azimuth angle, and ϕ s is the surface slope direction. Four phases of images in 2000 and 2016 were fused on the GEE platform, and the image was corrected by the empirical rotation model [17]. The model is defined as follows: where L H is the corrected reflectance (for a horizontal surface), λ represents a specific wavelength, L I is the observed reflectance on the incline surface, a is the slope of the linear regression for a specific wavelength, IC ranges from −1 (minimum illumination) to 1 (maximum illumination), IC H is the IC for a horizontal surface, and IC H is equal to the cosine of the solar zenith angle.
There are several commonly used topographic illumination correction models at present, and some researchers compared the advantages and disadvantages of these models. Tan et al. [14,16] compared the empirical rotation model with cosine and C models. It was found that the rotation model performed consistently well for both top-of-at atmosphere and top-of-canopy Landsat reflectance data. In this study, we found that, after topographic illumination correction by the empirical rotation model, the imageries of the study area had a good visual effect, and the band standard deviation was reduced. The topographic illumination corrected images in 2000 and 2016 are shown in Figure 4.

Tree Species Classification
The RF classifier was used for classification in this study. RF is a non-parametric machine learning algorithm based on decision tree classification. Each decision tree performs bootstrap sampling (a kind of sampling algorithm with playback) and then estimates the calculation error based on the sample's OOB (out of bag) error. RF does not consider all the variables on each node to determine the optimal segmentation threshold; instead, it uses a random subset of the original feature set, which results in a large number of unrelated decision trees. RF can get better classification results with less training data. Belgiu and Dragut [17] summarized the application of RF in remote sensing, and confirmed that the sensitivity of RF classifier to the quality and overfitting of training samples was lower than that of other streamlined machine learning classifiers, because a large number of decision trees were randomly generated. Naidoo et al. [18] combined LiDAR and hyperspectral data to study eight tropical savanna tree species in Kruger National Park, and concluded that RF was the most practical method for tree species classification in a highly heterogeneous environment.
Feature selection needs the most important feature variables. Considering that different vegetation had different vegetation indices in different phenological periods, this study attempted to extract features using the normalized vegetation index and enhanced vegetation index. Combined with spectral characteristics, that is, the average value and standard deviation of seven bands, 16 characteristic variables were generated.
Based on the training sample data, we set the parameters of the classifier and trained the classifier using the training data of 2000 and 2016. Then, we classified the images of these two years.

Accuracy Evaluation
The accuracy of classification results was verified by the validation samples. The confusion matrix method was used to analyze the results. The classification accuracy is expressed as overall accuracy and kappa accuracy. The overall accuracy is the ratio of correctly classified pixels to all validated pixels expressed as a percentage. Kappa accuracy evaluates the performance of classification compared with random assignment values.

Comparison of Classification Results
The classification results of 2000 and 2016 were loaded onto the GIS platform, the grid format results were transformed into vector format, and we calculated the transfer matrix of these two classification results. We established a new attribute field, compared the tree species code in 2000 with that in 2016, kept the changed area, and deleted the unchanged area, so as to get the changed area of tree species in the study area. It is worth noting here that, since the classification was based on a remote sensing image using the random forest classifier, achieving 100% accuracy was only an ideal state, which was impossible; thus, so there might be some very fragmented patches in the overlay map showing the changes in tree species, especially at the junction of different tree species. However, this might be caused by the misclassification of one of the two classifications. Therefore, this study considered that the tree species in these broken areas did not change and deleted them.

Classification Accuracy
Through the validation of the validated samples, the classification results in 2016 were as follows: the overall accuracy was 0.77, and the kappa accuracy was 0.70; the classification results in 2000 were as follows: the overall accuracy was 0.76, and the kappa accuracy was 0.72.

Classification Results
Based on the GEE platform, the tree species distribution map of the study area was obtained by the RF classification algorithm. The classfication results in terms of area and percentage statistics of each tree species are shown in Table 3. The species distribution map is shown in Figure 5.  As can be seen from Figure 5 and Table 3, the distribution position of the four main tree species did not significantly change. Pine species were widely distributed in the whole study area; Chinese arborvitae were primarily distributed in the northernmost and southern parts of the study area; Quercus species were mainly distributed in the north, center, and south of the study area; Black locust was rarely distributed in the south, center, and north of the study area.The change in area and percentage of the four main tree species can be seen in Table 3. The most apparent changing species were pines and black locust. After 16 years of forest evolution, the percentage of pines decreased from 55.69% in 2000 to 50.22% in 2016, with an area reduction of 641.58 ha. The rate of black locust increased from 10.15% in 2000 to 13.75% in 2016, with an area growth of 421.85 ha. This reflected the expansion of black locust. The area of Chinese arborvitae decreased somewhat, and the proportion of reduction was not large. The area of Quercus species increased slightly, with an increase percentage of 1.77%.

Result Comparison
The change in tree species in the study area with a span of 16 years is shown in Figure 6.

Discussion
From the results of the study, pines and black locust had the most dramatic changes among main tree species in the study area. In the late 1950s, limited by technical conditions, pure forests were dominated in the plantation of the study area. Among those pure forests, pines, especially oil pine, accounted for a large proportion. However, after a long period, the area of pine species decreased, while the broadleaf tree species, especially black locust forest, expanded. Through the study on the distribution change in tree species in the Mount Tai state-owned forest farm over 16 years, it can be seen that the area of black locust in Mount Tai expanded continuously. After 16 years of evolution, the area of black locust increased by 421.85 ha, with the proportion of area increasing by 3.60%, while the area percentage of pines in 2000 was 55.69%. Although pines still occupied more than half of the area in 2016, the proportion of its area was reduced by 5.47%; in the meanwhile, black locust, as a broadleaf tree species, grows fast, generally shading and affecting the growth of coniferous tree species such as pines. To deal with the situation of black locust's invasion, technicians in the forest farm tend to overcut them. However, because of the strong ability of root regeneration, after cutting, the black locust trees grow denser and thinner, which seriously affects the landscape effect and ecological function. Therefore, overcutting black locust forests is not recommended in forest management. Contrastingly, if the invasion of black locust leads to the formulation of mixed forests, the possibility of pests and diseases could be far less. Therefore, the drivers of black locust expansion should be essential in future research on Mount Tai Forest Farm. This study analyzed this factor from several aspects. Firstly, from the aspect of community environment, with the global temperature warming, Mount Tai is no exception. According to the research of Zhang et al. [19], the climate observation data of Mount Tai from 1981 to 2015 were compared and analyzed. They concluded that the annual average temperature of Mount Tai in the past 35 years showed an overall upward trend, while the precipitation showed a decreasing trend. In general, the adaptability of pines is not as good as that of black locust in dealing with drought and low rain in spring and continuous high temperature in summer. Black locust is a drought-tolerant fast-growing tree, and the plasticity of black locust in the form of ecophysiological and morphological adaptations to drought is an important precondition for its successful growth in drought areas [20]. The investigation by Mantovani [21] showed that drought stress increased the nodule biomass of black locust in order to maintain biological nitrogen fixation and to counteract the lower soil nitrogen availability. The biological nitrogen fixation of drought-stressed trees could be maintained at relatively higher values compared to the well-watered trees. The average leaf nitrogen content varied between 2.8% and 3.0% and was not influenced by the drought stress. Carbon fixation, carbon allocation, and biological nitrogen fixation were to some extent balanced at low irrigation and allowed black locust to cope with long-term water constraints. On the other hand, as a special tree species in China, oil pine is wildly distributed in Mount Tai. In China, many scholars studied the effect of water stress on oil pine. For example, Zhao et al. [22] studied the physiological responses of oil pine seedlings to water stress and discussed the drought resistance mechanism. They indicated that enzyme activity decreased when the water stress was excessive. Secondly, from the perspective of interspecific competition, through investigation, we found that the thorns on black locust could destroy the top advantage of pines and make the pines not grow high. As a light-favoring tree species, the long-term shade in areas mixed with black locust is bound to affect the growth of pines. According to the investigation, on the sunny slope of the mountain, black locust and Quercus expanded more seriously, while, on the shady slope, pines had a better growth. This is closely related to the light preference of black locust and Quercus. Thirdly, in terms of diseases and insect pests, among the four main tree species, pine was most seriously affected by insect pests. Pine caterpillars, bumblebees, Alternaria alternata and other pests basically take pines as the invasion object, which makes the growth of pines weak [23].
The structure of plantation was relatively single in Mount Tai forest farm. This situation is common in many state-owned forest farms in China. From a historical point of view, plantation afforested in the 1950s resulted in rapid greening, but these farms still had certain defects compared with natural forests. However, its situation after a long period of evolution can guide future forest management to some extent. For example, there are four standard sample plots in the Mount Tai state-owned forest farm, which are surveyed annually, detailing the changes of every tree's height and diameter. However, the area of the four standard plots is about 667 m 2 , which is too small for Mount Tai Forest Farm. It is urgent that the survey area be extended to the whole forest farm. In this way, remote sensing technology on the basis of sample plot surveys is crucial. In fact, there is a standard sample plot in Mount Tai Forest Farm, located in the Cherry Garden subordinate management district. Twenty years ago, oil pines accounted for 80% of the total area and black locust accounted for 20%, but now the proportion of the two tree species is completely reversed. This dramatic change in tree species in Mount Tai proves the significance of this study.
The primary challenge of tree species classification via remote sensing technology is that, even for different tree species, the reflection in the image represents the spectral characteristics of vegetation; thus, there is little difference in the reflectance value on the image. At present, hyperspectral images have strong advantages in tree species classification, and they can provide accurate classification results for large areas based on relatively less training data [24]. Using hyperspectral data was proven to be a feasible way of classifying forest tree species [25][26][27][28][29]. High-resolution images are also widely used in tree species classification [30][31][32]. In addition, LiDAR data were successfully used to distinguish different tree species [33][34][35][36][37].
Landsat remote sensing images, as medium-resolution and multi-spectral images, are not as widely used in tree species classification as hyperspectral images, high-resolution images, and LiDAR images. However, some scholars acquired better classification results based on Landsat images combined with other analysis methods [12]. The advantage of Landsat itself is its longer history. Since the launch of Landsat 1 on 23 July 1972, the Landsat satellite was applied in forestry. In 1976, Hall [38] and others used Landsat data to classify the national forests of northeastern California into four coniferous tree types. In addition, Landsat 8 OLI narrows the wavelength range of several spectral bands, making them more sensitive [39,40]. These achievements are positive for extracting tree species distribution information. Landsat data are optimal in the identification of forest species distribution changes in the study area, especially the long-term changes.
Landsat has many incomparable advantages over other image data in terms of availability and application scope. However, we cannot ignore its biggest drawback, that is, spatial resolution. Without accurate sample data, the spatial resolution of 30 m really hinders the study of tree species classification. In fact, this study only categorized four main tree species, mainly because these four main tree species occupy most of the area of the Mount Tai state-owned forest farm. However, in addition to these four main tree species, there are a few economic forests in Mount Tai. In addition, oil pine, Japanese black pine, Japanese red pine, and others are also included in Mount Tai. However, it is very difficult to distinguish them in detail using Landsat images.
Landsat data have advantages in studying the historical changes of tree species distribution; however, with the development of remote sensing hardware, Landsat data will not be the best choice in the study of changes in tree species distribution in the future. In contrast, the Sentinel-2A satellite, for example, was launched in 2015 with a 10-m spatial resolution and a 10-day re-visit period. In optical data, Sentinel-2A data are the only data with three bands in the red-edge range, which is very effective for monitoring vegetation health information. In addition, sentinel data are also available.
As to the selection of classification methods, traditional methods can hardly meet the demands of classification accuracy. There are increasing discussions on tree species classification. For example, Matsuki et al. [1] combined spectral features with tree-crown features and input the data into a support vector machine (SVM) classifier pixel by pixel, and they divided the data of Tomo Forest Science Park in Tokyo, Japan into 16 tree species. Dalponte et al. [41] proposed a new semi-supervised support vector machine classifier for the study of the classification of tree species at the single crown level. Lindberg et al. [42] proposed a new clustering approach to delineate tree crowns in three dimensions (3D) based on ellipsoidal tree crown models, and this approach was aimed at deriving information on the understory vegetation. Zou et al. [43] proposed a new voxel-based deep learning method to classify tree species in 3D point clouds collected from complex forest scenes. These researchers achieved good classification results of their cases, but the generalization of these methods still needs further research. Fassnacht et al. [2] suggested that the future application of remote sensing technology in tree species classification research should not only focus on its effectiveness under certain conditions, but also on its limitations under other circumstances.
No matter which classifier is used, it needs the support of sample data. Accurate sample data are the basis of better classification results. The sample data used in this paper in 2000 were from the historical data of Mount Tai Forest Farm. The sample point distribution of various tree species was determined by combining historical pictures with the ridge lines, valley lines, and other key features in the image. The sample data of 2016 were from the field survey of Mount Tai Forest Farm. Therefore, the sample data of this study were reliable and solid.
At the early stage of this study, we intended to extend the research time scale, as we could obtain Landsat image data in the 1980s, but this approach failed because the field survey data at that time could not be verified, as only a small amount of text description currently remains. As a result, the time span of this study was only 16 years. If the earlier field survey data could be obtained, it would be feasible to use the Landsat image to analyze the change in tree species over a more extended period.

Conclusions
(1) As medium-resolution remote sensing data, Landsat data have a critical application in the analysis of tree species distribution change in the study area because of their advantages in terms of accessibility and launch time. By combining with accurate sample data, ideal tree species classification results can be obtained. (2) The random forest classifier was used to classify tree species in the study area, achieving ideal results. On the basis of topographic correction of the images in the study area, 16 feature variables were extracted to obtain the classification results in 2000 and 2016. (3) After 16 years of forest change, the area of each tree species changed in the study area. The most noticeable changes were the decrease in pine species and the increase in black locust species, which reflected the expansion of black locust.