Mapping Torreya grandis Spatial Distribution Using High Spatial Resolution Satellite Imagery with the Expert Rules-Based Approach

Rapid expansion of Torreya forests in the mountainous region in Zhejiang Province in the past three decades has produced many environmental problems such as soil erosion and poor water quality, requiring an update of its spatial distribution in a timely way. However, to date there are no suitable approaches available for mapping Torreya forest distribution, especially the new Torreya plantations, due to the complex landscapes. This research used high spatial resolution Chinese Gaofen (GF-1) and Ziyuan (ZY-3) satellite images and digital elevation model (DEM) data to extract old Torreya forests and new Torreya plantations using a newly proposed expert rules-based approach. Different variables such as spectral bands, vegetation indices, textural images, and DEM-derived variables were examined, and separability analyses of different land covers were explored. An expert rules-based approach was developed for the extraction of old Torreya forests and new Torreya plantations. The accuracy assessment using field survey data and Google Earth images indicates that this newly-proposed approach can effectively distinguish both old Torreya forests and new Torreya plantations from other land covers with producer’s accuracies of 84% and 92%, and user’s accuracies of 77% and 85%, respectively, much better classification accuracies than the maximum likelihood classifier. This new approach may be used for other study area for extracting Torreya forest distribution. This research provides valuable data sources for better managing existing Torreya forests and planning potential Torreya expansions in this region in the near future.


Introduction
Torreya grandis (hereafter Torreya), an evergreen coniferous tree species, has over thousand years of cultivation history [1][2][3], and plays an important role in improving economic conditions for local farmers in Zhejiang Province [4,5].Because of their high economic value, Torreya plantations have been expanded rapidly in the past three decades [4].However, rapid expansion of Torreya plantations has resulted in many environmental problems such as soil erosion, polluted waterbodies, biodiversity loss, and land degradation due to improper management and cultivation [5,6].It is an urgent task to accurately map Torreya spatial distribution to provide fundamental resources for making better decisions for management of existing Torreya plantations and for planning of potential expansion in near future.
Although Torreya's contribution in improving farmer's economic condition has been highly recognized, its spatial distribution and areas are not clear because of lack of a suitable approach to extract its distribution, especially the young and new plantations [1,3].The area statistics are from the reports of local farmers and companies under investigation, but they cannot accurately represent its real situation.The most accurate approach to obtaining the spatial distribution and ecological characteristics at tree species and landscape scales is based on field survey, but it is only feasible for a small area considering the time-consuming and labor-intensity.As remote sensing data are available from the past four decades, many classification approaches have been developed to map land cover spatial distribution [7].In particular, since high spatial resolution satellite images such as IKONOS, QuickBird, Pleiades, and Worldview have been readily available since the early 21st century, mapping Torreya forests at the local and regional scales becomes feasible [8][9][10][11][12] considering that patch sizes of Torreya plantations vary from less than hundred square meters to several hundred hectares according to our field survey in 2015.
Extraction of individual tree species such as Torreya, hickory and rubber plantations is often difficult due to its similar spectral features with other forest types [13][14][15].Fassnacht et al. [16] provide a comprehensive review of tree species classification from different types of remotely sensed data.Many previous studies have explored the approaches to map individual tree species or plantations (e.g., rubber, hickory, oil palm, and coffee) using different remote sensing data [13,14,[17][18][19][20][21][22][23].Selection of suitable remote sensing data is important for a successful classification of tree species.Hyperspectral (e.g., Hyperion, AVIRIS (Airborne Visible/Infrared Imaging Spectrometer)) and high spatial resolution images (e.g., QuickBird, Worldview) are commonly used [24][25][26][27][28].Because of the different characteristics of various sensor data such as optical, radar, and lidar data, much research has shifted to the combined use of multiple sensor data [29][30][31][32].Also, much research attempts to identify a suitable approach to classify specific tree species through comparison of different algorithms [25,33,34].The common approaches can be grouped as pixel-based, subpixel-based, and object-oriented [7].To date, there are no universal approaches that can be used for classification of different tree species because of the big difference in tree structures (e.g., tree crown size, leaf shape and size, growing stages) and different characteristics of various sensor data.
Depending on specific tree species, the phenology information is very useful and often used in tree species classification [35].For example, rubber is a deciduous tree species, and integration of different seasons of optical sensor data has been proven valuable in improving rubber extraction accuracy [13,16,19].Another important feature of plantations is the forest stand structure in which plantations have more homogeneity of canopy cover than other forest covers.Thus, use of spatial features based on texture and segmentation approaches and sub-pixel information using spectral mixture analysis has been proven effective in improving mapping accuracy [15,[36][37][38].In particular, spatial information in the high spatial resolution images is valuable, thus textures and object-oriented approaches are often used [8][9][10][11][12][13]17,39,40].For example, the object-oriented approach has been used for mapping Torreya forest distribution with high spatial resolution images such as IKONOS [8][9][10][11][12].However, these studies only focused on mapping old Torreya forest distribution with high canopy density without taking the new and young Torreya plantations into account.
In reality, mapping young and new plantations is especially critical because of its important role in improving economic conditions in the near future and because of low canopy density, resulting in serious environmental problems such as soil erosion and biodiversity loss.However, to date there is no suitable approach for accurately extracting young and new Torreya plantations because of the complex landscapes which consist of small-size tree crowns, bare soils, grass, shrubs, and shadows.These complexities make classification accuracy poor if pixel-based classification algorithms are directly used based on training sample plots [7].Our initial experiments using the object-oriented approach also indicate the difficulty in extracting young and new Torreya plantation.The main reason may be the great heterogeneity of spectral signatures and largely different patch sizes in the young and new plantations, resulting in difficulty in determining optimal parameters for producing optimal image segmentation.Therefore, the objective of this research is to develop a new approach based on expert rules to accurately map old Torreya forest and new Torreya plantation distribution using high spatial resolution satellite images.

Study Area and Unique Characteristics of Torreya Plantations
A typical region of Torreya forests with approximately 100 km 2 in Zhuji County, Zhejiang Province was selected as the study area (Figure 1).This region is a core area of Torreya cultivation with top ranking in its yield and quality in China [1,6,41].The Torreya forest is a pillar industry of the town's agricultural economy.The study area has different land cover types, such as bamboo forest, evergreen and deciduous forests, shrub, cropland, bare soil, water, and impervious surfaces.Torreya is generally distributed in the regions with slope ranges of    , having moist soil condition.Torreya has strong ability to adapt to low temperature [6].This region belongs to subtropical monsoon climate with annual temperature of 12.3-16.1 • C and with hot and humid summer.The terrain is undulating (Figure 1b,c) with an average altitude of about 390 m.
optimal parameters for producing optimal image segmentation.Therefore, the objective of this research is to develop a new approach based on expert rules to accurately map old Torreya forest and new Torreya plantation distribution using high spatial resolution satellite images.

Study Area and Unique Characteristics of Torreya Plantations
A typical region of Torreya forests with approximately 100 km 2 in Zhuji County, Zhejiang Province was selected as the study area (Figure 1).This region is a core area of Torreya cultivation with top ranking in its yield and quality in China [1,6,41].The Torreya forest is a pillar industry of the town's agricultural economy.The study area has different land cover types, such as bamboo forest, evergreen and deciduous forests, shrub, cropland, bare soil, water, and impervious surfaces.Torreya is generally distributed in the regions with slope ranges of 10°~40°, having moist soil condition.Torreya has strong ability to adapt to low temperature [6].This region belongs to subtropical monsoon climate with annual temperature of 12.3-16.1 °C and with hot and humid summer.The terrain is undulating (Figure 1b,c) with an average altitude of about 390 m.Torreya forests have very different stand structures, densities, and land cover composition (Figure 2), depending on ages and growing seasons.The age of Torreya forests has huge variations, from as old as hundreds and even over one thousand years for old Torreya forests and as young as Torreya forests have very different stand structures, densities, and land cover composition (Figure 2), depending on ages and growing seasons.The age of Torreya forests has huge variations, from as old as hundreds and even over one thousand years for old Torreya forests and as young as just planted recently with various patch sizes [1,41].Old Torreya trees have large and irregular crown sizes [41].During the harvest season of Torreya fruits, understory was completely removed for the sake of access and easy pick-up of nuts (see Figure 2a), but in the no-harvest seasons, dense grass and shrub covers may co-exist underneath Torreya forests.In the young or new Torreya plantations, grass and shrub covers may or may not exist, depending on different management ways [6], as shown in Figure 2b-d.The big difference between old Torreya forests and new Torreya plantations is their canopy structure and density, including tree height, crown size, and grass/shrub coverage.The majority of new Torreya plantations have little grass and shrub coverage (see Figure 2d) because the owners removed all vegetation covers during the preparation of planting.According to our communication with the Torreya owners during our field survey in 2015, the owners also removed the grasses and shrubs three or four times every year for the sake of reducing nutrient competition between Torreya and other vegetation types and for the convenience of nut collection during the harvest.The unique bioecological characteristics and management activities of Torreya plantations have certain advantages for field survey but produce a challenge to using remote sensing data due to their complex landscapes in different stages and geolocations.
just planted recently with various patch sizes [1,41].Old Torreya trees have large and irregular crown sizes [41].During the harvest season of Torreya fruits, understory was completely removed for the sake of access and easy pick-up of nuts (see Figure 2a), but in the no-harvest seasons, dense grass and shrub covers may co-exist underneath Torreya forests.In the young or new Torreya plantations, grass and shrub covers may or may not exist, depending on different management ways [6], as shown in Figure 2b-d.The big difference between old Torreya forests and new Torreya plantations is their canopy structure and density, including tree height, crown size, and grass/shrub coverage.The majority of new Torreya plantations have little grass and shrub coverage (see Figure 2d) because the owners removed all vegetation covers during the preparation of planting.According to our communication with the Torreya owners during our field survey in 2015, the owners also removed the grasses and shrubs three or four times every year for the sake of reducing nutrient competition between Torreya and other vegetation types and for the convenience of nut collection during the harvest.The unique bioecological characteristics and management activities of Torreya plantations have certain advantages for field survey but produce a challenge to using remote sensing data due to their complex landscapes in different stages and geolocations.

Field Data Collection and Organization
Fieldwork was conducted in September 2015.During the fieldwork, the coordinates for each plot were recorded and corresponding photos were taken using a camera with a global positioning system (GPS) function.A total of 615 samples covering different land covers such as needle-leaf forest, broadleaf forest, bamboo forest, croplands and other non-vegetation types (e.g., impervious surfaces, bare soils, water), in addition to different ages of Torreya plantations, were collected during the fieldwork.The sample plots were identified from the field survey based on the representativeness of different land covers and sufficiently large areas compared with the cell size of satellite images used in this research.These plots were linked to the high spatial resolution images based on coordinates, and they were randomly separated into training and test samples.Meanwhile, Google Earth was also used to collect more sample plots for different land covers in this study area.

Remote Sensing Data Collection and Pre-Processing
The Chinese Gaofen satellite image (GF-1) has four multispectral bands (three visible bands and one near-infrared band) with 8 m spatial resolution, and one panchromatic band with 2 m spatial resolution.This L1A product with 60 km swath was acquired on 29 July 2015 with solar zenith angle of 15.57degrees.The Chinese Ziyuan satellite image (ZY-3) has four multispectral bands (three visible bands and one near-infrared band) with 5.8 m spatial resolution, and one panchromatic band with 2.1 m spatial resolution.This image with 51 km swath was acquired on 17 December 2015 with solar zenith angle of 56.67 degrees.

Field Data Collection and Organization
Fieldwork was conducted in September 2015.During the fieldwork, the coordinates for each plot were recorded and corresponding photos were taken using a camera with a global positioning system (GPS) function.A total of 615 samples covering different land covers such as needle-leaf forest, broadleaf forest, bamboo forest, croplands and other non-vegetation types (e.g., impervious surfaces, bare soils, water), in addition to different ages of Torreya plantations, were collected during the fieldwork.The sample plots were identified from the field survey based on the representativeness of different land covers and sufficiently large areas compared with the cell size of satellite images used in this research.These plots were linked to the high spatial resolution images based on coordinates, and they were randomly separated into training and test samples.Meanwhile, Google Earth was also used to collect more sample plots for different land covers in this study area.

Remote Sensing Data Collection and Pre-Processing
The Chinese Gaofen satellite image (GF-1) has four multispectral bands (three visible bands and one near-infrared band) with 8 m spatial resolution, and one panchromatic band with 2 m spatial resolution.This L1A product with 60 km swath was acquired on 29 July 2015 with solar zenith angle of 15.57degrees.The Chinese Ziyuan satellite image (ZY-3) has four multispectral bands (three visible bands and one near-infrared band) with 5.8 m spatial resolution, and one panchromatic band with 2.1 m spatial resolution.This image with 51 km swath was acquired on 17 December 2015 with solar zenith angle of 56.67 degrees.
Both GF-1 and ZY-3 images were orthorectified and registered with DEM data using 20-30 control points.The quadratic polynomial model and the bilinear resampling approach were used to conduct image-to-image (and DEM) registration with the GF-1 as the reference imagery.The root mean squared errors of less than 0.5 pixels were obtained.The Fast Line-of-sight Atmospheric Analysis of Spectral Hypercubes (FLAASH) algorithm [42] was used for atmospheric calibration of both images.The C-correction model [43] was then used to conduct topographic correction.For the GF-1 image with high solar elevation angle, one C-correction model was used for entire study area.For the ZY-3 image with relatively low solar elevation angle, current topographic correction approaches such as Minnart and C-correction models cannot effectively correct the topographic effects [44].Therefore, multiple C-correction models were used for the ZY-3 imagery based on the relationship between slope and C-correction parameter, a similar approach as used in [44].
The multispectral bands and panchromatic band have different advantages, that is, multispectral bands provide rich spectral information for different land covers which are critical for land cover classification.Although the panchromatic data have only one spectral band, the high spatial resolution provides detailed spatial information that can reduce the mixed pixel problem, which is especially valuable for the land covers with relatively small patch sizes [45].The data fusion of both multispectral and panchromatic data can improve spatial resolution while preserving multispectral features if proper data fusion techniques are used; this is especially important for quantitative analysis such as land cover classification [46].Many data fusion techniques are available, as summarized in previous literature (e.g., [45,47,48]).Gram-Schmidt Pan Sharpening has been regarded as a good data fusion technique, especially for high spatial resolution images because this approach can effectively preserve the fidelity of spectral features while improving spatial resolution [49,50], which is critical for quantitative analysis such as land cover classification [46,51].Therefore, this technique was used in this research to integrate the multispectral and panchromatic data in GF-1 and ZY-3 images separately.The fused images for both GF-1 and ZY-3 data have spatial resolution of 2 m.

Ancillary Data Collection and Analysis
Advanced Spaceborne Thermal Emission and Reflection Radiometer (ASTER) Global Digital Elevation Model (GDEM) data with 30 m spatial resolution are often used for topographic correction because of its similar spatial resolution with Landsat imagery at no cost [37,52].However, the GDEM data are not suitable for this research because this research used high spatial resolution satellite images (2 m here).Unfortunately, high spatial resolution DEM data are not available.Therefore, we bought a DEM data product with 10 m spatial resolution and used it in this research.Elevation, slope and aspect were calculated from the DEM data.Meanwhile, the topographic wetness index (TWI) was calculated for describing the extent of catchment area and the influence of spatial position on soil water accumulation [53].The TWI can be expressed as TWI = ln(SCA/tanSl), where SCA is specific catchment area: SCA = (flow accumulation + 1) × pixel size of DEM data.Sl is Slope.The use of TWI in this research is that TWI may be a good variable to distinguish Torreya plantations and croplands (or bare soils in the agricultural lands).This is because Torreya roots have relatively weak water absorption ability, and need moist soil conditions, but not the waterlogged sites [6], while croplands are almost planted on the flat land and the catchment convenient slopes.

Extraction and Selection of Variables from Remote Sensing and Ancillary Data
The GF-1 fused image in summer 2015 has better image quality and less impact of topographic factors on the spectral features than the ZY-3 fused image in winter 2015 with relatively low solar elevation angle.Thus, the major analysis was to use the GF-1 fused image for land cover classification.The spectral signatures of major land covers were first examined to understand the potential separability of different land covers using individual spectral bands in the GF-1 images.According to field survey, the land cover types in this study include evergreen forests, deciduous forests, bamboos, shrubs, herbs, farmland, bare land, impervious surface, and water.Since Torreya belongs to evergreen forest with wide variation in tree attributes (e.g., crown size, age, and height), its spectral signatures with wide ranges were confused with other land covers such as forest types, agriculture lands, and bare soils, depending on the growing stage of Torreya forests.Therefore, the emphasis is on the analysis of spectral signatures of old Torreya forests, young and new Torreya plantations with different conditions of understory.After the initial analysis based on spectral bands, other variables such as typical vegetation indices (e.g., simple ratios, the normalized difference vegetation index (NDVI), and modified soil-adjusted vegetation index (MSAVI)) and textural images were explored.The major variables used in this research are summarized in Table 1.
Transformed variables Sum of visible bands (i.e., B + G + R) Because surface reflectance values of individual spectral bands are often influenced by external factors such as topographic and moisture conditions, vegetation indices can reduce these impacts to a certain degree and extract some specific vegetation features such as vegetation health/vigor and leaf area index [54].Many vegetation indices have been proposed in previous studies [54,55] which are mainly based on multispectral or hyperspectral bands.In this research, we examined some simple ratios, traditional vegetation indices (e.g., NDVI), and modified soil-adjusted vegetation index (MSAVI) [54,55] after the initial spectral analysis based on visible and near infrared wavelength (NIR) bands.Meanwhile, the common image transform algorithm-principal component analysis-was used to concentrate the major information on the first two components [56].

Principal component analysis
In addition to the pixel-level features, the spatial information inherent in the high spatial resolution images is also important for improving land cover classification [36,57,58].Previous studies on land cover classification have indicated that proper selection of textural images, especially from high spatial resolution images, is valuable [7,36,58].The key is to identify optimal textural images that are helpful in separating the interesting land covers from others.A good texture is an optimal combination of a suitable texture measure, remote sensing variable, and window size [36,57], thus, it requires a comparative analysis of the textural images through separability analysis of different land covers.In this research, the grey-level co-occurrence matrix (GLCM) approach (e.g., mean, entropy, homogeneity, and variance) was used to extract textural images based on the first principal component (PC1) with different window sizes from 3 × 3 to 13 × 13 pixels.This is because PC1 from the GF-1 fused image concentrated the largest proportion of information.Separability analysis of the textural images based on training samples was examined and the best texture images for distinguishing Torreya forests from other land covers were selected.
Because Torreya forests have a wide variation in vegetation coverage, tree crown size, growing stages, and proportions of grass and bare soil, direct extraction of Torreya forests based on remote sensing variable alone is difficult.Vegetation phenology is important information for distinguishing evergreen and deciduous forests, especially for the plantations [7].The difference of NDVI images between the GF-1 in summer and ZY-3 in winter was used to distinguish Torreya forests from deciduous forests in this research.Meanwhile, slope, aspect, and TWI were calculated from the DEM data and they were used as potential variables for separating new Torreya plantations from agricultural lands and impervious surfaces because Torreya plantations are mainly located in the mountainous regions with slope ranges of greater than 10 degrees, while croplands are almost planted on the flat land and the catchment convenient slope.

Extraction of Torreya Forests
The maximum likelihood classifier (MLC) is a traditional supervised classification approach and is often used for land cover classification [7].In this research, it was first used to examine whether different Torreya forests (old, young, new) can be classified or not from remote sensing variables using training samples.Based on visual interpretation between the classified image and color composite of GF-1 imagery, we found that MLC cannot effectively extract different Torreya forests, because old Torreya forests were considerably confused with high dense forests, and young/new plantations are confused with agricultural lands, shrubs, and bare soils.Therefore, we explored the object-oriented approach to extract the different Torreya forest classes, because we used multispectral images with high spatial resolution images (2 m) in this research.Based on previous research, image segmentation can reduce the spectral heterogeneity of the same land covers and can improve land cover classification accuracy [59][60][61][62][63][64][65].Unfortunately, we found the object-oriented approach was not suitable for the Torreya forest mapping in this research.The possible reasons for the poor classification results are that (1) patch sizes of different Torreya forests (old, young, new plantations) vary largely; (2) some new plantations are mixed with grass and shrubs with different densities and other new plantations look like bare soils; (3) the spectral signatures of different Torreya forests are confused with other land covers such as dense forests, bare soils, and shrubs, depending on the growing stages of Torreya forests.Because of the special characteristics of Torreya forests, especially young and new plantations, the existing classification approaches based on training sample plots cannot effectively extract new Torreya plantations.It is necessary to develop a new approach suitable for the Torreya forest mapping.Because our concern in this research is to extract Torreya forest (old Torreya forest and new plantations), but the landscape of Torreya forest is very complex in the high spatial resolution, we need to explore a new approach to directly extract Torreya forests while other land covers can be ignored.
Based on field survey, we can separate Torreya forests into two categories-old Torreya forests and young/new Torreya plantations.Old Torreya forests usually have relatively dense canopy with limited understory consisting of grass and/or bare soils (Figure 2a).The Torreya canopy may account for more than 60% of the land cover area.Young/new Torreya plantations have limited Torreya canopy (Figure 2c,d), accounting for less than 40% of the land cover area, while grass/shrub and/or bare soils account for the majority of the understory.In addition to Torreya forests, other land covers in this study include evergreen forests, deciduous forests, bamboos, shrubs, herbs, farmland (with or without crops), bare soil, impervious surfaces, and water.Through analysis of the spectral signatures among different land covers based on the GF-1 fused multispectral data, we can broadly group the land covers into high vegetation cover class and low vegetation cover class (Figure 4).  4 has low reflectance in all spectral bands, including water, low-albedo impervious surfaces, and some shadows.The old Torreya forests are included in the high vegetation cover groups and young or new Torreya plantations are mainly included in low vegetation cover class.Based on this analysis, our objectives are to identify suitable rules and variables to distinguish old Torreya forests from high vegetation cover class, and new Torreya plantations from low vegetation cover class.Based on this idea, Figure 5 illustrates the framework to extract old Torreya forests and new Torreya plantations using the expert rules-based approach.
In this framework, the critical steps are to identify the variables from remote sensing and/or ancillary data and to develop the expert rules.Since many variables such as spectral bands, vegetation indices, and textures (see Table 1) may be used for land cover classification, but not all variables are needed considering their spectral separability and correlation between the variables, it is necessary to use an approach to identify suitable variables [7].Different approaches, such as separability analysis, spectral analysis of different land covers, random forest for ranking the importance of the variables, and analyst's experiences, can be used to determine the variables for land cover classification [7].The Torreya forest is a complex land cover type that may be confused with different land covers.For example, the old Torreya forests may include the following components: Torreya forest canopy, shadows cast by tall Torreya tree crowns, and gaps between Torreya trees (the gaps could be grass, shrubs, and bare soil) in the 2 m spatial resolution images.On the other hand, the young and new Torreya plantations may mainly include small size of Torreya tree crown, and relatively large gaps between the trees (e.g., the gaps mainly include grass and shrubs or mainly bare soils, depending on the farmer's cultivation methods).Therefore, we need to separate the complex Torreya cover from others but it is a challenge due to the confusion with different land covers.Based on the training sample plots for different land covers, the variables The high vegetation cover class has very high reflectance in NIR and very low reflectance in visible bands.This category can be separated into two subcategories: (1) high vegetation cover with very high reflectance in NIR, such as evergreen forest (including old Torreya forest), deciduous forests, bamboos, shrubs, herbs and crops; and (2) high vegetation cover class with relatively low reflectance in NIR, such as shadows from tall trees (e.g., old Torreya forest, other evergreen forest, deciduous forest and bamboo).The low or non-vegetation cover class usually has high reflectance values in visible bands, including new Torreya plantations, bare land, high-albedo impervious surfaces, and low vegetation coverage farmland.Another special cover in Figure 4 has low reflectance in all spectral bands, including water, low-albedo impervious surfaces, and some shadows.The old Torreya forests are included in the high vegetation cover groups and young or new Torreya plantations are mainly included in low vegetation cover class.Based on this analysis, our objectives are to identify suitable rules and variables to distinguish old Torreya forests from high vegetation cover class, and new Torreya plantations from low vegetation cover class.Based on this idea, Figure 5 illustrates the framework to extract old Torreya forests and new Torreya plantations using the expert rules-based approach.
In this framework, the critical steps are to identify the variables from remote sensing and/or ancillary data and to develop the expert rules.Since many variables such as spectral bands, vegetation indices, and textures (see Table 1) may be used for land cover classification, but not all variables are needed considering their spectral separability and correlation between the variables, it is necessary to use an approach to identify suitable variables [7].Different approaches, such as separability analysis, spectral analysis of different land covers, random forest for ranking the importance of the variables, and analyst's experiences, can be used to determine the variables for land cover classification [7].The Torreya forest is a complex land cover type that may be confused with different land covers.For example, the old Torreya forests may include the following components: Torreya forest canopy, shadows cast by tall Torreya tree crowns, and gaps between Torreya trees (the gaps could be grass, shrubs, and bare soil) in the 2 m spatial resolution images.On the other hand, the young and new Torreya plantations may mainly include small size of Torreya tree crown, and relatively large gaps between the trees (e.g., the gaps mainly include grass and shrubs or mainly bare soils, depending on the farmer's cultivation methods).Therefore, we need to separate the complex Torreya cover from others but it is a challenge due to the confusion with different land covers.Based on the training sample plots for different land covers, the variables suitable for separating different Torreya components from other land covers were examined using the separability analysis and spectral and spatial features.
suitable for separating different Torreya components from other land covers were examined using the separability analysis and spectral and spatial features.In order to explore suitable variables for extraction of Torreya forests, we examined different land covers and shadows on the variables and identify the potential thresholds to separate different land covers (Figure 6).For example, NDVI from GF-1 fused images can effectively classify all land covers into high vegetation cover and low vegetation cover classes.TWI from DEM data can separate new Torreya plantations in the mountainous regions from croplands (agriculture lands either with growing crops or without crops), and in some non-vegetation surfaces such as impervious surface and bare lands in the flat terrain.The NDVI(diff) (i.e., NDVI difference between GF-1 and ZY-3 images) can separate evergreen forests (e.g., old Torreya forests) from deciduous forests.One critical step in this approach is to determine the thresholds for the selected variables.As shown in Figure 5, the thresholds for each node were initially determined using the mean and standard deviation of the selected variables based on training samples, that is, mean ± 1 fold of standard deviation.The thresholds can be further adjusted through error-and-try approach if needed.As a comparison, MLC was also used to do the land cover classification using the same training samples based on the selected variables.
As shown in Figure 1, old Torreya forests are a combination of Torreya species, shadows cast by Torreya crown, grass/shrubs, and/or bare soils; new Torreya plantation is a combination of Torreya species, grass/shrubs, and/or bare soils, depending on the different management activities such as grass/shrub cutting or not.Therefore, post-processing is required for recoding the classified results into meaningful classes and reducing the salt-and-pepper problem.The final classification results were merged into four classes: old Torreya forests, new Torreya plantations, other high vegetation cover class and other low vegetation cover class.The post processing using majority filtering with a window size of 3 × 3 pixels was used to reduce the salt-and-pepper effects in the classified images.In order to explore suitable variables for extraction of Torreya forests, we examined different land covers and shadows on the variables and identify the potential thresholds to separate different land covers (Figure 6).For example, NDVI from GF-1 fused images can effectively classify all land covers into high vegetation cover and low vegetation cover classes.TWI from DEM data can separate new Torreya plantations in the mountainous regions from croplands (agriculture lands either with growing crops or without crops), and in some non-vegetation surfaces such as impervious surface and bare lands in the flat terrain.The NDVI(diff) (i.e., NDVI difference between GF-1 and ZY-3 images) can separate evergreen forests (e.g., old Torreya forests) from deciduous forests.One critical step in this approach is to determine the thresholds for the selected variables.As shown in Figure 5, the thresholds for each node were initially determined using the mean and standard deviation of the selected variables based on training samples, that is, mean ± 1 fold of standard deviation.The thresholds can be further adjusted through error-and-try approach if needed.As a comparison, MLC was also used to do the land cover classification using the same training samples based on the selected variables.
As shown in Figure 1, old Torreya forests are a combination of Torreya species, shadows cast by Torreya crown, grass/shrubs, and/or bare soils; new Torreya plantation is a combination of Torreya species, grass/shrubs, and/or bare soils, depending on the different management activities such as grass/shrub cutting or not.Therefore, post-processing is required for recoding the classified results into meaningful classes and reducing the salt-and-pepper problem.The final classification results were merged into four classes: old Torreya forests, new Torreya plantations, other high vegetation cover class and other low vegetation cover class.The post processing using majority filtering with a window size of 3 × 3 pixels was used to reduce the salt-and-pepper effects in the classified images.

Accuracy Assessment
For a land cover classification project, accuracy assessment is often required and the error matrix approach is commonly used.Overall classification accuracy and kappa coefficient can be calculated from the error matrix to evaluate the overall performance of the selected approach; producer's accuracy and user's accuracy can be used to evaluate each land cover classification performance [66,67].In the accuracy assessment procedure, the critical steps are to determine the number of test samples, allocate the test samples using certain sampling technique on the study area, and identify the land cover type for each sample.In this research, forty sample plots for each land cover were selected separately and they were randomly allocated into entire study area.Field survey data and Google Earth were used to identify the specific land cover for each sample plot.

Analysis of Torreya Forest Classification Results
Through the separability analysis of land covers based on different variables, Table 2 summarizes the selected variables that were used to extract old Torreya forests and new Torreya plantations.The thresholds for the selected variables were determined according to the analysis of the separability of specific land cover types using relevant variables (Figure 6).Based on the expert rules from Figure 6, four classes-old Torreya forests, new Torreya plantation, high vegetation cover class, and low vegetation cover class-were classified and are illustrated in Figure 7.We can find that old Torreya forests are dispersedly distributed in the study area with relatively small patch sizes, while some new Torreya plantations are much concentrated in limited locations with large patch sizes.Figure 7 also indicates that the MLC produced much more pixels of old Torreya forests than the expert rules-based approach.The major reason may be the spectral confusion of old Torreya forests with other high vegetation cover classes, resulting in relatively poor classification results.

Accuracy Assessment
For a land cover classification project, accuracy assessment is often required and the error matrix approach is commonly used.Overall classification accuracy and kappa coefficient can be calculated from the error matrix to evaluate the overall performance of the selected approach; producer's accuracy and user's accuracy can be used to evaluate each land cover classification performance [66,67].In the accuracy assessment procedure, the critical steps are to determine the number of test samples, allocate the test samples using certain sampling technique on the study area, and identify the land cover type for each sample.In this research, forty sample plots for each land cover were selected separately and they were randomly allocated into entire study area.Field survey data and Google Earth were used to identify the specific land cover for each sample plot.

Analysis of Torreya Forest Classification Results
Through the separability analysis of land covers based on different variables, Table 2 summarizes the selected variables that were used to extract old Torreya forests and new Torreya plantations.The thresholds for the selected variables were determined according to the analysis of the separability of specific land cover types using relevant variables (Figure 6).Based on the expert rules from Figure 6, four classes-old Torreya forests, new Torreya plantation, high vegetation cover class, and low vegetation cover class-were classified and are illustrated in Figure 7.We can find that old Torreya forests are dispersedly distributed in the study area with relatively small patch sizes, while some new Torreya plantations are much concentrated in limited locations with large patch sizes.Figure 7 also indicates that the MLC produced much more pixels of old Torreya forests than the expert rules-based approach.The major reason may be the spectral confusion of old Torreya forests with other high vegetation cover classes, resulting in relatively poor classification results.The accuracy assessment results in Table 3 indicate that the expert rules-based approach has much higher overall accuracy (84.4%) than MLC (73.1%).Old Torreya forests and new Torreya plantations have user's accuracy of 77.5 and 85% and producer's accuracy of 83.8 and 91.9% using The accuracy assessment results in Table 3 indicate that the expert rules-based approach has much higher overall accuracy (84.4%) than MLC (73.1%).Old Torreya forests and new Torreya plantations have user's accuracy of 77.5 and 85% and producer's accuracy of 83.8 and 91.9% using the expert rules-based approach, comparing with the user's accuracy of 66.7% and 84.2% and producer's accuracy of 64.9% and 75.7% using MLC.The major confusion is between old Torreya forests and high vegetation cover class and between new Torreya plantations and low vegetation cover class.This research indicates that the proposed expert rules-based approach can successfully extract old Torreya forest and new plantations in this study area.

Discussion
High spatial resolution satellite imagery is helpful in visual interpretation due to its rich spatial information resulting in obviously clear shape and boundary of different land covers.However, it also produces high spectral heterogeneity for the same land cover type, for example, different spectral signatures within the forest canopy, shadows, and gaps (see Figure 8).This situation generates a challenge using computer-based automatic approaches for land cover classification, resulting in poor accuracy and noise classification results [58].
Shadows from tall objects such as tree crowns are often a problem in remote sensing-based land cover classification [58].In medium spatial resolution images such as Landsat, shadow is mixed with tree crowns, thus the surface reflectance is a mixture of leaves, braches and shadows.In high spatial resolution images, shadows have completely different spectral signatures with leaves or braches, as shown in Figure 8a,c, resulting in high heterogeneity in the spectral signatures within the same forest cover.The shadows from different tall objects have various spectral signatures due to the impacts of different tree heights and crown sizes, topography, and sun elevation angles [68].In order to reduce the shadow problem, different approaches such as vegetation indices and textures may be used, but shadow areas from high spatial resolution images are so large that we have to select shadows as special classes to handle.As this research shown, we select different kinds of shadows-Torreya, broadleaf, and bamboo cast shadows at the initial analysis.Based on the analysis of different shadows, we identify rules to separate the shadows from other covers.Although this approach is proven effective in this research, the selection of variables and determination of thresholds (rules) are still subjective, and the shadows from different objects are difficult to be separated.
or braches, as shown in Figure 8a,c, resulting in high heterogeneity in the spectral signatures within the same forest cover.The shadows from different tall objects have various spectral signatures due to the impacts of different tree heights and crown sizes, topography, and sun elevation angles [68].In order to reduce the shadow problem, different approaches such as vegetation indices and textures may be used, but shadow areas from high spatial resolution images are so large that we have to select shadows as special classes to handle.As this research shown, we select different kinds of shadows-Torreya, broadleaf, and bamboo cast shadows at the initial analysis.Based on the analysis of different shadows, we identify rules to separate the shadows from other land covers.Although this approach is proven effective in this research, the selection of variables and determination of thresholds (rules) are still subjective, and the shadows from different objects are difficult to be separated.The proportion of grass cover and bare soil in young or new Torreya plantations (see Figure 8f,h) is an important factor influencing the classification accuracy.As Table 3 shows, new Torreya plantations may be confused with high vegetation cover or low vegetation cover and even old Torreya forests, depending on the proportion of grass and bare soil.The different compositions of grass and bare soil result in considerably different spectral signatures.Therefore, direct classification using traditional pixel-based classification such as MLC in this research is difficult (see The proportion of grass cover and bare soil in young or new Torreya plantations (see Figure 8f,h) is an important factor influencing the classification accuracy.As Table 3 shows, new Torreya plantations may be confused with high vegetation cover or low vegetation cover and even old Torreya forests, depending on the proportion of grass and bare soil.The different compositions of grass and bare soil result in considerably different spectral signatures.Therefore, direct classification using traditional pixel-based classification such as MLC in this research is difficult (see Table 3).The expert rules-based approach has been proven effective to extract new Torreya plantations in different conditions.This research also shows that remote sensing data alone cannot accurately extract certain land covers, and use of ancillary data such as DEM data is valuable in improving these land covers.For example, the new Torreya plantations are confused with croplands and bare soils, and they are not separable from the spectral signatures, but use of slope or TWI can effectively separate new Torreya plantations from croplands.This is because we obtained the knowledge from field survey that Torreya plantations are distributed in the mountainous regions with different slopes, but croplands are distributed in the flat terrains.
For the extraction of Torreya forests using high spatial resolution images, one special class is the gap between the rows or columns of Torreya trees, and the sizes of gaps vary depending on the growing stage and planting density.The gaps could be grass, shrubs, bare soils and shadow, depending on the management ways.Therefore, the gaps were also used as special classes during the land cover classification.Because of the wide spectral variation, extraction of the gaps directly from high spatial resolution is also difficult.Caution should be taken when gaps are combined into old Torreya forests or new Torreya plantations.
This research has shown the unique characteristics of Torreya forests in high spatial resolution images.Our initial analysis based on spectral signatures alone has indicated that Torreya forests cannot be effectively extracted.The Torreya forests can be confused with different land covers depending on the Torreya tree growing stages and the composition of different land covers.Therefore, use of multi-source data is necessary but produces a challenge that we need to determine which variables should be selected.This research explored the separability analysis and the relationships between different variables and land covers to find the rules to distinguish Torreya forests at different conditions with specific land covers.This approach has been proven more effective and accurate than using traditional pixel-based supervised classifier.More research should be focusing on the selection of suitable variables and determination of expert rules that can be easily used in different study areas or different satellite images in the future.
Spatial features reflect the variation of spectral signatures within a unit.Two approaches-texture and segmentation-are often used to extract spatial features for land cove classification [7].Previous studies have examined the roles of textures in improving land cover classification 36], that is, textures are indeed valuable, but the key is to identify proper textural images that can effectively extract the features of interests, which is a combination of such factors as spectral band, texture measure, and moving window size [36,57].Another problem is that no textures are helpful for all land covers, that is, some land covers could be improved, and some could be worse, depending on the textures used and the patch sizes and composition of land covers [36].Image segmentation is another way to use spatial features.It partitions the raster images into segments based on pixel, edge, and region methods [64,65] so that the pixels with similar spectral values nearby are grouped in a single segment.Different algorithms such as thresholding, edge-based and region-based can be used [59,69].Many studies have shown that the object-based classification approaches provided better accuracy than pixel-based approaches, especially when high spatial resolution images are used [58,[60][61][62][63].However, it is hard to identify the optimal parameters used in the image segmentation approaches that can fit well different patch sizes of land covers.Because of the unique characteristics of spatial features in different land covers, the selected textures or segments suitable for one study cannot be directly transferred to other study areas due to different land cover composition and complexity.However, the combination of segmentation and expert-based approach may further improve the mapping performance in this study and should be examined in the near future.
Previous studies using remote sensing data focused on land cover classification through exploration of remote sensing variables and/or classification algorithms [7,51,58].The remote sensing variables such as spectral bands, vegetation indices, and textures, and supervised classification algorithms such as MLC, neural network, and support vector machine may be satisfied for overall land cover classification, but may not provide the best result for an individual class, that is, no one approach is optimal for each type of land cover [7,51,70].We need to develop a specific approach to accurately extract a given class, and such approaches are not universal, as they depend on specific land covers such as rubber plantation [14,19,21] and hickory plantation [37].So far no one approach can be used for extracting different tree species, thus we have to develop specific approaches depending on the tree species.More research is needed to develop an approach to map tree species suitable for different locations.
The growth of tree species has different requirements on topography, soil, and moisture.Thus, incorporation of ancillary data such as DEM and soil type may be useful for land cover classification [7].This research has indicated that effective use of DEM-derived slope or topographic wetness index can reduce the confusion between Torreya forests and agricultural lands.Previous research also indicates that lidar is an important data source for extracting forest types because different forest types have their own characteristics in stand canopy sizes, heights, and shapes [31,32], but lidar is often unavailable for most study areas.More research should be undertaken on the effective use of multi-source data for improving the extraction performance of individual tree species [16,47].

Conclusions
Rapid expansion of Torreya forests in Zhejiang Province in the past three decades has produced many environmental problems, requiring mapping of spatial distribution and dynamic change in a timely way.Because of lack of suitable approaches to extract Torreya forests, especially the new Torreya plantations, this research explored the high spatial resolution satellite images to map old Torreya forests and new Torreya plantations through an expert rules-based approach.Multi-source data-spectral bands, vegetation index, texture, temporal information, and DEM-derived variables-were explored, and an expert rules-based approach was developed through analysis of these variables and specific land cover types.The results indicate that the expert rules-based approach can effectively extract old Torreya forests and new Torreya plantations with user's accuracies of 77.5% and 85% and producer's accuracies of 83.8% and 91.9%, respectively, much higher accuracies than the MLC.This approach provides the potential to effectively extract Torreya forests using high spatial resolution satellite images, which has not been examined in previous research.This research also confirmed the necessity of using multi-source data in improving Torreya forest mapping performance.

Figure 1 .
Figure 1.Location of the study area-northern Zhaojia Township, Zhuji County, Zhejiang Province.(a) represents true color composite of a Chinese Gaofen (GF-1) fused image; and (b,c) represent elevation and slope of the study area.

Figure 1 .
Figure 1.Location of the study area-northern Zhaojia Township, Zhuji County, Zhejiang Province.(a) represents true color composite of a Chinese Gaofen (GF-1) fused image; and (b,c) represent elevation and slope of the study area.

Figure 2 .
Figure 2. Photos (taken in September 2015) of Torreya forests with various ages: (a) old Torreya forest, aged hundreds of years; (b,c) young plantations between 8 and 16 years old; and (d) new Torreya plantation without grass and shrub covers.

Figure 3
Figure 3 illustrates the framework of extracting Torreya forests from a high spatial resolution image.The major steps include preparation of remote sensing and digital elevation model (DEM) data (e.g., atmospheric calibration and topographic correction for images, data fusion of multispectral and panchromatic data), extraction of variables suitable for extraction of Torreya forests, development of proper rules to distinguish Torreya forests from other land covers, and finally the accuracy assessment of classification results.The key in this approach is to identify suitable variables from high spatial resolution imagery and DEM data and to develop expert rules to separate old Torreya forests and new Torreya plantations from other land covers.

Figure 2 .
Figure 2. Photos (taken in September 2015) of Torreya forests with various ages: (a) old Torreya forest, aged hundreds of years; (b,c) young plantations between 8 and 16 years old; and (d) new Torreya plantation without grass and shrub covers.

Figure 3
Figure 3 illustrates the framework of extracting Torreya forests from a high spatial resolution image.The major steps include preparation of remote sensing and digital elevation model (DEM) data (e.g., atmospheric calibration and topographic correction for images, data fusion of multispectral and panchromatic data), extraction of variables suitable for extraction of Torreya forests, development of proper rules to distinguish Torreya forests from other land covers, and finally the accuracy assessment of classification results.The key in this approach is to identify suitable variables from high spatial resolution imagery and DEM data and to develop expert rules to separate old Torreya forests and new Torreya plantations from other land covers.

Figure 3 .
Figure 3. Framework of mapping Torreya forest distribution using high spatial resolution satellite images and ancillary data (Note: veg.represents vegetation).

Figure 3 .
Figure 3. Framework of mapping Torreya forest distribution using high spatial resolution satellite images and ancillary data (Note: veg.represents vegetation).
Spatial features TexturesGLCM (e.g., mean, variance, entropy) and skewness based on PC1 image with window sizes from 3 × 3, 5 × 5, until 13 × 13 pixels were used to calculate textural images Topographic factors DEM-based variables Elevation, slope, aspect Topographic wetness index (TWI) = ln(SCA/tanSl), where SCA = (flow accumulation + 1) × pixel size of DEM Combination NDVI(diff) NDVI difference between GF-1 and ZY-3 images, that is, NDVI(diff) = NDVI(GF)−NDVI(ZY) Note: R, G, B and NIR represent red, green, blue and near-infrared spectral bands; NDVI, NDGI, DVI, and MSAVI represent normalized difference vegetation index, normalized difference greenness index, difference vegetation index, and modified soil-adjusted vegetation index, respectively; GLCM represents grey-level co-occurrence matrix; PC1 represents the first component from the principal component analysis; DEM represents digital elevation model; SCA is specific catchment area; Sl is slope; and NIR is near infrared wavelength.

Figure 4 .
Figure 4. Spectral values of training samples of different land-cover classes.

Figure 4 .
Figure 4. Spectral values of training samples of different land-cover classes.

Figure 5 .
Figure 5. Extraction of old Torreya forests and new Torreya plantations using the expert rules-based approach (Note: NDVI(diff) = NDVI(GF) − NDVI(ZY); texture selected in this research is the textural image from PC1 using mean measure and window size of 11 x 11 pixels; NDVI and NDGI represent normalized difference vegetation index, and normalized difference greenness index, respectively; TWI represents topographic wetness index; DEM represents digital elevation model; NIR represents near infrared wavelength; and veg.represents vegetation).

Figure 5 .
Figure 5. Extraction of old Torreya forests and new Torreya plantations using the expert rules-based approach (Note: NDVI(diff) = NDVI(GF) − NDVI(ZY); texture selected in this research is the textural image from PC1 using mean measure and window size of 11 x 11 pixels; NDVI and NDGI represent normalized difference vegetation index, and normalized difference greenness index, respectively; TWI represents topographic wetness index; DEM represents digital elevation model; NIR represents near infrared wavelength; and veg.represents vegetation).

Figure 6 .
Figure 6.Analysis of variables based on different land covers for determination of thresholds to extract old Torreya forests and new Torreya plantations (Note: the subfigures show the potential discriminant boundaries between the land covers based on NDVI (normalized difference vegetation index) (a); TWI (topographic wetness index) (b); near infrared band (c); Blue band (d); NDGI (normalized difference greenness index) (e); texture (f); and NDVI(diff) (the difference of NDVI images between GF-1 and ZY-3 data) (g)).

Figure 6 .
Figure 6.Analysis of variables based on different land covers for determination of thresholds to extract old Torreya forests and new Torreya plantations (Note: the subfigures show the potential discriminant boundaries between the land covers based on NDVI (normalized difference vegetation index) (a); TWI (topographic wetness index) (b); near infrared band (c); Blue band (d); NDGI (normalized difference greenness index) (e); texture (f); and NDVI(diff) (the difference of NDVI images between GF-1 and ZY-3 data) (g)).

Figure 7 .
Figure 7. Spatial distribution of old Torreya forests and new Torreya plantations using (a) expert rules based approach and (b) maximum likelihood classifier based on multi-source data.

Figure 7 .
Figure 7. Spatial distribution of old Torreya forests and new Torreya plantations using (a) expert rules based approach and (b) maximum likelihood classifier based on multi-source data.

Figure 8 .
Figure 8.A comparison of different land-cover compositions in the GF-1 satellite color composites and corresponding photos ((a,b)-old Torreya forests; (c,d)-broadleaf forests; please note the shadows from old Torreya forests and from broadleaf forest; (e,f)-young Torreya plantations with some grass cover and bare soil; (g,h)-new Torreya plantations with abundant bare soil).

Figure 8 .
Figure 8.A comparison of different land-cover compositions in the GF-1 satellite color composites and corresponding photos ((a,b)-old Torreya forests; (c,d)-broadleaf forests; please note the shadows from old Torreya forests and from broadleaf forest; (e,f)-young Torreya plantations with some grass cover and bare soil; (g,h)-new Torreya plantations with abundant bare soil). ~40

Table 1 .
Variables used in this research.

Table 2 .
Selected variables for extraction of old Torreya forests and new Torreya plantations.: Texture * here indicates the textural images using mean measure based on PC1 and window size of 11 × 11 pixels.NDVI and NDGI represent normalized difference vegetation index, and normalized difference greenness index, respectively; TWI represents topographic wetness index; NIR represents the near-infrared spectral band; NDVI(diff) indicates the difference of NDVI images between GF-1 and ZY-3 data. Note

Table 3 .
Comparison of Torreya forest classification accuracy assessment results between the expert rules-based approach and maximum likelihood classifier.Note: OTF, NTP, HVC and LVC represent old Torreya forests, new Torreya plantations, high vegetation cover classes, and low vegetation cover classes, respectively.PA, UA, OA and KC represent producer's accuracy, user's accuracy, overall accuracy, and kappa coefficient, respectively.