Mapping Robinia Pseudoacacia Forest Health Conditions by Using Combined Spectral , Spatial , and Textural Information Extracted from IKONOS Imagery and Random Forest Classifier

The textural and spatial information extracted from very high resolution (VHR) remote sensing imagery provides complementary information for applications in which the spectral information is not sufficient for identification of spectrally similar landscape features. In this study grey-level co-occurrence matrix (GLCM) textures and a local statistical analysis Getis statistic (Gi), computed from IKONOS multispectral (MS) imagery acquired from the Yellow River Delta in China, along with a random forest (RF) classifier, were used to discriminate Robina pseudoacacia tree health levels. Specifically, eight GLCM texture features (mean, variance, homogeneity, dissimilarity, contrast, entropy, angular second moment, and correlation) were first calculated from IKONOS NIR band (Band 4) to determine an optimal window size (13 × 13) and an optimal direction (45°). Then, the optimal window size and direction were applied to the three other IKONOS MS bands (blue, green, and red) for calculating the eight GLCM textures. Next, an optimal distance value (5) and an optimal neighborhood rule (Queen’s case) were determined for calculating the four Gi features from the four IKONOS MS bands. Finally, different RF classification results of the three forest health conditions were created: (1) an overall accuracy (OA) of 79.5% produced using the four MS band reflectances only; (2) an OA of 97.1% created with the eight GLCM features calculated from IKONOS Band 4 with the optimal window size of 13 × 13 and direction 45°; (3) an OA of 93.3% created with the all 32 GLCM features calculated from the four IKONOS OPEN ACCESS Remote Sens. 2015, 7 9021 MS bands with a window size of 13 × 13 and direction of 45°; (4) an OA of 94.0% created using the four Gi features calculated from the four IKONOS MS bands with the optimal distance value of 5 and Queen’s neighborhood rule; and (5) an OA of 96.9% created with the combined 16 spectral (four), spatial (four), and textural (eight) features. The most important feature ranked by RF classifier was GLCM texture mean calculated from Band 4, followed by Gi feature calculated from Band 4. The experimental results demonstrate that (a) both textural and spatial information was more useful than spectral information in determining the Robina pseudoacacia forest health conditions; and (b) the IKONOS NIR band was more powerful than visible bands in quantifying varying degrees of forest crown dieback.


Introduction
Recent cases of increased tree mortality and die-offs triggered by drought and/or high temperature have been documented [1][2][3].However, the fundamental mechanisms underlying tree survival and mortality during drought remain poorly understood [4].Based on the decline spiral model [5,6], drought can operate as a trigger that may ultimately lead to mortality of trees that are already under stress (due to causes such as old age, poor site condition, and air pollution).
Identifying the location and extent of forest at risk from damaging agents and processes assists forest managers in prioritizing their planning and operational mitigation activities [7].Stress in forests displays a variety of symptoms, some of which may be detected by remote sensing [8,9].Plant symptoms detectable by remote sensing may be attributed to an increase in red reflectance due to lowered chlorophyll absorption, a decrease in near infrared (NIR) reflectance due to reduced cell vigor, or shifting of a red edge position that may be invoked by different types of stress (e.g., water or nitrogen deficit) [10] or by natural development of plants (e.g., phenology changes) [11].
Several studies have focused on using Landsat Thematic Mapper/Enhanced Thematic Mapper Plus (TM/ETM+) and Satellite Pour l'Observation de la Terre (SPOT) data to map the extent and location of forest disturbances from regional to continental scales [12][13][14][15][16][17].Although the usefulness of remote sensing technology in mapping forest health conditions is widely recognized, forest health conditions are more difficult to detect from space than other forest disturbances, such as fire or clear-cutting [18].For example, living trees, foliated trees, and the understory are frequently observed in a tree crown dieback stand, which usually results in a mixture of reflectance from living trees and understory.
With recent very high resolution (VHR) satellite imagery, such as QuickBird and IKONOS, forest stress and disease can be detected at a crown level [19], which makes discrimination of individual healthy and diseased trees possible [20].However, the classification of VHR images suffers from uncertainty of the spectral information because the increase of the intra-class variance and decrease of the inter-class variance lead to a reduced separability in the spectral domain, particularly for spectrally similar classes [21].For example, in stressed/diseased forest stands, the understory plants (e.g., regeneration forest, shrubs, and grasses) presenting in gaps and open areas may have a similar NIR response to a closed forest canopy.Such a classification challenge may be overcome by using image spatial information, such as semivariogram, edge density, grey-level co-occurrence matrix (GLCM), or differential morphological profiles [22][23][24][25][26].This is because, as spatial resolution increases, ground objects tend to be represented by several pixels, and thus the spatial characteristics of imagery become increasingly important with respect to spectral information [27].The GLCM has been used to extract the most popular textural information to describe spatial properties of targets [28].Many studies using VHR sensors' data have demonstrated that the GLCM textural features are effective for mapping forest structure or discriminating forest disturbance severity [29][30][31][32].
In order to characterize local spatial distribution patterns, such as the dieback of Robinia pseudoacacia trees frequently distributing along river channels or road lines in the Yellow River Delta (YRD), China [33], Getis statistic (Gi) [34,35], a local indicator of spatial association (LISA) method [36], may be applied.The statistic reflects the information generated with respect to local variation of tree health conditions within patterns of spatial dependence, resulting in a potential to uncover discrete spatial regimes.The potential of the Gi statistic to identify significant spatial dependency in remotely sensed imagery has been documented [37][38][39].In addition, the change information in Gi statistic applied to multi-date remote sensing data could be used to identify coral reef stress [40], detect sandstorms and desertification [41], map soil moisture [42], and identify a homogenous stable area used for the radiometric vicarious calibration of satellite sensors [43].
Given the potential that textural and spatial features may help reduce the uncertainty associated with the classification of forest health conditions, some commonly-used parametric classification algorithms (such as maximum likelihood classifier) may be inadequate to the additional information provided by VHR image data [44].The spectral variability of an individual tree (e.g., pixels representing sunlit crown, shaded crown, and the influence of factors such as branches, cones, and tree morphology) restricts the development of a unique spectral signature for tree classification [45].To overcome the difficulty, data mining and machine learning techniques, such as random forest (RF) classifier [46], have been developed and produced promising results in mapping forest health conditions and extracting forest structure parameters [47][48][49].RF does not rely on the data distribution assumption (e.g., normality) and is robust to spectral variations caused by high intra-class variability.In recent years, the RF technique has proved to be highly successful in complex remote sensing applications with improved classification accuracies that are comparable to or better than most of the state-of-the-art classifiers such as support vector machines, neural networks, and boosting techniques [50][51][52].Since the textural and local spatial information has the potential to improving the accuracy of class designation by minimizing intra-class variation [30,53], the overall objective of this study is to assess whether the GLCM and Gi features extracted from IKONOS imagery were effective in determining Robinia pseudoacacia forest health conditions in the YRD, China using RF classifier.

Study Area
The Yellow River Delta (YRD) is situated in the estuary of the Yellow River in the City of Dongying, Shandong Province of Eastern China (Figure 1).It has a warm temperate, continental monsoon climate with distinctive seasonality.The average annual temperature is 11.7-12.6°C, and the mean annual precipitation varies between 530 and 630 mm, 70% of which falls during summer.Approximately 10.5 million tons of sand and soil discharged by the river annually are deposited in the delta, forming a vast floodplain and special wetland landscape [54].The dominant soil types in this area are Calcaric Fluvisols, Salic Fluvisols, and Gleyic Solonchaks.A low groundwater table with a high level of dissolved salts in it and a high evaporation rate (the ratio of the average annual evaporation to precipitation is 3.5:1) make saline soil and secondary saline soil wildly spread.In the YRD, the natural vegetation includes herbs (Viola philippica, Phragmites australis, Setaria viridis, Imperata cylindrica, Aeluropus littoralis, and Phragmites australis, etc.) and shrubs (Salix matsudana and Tamarix chinesis).There are no natural forests in the YRD.Robinia pseudoacacia forests, one of the fast-growing deciduous species in the world, have a certain ability to tolerate drought and soil salinity; therefore they have been planted widely in this area since the 1970s and become the largest artificial forests in China [33].However, Robinia pseudoacacia forests in the YRD have suffered continuously from dieback and mortality since the 1990s.Besides Robinia pseudoacacia, there are other forest plantations with a relatively small size, including Populus tomentosa, Fraxinus chinesis, Ulmus pumila, and Ailanthus altissima.There are four forest areas in the YRD, including Gudao, Machang, Abandoned Yellow River, and Nature Reserve with a total area of 27.94 km 2 [33].Our study area is only part of the Robinia pseudoacacia forest covered by IKONOS imagery in Figure 1.A two-level sampling design (plan) was adopted for a ground survey of forest crown conditions.The color dots in the color composite image show the locations of the sampling plots for three health conditions of Robinia pseudoacacia forests.

Satellite Imagery
IKONOS imagery.A scene of IKONOS satellite imagery with four multispectral (MS) bands (i.e., blue, green, red, and NIR bands) at 4 m resolution and one panchromatic (Pan) band at 1 m resolution was acquired for the study area on 9 June 2013 with an average off-nadir view angle of 8.1°. Figure 1 shows its false color composite image, which covers an area of 140 km 2 .
ALOS imagery.ALOS (Advanced Land Observation Satellite) imagery obtained on 12 October 2010 was used as a reference for preliminary field surveys of Robinia pseudoacacia forests on 12-20 May 2012 in order to obtain the boundary of Robinia pseudoacacia forests in the study area.Pan-sharpened false color composite ALOS imagery at 2.5 m resolution was brought to the field to delineate the tree crown conditions, which was also a basis for designing the field sampling plan in May 2013.

Field Samples
Aided by the ALOS pan-sharpened imagery and a Global Navigation Satellite System (GNSS) unit with a horizontal accuracy of <1 m, a field crew of five people collected data on Robinia pseudoacacia health status from 75 plots (a plot somewhat similar to a stand in size) within homogenous patches across the study area during 15-26 May 2013 and 29 May-9 June 2014.By considering an average crown diameter of about 4-6 m and tree spacing ranging from 3 to 4 m based on a previous field survey in 2012, we designed a two-level sampling plan (see Figure 1).Each plot represented a 30 m × 30 m area.In each plot, five 10 m × 10 m subplots were arranged, deployed in four corners and one center within a plot.Consequently there were 375 subplots (75 × 5 = 375) in total.GNSS was used to measure the four-corner location of a plot.In every subplot, we first selected three trees that had an approximately average diameter at breast height (DBH) and an average crown dieback, then measured and calculated the five crown condition indicators [55], including live crown ratio (=live crown length/actual tree crown length × 100%), crown density, crown diameter, crown dieback, and foliar transparency, and finally averaged the survey data from the three trees as the subplot crown condition.Per each plot (stand), crown condition was obtained by averaging its five-subplot survey data.

Image Preprocessing
IKONOS imagery was radiometrically corrected to ground surface reflectance by utilizing the empirical line calibration (ELC) [56].In this study, the in situ spectral measurements were taken from targets of river (deep/clear water) and concrete ground located within image areas during 25-26 May 2013 by using an Analytical Spectral Devices (ASD) spectrometer.Although in situ spectral measurements were not collected on the same day as the acquisition of IKONOS imagery (9 June 2013), we tried to choose a measuring time close to the image acquisition and clear sky conditions to collect in situ spectra from targets.Both IKONOS MS bands and the Pan band were geometrically rectified into the Universal Transverse Mercator coordinate system using 20 ground control points and a nearest-neighbor resampling method with a root mean square error of less than 0.5 pixels.In addition, since we only focused on Robinia pseudoacacia forests, we used the forest boundary data obtained from the field survey on 12-20 May 2012 to mask out non-forest areas, such as water, urban, roads, crop areas, etc. before extracting textural/spatial information and classifying forest health conditions.

Crown Condition Classification Standard
Within each of the 375 subplots, three average Robinia pseudoacacia trees were selected and evaluated according to the USFS (United States Forest Service) Crown Condition Classification Guide (CCCG) standard [55].These CCCG indicator data were then averaged from the five subplots for each plot and each indicator was classified into one class from the three vigor classes based on a range of values in the CCCG (see Table 1).

The Grey-Level Co-Occurrence Matrix (GLCM)
GLCM was suggested as a mechanism for deriving texture measures [57].Spatial co-occurrence texture analysis requires a user to identify five different control variables [29]: (1) window size; (2) the texture measure(s); (3) input channel (i.e., spectral channel to measure the texture); (4) quantization level of output channel (eight-bit, 16-bit, or 32-bit); and (5) the spatial component (i.e., the inter-pixel distance and angle during co-occurrence computation).Previous research on spruce forest structure variables (including age, top height, circumference, stand density, and basal area) using IKONOS imagery recommended the selection of textural features with consideration of the window size, the displacement, and direction [31].We adopted this method to compute GLCM textures from IKONOS MS bands.The input pixel values were quantized to 32 discrete levels that led to a 32 × 32 element GLCM matrix.Given a large enough window size, any offset could be used, but (1,1) is the most commonly used offset [58].Therefore the displacement of (1, 1) was adopted.In this study we focused on three GLCM control variables, namely, the window size, the texture measure, and the direction based on IKONOS Band 4 (NIR band).The reasons for choosing Band 4 to test the effects of the window size and direction on classifying forest health conditions include (1) the workload was too heavy to test all window sizes and directions for all four MS bands; (2) per statistics of training samples of Robinia pseudoacacia health conditions extracted from MS bands (Figure 2), Band 4 was the most effective to discriminate among three health levels; and (3) Pu and Cheng [59] supported the fact that the TM NIR band was the most important to correlate with LAI (note that TM NIR band has the same wavelength as IKONOS Band 4).
Thus the optimal window size and direction would be determined based on IKONOS Band 4, and then the determined window size and direction would be applied to all other IKONOS MS bands.For this study, eight widely used GLCM features were selected (Table 2).Window size is an important component of a texture analysis because the texture is a multi-scale phenomenon [60].If the window size is too small then it does not contain enough information about the area to perform an accurate analysis.However, if the window is too large, then it can overlap with other types of ground cover, and produces erroneous results.This is referred to as the edge effect [61,62].In this study, we calculated eight GLCM features from IKONOS Band 4 starting from a 3 × 3 window size until an optimal window size leading to the highest classification accuracy could be identified.Since the directionality of texture has the potential to discriminate between isotropic and anisotropic structures [26], we also calculated eight GLCM measures from four directions (0°, 45°, 90°, 135°) based on IKONOS Band 4 to see whether the direction had any impact on the classification accuracy.Then the determined optimal window size and direction were applied to other IKONOS MS bands.Finally, 32 GLCM features (4 MS bands × 8 GLCM measures) extracted from the four MS bands based on the optimal window size and direction were used to classify Robinia pseudoacacia forests' health conditions with an RF classifier.The GLCM textural analysis was implemented with the ENVI software [63].

Local Spatial Statistics
Spatial variations of forest health conditions were assessed across the study area through the Getis statistic (Gi).From an application perspective and in consideration of remote sensed imagery, pixels from a healthy tree crown will generate clusters that differ in intensity from pixel clusters from a dieback tree crown.The Gi [34], as a dimensionless indicator, expresses the sum of the weighted variable values pi for some distance d of a particular unit cell i as a proportion of the sum of the variable values over the entire study area (Table 2).In this study, the Gi was used to analyze spatial autocorrelation characteristics of forest health conditions.Toward this objective, the Gi statistics or features were computed with a series of increasing distance value ranging from 1 pixel (i.e., window size of 3 × 3 pixels) and with seven neighborhood rules (including Rook's case, Bishop's case, Queen's case, Horizontal, Vertical, Positive slope, and Negative slope) [34] from four IKONOS MS bands, respectively, using the ENVI software.This procedure was repeated with increasing lag sizes until the distance value leading to the highest classification accuracy could be identified.With this optimal distance value, we compared and selected an optimal neighborhood rule that could lead to the highest classification accuracy.
Table 2.A list of GLCM texture measures a and Gi statistic b , extracted from the four IKONOS MS bands and used in this analysis.
Note: (B#) represents the band number of IKONOS MS imagery.Per the GLCM texture measure, i, j are the row/column numbers in a spatial-dependence matrix; p (i, j) is the value in the cell i, j in the matrix; N is the number of rows or columns and equals to the number of grey levels.Per the Gi statistic, w ij (d) is a symmetric one/zero spatial weight matrix with ones for all links defined as being within distance d of a given i; all other links are zero including the link of point i to itself; i is a given pixel and j are the pixels (not including the ith pixel) in the image, p i , p j are the pixel values; n is the total number of pixels in the image.a Haralick et al. [57]; b Getis and Ord [34].

Random Forest Classification
Random forests [46] grow many regression trees without statistical pruning, and the result is based on the average of all the regression trees.Individual regression trees in the forest are built using bootstrap aggregation (bagging), which involves randomly drawing, with replacement, a bootstrap sample of the original training dataset.Each subset selected using bagging to make each individual tree grow usually contains two thirds of the calibration dataset.The samples that are not present in the calibration subset are included as part of another subset called out-of-bag (OOB).In addition to bagging, each tree split is based on a random subset of the input variables [46].Therefore, the randomness introduced in the dataset selection and in the variable selection (mtry) makes random forests an accurate tool for prediction.Additionally, there are only two tuning parameters required for growing random forests, the number of trees to be grown (ntree), and the number of possible splitting variables (mtry) that are sampled at each node.In this study, we used the default values, i.e., mtry equals the number of root mean square of predictive variables and ntree = 500, as previous researchers have shown that sensitivity of the userdefined parameters is minimal and the default values are often a good choice [64,65].
A useful byproduct of RF is individual variable importance, which is a measure of the strength of a variable in the final model.RF variable importance is a useful output for gaining better insight into which variable or set of variables is the most relevant for the classification of the classes.To assess the importance of each feature, the RF switches one of the input random variables while keeping the rest constant, and it measures the decrease in accuracy that has taken place by means of the OOB error estimation and of Gini Index decrease [46].We utilized the variable importance to rank the IKONOS spectral/textural/spatial features to account for their ability to discriminate amongst healthy, medium, and severe dieback Robinia pseudoacacia trees.
Although the RF algorithm does include a built-in "out-of-bag" validation scheme, previous study has proved OOB error was insufficient for spatial data [66].Thus we used an independent validation data set to assess the RF classification accuracy.The ground reference dataset was randomly divided into two thirds (150 polygons) and one third (75 polygons) for training and validation, respectively.RF classification was performed using imageRF [67], which is a license-free platform that can be integrated into commercially-available IDL/ENVI software and can also be run as an add-on EnMAP-Box, which is an open-source and platform-independent software interface for image processing.In this study, confusion matrices were used to assess the classification accuracy from independent validation samples.Kappa index, overall accuracy (OA), Producer's accuracy (PA), and User's accuracy (UA) [68,69] were used to assess accuracies of classifying forest health conditions with different spectral/textural/spatial features.

Experimental Procedure
In this study, the experimental procedure was carried out by using the ENVI software and imageRF software.Figure 3 presents a flowchart for forest health conditions classification using IKONOS MS bands, GLCM features, Gi features, and a combination of the three datasets.In this experiment, after IKONOS imagery data were preprocessed, the procedure of RF classification could be executed with the following four main steps: (1) classifying imagery using MS spectral bands only; (2) classifying imagery using the GLCM textural features only; (3) classifying imagery using the local spatial Gi features only; and (4) classifying imagery using combined MS bands, GLCM, and Gi features, and ranking the contribution of the spectral/textural/spatial features to the classification.Step 1 was performed to provide a benchmark against which to compare with those using textural/spatial features only and using combinations of spectral, textural, and spatial features.The performance of the standard (benchmark) classification technique is necessary in order to measure whether incorporating textural/spatial information can improve classification accuracy.
At Step 2, the eight GLCM textures (Table 2) were first calculated from the IKONOS NIR band to determine the control variables (i.e., window size and direction) based on the highest RF classification accuracy, then the optimal window size and direction were applied to other MS bands, and finally GLCM features were used as input to run the RF classification.This step included: (1) calculating the eight GLCM measures from IKONOS Band 4 with window sizes from 3 × 3 through 17 × 17 and a fixed direction 135° to determine an optimal window size; (2) calculating the eight GLCM measures from IKONOS Band 4 with the optimal window size and four directions of 0°, 45°, 90°, and 135° to determine an optimal direction; (3) applying the optimal window size and direction to all four IKONOS MS bands to calculate 32 textures (8 GLCM measures × 4 MS bands); and (4) performing RF classifications using the eight GLCM textures calculated from Band 4 and all 32 GLCM textures calculated from the four MS bands.
At Step 3, first, an iterative classification scheme was used to determine the optimal distance value and the best neighborhood rule for calculating local spatial Gi features.Second, the four Gi features calculated with the optimum distance value and neighborhood rule from the four MS bands were used as input variables of the RF classifier.
At Step 4, the stack of MS bands, GLCM features and Gi features, determined at steps 2 and 3, was used as input to the RF classifier in order to explore whether a combination of MS+GLCM+Gi features could improve classification accuracy compared with using either MS bands, GLCM features, or Gi features only.Then all of the above predictive features were ranked to determine their differentiation ability in assessing Robinia pseudoacacia forest health conditions.

Determining the Optimal GLCM Window Size and Direction
Figure 4 shows how the classification accuracy changes as the window size increases from 3 × 3 to 17 × 17.All eight GLCM textural features from each window sizes were calculated from IKONOS Band 4 with direction 135°.From the figure, classification accuracy increases as the window size increases until the 13 × 13 window size.After that size, the OA and Kappa coefficients decrease slightly.Therefore, in this study, we chose the 13 × 13 window size as an optimal window size to compute the eight textural features to perform RF classification.
Next, with the fixed 13 × 13 window size, we further tested the effects of four directions (0°, 45°, 90°, 135°) on the RF classification result with the eight GLCM features extracted from IKONOS Band 4 (Figure 5).According to the OA, a direction of 45° has generated slightly better results compared to other directions.However, based on Producer's accuracy differences among the four directions, calculated from the 13 × 13 window size for the three health levels, one-way ANOVA test (SPSS V.21.0, 2012) results indicated that the differences in producer's accuracies among the four directions were not statistically significant at α = 0.05.Therefore a direction of 45° was selected as an "optimal" direction that may have a marginal accuracy and may be chosen just by chance.Moreover, the same one-way ANOVA test was also applied to three window sizes (from 11 × 11 to 15 × 15) with the fixed direction of 45°, and the test results also indicated that the differences of Producer's accuracies among the three window sizes were not statistically significant at α = 0.05.Therefore the window 13 × 13 was selected as an "optimal" window size that may also have a marginal accuracy higher than that of the other window sizes.Finally the optimal determined window size of 13 × 13 and direction of 45° were applied to all four MS bands to extract a total of 32 GLCM features.

Determining the Optimal Gi Distance Value and Neighborhood Rule
When comparing results of classification using Gi features with varying distance values and neighborhood rules, it is easy to see from Figure 6 that the Queen's neighborhood rule and the distance value of 5 would lead to the highest accuracy.Therefore, in this study, we adopted 5 as the optimal distance value and Queen's as the optimal neighborhood rule to calculate the four Gi features from the four IKONOS MS bands for use in the RF classification.

RF Classification Results
Table 3 lists the RF classification results based on a set of confusion matrices created with independent validation samples using spectral bands, textural features, spatial features, and spectral/textural/spatial features together.If only using spectral bands as input for RF classification, the OA was 79.5% and Kappa coefficient was 0.6821 with lower PA and UA for dieback forests (especially for medium dieback forest sites).If using the eight GLCM features calculated from IKONOS Band 4 as input, the OA of the forest health conditions classification was improved greatly to 97.1% compared to that created with the four MS bands only.However, if using all 32 GLCM features calculated from all four IKONOS MS bands as input, the OA of the forest health conditions classification decreased to 93.3%.The omission and commission errors for severe and medium dieback conditions of Robinia pseudoacacia forests, compared with those using the eight GLCM features as RF predictive variables, increased substantially.If using the four Gi features (calculated from the four MS bands) as input, the classification accuracy (94.0%) was almost the same as that using the 32 GLCM features as input (93.3%).The difference of accuracies between using these two datasets was that the commission errors for severe dieback and omission errors for medium dieback decreased substantially for the Gi features, while omission errors for healthy and commission errors for medium dieback increased slightly.In order to assess the performance of the combined features, the four MS bands, the eight GLCM features, and the four Gi features were combined and fed into the RF classifier.The reason we chose the eight GLCM features rather thanthe 32 GLCM features was that (1) the former yielded a higher classification accuracy and (2) the computational burden would be increased if we used all 32 GLCM features.Table 3 shows that the combined features ( 16) did not yield a higher classification accuracy (96.9%) than the GLCM features alone (eight) did (97.1%).Overall accuracy (%) Table 3. Confusion matrices and RF classification accuracies using four spectral bands (reflectances); eight GLCM textures (calculated from Band 4 using the window size of 13 × 13 and direction 45°); 32 GLCM textures (calculated from the four MS bands using the window size of 13 × 13 and direction 45°); four Gi features (calculated from the four MS bands using distance value of 5 and Queen's neighborhood rule); and four spectral, eight GLCM, and four Gi combined features together.The number in parentheses is the number of total features used.OA%: overall accuracy; PA%: Producer's accuracy; UA%: User's accuracy; M: medium; S: severe.Accuracy indices were computed based on independent validation samples.

Contribution of All Predictive Variables
The combined features importance plot in Figure 7 presents the relative contributions of the 16 individual features for separating the three health conditions of Robinia pseudoacacia forests.The plot reveals that the most important predictive variable was GLCM texture mean, extracted from Band 4 of IKONOS imagery, followed by the Gi features calculated from Bands 4, 3, 1 and 2, and then Band 4. According to the accuracy values illustrated in Figure 7, the first and second ranked features MEA (B4) and Gi (B4) are 5-6 times more important than the third ranked feature (Gi (B3)).

GLCM Feature Analysis
The moving window size of the GLCM is a key parameter in texture analysis.There are different methods for determining the window size for calculating textures.For example, Franklin et al. [70] used a range of experimental semivariograms to optimize the texture window size in remote sensing of forest inventory and forest structure characteristics.In this study, we determined the optimal window size by comparing RF classification accuracies.For the eight GLCM texture measures, the 13 × 13 window size was an ideal single window choice.Su et al. [71] found that GLCM angular secondary moment at a 45° angle could improve buildings classification using QuickBird image because buildings in Kuala Lumpur, Malaysia strike northwest and southeast, corresponding to the 135° direction.Kayitakire et al. [31] found that the direction parameter had minimal effects on retrieving forest structure variables in even-aged common spruce stands based on IKONOS-2 imagery.In our study area, the direction showed an insignificant impact on GLCM texture information.To compare the performance between textural and spatial features for classifying the three forest health levels, we computed the eight textural features from each IKONOS MS band.However, the eight GLCM features extracted from IKONOS Band 4 achieved a higher OA (97.1%) than that (93.3%) using all 32 GLCM features extracted from the four MS bands (Table 3).The reason for the lower classification accuracy created with the 32 GLCM features may be due to a large amount of redundant and inter-correlated textural information.

Figure 7.
Combined individual features' importance plot, expressed as the mean decrease in accuracy (%) when a feature was left out of a classification.B1, B2, B3, and B4 represent IKONOS blue band, green band, red band, and NIR band, respectively.The eight GLCM features were calculated from IKONOS Band 4 with the window size of 13 × 13 and direction 45°, and displacement 1 pixel.The Gi features were calculated from the four IKONOS MS bands with the distance value of 5 and Queen's neighborhood rule.
Figure 8b illustrates the GLCM MEA (B4) with the optimal window size of 13 × 13 and direction 45° covering a partial area of the study area.In this image, one can clearly distinguish many clusters of pixels of varying intensities.The majority of the imaging area was characterized by high GLCM Mean value (in blue in the image).Along the river channel in the north and along the road and drainage ditch from northeast to southwest, the values of MEA (B4) were smaller (in brown to yellow in the image).

Gi Feature Analysis
Ghimire et al. [39] found that making use of the Gi statistic with different distance values led to a substantial increase in per class classification accuracy of heterogeneous land-cover categories.The Kappa values of the RF classifications that used a combination of spectral and Gi variables at three different distance values (1, 3, and 5 pixels) ranged from 0.85 to 0.92 (vs.0.78 using only spectral bands).In this study, we also compared classification accuracies at different distance values (from 1 to 8 pixels) Mean decrease in accuracy (%) Predictive variables and demonstrated that a distance value of 5 pixels was optimal (Figure 6).On the contrary, Myint et al. [38] reported that the Gi statistics calculated from IKONOS pan-sharpened MS bands using the shortest distance threshold (i.e., 1 pixel or 1 m) achieved the highest OA (75%) in urban land-use and land-cover classification.A small window size (or distance value) indicates that a spatial dependency is confined to a very localized region, while a large distance value indicates more spatially extensive spatial dependence [72].Compared with heterogeneous urban areas, forest is relatively homogeneous, and thus the local spatial autocorrelation covers a larger area.In addition, the optimal distance value of 5 pixels (i.e., 11 × 11 window size for IKONOS MS bands) for Gi features was close to the optimal window size of 13 × 13 for GLCM features.8b, we found that the spatial structure of the two images was very similar, with a larger area of low values along river channels and road lines.
Figure 7 shows that the first and second important features were both computed from IKONOS NIR band (Band 4).This is because, compared with other visible bands, (1) Band 4 is more effective for detecting healthy or diseased trees with reduced cell vigor (Figure 2) and ( 2) the NIR band has the greater penetration energy through the canopy, and thus the band, reflecting more understory and ground surface information, can characterize more subtle spatial variation than visible bands.A similar observation was reported by Treits and Hwarth [73].

Stability of the RF Variable Importance
A RF allows for assessment of the importance of the variables by means of the Gini index and the OOB subset [46].Some researchers have reported that the stability of RF variable importance was dependent on the number of trees used in the ensemble (e.g., [46,74]), while other researchers have opted to calculate the variable importance after running the RF algorithm many times using different training subsamples of an original dataset [75].In order to assess the stability of variable importance in this study, RF classification was performed with a different number of trees.We chose GLCM features (8) (Table 2) as an example of RF input with the number of trees from 100 to 800 with an interval of 100.The results demonstrated that the rank of variable importance was always the same with statistically insignificant (at α = 0.05 based on one-way ANOVA test result) change in the variable importance value.

Implications for Classification of Forest Health Conditions
Among the different spectral, textural, and spatial feature datasets, spectral-only data (i.e., four band reflectance) resulted in the lowest classification accuracy, with a serious salt-and-pepper phenomenon in the classification map (Figure 8d).This was especially obvious for medium or severe dieback forest stands with lower PA and UA in Table 3. Due to understory green grass showing a similar spectral appearance to the healthy tree canopy, it was difficult to distinguish between the healthy and dieback forest crowns using image spectral information only.On the contrary, GLCM features (eight)-and Gi features (four)-based classification accuracy improved greatly (Table 3).The commission and omission errors for the three health conditions of forest stands have substantially decreased (Table 3).This indicated that both textural and local spatial information in the Robinia pseudoacacia forests could help quantify the degree of spatial homogeneity or heterogeneity of different forest conditions.Figure 8e,f shows an almost similar clustered pattern in the classification map of forests' health conditions using textural and spatial features separately.
In the study area, Robinia pseudoacacia forests are plantations with tree spacing ranging from 3 to 4 m with an average crown diameter of 4-6 m.Based on field survey data, we found that the crown closure was >80%, foliar transparency was <20% and crown diameter was >5.5 m in healthy forest stands with an understory of stunted and shade-tolerant grasses (such as Carex rigescens, Humulus scandens, etc.).In medium dieback forest stands, crown closure was between 50 and 70%, foliar transparency was between 30 and 50%, and crown diameter was between 4.5 and 5.5 m.The understory grass types changed and heliophytes (such as Viola philippica, Setaria viridis, etc.) appeared.With the continuous dieback tree crowns, the understory was covered by tall and dense grasses (such as Imperata cylindrical, Sonchus oleraceus, etc.) with the tree crown closure decreased to 20-40%, foliar transparency >60%, and crown diameter <4.5 m.When a severe dieback forest was near a river or lakeside, Phragmites australis was the predominant understory.Figure 9 shows the schematic representation of three conditions of forest plots.At healthy forest sites (Figure 9a), live tree crowns predominantly occupied each pixel within a window size of 13 × 13 image.Thus, for the homogeneous healthy forests, the image texture was relatively smooth and local spatial dependence was relatively high, leading to a high texture MEA (B4) and Gi (B4) value (blue color in Figure 8b,c).Inversely, at dieback forest sites (see Figure 9b,c), image pixels within the 13 × 13 window size might be covered by different components of dieback crown, live crown, understory grass, canopy shadow, or a mixture of the components, and thus the texture of dieback forest stand would be relatively rough with lower texture mean values (from yellow to brown color in Figure 8b).With the increase of tree dieback, which led to a higher local spatial variation, Gi (B4) values decreased correspondingly (from yellow to brown color in Figure 8c).Remotely sensed data can convey information regarding the distribution, structure, and condition of forests.It was obvious from Figure 8e that most medium and severe dieback forests were distributed along the river channel and road line/drainage ditch.This clustered spatial pattern indicated some underlying physical or ecological processes.Liang and Hua [75] found that there were three main factors that affected the dieback of Robinia pseudoacacia trees in this study area: soil texture, soil salt content, and ground water levels.Loam soil is suitable for Robinia pseudoacacia trees because it has good permeability for air and root growth.If the water table is less than 1 m, Robinia pseudoacacia trees will suffer from high mortality due to root rotting and crown dieback because of long-time waterlogging.Robinia pseudoacacia trees can grow healthily with a soil salt content of less than 0.3%.In the YRD, soil salinity is related to ground elevation, groundwater depth, soil texture, and micro-geomorphology [76].Zhang and Xing [77] found that higher soil salinity, lower nutrient content, and higher soil bulk density existed at dieback Robinia pseudoacacia forest stands in this study area.Thus inclusion of ancillary data (such as soil salt content, soil texture, groundwater levels, and microgeomorphology type) in the RF classification may help our understanding of the mechanism of tree crown dieback and improve classification accuracy for future applications.Moreover, in addition to GLCM and Gi features, other spatial features such as morphological textural features [78], which have been proved promising in distinguishing red (Rhizophora mangle), white (Laguncularia racemosa), and black (Avicennia germinans) mangroves and rainforest regions with IKONOS imagery on the Caribbean coast of Panama, should be considered in future studies.

Conclusions
In this study we used spectral, spatial, and textural information extracted from IKONOS multispectral imagery and random forest (RF) classifier to determine three health conditions of Robinia pseudoacacia forests in the YRD, China.The experimental results demonstrated that both spatial and textural information outperformed spectral reflectance data.Nevertheless, the combination of all 16 features (four spectral, eight textural, and four spatial) did not yield better classification results than those created by the GLCM features alone (eight), which achieved the highest OA of 97.1%.In this study, to calculate the best local spatial statistical features, an optimal distance value of 5 pixels and the optimal Queen's neighborhood rule were adopted.The best GLCM texture measures were calculated with the optimal window size of 13 × 13 and direction of 45°.RF variable importance proved that the IKONOS NIR band was the most effective for textural and spatial information extraction, leading to a high separability of Robinia pseudoacacia forests' health conditions.
Our results also indicated that the RF classifier was a useful and robust tool for identification of forest health conditions using spectral, textural, and spatial features extracted from VHR remote sensing data as input.In addition, texture measures and local spatial statistics extracted from VHR imagery (e.g., IKONOS imagery in this study) could be used to characterize the spatial structure of stressed forests and help us better understand some underlying physical or ecological processes.

Figure 1 .
Figure 1.Location of the study area.The false color composite image covering the Robinia pseudoacacia forests in the study area was made with IKONOS NIR band/Red band/Green band vs. R/G/B.A two-level sampling design (plan) was adopted for a ground survey of forest crown conditions.The color dots in the color composite image show the locations of the sampling plots for three health conditions of Robinia pseudoacacia forests.

Figure 2 .
Figure 2. Means (bars) and standard deviations (short lines) of training samples of three health conditions of Robinia pseudoacacia extracted from IKONOS MS bands.Bands 1-4 represent IKONOS blue, green, red, and NIR band, respectively.

Figure 3 .
Figure 3.A flowchart of the analysis procedure for identifying and mapping the three health conditions of Robinia pseudoacacia.See the text for the full names of those abbreviations used in the figure.

Figure 4 .
Figure 4. RF classification accuracies using the eight GLCM textural features across a range of window sizes.The eight GLCM texture measures were calculated from IKONOS Band 4 with a direction 135° and displacement 1 pixel.

Figure 5 .
Figure 5.Comparison of overall accuracies of RF classification across four directions using the eight GLCM features calculated from IKONOS Band 4 with the window size of 13 × 13 and displacement 1 pixel.

Figure 6 .
Figure 6.Illustration of the effects of the Gi distance thresholds (unit: pixel) and neighborhood rules on the RF classification accuracies.

Figure 8 .
Figure 8.Comparison between (a) a subset of false color IKONOS composite image (NIR band/red band/green band vs. R/G/B); (b) GLCM texture MEA (B4) calculated from IKONOS NIR band with window size of 13 × 13 and direction 45°; (c) Gi (B4) calculated from IKONOS NIR band with distance value of 5 and the Queen's neighborhood rule; (d) RF classification result created with the four IKONOS MS bands; (e) RF classification result created with the eight GLCM features calculated from IKONOS NIR band with the window size of 13 × 13 and direction 45°; and (f) RF classification result created with the four Gi features calculated from each IKONOS MS band with the distance value of 5 and Queen's neighborhood rule.

Figure 8c presents an
Figure 8c presents an example of an image created with Gi (B4).By visually comparing Figure8cwith GLCM Mean image in Figure8b, we found that the spatial structure of the two images was very similar, with a larger area of low values along river channels and road lines.Figure7shows that the first and second important features were both computed from IKONOS NIR band (Band 4).This is because, compared with other visible bands, (1) Band 4 is more effective for detecting healthy or diseased trees with reduced cell vigor (Figure2) and (2) the NIR band has the greater penetration energy through the canopy, and thus the band, reflecting more understory and ground surface information, can characterize more subtle spatial variation than visible bands.A similar observation was reported by Treits and Hwarth[73].

Figure 9 .
Figure 9. Schematic representation of three health conditions of forest plots, each with varying dieback and crown density: (a) a healthy forest; (b) a medium dieback forest; and (c) a severe dieback forest.To the right are photographs representing each health category.

Table 1 .
The CCCG a tree vigor indicators and classification thresholds b .
a CCCG: Crown Condition Classification Guide; b Values of 100% were recorded as 99%, and intermediate values were upgraded to the next full 5% grouping.