Combining Binary and Post-Classiﬁcation Change Analysis of Augmented ALOS Backscatter for Identifying Subtle Land Cover Changes

: This research aims to detect subtle changes by combining binary change analysis, the Iteratively Reweighted Multivariate Alteration Detection (IRMAD), over dual polarimetric Advanced Land Observing Satellite (ALOS) backscatter with augmented data for post-classiﬁcation change analysis. The accuracy of change detection was iteratively evaluated based on thresholds composed of mean and a range constant of standard deviation. Four datasets were examined for post-classiﬁcation change analysis including the dual polarimetric backscatter as the benchmark and its augmented data with indices, entropy alpha decomposition and selected texture features. Variable importance was then evaluated to build a best subset model employing seven classiﬁers, including Bagged Classiﬁcation and Regression Tree (CAB), Extreme Learning Machine Neural Network (ENN), Bagged Multivariate Adaptive Regression Spline (MAB), Regularised Random Forest (RFG), Original Random Forest (RFO), Support Vector Machine (SVM), and Extreme Gradient Boosting Tree (XGB). The best accuracy was 98.8%, which resulted from thresholding MAD variate-2 with constants at 1.7. The highest improvement of classiﬁcation accuracy was obtained by amending the grey level co-occurrence matrix (GLCM) texture. The identiﬁcation of variable importance (VI) conﬁrmed that selected GLCM textures (mean and variance of HH or HV) were equally superior, while the contribution of index and decomposition were negligible. The best model produced similar classiﬁcation accuracy at about 90% for both years 2007 and 2010. Tree-based algorithms including RFO, RFG and XGB were more robust than SVM and ENN. Subtle changes indicated by binary change analysis were somewhat hidden in post-classiﬁcation analysis. Reclassiﬁcation by combining all important variables and adding ﬁve classes to include subtle changes assisted by Google Earth yielded an accuracy of 82%.


Introduction
Monitoring of change over conserved areas may not always identify significant differences between two observations. Subtle changes might be more relevant than abrupt changes when land uses are well-managed and there are no unexpected disturbances such as natural disasters. Vogelman explained that subtle change was a gradual change related to "within state" alteration of energy response that is commonly related to vegetation dynamic other than the normal phenological cycle [1]. The subtle change in a vegetated land cover may be a sign of vegetation damage due to diseases, insects, drought, and changes of plant community [1], or indicating the change of vegetation

Multivariate Alteration Detection (MAD) and The Iteratively Reweighted Multivariate Alteration Detection
Multivariate Alteration Detection (MAD) is a popular binary change detection technique based on Hotteling's canonical correlation, by transforming two sets of vector images of the same place, acquired at two time points, M = (M 1 , . . . , M p ) and N = (N 1 , . . . , N p ) into new images Q = a T M and R = b T N. From p spectral bands in the original bi-temporal images, two images can be generated where each new image is composed of p MAD-variates. As suggested by Nielsen et al. [24], vectors a T and b T are selected simultaneously by maximising the variance of the difference between Q and R subject to the constraint that the variance of Q and R are both equal to 1.
By maximising the variance of Q and R, MAD generates variance of [Q-R] to comply with canonical correlation analysis. Hence, maximising the difference between Q and R can be achieved by minimising non-negative correlation ( ): The variance of the difference then can be written as Var(Q-R) = 2(1 − ), where is the correlation between Q and R. The MAD transformation as the change result was defined by Reference [24] as: The MAD variates are ordered by descending variance. MAD variate-1 is the difference between the highest-order of canonical variates while MAD variate-2 is the difference between the second-highest-order of canonical variates [14]. Standardising values by computing correlation instead of covariance helps to cope with differing scales that result from different gain factors and atmospheric conditions. The improvement of MAD by iterative processing, IRMAD, is performed to improve separation among classes by adding more weight to no-change probabilities during the process [14].
Compared to other binary or post-classification change analysis, MAD does not strictly demand pre-processing to produce an accurate change map. This advantage is rooted from a process that selects pseudo-invariant features (PIF) or invariant pixels as non-change samples for relative normalization of images [25]. Pseudo-invariant features are usually selected from features such as buildings or constructions that are relatively constant in their reflectance over acquisitions, albeit with minor effects of seasonal conditions. Iteratively Reweighted Multivariate Alteration Detection is an iterated version of MAD that excludes pixels of change detected at the preceding iteration for the next calculation [25]. The iteration is intended to reduce the adverse effect of change occurrence in the feature spaces by assigning a higher weight to unchanged pixels [25,26]. The weight is a probability of non-changed pixels, modeled by using the Chi-square distribution [17]. The iteration of MAD processing would stop when criteria, such as a lack of change in the canonical correlation, are met [25]. The IRMAD performed better than single MAD transformation in analyzing multitemporal images with a dominant change [25] and provided accurate yet more concentrated change indication by suppressing salt-and-pepper effects [25,27]. According to Marpu Remote Sens. 2019, 11, 100 4 of 24 et al. [17], the limitation of IRMAD includes the standardization of individual MAD to define non change distribution and the large proportion of change pixels that leads to incorrect projection of MAD variates.
Multivariate Alteration Detection or its modified version is frequently used for change detection employing optical images. Examples include the application of IRMAD to characterize decadal change processes by employing Landsat images [28]; the employment of MAD to pre-process ASTER preceding classification [15]; and the use of IRMAD for reducing false detection of change with Hyperion [16]. Pre-processed or pre-transformed spectral imagery from optical sensors have been employed. A comparison of the performance of IRMAD using original surface reflectance and tasselled cap transformation of Landsat 5 images demonstrated comparable change results [29]. Nonetheless, we are not aware of change analysis that employed IRMAD using SAR data.

The Amendment of Synthetic Aperture Radar Data with Synthetic Layers for Improving Classification Accuracy
Another alternative of change detection is through comparative-based analysis that analyses change from post-classified images to provide "from-to" information on pre-and post-class labels, enriching the information of binary change detection [30]. The quality of post-classification detection relies on the accuracy of each classification process of each data pair. The main challenge is to obtain an adequate accuracy such as the standard accuracy for thematic maps implemented by the US National Park Service at 80% [31].
The use of radar data to provide "from-to" change information for investigating land cover dynamics should deal with a limited number of data layers contributing to class separation. SAR data may have single, dual or triple layers composed of horizontal, vertical and cross-polarisation modes. More layers, i.e., four are possible from bi-static images when the transmitter separates signals from the receiver and the reciprocity of cross-polarisation does not dictate the result [32]; nonetheless, the image mode is yet to be systematically produced for monitoring. Data fusion is an alternative technique that combines two or more data sources that will provide more layers for input to the classification process [33]. Fusing SAR and optical images has been used to study earthquake damage detection [34] or to assess tsunami damage [35]. Hence, data fusion necessitates additional data sources and adjustment techniques to integrate geometric, radiometric and other differing properties.
An alternative strategy to add variables in SAR data is by amending synthetic layers. Since the amendment is derived from the data itself, geometric correction is irrelevant. The synthetic data can be generated by simple algebra such as differencing, ratioing, or through advanced techniques such as decomposition and texture analyses. Adding synthetic data to original layers has been implemented in several investigations, for instance to monitor deforestation and land use at the Samarinda rain forest [36], to detect Phragmites [37], and to identify smallholder oil palm plantation [38].
An index is synthetic data commonly utilized to identify or differentiate features, either for optical or radar imagery. A popular index in optical images is the Normalized Difference Vegetation Index (NDVI) [39,40], while for SAR data, a comparable vegetation index is the Radar Vegetation Index (RVI). Radar Vegetation Index may be derived from dual polarimetric or fully polarimetric SAR images. The following equation is used to derive RVI from fully polarimetric SAR [41]: where σ 0 hh refers to horizontal backscatter values, σ 0 vv denotes vertical backscatter values and σ 0 hv is for cross-polarization. For dual polarimetric images, RVI can be calculated by using this formula [42]: Remote Sens. 2019, 11, 100 5 of 24 RVI hh symbolizes RVI for horizontal polarization backscatter and the RVI for vertical polarization image can be obtained using a similar equation by changing the horizontal backscatter component with the vertical one (RVI vv ≈ RVI hh ). The use of RVI to amend SAR data in assisting land cover classification can be found in several studies, which have mostly employed fully polarimetric backscatter, such as Ling et al. [43] or Avtar et al. [44].
Synthetic data can also be formed through decomposition techniques, such as Entropy-Alpha decomposition introduced by [45]. This decomposition generates three layers including entropy, alpha and anisotropy. A detailed explanation about the Entropy-Alpha decomposition is available in Cloude and Pottier [45]. The use of entropy and alpha to improve classification results has been implemented by Rodriguez et al. [46] and Qi et al. [47].
Another popular synthetic layer is texture, which can be derived either from optical or SAR data. Grey Level Co-occurrence Matrix and Generalized Co-occurrence Matrix (GCM) are among the available techniques to derive texture [48,49]. Many features can be generated in the texture analysis including angular second moment (ASM), contrast, dissimilarity, energy, entropy, GLCM correlation, GLCM mean, GLCM variance, homogeneity, and maximum. Readers should refer to Haralick et al. [48] for details. The use of texture for amending SAR data in order to improve classification results have been implemented by several investigators [50][51][52]. However, it has been demonstrated that only a few texture features contributed significantly to the improvement of accuracy.

Pixel-Based Techniques for Land Cover Classification
Lu and Weng [53] highlighted the importance of employing suitable techniques as a prerequisite for successful land cover classification. The algorithms of classification can be grouped into pixel-based, sub pixel-based, per field-based, contextual-based, and combinative ones [53]. Pixel-based classifiers have been developed and implemented to classify land cover/use employing optical and radar imageries. Notable pixel-based algorithms include the maximum likelihood classifier (MLC), SVM and decision tree (DT), in which DT does not need statistical assumptions [54]. Support Vector Machine and NN are learning techniques having several advantages including that NN is a distribution free analysis that easily combines multisource data and is claimed to be free from accumulative errors and less affected by atmospheric conditions, illumination and surface moisture [55,56]. Both SVM and NN are superior when dealing with spectral mixture cases [57] while also being responsive to tuning parameters [20]. Meanwhile, SVM has been shown to outweigh MLC and NN when handling small training samples [58]. In contrast, NN has outperformed MLC and SVM for differentiating crop areas [59].
Improving the accuracy of classification is targeted through various strategies. Ensemble learning is an approach for improving the accuracy of classification through modifying the selection of training samples. Various strategies for training selection have been proposed, for instance bagging, a technique to construct multiple versions of a predictor for producing an aggregate predictor [60], and boosting, a general method to improve prediction capability of an algorithm by reducing error from a weak algorithm through reweighting the samples [61]. Implementing the strategy in the real classification of remote sensing images has improved the accuracy of classification by 3-6% [22] or reduced misclassification rates by 20-50% [62]. Various algorithms have been developed for implementing bagging or boosting strategies, including CAB, MAB and RF either in their original version (RFO) or modified one (RFG), and Extreme Gradient Boosting Tree (XGB). Exploring these newly developed algorithms is necessary for better understanding their potential.
The Bagged Classification and Regression Tree (CAB) originated from the Classification and Regression Tree (CART) introduced by Breiman, that produces monotone outcomes by calculating probability of classes [63]. The classification and regression technique can classify categorical or continuous data through recursive binary partitioning [64]. The main disadvantages of CART and CAB are sensitivity to noise and data size and they tend to overfit [65]. Extreme learning is an algorithm Remote Sens. 2019, 11, 100 6 of 24 to regularize learning by minimizing training error and weights that can be implemented in various classifiers such as SVM or neural networks [66].
Extreme Learning Machine Neural Networks is extreme learning combined in neural networks. The classifier inherits the benefit of neural networks, i.e., the ability to tease apart convoluted connections of large datasets, suitable for parametric or non-parametric variables [67], the capability to reduce false alarms [68], and overcoming spectral mixtures in moderate spatial resolution images [69]. However, the technique also contains the drawbacks of neural networks, i.e., generating complexity (hidden nodes) that is problematic to explain, and a likelihood of overfitting [70].
The Bagged MARS roots from Multivariate Adaptive Regression Spline (MARS), a non-parametric regression combining classical regression and splines approximation for predicting unknown function simultaneously [63,71]. MARS outweighed CART in predicting gullies by generating a smoother estimation [64]. However, the prediction of MARS was highly affected by the local nature of the dataset [65].
Random Forest is one of the popular ensemble decision tree classifiers in remote sensing. Many researchers have demonstrated that the performance of RF is better than traditional single tree learning [72][73][74]. The advantages of RF include being less sensitive to overtraining and noises, the ability to generate variable importance for eliminating less important features in order to reduce dimension and computing time, as well as being unresponsive to overtraining [72,75]. Nonetheless, RF tended to be insensitive to mislabeled training [76], delicate to spatial autocorrelation [77,78], and failed to deal with imbalance training [78].
Extreme gradient boosting tree is the implementation of Friedman's concept on a gradient boosting machine which generates constant approximations with finer granularity [79]. A gradient boosting tree has the potential to optimize processing when the access to memory is insufficient for storing a big dataset, hence being sensitive to data modification [79,80].

Site
Taman Nasional Gunung Halimun Salak (TNGHS) is the last remaining rain forest in Java island [81], situated approximately 65 km south of Jakarta, the capital city of Indonesia ( Figure 1). The park is the last montane rain forest on Java Island, hosting several endangered species such as the Javan leopard (Panthera pardus melas), the Javan eagle (Spizaetus bartelsi) and the Javan gibbon (Hylobates moloch). This park is located in rugged terrain and has not experienced a major disturbance in the past decades, hence the claim should be assessed. Previous research reported that the dynamic of land cover at the surrounding of protected areas like parks may compromise the ecological function of the areas, including the conservation of protected species as well as downstream water provision [82]. Land cover dynamics of this critically important area and its surroundings threaten numerous ecosystem services. Monitoring of this site and its surroundings is needed to provide information on how resilient the park's natural values are.

Datasets
In this research, Advanced Land Observing Satellite (ALOS) Fine Beam Dual (FBD) mode of Phased Array type L-band Synthetic Aperture Radar (PALSAR) was used to identify changes between paired datasets and to investigate the effect of synthetic data amendment on improving classification accuracy. ALOS FBD Level 1.1 comprised of horizontal polarisation (HH) and cross-polarisation (HV) was provided by Japan Aerospace Exploration Agency (JAXA) through the 6th Research Announcement (RA-6) for ALOS-2. The images were provided in slant range geometry (single look complex products) with ascending mode, swath width of 70 km and ground resolution about 19 m × 10 m [83]. A pair of images acquired on 20 August 2007 and 28 August 2010 were employed for change detection analysis. Other data were the Shuttle Radar Topography Mission (SRTM) Digital Elevation Model (DEM) 1 arc second and ancillary data, including forest and plantation mapping from the regional state-forest company (PT Perhutani), the map of TNGHS from the Ciliwung river basin organisation (Balai Besar Daerah Aliran Sungai Ciliwung), and historical Google Earth images to assist terrain correction and sampling selection. Figure 2 describes the workflow of this research. Prior to change detection, all images were calibrated, terrain corrected, de-speckled, and converted to dB (sigma0). Single Look Complex (SLC) of ALOS PALSAR were internally calibrated by JAXA prior distribution. SLC format was used to derive polarimetric decomposition features from the Cloude-Pottier theorem. Further calibration of SLC was performed based on Shimada et al. [84] to convert a complex number into a conventionally-used sigma nought (sigma0) in decibels from the SLC data. Terrain correction was assisted with SRTM-DEM 1 arc second and resampled with bi-linear interpolation to result in spatial resolution at 30 m [85]. Image speckles were filtered by using Gamma Map with a window size 5x5 following Reference [86]. Gamma Map filtering was known as simple yet time-efficient [11].

An Iteratively Reweighted Multivariate Alteration Detection and Thresholding to Determine Change and Unchanged
The bi-temporal analysis was performed onto dual polarisation of the backscatter coefficient for the data pair comprising of the years 2007 and 2010, by using IRMAD. The IRMAD processing produced three layers including MAD variate-1, MAD variate-2 and chi-square. The MAD variates are the transformed version of dual polarisation ALOS PALSAR, while chi-square layer denotes a transformed version of both MAD variates into a single layer that is generated based on chi-square distribution [87]. These layers were examined visually to identify whether changed and unchanged pixels were distinctively indicated. Visual assessment was assisted with higher resolution optical image products from freely accessible high-resolution images provided by Google Earth. For each extent, at least two images were examined, captured at times that were proximate to the first and second PALSAR acquisition dates.
The delineation between changed and unchanged was determined based on the thresholding of the selected MAD variate in the visual examination. Statistical parameters, mean (μ) and standard deviation (σ) were then derived from the selected MAD variate for thresholding [88]. Various ranges of statistical parameters have been attempted to define the optimum threshold, such as between μ ± 0.5σ and 1σ [89], μ ± 0.1σ and 2.0σ [88], or μ ± 0.5σ and 3.0σ [90,91]. In this research, a range of

An Iteratively Reweighted Multivariate Alteration Detection and Thresholding to Determine Change and Unchanged
The bi-temporal analysis was performed onto dual polarisation of the backscatter coefficient for the data pair comprising of the years 2007 and 2010, by using IRMAD. The IRMAD processing produced three layers including MAD variate-1, MAD variate-2 and chi-square. The MAD variates are the transformed version of dual polarisation ALOS PALSAR, while chi-square layer denotes a transformed version of both MAD variates into a single layer that is generated based on chi-square distribution [87]. These layers were examined visually to identify whether changed and unchanged pixels were distinctively indicated. Visual assessment was assisted with higher resolution optical image products from freely accessible high-resolution images provided by Google Earth. For each extent, at least two images were examined, captured at times that were proximate to the first and second PALSAR acquisition dates. The delineation between changed and unchanged was determined based on the thresholding of the selected MAD variate in the visual examination. Statistical parameters, mean (µ) and standard deviation (σ) were then derived from the selected MAD variate for thresholding [88]. Various ranges of statistical parameters have been attempted to define the optimum threshold, such as between µ ± 0.5σ and 1σ [89], µ ± 0.1σ and 2.0σ [88], or µ ± 0.5σ and 3.0σ [90,91]. In this research, a range of constants between 0.1 and 2.0 were selected and iterated following the approach of Fung and Mas [30]. The accuracy of the binary change map was assessed by taking 60 samples of positive change, 60 samples of negative change and 120 samples of unchanged for a 3x3 window size or 2060 pixels in total. The iterated constants and overall accuracy were graphed for assessing the improvement or declining trend within the range. The constants that generated the biggest overall accuracy were then used as the threshold for change-unchanged delineation to produce binary change data. They were then utilised for guiding the selection of samples for post-classification detection. A set of combinations was tested to yield the highest possible accuracy, including horizontal co-polarisation, cross-polarisation and synthetic data. A summary of data combinations to evaluate the efficacy of each synthetic data type is presented in Table 1, showing the synthetic data layers used to improve accuracy. Synthetic data included a set of indices, entropy alpha decomposition and the texture analysis. The set of indices comprised the differencing of HH-HV, ratioing of HH/HV, and dual polarimetric radar vegetation index (RVI). Dual polarisation data were the control treatment to evaluate the robustness of the synthetic data amendment. RVI for dual polarization data was determined following Reference [42]. The entropy-alpha decomposition was calculated based on Cloude and Pottier [92]. Considering the contribution of texture components for classification as suggested by Yayusman and Nagasawa [38], only selected textures from GLCM [48], i.e., mean and variance, were utilised for post-classification.
The imagery was classified into seven classes, i.e., forest, rubber, tea, oil palm, crop, built-up, and waterbody. Figure 3 shows pictures of six of these seven classification classes. The class 'waterbodies' was specifically taken from the ocean since the area has no inland water. The selection of training samples for classification was guided by Google Earth, the map of the regional state-forest company and the national park map. The training samples were taken from locations having consistent cover within two observations. The samples were divided into 75% for the developing model and 25% for accuracy assessment. Seven supervised pixel-based classifications including CAB, MAB, RFO, RFG, XGB, ENN, and SVM were employed. Selected classifiers were coded in R statistical software employing caret and raster packages [93,94], following the suggestion of Reference [20]. The classifiers were run by implementing default parameters of the R packages as summarised in Table 2.   Further investigation employed all synthetic layers in order to classify land cover types and to evaluate the importance of variables. The variables (layers) and their acronyms are summarised in Table 3. Variable importance (VI) is a percentage of the variable's contribution calculated from error generation during the permutation of a variable using its out of bag data [68]. The measure assisted in model reduction while it provided a first impression of controlling variables [83,84]. In this research, VI was used to select variables for modelling the best subset. When adding a variable did not improve the accuracy, then the augmentation of the layer was stopped and the combination before augmentation was considered as the best subset.  Further investigation employed all synthetic layers in order to classify land cover types and to evaluate the importance of variables. The variables (layers) and their acronyms are summarised in Table 3. Variable importance (VI) is a percentage of the variable's contribution calculated from error generation during the permutation of a variable using its out of bag data [68]. The measure assisted in model reduction while it provided a first impression of controlling variables [83,84]. In this research, VI was used to select variables for modelling the best subset. When adding a variable did not improve the accuracy, then the augmentation of the layer was stopped and the combination before augmentation was considered as the best subset. A further step was assessing the accuracy of classification of the best subset and all data combinations. The highest accuracy from either all variables or the best subset was used for the post-classification change evaluation. To reduce the salt and pepper effect, a raster sieve was used. If the change that was indicated in binary analysis was unidentified, then a modified classification was performed by taking samples from areas identified as changed by IRMAD and consequently adding more classes guided by historical images of Google Earth. The reclassification employed the best method composing of important variables.

The Result of Iteratively Reweighted Multivariate Alteration Detection and The Determination of Changed and Unchanged
The binary change identification using IRMAD is presented in Figure 4. The figure shows MAD variate-1, MAD variate-2, and Chi-square layers and an RGB layer composing those three layers. From the gray-scale layers, MAD variate-2 displayed the clearest indication of change in the site while chi-square layer could not indicate changes as MAD variate-1 and MAD variate-2 did. Two labels were placed to show areas with indication of changes. MAD variate-1 yielded a similar indication for location-1, but it failed to indicate any changes in location-2. The RGB layer shows a differing tone that may signpost different change types.
A qualitative assessment through visual checking was further performed guided by the indication of change as presented in the figure. The visual check was done to validate the indication from the analysis. The visual check was made using Google Earth in the proximate date (see example in Figure 5), which guided the interpretation of different tones indicated in the binary change maps of differencing cross-polarisation data or filtered MAD variate-2 of IRMAD processing. As indicated by selected features (see a, b, c, d on Figure 5), there was a change in the sample areas. A dark tone indicated a denser vegetation from 2007 to 2010, while a light tone signified the inverse condition. A darker tone indicated a change from fields prepared for cultivation to land fully-covered with oil palm (a) or from juvenile to maturing oil palm plantation (c). The light tone signified the inverse condition from vegetated crop areas to semi-bare ones as indicated in (b) and (d). Considering that MAD variate-2 provided the clearest indication of changed and unchanged, the thresholding was then determined based on MAD variate-2. The mean value of MAD variate-2 at 142.8 and the standard deviation at 52.2 were multiplied with the selected range of constant for iteration. The overall accuracy, false detection and missed detection from the iteration are presented in Figure 6. Considering that MAD variate-2 provided the clearest indication of changed and unchanged, the thresholding was then determined based on MAD variate-2. The mean value of MAD variate-2 at 142.8 and the standard deviation at 52.2 were multiplied with the selected range of constant for iteration. The overall accuracy, false detection and missed detection from the iteration are presented in Figure 6. Figure 6 demonstrates the increasing accuracy of iterating the constant of MAD variate-2 from a value of 0.1 to 2.0. The accuracy increased up to 1.7 and declined after 1.9. The best accuracy for MAD variate-2 was 98.8% obtained by constants between 1.7 and 1.9. The curve of false and missed detections of MAD showed a contrasting pattern for the range of constants. Generated false detection started from 28% and missed detection was from 12%, and the rates were suppressed to zero at constants of 1.7 and 1.8. An optimum constant was obtained at 1.7 to be used for delineating changed from unchanged.  Figure 6 demonstrates the increasing accuracy of iterating the constant of MAD variate-2 from a value of 0.1 to 2.0. The accuracy increased up to 1.7 and declined after 1.9. The best accuracy for MAD variate-2 was 98.8% obtained by constants between 1.7 and 1.9. The curve of false and missed detections of MAD showed a contrasting pattern for the range of constants. Generated false detection started from 28% and missed detection was from 12%, and the rates were suppressed to zero at constants of 1.7 and 1.8. An optimum constant was obtained at 1.7 to be used for delineating changed from unchanged.    Figure 7 shows the response of classifiers in accuracy generation to the amendment of synthetic layers. Two observations, year 2007 and 2010, portrayed comparable descriptions of the response. It seems that amending dual polarization backscatter with indices did not yield a significant improvement to the accuracy, while some of the classifiers responded slightly to the amendment of the entropy-alpha decomposition. Some classifiers generated a slightly lower accuracy (−0.5%) by index amendment compared to the standard data, including MAB, SVM and CAB. Meanwhile, texture amendment substantially increased the accuracy by 20% in two classifiers, i.e., RFO, RFG, while two others CAB and XGB improved it by about 15% and 16%, respectively. SVM followed with MAB generated slightly lower improvement with texture amendment by around 13% and 10%, respectively. Extreme Learning Neural Network yielded the least accurate results by 7% with texture amendment. In general, texture amendment substantially improved the accuracy of classification and produced the highest increase to the accuracy of classification, up to 20%.     Figure 8 demonstrates the comparison of variable importance for five techniques. Only treebased techniques generated high variable importance, while similar capabilities from ENN and SVM were not evident. The figure indicates a different order of the importance of variables by different classifiers. A similar result was demonstrated by random forest techniques either for the original version or regularized ones.  Several classifiers did not respond to the addition of some variables, such as MAB that identified the order of four variables similar to RFO and RFG but did not consider other variables that shared an importance weight with RFO and RFG.

The Identification of Variable Importance for Classification
Among variables, the texture-mean either derived from HV or HH was consistently considered as the highest important variable followed by texture-variance. Backscatter intensity either horizontal or cross-polarization, was valued as an important variable in four classifiers, except MAB. Seven variables that consistently shared significant contributions following four classifiers included texture-mean HV, texture-mean HH, texture-variance HH, texture-variance HV, Sigma0 HH, Sigma0 HV, and index-differencing. Figure 9 shows the increment of accuracy when each variable was added consecutively following the descending order of VI resulting from RFO. RFO was used to determine variable importance since it generated the highest accuracy when all variables were employed. It appears that variables with contribution less than 10% brought a contra productive effect to accuracy generation by decreasing the accuracy when they were added. In this research, six appears to be an adequate number of variables for land cover classification with seven targets to develop the best subset model. Figure 9 shows the increment of accuracy when each variable was added consecutively following the descending order of VI resulting from RFO. RFO was used to determine variable importance since it generated the highest accuracy when all variables were employed. It appears that variables with contribution less than 10% brought a contra productive effect to accuracy generation by decreasing the accuracy when they were added. In this research, six appears to be an adequate number of variables for land cover classification with seven targets to develop the best subset model. Unknown error occurred during the classification process employing a different number of variables on two classifiers, i.e., ENN and MAB. The error of ENN processing took place during the process of data composition with 5 or 7 to 11 variables, while the error of MAB occurred during the process of data composition with 9 to 11 variables. Table 4 describes the accuracy of classification by employing all variables and from the best subset model defined with VI assistance. It appears the best subset model generated the same or higher accuracy than the model developed from all variables except for MAB and ENN. For MAB, the best subset produced slightly lower accuracies than when employing all variables.

Comparing the Accuracy from All Data Layers and the Best Subset
In general, tree-based classifiers including RFO, RFG, CAB and XGB generated a greater accuracy than SVM and ENN. The table demonstrates that RFG and RFO were superior compared to other classifiers by producing accuracy at about 90% for the years 2007 and 2010. The map of classification generated from the best subset model using RFG was then employed for the years 2007 and 2010. Unknown error occurred during the classification process employing a different number of variables on two classifiers, i.e., ENN and MAB. The error of ENN processing took place during the process of data composition with 5 or 7 to 11 variables, while the error of MAB occurred during the process of data composition with 9 to 11 variables. Table 4 describes the accuracy of classification by employing all variables and from the best subset model defined with VI assistance. It appears the best subset model generated the same or higher accuracy than the model developed from all variables except for MAB and ENN. For MAB, the best subset produced slightly lower accuracies than when employing all variables.

Comparing the Accuracy from All Data Layers and the Best Subset
In general, tree-based classifiers including RFO, RFG, CAB and XGB generated a greater accuracy than SVM and ENN. The table demonstrates that RFG and RFO were superior compared to other classifiers by producing accuracy at about 90% for the years 2007 and 2010. The map of classification generated from the best subset model using RFG was then employed for the years 2007 and 2010.

From-To Information of Change
Post classification detection employing the best model could not identify the change indicated in IRMAD, and thus, the subtle change was hidden. Reclassifying images by combining important variables employing the best method for data pairs and adding more classes related to subtle change were performed for 12 classes. Five more classes were added to integrate types of subtle changes as identified in IRMAD. The additional classes included previously vegetated crop areas to semi-fallow crop areas, juvenile to maturing oil palm, old to regenerated rubber plantation, juvenile to maturing rubber, and previously rubber to newly planted oil palm. The reclassification resulted in an accuracy of about 82%. Figure 10 provides information regarding the land cover types that were unchanged between 2007 and 2010, as well as the type of change in the hotspot location as indicated by IRMAD analysis. Water was better classified amongst the classes while stable oil palm, tea and rubber were less accurate due to varying stand ages. The change from juvenile to maturing oil palm yielded high accuracy at 95% and 90%, respectively, for the producer and user accuracies. Rubber growth stages from juvenile to maturing plants or from old to regenerated ones were also well classified at 82% and 61% for producer accuracies, and 85% and 76% for user accuracies, respectively. The change of rubber into the oil palm plantation was identified with producer accuracy at 70% and user accuracy at 69%.

From-To Information of Change
Post classification detection employing the best model could not identify the change indicated in IRMAD, and thus, the subtle change was hidden. Reclassifying images by combining important variables employing the best method for data pairs and adding more classes related to subtle change were performed for 12 classes. Five more classes were added to integrate types of subtle changes as identified in IRMAD. The additional classes included previously vegetated crop areas to semi-fallow crop areas, juvenile to maturing oil palm, old to regenerated rubber plantation, juvenile to maturing rubber, and previously rubber to newly planted oil palm. The reclassification resulted in an accuracy of about 82%. Figure 10 provides information regarding the land cover types that were unchanged between 2007 and 2010, as well as the type of change in the hotspot location as indicated by IRMAD analysis. Water was better classified amongst the classes while stable oil palm, tea and rubber were less accurate due to varying stand ages. The change from juvenile to maturing oil palm yielded high accuracy at 95% and 90%, respectively, for the producer and user accuracies. Rubber growth stages from juvenile to maturing plants or from old to regenerated ones were also well classified at 82% and 61% for producer accuracies, and 85% and 76% for user accuracies, respectively. The change of rubber into the oil palm plantation was identified with producer accuracy at 70% and user accuracy at 69%.  Figure 10. The distribution of 12 land cover/change classes resulting from RFG added with producer and user accuracies.

Discussion
As suggested by previous research [88][89][90]95], a threshold that generates the highest accuracy should be selected to discriminate changed and unchanged land cover from binary change detection such as IRMAD. Previous attempts on iterating constants for binary change map generation mostly employed Landsat or other optical images [30,96]. This research demonstrated the value of dual polarimetric ALOS backscatter for IRMAD analysis and indicated that the highest accuracy was achieved within a constant range of 0.1 to 2.0. The visual check of binary change against Google Earth imagery may be directly utilized to validate the change being detected by IRMAD. Nonetheless, freely accessible optical imagery with higher spatial resolution may not always be available due to cloud cover or limited recurring observations.
Post-classification change detection may complete the information of change by providing "from-to" information. However, error detection is proliferated by each classification. The challenge to have reliable post-classification change results is to obtain an adequate accuracy rate. Following the standard acceptable accuracy for thematic mapping of the National Park Service, US Department of the Interior, the minimum expected accuracy is 80% [31]. Limited bands of dual polarization SAR images for generating the "from-to" information may result in low accuracy of classification, thus enriching the dual polarization images is required to obtain adequate accuracy. Synthetic data amendment has been an option to enrich available data and in turn achieve greater accuracy. By amending original layers with several types of synthetic data, it appears that texture has the greatest potential to result in greater accuracy of classification when employing SAR images. Simple indices constructed from differencing and ratioing layers appears ineffective in increasing the accuracy. Meanwhile, decomposition seems better than indices, but not as robust as texture to be implemented in backscatter data for investigating a tropical montane environment. Pseudo-cross-variogram, a multi-temporal texture feature derived from a geostatistical approach, was reported to yield greater accuracy compared to GLCM textures [97]. Utilizing this feature might further improve the accuracy of classification employing SAR data.
With many optional classifiers asserting their advantages for land cover classification, researchers need to selectively employ the most suitable one for their cases. The growing use of ensemble classifiers demonstrates the potential. This research demonstrates that decision tree classifiers perform well in generating accuracy of classification. Random forest, either in the original or regularized mode, or extreme gradient boosting tree classifiers generated high accuracy at about 90%. Nonetheless, the neural network seems less responsive to data amendment and synthetic data when implemented in this research. The unknown error which occurred in the classification process employing ENN and MAB requires in-depth exploration in the future. The comparison of accuracies suggests that ensemble trees such as RF and XGB outweighs other classifiers and would likely produce adequate classification accuracy.
The use of variable importance appears effective to assist in generating high accuracy of classification. Meanwhile, the order of variable importance differs by classifiers, which disproves the previous claim of Reference [98] about a likely similar order of variable importance to be generated across classifiers albeit making their different contributions. Employing all available layers would not always yield the highest accuracy. Variable selection is therefore essential to result in optimum accuracy and may be more relevant when employing hyperspectral images with hundreds of layers [99].
The strategy of taking samples from consistent land uses was practical for land cover classification; however, it was not adequate to allow for the identification of subtle change. As subtle changes were less noticeable, it is difficult to define sub classes at the initial classification with limited information of the sites. Iteratively Reweighted Multivariate Alteration Detection of dual polarization ALOS PALSAR was successful to indicate subtle changes related to farming practices like crop staging and changing vegetation type. This technique serves as an alternative for traditional change analysis employing SAR data such as differencing of log intensity or ratioing [11]. As details of the growing stages of plants could be identified as change in the IRMAD, a more detailed class definition is required. The changes of semi-bare land to denser-vegetated paddy fields, of juvenile to mature oil palm plantation as well as of rubber to juvenile oil palm plantation were unobserved by using seven general land classes, even if the accuracy was about 90%. The possible reason includes the changing surface moisture at the same land cover since the test site is situated in humid tropics and the L-band SAR has some penetration into foliar canopy. Soil background remains an important issue when interaction with low biomass is involved [86]. Reclassification by identification of sub classes at the hotspot of change as indicated in IRMAD was required to better identify subtle changes related to the stage of vegetation growth and vegetative use of change. Nonetheless, adding sub classes to the classification may reduce accuracy. Taking samples on the hotspot for change identified in IRMAD appears effective in improving classification to include the subtle change. Google Earth assisted for better interpretation of change particularly to validate subtle changes identified in IRMAD. Identifying temporal patterns of spectra or indices may complement the information of the change process.

Conclusions
Employing microwave satellite images is an option for monitoring areas with poor accessibility and areas that are severely affected by persistent cloud cover. The second MAD variate produced from IRMAD processing filtered with Gamma Map successfully indicated the location of change. The employment of post-classification change analysis informs "from-to" change. Dual layers of ALOS FBD images may limit the capabilities for classification of complex classes, hence injecting synthetic images enriches information to improve the accuracy. The increment of accuracy by data injection was demonstrated for instance by employing random forest. The greatest accuracy was yielded by the amendment of selected texture features comprising of GLCM layers, i.e., mean and variance. Tree-based methods including RFO, RFG and XGB appear superior in generating the accuracy of classification compared to the support vector machine and neural network. The identification of variable importance assisted in defining the best subset model for reclassification to allow the identification of change indicated in IRMAD analysis. However, subtle change related to growth stages could not be identified with general land cover classes. The different tone resulting from differencing and IRMAD indicated subtle changes due to the change of vegetation type or growth phases such as from juvenile to maturing oil palm or semi-bare to vegetated crop fields. Combining the important variables to reclassify the change process may be adequate to improve accuracy and hence, to emphasize their utility in the detection of subtle changes. Ancillary data from respectful institutions and Google Earth assists in the interpretation of binary change results. Exploring the temporal pattern of changes would likely enhance the understanding of the gradual processes.