Generation of Radiometric , Phenological Normalized Image Based on Random Forest Regression for Change Detection

Efforts have been made to detect both naturally occurring and anthropogenic changes to the Earth’s surface by using satellite remote sensing imagery. There is a need to maintain the homogeneity of radiometric and phenological conditions to ensure accuracy in change detection, but images to assess long-term changes in time-series data that satisfy such conditions are difficult to obtain. For this reason, image normalization is essential. In particular, the normalizing compositive conditions require nonlinear modeling, and random forest (RF) techniques can be utilized for this normalization. This study employed Landsat-5 Thematic Mapper satellite images with temporal, radiometric and phenological differences, and obtained Radiometric Control Set Samples by selecting no-change pixels between the subject image and reference image using scattergrams. In the obtained no-change regions, RF regression was modeled, and normalized images were obtained. Next, normalization performance was evaluated by comparing the results against the following conventional linear regression methods: mean-standard deviation regression, simple regression, and no-change regression. The normalization performance of RF regression was much higher. In addition, for an additional usefulness evaluation in normalization, the normalization performance was compared with other nonlinear ensemble regressions, i.e. Adaptive Boosting regression and Stochastic Gradient Boosting regression, which confirmed that the normalization performance of RF regression was significantly higher. In other words, it was found to be highly useful for normalization when compared to other nonlinear ensemble regressions. Finally, as a result of performing change detection, normalized subject images generated by RF regression showed the highest accuracy, which indicated that the proposed method (where the image was normalized using RF regression) may be useful in change detection between multi-temporal image datasets.


Introduction
Change detection techniques using satellite remote sensing images are methods to quantitatively analyze changes occurring to a targeted area based on data obtained for two different points in time [1][2][3][4].As multi-temporal images are often captured by different sensors under variable atmospheric conditions, degrees of solar illumination, and viewing angles, radiometric normalization is required to remove radiometric differences and make the images comparable [5][6][7].There are two methods for radiometric normalizations: the absolute method and the relative method [8].The absolute method aims to convert the digital numbers recorded by satellite sensors to the true surface reflectance and correct for factors such as changes in the satellite sensor, solar angle, and atmospheric influence [9][10][11].However, absolute radiometric normalization often presents difficulties with atmospheric data collection in terms of cost and accessibility, and cannot be implemented when there are no ground-measured values to obtain the detection data; therefore, in most cases, relative radiometric normalization is utilized [10,12].
Relative radiometric normalization does not require complex atmospheric transfer parameterization processes and can yield information on the relative changes contained in multi-temporal image data that are tracked and corrected so that areas with changes can be identified easily and quickly [12].One commonly used method is a mathematical model wherein data are corrected through regression equations, assuming a linear relationship between bands in multi-temporal image data, provided that the area is the same [13,14].However, in reality, the Earth's surface presents a complex mix of natural and man-made features that exhibit very different, and often nonlinear thermal properties [15].In addition, as optical images, which are the main source of remote sensing data for change detection, depend on the reflectance of targets illuminated by sunlight, optical data are constrained by the limitations of data acquisition such as the impact of clouds, fog, or smoke.It can be difficult to obtain optical images that meet the temporal requirements [16].In this case, temporal images with inevitably different phenological conditions should be utilized.
The nonlinear features of Earth's surface including radiometric, and phenological features should be normalized.One way to overcome such challenges is to apply a random forest (RF) regression algorithm, where the RF regression can model linear and nonlinear relationships [17].In particular, more complex feature space or nonlinear relationships between data of two images are identified more accurately and predictably relative to other statistical techniques [18,19].Furthermore, as the RF algorithm allows the formation of multiple decision trees for learning and combines these in the prediction, it is possible to reduce overfitting errors with a high stability and negligible noise in the data [17].Therefore, the possibility of applying RF in the field of remote sensing has been proposed [20][21][22].In general, remote sensing employs RF to classify land-cover, or to define the status of forest habitats, biomass and trees; a few studies have also used RF regression to predict land-cover, trees, and tree species [19,23,24].The use of a regression was relatively small compared to the classification [19], and there was no case to normalize the two images by using the RF regression to change detection.
In this study, a method was proposed to normalize radiometric and phenological conditions using RF regression.This normalization method was based on modeling using invariant pixels known as radiometric control set samples (RCSS) from the subject image and the reference image, which is a better approach than using global image statistics [25][26][27].To select the RCSS, no-change pixels were selected through scattergrams used in no-change (NC) regression, which is the conventional linear regression method [28].Variables were acquired for the selected no-change region and modeled using RF regression, and a normalized subject image was obtained.To evaluate the performance of normalized subject images generated by RF regression, statistical values were compared with those from the conventional linear regression method.In addition, normalized subject image generated by RF regression was further evaluated with a comparison to other nonlinear ensemble regression techniques.Finally, the change detection results were compared to determine the applicability of this method in change detection.
The composition of this paper is as follows.Section 2 describes the background of linear radiometric normalization, RF regression and other nonlinear ensemble regressions.Section 3 describes the information of the image used in this study, and the methodology used.Sections 4 and 5 present the results and discussions of the RF regression normalization and change detection, and our conclusions are presented in Section 6.

Linear Radiometric Normalization
Conventional relative radiometric normalization methods assume that radiometric relationships between the subject image and reference image are linear.The objective of these methods is to rectify the subject image to a reference image through a linear transform [8].The common form for linear radiometric normalization is given by Equation (1), where x i is the digital number of band i in subject image; y N i is the digital number of band i in the normalized subject image; and a i , b i are normalization constants for band i.The relative radiometric normalization process can be divided into two steps: first, selecting the normalization targets; and, second, determining the normalization coefficients [8].

Mean-Standard Deviation (MS) Regression
This method normalizes image so that the subject image X and reference image Y have the same mean and standard deviation in all bands.The mean-standard deviation (MS) normalization coefficients are derived using Equation (2) [25], where x i and y i are the means of band i; and S x i and S y i are the standard deviation of band i in the subject image and the reference image, respectively.

Simple Regression (SR)
In this method, the subject image X is regressed against the reference image Y in each band.Simple regression (SR) normalization uses least-squares to derive the normalization coefficients.The SR normalization coefficients are solved from Equation (3) [12,25].Thus, where y i is the digital number of band i in the reference image and the summation runs the whole scene.To solve this equation, normalization coefficients are obtained as expressed in Equation (4), where S x i x i is the variance of band i in the subject image and S x i y i is the covariance of band i in the subject and reference images.

No-Change (NC) Regression
This method is based on linear regression line and utilizes only no-change pixels in the subject and reference images for calculation of normalization coefficients [28,29].The no-change pixels are determined based on the scattergram between the near-infrared bands of the subject image and the reference image [28].Thus, if the no-change set is identified, the least-squares equation can be solved to obtain normalization coefficients.The equation for solving the normalization coefficient is given as Equation ( 5), where x NC i and y NC i are the means of NC sets in band i of the subject image and reference image, respectively; S NC x i x i is the variance of NC sets in band i of the subject image; and S NC x i y i is the covariance of NC sets in band i of the subject and reference images.

Random Forest Regression
RF regression consists of a non-parametric, ensemble approach that depends on classification and regression tree (CART) models [17].In regression problems, RF is an arbitrary number of simple trees, which combine their responses to obtain an estimate of the dependent variable.When receiving input vector x, the technique independently creates K regression trees h k (x), (k = 1, . . ., K), and the model prediction is obtained as the mean of the prediction from each individual tree in the forest [17,24,30].Equation of RF regression is Equation (6).
As the results of the trees are computed into the mean value, the sample variance can be reduced [18,30].At this time, to avoid correlation with other trees, the diversity of the trees is increased by making them grow from different training data subsets created through a procedure called bootstrap aggregating, or bagging [30,31].Bagging refers to a technique to create training data through random resampling as training data is generated.In other words, when creating the next subset of data, training data are created instead of removing data selected from input samples [17,32].These characteristics render the prediction of each tree independent and, through this, the RF regression is less sensitive than other streamline machine learning regressions to the quality of the training samples and to overfitting.As a result, strength is ensured against data that include noise [33].Furthermore, samples that are not selected for training the kth tree in the bagging process are included as part of another subset called the out-of-bag (OOB) data [34].OOB data are used to evaluate the performance of an RF regression, with about one-third of the data selected.OOB refers to squared error (MSE) obtained through OOB prediction and is represented by Equation (7) [35], where y i denotes the prediction for the ith observation and yiOOB denotes the average prediction for the ith observation from all trees.However, as the OOB-MSE depends on the measurement of the scale, OOB-R 2 was calculated using Equation (8).
where Var y is the total variance of the response variable.RF regression also produces a measure that ranks variables according to their importance [34,35].In turn, each variable is permuted, and regression trees are grown on the modified dataset.The importance measure of each variable is then calculated as the difference in the MSE between the original OOB dataset and the modified dataset.In particular, this aspect is useful for multi-source studies where data dimensionality is very high [34,36], and it is important to know how each variable influences the prediction model to select the most suitable variables [37,38].

Adaptive Boosting Regression
Adaptive Boosting (AdaBoost) is a sequential ensemble method that was originally developed to enhance classification and regression trees [39].The main idea of AdaBoost is to construct a succession of weak learners by using different training sets derived from resampling the original data.The algorithms learn a set of classifiers by using a weak-learner to produce the final classifier.In AdaBoost, each training instance receives a weight that is used when learning each hypothesis; this weight indicates the relative importance of each instance and is used in computing the error of a hypothesis on the dataset.After each iteration, instances are reweighted, with those instances not correctly classified by the last hypothesis receiving larger weights.Thus, as the process continues, learning focuses on those instances that are most difficult to classify.The key to AdaBoost is in the reweighting of those instances that are misclassified at each iteration.In regression problems, the output for an instance is not correct or incorrect, but has a real-valued error that may be arbitrary constant [40].The prediction error is compared against a threshold to mark it as an error (or not) and the AdaBoost version for classification is used [41].The probabilities kept by the algorithm are modified based on the magnitude of the error; instances with a large error on the previous learners have a higher probability of being chosen to train the following base learner: The median or weighted average is then applied to combine the predictions of the different base learners [42].

Stochastic Gradient Boosting Regression
Stochastic gradient boosting (SGB) is an ensemble approach related to both boosting and bagging [43,44].Many small classification or regression trees are built sequentially from pseudo-residuals (the gradient of the loss function of the previous tree) [45].At each iteration, a tree is built from a random-sub-sample of the dataset (selected without replacement) producing an incremental improvement in the model.Using only a fraction of the training data increases both the computation speed and the prediction accuracy, while also helping to avoid over-fitting the data.An advantage of SGB is that it is not necessary to pre-select or transform predictor variables.Furthermore, it is also resistant to outliers as the steepest gradient algorithm emphasizes points close to their correct classification [46].It should be noted that gradient boosting is functionally similar to RF since it creates a tree ensemble, and also uses randomization during the creation of the trees.However, where an RF builds trees in parallel which can "vote" on the prediction, gradient boosting creates a series of trees where the prediction receives incremental improvement by each tree in the series.

Study Sites and Data
The study sites were located in Gimpo (126.37 • E, 37.29 • N), which is located in the central-western part of South Korea.Gimpo has been a site of development since 2006.Therefore, for the intended detection of change before and after, two time-points were selected: 18 March 2005 for the reference image; and 27 September 2011 for the subject image.Figure 1a,b illustrates the RGB image of both the subject image and reference image.Moreover, the Korean Peninsula undergoes an increasing Leaf Area Index (LAI) trend beginning in March, and shows elevated LAI values in September.Thus, there are phenological variations between the two-time images [47].The present research data are based on an image obtained from a Landsat-5 satellite Thematic Mapper (TM) sensor and the Global Digital Elevation Model (GDEM) V2 was obtained from the Advanced Spaceborne Thermal Emission and Reflection Radiometer (ASTER) sensor.In the case of the GDEM, the image obtained on 17 October 2011-which is temporally similar to the subject image-were utilized.At this time, one of the most important preprocessing steps for accurate normalization and change detection results is image registration [48], which matches the geometric location information of two images.The Landsat-5 TM images used in this study were the Level-1 precision and terrain corrected product (L1TP) that was corrected geometrically and topographically through co-registration [49].Moreover, only areas without cloud or fog were extracted to avoid the influence of such weather conditions in the Landsat-5 TM image.Thus, from each image set, 600 × 600 pixels of the same location without cloud or fog were extracted to implement the experiment.

Automatic Detection of No-Change Pixel
To ensure that the no-change pixel selection process was independent from operator performance effects, this selection was carried out through an automatic process.In this study, we extracted the no-change set based on the scattergram used in NC regression.As mentioned above, the no-change pixels were identified in the scattergram between near-infrared bands for the subject and reference images.This method locates the statistical centers of water and land-surface clusters based on local maxima for the near-infrared bands in the scattergram [28].The no-change set is defined as indicated in Equation (9) [50], where and are the initial normalization coefficients for the near-infrared band, obtained from the centers of the water and land-surface clusters; HVW is the corresponding half vertical width of the no-change regions in the scattergram, given the half perpendicular width (HPW) of the no-change region.The relationship between HVW and HPW is expressed by Equation (10) [25,28] Next, it was assumed that the majority (>50%) of pixels in the images did not experience significant land-cover change between the two dates represented by the reference and subject images [25].In addition, the obtained no-change set should have a correlation coefficient greater than 0.9 [12,13].In this study, the center coordinate of the water cluster in the scattergram was (5,5), and the center coordinate of the land surface in the scattergram was (71, 88).From these two coordinates, the initial normalization coefficients were acquired: = .
. The HPW value was selected as 11 to include 52.56% no-change regions (189,221 out of 360,000); the value of HVW was 17.6737, which was obtained with the value determined using Equation (10), and the correlation coefficient in the no-change set was 0.9740, which was greater than 0.9. Figure 2a shows the scattergram and no-change lines of the subject and reference images for the near-infrared bands.Each color in the scattergram indicates point density, with the corresponding colors representing numbers indicated in the legend.Thus, the areas that appear red have a higher point density than the areas that appear black.Figure 2b shows the no-change pixels selected based on the scattergram; the black area indicates non-change, and the white area represents change.

Automatic Detection of No-Change Pixel
To ensure that the no-change pixel selection process was independent from operator performance effects, this selection was carried out through an automatic process.In this study, we extracted the no-change set based on the scattergram used in NC regression.As mentioned above, the no-change pixels were identified in the scattergram between near-infrared bands for the subject and reference images.This method locates the statistical centers of water and land-surface clusters based on local maxima for the near-infrared bands in the scattergram [28].The no-change set is defined as indicated in Equation ( 9 where a 4 0 and b 4 0 are the initial normalization coefficients for the near-infrared band, obtained from the centers of the water and land-surface clusters; HVW is the corresponding half vertical width of the no-change regions in the scattergram, given the half perpendicular width (HPW) of the no-change region.The relationship between HVW and HPW is expressed by Equation ( 10) [25,28]: Next, it was assumed that the majority (>50%) of pixels in the images did not experience significant land-cover change between the two dates represented by the reference and subject images [25].In addition, the obtained no-change set should have a correlation coefficient greater than 0.9 [12,13].In this study, the center coordinate of the water cluster in the scattergram was (5,5), and the center coordinate of the land surface in the scattergram was (71, 88).From these two coordinates, the initial normalization coefficients were acquired: a 4 0 = 1.2575; and b 4 0 = −1.2878.The HPW value was selected as 11 to include 52.56% no-change regions (189,221 out of 360,000); the value of HVW was 17.6737, which was obtained with the a 4 0 value determined using Equation (10), and the correlation coefficient in the no-change set was 0.9740, which was greater than 0.9. Figure 2a shows the scattergram and no-change lines of the subject and reference images for the near-infrared bands.Each color in the scattergram indicates point density, with the corresponding colors representing numbers indicated in the legend.Thus, the areas that appear red have a higher point density than the areas that appear black.Figure 2b shows the no-change pixels selected based on the scattergram; the black area indicates non-change, and the white area represents change.

Radiometric Normalization Using Random Forest Regression
Many methods have been developed for relative radiometric normalization based on linear assumptions.However, as mentioned above, in cases where there are compositive differences in radiometric and phenological conditions, all of the corrections should be merged, and nonlinear transfer is necessary.For this reason, RF regression is used for nonlinear adjustment and radiometric correction using the no-change region.In this study, there were two main phases: training and prediction.The training stage refers to the process of constructing RF regression.For high-quality RF regression performance, the number of variables to be selected and tested for the best split when growing the trees (mtry) and the number of decision trees to be generated (ntree) should be optimized [22,23].In this study, the mtry was set to the square root of the number of the input variable, as is commonly done [36].Table 1 shows the selected explanatory variables.The total number of variables in this study was 27, and these variables were divided into two categories: variables extracted from the subject image; and variables extracted from the ASTER GDEM, which is temporally similar to the subject image.The variables selected for the former were Bands 1-5 and Band 7, which contained the spectral characteristics of the subject image.The gray-level co-occurrence matrix (GLCM) data were obtained using statistical values between a given pixel and the neighboring pixels, which showed the spatial characteristics of the pixels as a texture feature [51].The statistical parameters applied in this study were angular second moment (ASM), contrast, correlation, and entropy, with better texture feature extraction in remote sensing [52].ASM reflects the regularity and uniformity of image distribution; contrast reflects the depth and smoothness of the image texture structure; correlation reflects the similarity of the image texture in the horizontal or vertical direction, and entropy is a measure of image information that reflects the complexity of the texture distribution [52]; the mean pixel values, and the variance values, which are the main statistical values.Next, the GLCM, mean, and variance were obtained only for the R (Band 3), G (Band 2), and B (Band 1) values to be modeled for visual interpretation.For the latter group, elevation, slope, and aspect were selected.Digital elevation models (DEMs) are increasingly being used for visual and mathematical analysis of topography, landscape and landforms, as well as for modeling surface processes [53].For these reasons, DEMs have become a core source of data for modeling in remote sensing, which provide information about the Earth's surface and spatial distribution that cannot be obtained from images alone [54,55].In particular, elevation, slope and aspect are the three primary topographic variables [56].Slope and aspect data derived from DEMs are important variables used regularly in geographical information system for modeling purposes [54].

Radiometric Normalization Using Random Forest Regression
Many methods have been developed for relative radiometric normalization based on linear assumptions.However, as mentioned above, in cases where there are compositive differences in radiometric and phenological conditions, all of the corrections should be merged, and nonlinear transfer is necessary.For this reason, RF regression is used for nonlinear adjustment and radiometric correction using the no-change region.In this study, there were two main phases: training and prediction.
The training stage refers to the process of constructing RF regression.For high-quality RF regression performance, the number of variables to be selected and tested for the best split when growing the trees (mtry) and the number of decision trees to be generated (ntree) should be optimized [22,23].In this study, the mtry was set to the square root of the number of the input variable, as is commonly done [36].Table 1 shows the selected explanatory variables.The total number of variables in this study was 27, and these variables were divided into two categories: variables extracted from the subject image; and variables extracted from the ASTER GDEM, which is temporally similar to the subject image.The variables selected for the former were Bands 1-5 and Band 7, which contained the spectral characteristics of the subject image.The gray-level co-occurrence matrix (GLCM) data were obtained using statistical values between a given pixel and the neighboring pixels, which showed the spatial characteristics of the pixels as a texture feature [51].The statistical parameters applied in this study were angular second moment (ASM), contrast, correlation, and entropy, with better texture feature extraction in remote sensing [52].ASM reflects the regularity and uniformity of image distribution; contrast reflects the depth and smoothness of the image texture structure; correlation reflects the similarity of the image texture in the horizontal or vertical direction, and entropy is a measure of image information that reflects the complexity of the texture distribution [52]; the mean pixel values, and the variance values, which are the main statistical values.Next, the GLCM, mean, and variance were obtained only for the R (Band 3), G (Band 2), and B (Band 1) values to be modeled for visual interpretation.For the latter group, elevation, slope, and aspect were selected.Digital elevation models (DEMs) are increasingly being used for visual and mathematical analysis of topography, landscape and landforms, as well as for modeling surface processes [53].For these reasons, DEMs have become a core source of data for modeling in remote sensing, which provide information about the Earth's surface and spatial distribution that cannot be obtained from images alone [54,55].In particular, elevation, slope and aspect are the three primary topographic variables [56].Slope and aspect data derived from DEMs are important variables used regularly in geographical information system for modeling purposes [54].In the case of the ntree, a majority of studies have set the ntree value as 500 as the errors stabilize before this number of regression tree is achieved [23].However, as investigated by Reference [57] the sensitivity of the number of trees showed that this parameter had no influence on the results.For this reason, 32, 64, 128, 256, 512, and 1024 were selected as the number of trees, and the optimal ntree value was derived by considering the performance of the RF regression calculated using Equation ( 8), as well as training time.
The prediction stage refers to the process of modeling radiometric and phenological normalized images.Modeling was performed using the RF regression obtained based on no-change region in the training stage.First, explanatory variables corresponding to each pixel position were acquired for all pixels of the subject image.Then, the obtained explanatory variable were used as the input values of the previously obtained RF regression.Finally, the normalized subject image was obtained through the obtained prediction values.At this time, only the R, G, and B bands were modeled for visual interpretation.

Accuracy Assessment of Radiometric Normalization
The quality of radiometric normalization can be statistically assessed based on the root mean squared error (RMSE) and R 2 value between pairs of corresponding bands.The RMSE is defined as shown in Equation (11), and the R 2 value is defined as in Equation ( 12), where N is the total number of pixels in the scene.A lower RMSE value indicates a better fit and therefore better normalization results.The closer the R 2 value is to 1, the better the radiometric normalization process.

Change Detection
In this study, to investigate the use of the normalized subject images generated by RF regression in change detection, image differencing was performed with the reference images and compared with the change detection results of normalized subject images generated by linear regression.Image differencing is a method of subtracting pixel values from two images with a common coordinate system to observe changes between two different points in time.If the result of the operation has a positive or a negative value, change has occurred.If the value is close to 0, there has been no change.Then, the result of the operation is converted into a positive value by adding an arbitrary constant (offset value).The process of the image differencing is expressed as Equation ( 13) [3].
Remote Sens. 2017, 9, 1163 9 of 21 where I 1 is the pixel value at time 1; I 2 is the pixel value at time 2; C is the arbitrary constant; and i, j, and k are the row, column, and band, respectively.After the execution of image differencing, a threshold should be applied to distinguish between change and no change.In this study, Otsu's method was used as it is the most practical method to determine the threshold value based on the brightness distribution of the input image [58].At this time, given the characteristics of pixel-based change detection, there is a case of "salt-and-pepper"-type noise [59], which leads to the false detection of unchanged areas as change areas.To address this problem, segmentation was performed to determine the minimum object size, and noise was identified if the minimum object size was not satisfied in the change or no-change areas.For segmentation, a simple linear iterative clustering (SLIC) technique was used due to its relative simplicity and rapid computing speed [60].SLIC is a segmentation method that entails the establishment of cluster space size, the creation of k cluster centers, measuring color and spatial distance, and designating the cluster center closest to each pixel [60,61].Experiments were performed by setting the cluster space size to 5 × 5, 7 × 7, and 9 × 9 pixels.
To evaluate change detection accuracy, ground-truth data for the areas with actual change were distinguished from those with no change by carrying out manual digitizing.Manual digitizing generates ground-truth data by directly interpreting two date images through a combination of spatial properties (size, shape, texture, and pattern) and spectral properties (tone and color) [62].The ground-truth data for this study are shown in Figure 3. Based on these ground-truth data, the user's accuracy, producer's accuracy, and overall accuracy were obtained.Overall accuracy is the percentage of samples correctly classified out of the entire sample.The user's accuracy corresponds to the error of commission, whereas the producer's accuracy corresponds to the error of omission.In other words, the user's accuracy represents the proportion of correctly classified samples among the classified samples of the acquired map, and the producer's accuracy represents the proportion of correctly classified samples among the ground-truth class samples [63].
where I1 is the pixel value at time 1; I2 is the pixel value at time 2; C is the arbitrary constant; and i, j, and k are the row, column, and band, respectively.After the execution of image differencing, a threshold should be applied to distinguish between change and no change.In this study, Otsu's method was used as it is the most practical method to determine the threshold value based on the brightness distribution of the input image [58].At this time, given the characteristics of pixel-based change detection, there is a case of "salt-and-pepper"-type noise [59], which leads to the false detection of unchanged areas as change areas.To address this problem, segmentation was performed to determine the minimum object size, and noise was identified if the minimum object size was not satisfied in the change or no-change areas.For segmentation, a simple linear iterative clustering (SLIC) technique was used due to its relative simplicity and rapid computing speed [60].SLIC is a segmentation method that entails the establishment of cluster space size, the creation of k cluster centers, measuring color and spatial distance, and designating the cluster center closest to each pixel [60,61].Experiments were performed by setting the cluster space size to 5 × 5, 7 × 7, and 9 × 9 pixels.
To evaluate change detection accuracy, ground-truth data for the areas with actual change were distinguished from those with no change by carrying out manual digitizing.Manual digitizing generates ground-truth data by directly interpreting two date images through a combination of spatial properties (size, shape, texture, and pattern) and spectral properties (tone and color) [62].The ground-truth data for this study are shown in Figure 3. Based on these ground-truth data, the user's accuracy, producer's accuracy, and overall accuracy were obtained.Overall accuracy is the percentage of samples correctly classified out of the entire sample.The user's accuracy corresponds to the error of commission, whereas the producer's accuracy corresponds to the error of omission.In other words, the user's accuracy represents the proportion of correctly classified samples among the classified samples of the acquired map, and the producer's accuracy represents the proportion of correctly classified samples among the ground-truth class samples [63].

Assessment of Accuracy of the Linear Regression Method
For a comparison of the proposed method, R 2 and RMSE values were computed between normalized subject images generated by linear regression and reference images.The obtained R 2

Assessment of Accuracy of the Linear Regression Method
For a comparison of the proposed method, R 2 and RMSE values were computed between normalized subject images generated by linear regression and reference images.The obtained R 2 and RMSE values are shown in Table 2.The mean R 2 values of MS regression, SR, and NC regression were improved by 0.12%, 0.01%, and 0.13%, respectively, and the mean RMSE values were improved by 25.61%, 33.22%, and 24.93%, respectively, relative to the raw subject images.RMSE values were thus slightly improved, but R 2 values improved very little.The scattergrams of Band 1, Band 2 and Band 3 with the reference images were obtained to confirm the bias, as shown in Figure 4. Compared with the scattergram of the raw subject images, the bias was slightly reduced in the case of MS and NC regression, but bias was still present; almost no bias was removed in the case of SR.
and RMSE values are shown in Table 2.The mean R 2 values of MS regression, SR, and NC regression were improved by 0.12%, 0.01%, and 0.13%, respectively, and the mean RMSE values were improved by 25.61%, 33.22%, and 24.93%, respectively, relative to the raw subject images.RMSE values were thus slightly improved, but R 2 values improved very little.The scattergrams of Band 1, Band 2 and Band 3 with the reference images were obtained to confirm the bias, as shown in Figure 4. Compared with the scattergram of the raw subject images, the bias was slightly reduced in the case of MS and NC regression, but bias was still present; almost no bias was removed in the case of SR.

Assessment of Accuracy of Random Forest Regression
As mentioned in Section 3.2.2, to select the optimal ntree, six experiments were performed using 32, 64, 128, 256, 512, and 1024 trees.As shown in Table 3 and Figure 5, the results indicate that the values of R 2 obtained through OOBs in Bands 1-3 were similar regardless of the number of trees.However, the training times increased in proportion to the number of trees.Based on the training time as well as the performance in terms of R 2 , using 32 trees was verified to be most efficient; therefore, 32 was selected as the ntree.In addition, the variable importance was obtained for each band, and only the top five variables among the 27 included were extracted.For Band 1, the top five variables were Band 1, Band 2, Mean of Band 1, Band 3, and Mean Band 2, which corresponded to scores of 0.2023, 0.1440, 0.0825, 0.0797, and 0.0666, respectively.For Band 2, the top five variables were Mean of Band 2, Band 2, Band 1, Band 3, and Mean of Band 3, with corresponding scores being 0.1606, 0.1192, 0.1085, 0.1054, and 0.0836, respectively.In the case of Band 3, the top five variables were Band 5, Band 3, Band 1, Mean of Band 1, and Band 2, and the respective scores were 0.2131, 0.1376, 0.0999, 0.0736, and 0.0707.With the exception of Band 5 in Band 3, it was confirmed that the importance of Band 1, Band 2, Band 3, and Mean of Band variable was high.It should be noted, however, that the variable importance scores depend on the number of included variables.Removing or replacing predictors, for example, may change the importance scores as different inter-correlated variables could act as surrogates.
The scattergrams of Band 1, Band 2 and Band 3 between the normalized subject images generated by RF regression and the reference images are illustrated in Figure 6.The bias between these data was substantially reduced when compared to the scattergrams of normalized subject images generated by linear regression.In addition, to evaluate the performance of radiometric normalization, R 2 and RMSE values were computed between the normalized subject images generated by RF regression and the reference images.The obtained R 2 and RMSE values are shown in Table 4

Assessment of Accuracy of Random Forest Regression
As mentioned in Section 3.2.2, to select the optimal ntree, six experiments were performed using 32, 64, 128, 256, 512, and 1024 trees.As shown in Table 3 and Figure 5, the results indicate that the values of R 2 obtained through OOBs in Bands 1-3 were similar regardless of the number of trees.However, the training times increased in proportion to the number of trees.Based on the training time as well as the performance in terms of R 2 , using 32 trees was verified to be most efficient; therefore, 32 was selected as the ntree.In addition, the variable importance was obtained for each band, and only the top five variables among the 27 included were extracted.For Band 1, the top five variables were Band 1, Band 2, Mean of Band 1, Band 3, and Mean Band 2, which corresponded to scores of 0.2023, 0.1440, 0.0825, 0.0797, and 0.0666, respectively.For Band 2, the top five variables were Mean of Band 2, Band 2, Band 1, Band 3, and Mean of Band 3, with corresponding scores being 0.1606, 0.1192, 0.1085, 0.1054, and 0.0836, respectively.In the case of Band 3, the top five variables were Band 5, Band 3, Band 1, Mean of Band 1, and Band 2, and the respective scores were 0.2131, 0.1376, 0.0999, 0.0736, and 0.0707.With the exception of Band 5 in Band 3, it was confirmed that the importance of Band 1, Band 2, Band 3, and Mean of Band variable was high.It should be noted, however, that the variable importance scores depend on the number of included variables.Removing or replacing predictors, for example, may change the importance scores as different inter-correlated variables could act as surrogates.
The scattergrams of Band 1, Band 2 and Band 3 between the normalized subject images generated by RF regression and the reference images are illustrated in Figure 6.The bias between these data was substantially reduced when compared to the scattergrams of normalized subject images generated by linear regression.In addition, to evaluate the performance of radiometric normalization, R 2 and RMSE values were computed between the normalized subject images generated by RF regression and the reference images.The obtained R 2 and RMSE values are shown in Table 4.

Comparison of Accuracy between Random Forest Regression and Linear Regression
To visually compare the results, normalized subject images were acquired, as illustrated in Figure 7.In the case of normalized subject images generated by linear regression, phenological normalization was not performed, whereas the normalized subject image generated by RF regression showed similar spectral characteristics to those of the reference image while retaining the spatial characteristics of the subject image.To compare the quantitative values, each R 2 and RMSE value was plotted as shown in Figure 8.The results showed that the performance of normalized subject images generated by RF regression was much better than that of the raw subject images and normalized subject images generated by linear regression.Applying RF regression improved the mean R 2 values in the raw, MS regression, SR, and NC regression by 53.88%, 53.75%, 53.87%, and 53.75%, respectively.In addition, the mean RMSE values were improved by 74.76%, 66.07%, 54.69%, and 66.37%, respectively.

Comparison of Accuracy between Random Forest Regression and Linear Regression
To visually compare the results, normalized subject images were acquired, as illustrated in Figure 7.In the case of normalized subject images generated by linear regression, phenological normalization was not performed, whereas the normalized subject image generated by RF regression showed similar spectral characteristics to those of the reference image while retaining the spatial characteristics of the subject image.To compare the quantitative values, each R 2 and RMSE value was plotted as shown in Figure 8.The results showed that the performance of normalized subject images generated by RF regression was much better than that of the raw subject images and normalized subject images generated by linear regression.Applying RF regression improved the mean R 2 values in the raw, MS regression, SR, and NC regression by 53.88%, 53.75%, 53.87%, and 53.75%, respectively.In addition, the mean RMSE values were improved by 74.76%, 66.07%, 54.69%, and 66.37%, respectively.

Assessment of Accuracy of Other Nonlinear Regression Method and Comparison of Accuracy with Random Forest Regression
As previously mentioned, two other nonlinear ensemble regression methods, AdaBoost regression and SGB regression, were compared for further evaluation of normalization using RF

Assessment of Accuracy of Other Nonlinear Regression Method and Comparison of Accuracy with Random Forest Regression
As previously mentioned, two other nonlinear ensemble regression methods, AdaBoost regression and SGB regression, were compared for further evaluation of normalization using RF regression.For comparison under the same conditions, the same explanatory variables were used in RF regression, and the number of weak learner of AdaBoost regression and the number of trees of SGB regression were selected as 32, also used the RF regression.The scattergram of the normalized image and reference image obtained through each regression are shown in Figure 9.It can be confirmed that the bias was still present and that the performances were significantly lower when compared with the scattergram of the normalized subject image generated by RF regression.The normalized subject image obtained by each regression is also shown in Figure 10.In the case of the normalized subject image generated by AdaBoost regression, it was confirmed that it did not contain any spectral characteristics of the reference image.In the case of the normalized subject image generated by SGB regression, it was similar to the spectral characteristics of the reference image, rather than the normalized subject image generated by AdaBoost regression, but the visual performance was significantly lower when compared with the normalized subject image generated by RF regression.
For the quantitative comparison, the normalization performance of the two images was obtained, as shown in Table 5.The mean R 2 and RMSE of the normalized subject image generated by AdaBoost regression were 0.4162 and 10.0246, respectively.The mean R 2 and RMSE of the normalized subject image generated by SGB regression were 0.4406 and 10.1438, respectively.In comparison with the normalization performance of the RF regression, the normalization performance of AdaBoost regression was lower by 49.28% for R 2 and 18.06% for RMSE, and the normalization performance of SGB regression was lower by 46.84% for R 2 and 19.02% for RMSE.In other words, when compared with other nonlinear ensemble regressions, it confirmed that RF regression was stable and good normalization performance, and the additional usefulness of radiometric and phenological normalization using RF regression was judged.

Assessment of Accuracy of Other Nonlinear Regression Method and Comparison of Accuracy with Random Forest Regression
As previously mentioned, two other nonlinear ensemble regression methods, AdaBoost regression and SGB regression, were compared for further evaluation of normalization using RF regression.For comparison under the same conditions, the same explanatory variables were used in RF regression, and the number of weak learner of AdaBoost regression and the number of trees of SGB regression were selected as 32, also used in the RF regression.The scattergram of the normalized image and reference image obtained through each regression are shown in Figure 9.It can be confirmed that the bias was still present and that the performances were significantly lower when compared with the scattergram of the normalized subject image generated by RF regression.The normalized subject image obtained by each regression is also shown in Figure 10.In the case of the normalized subject image generated by AdaBoost regression, it was confirmed that it did not contain any spectral characteristics of the reference image.In the case of the normalized subject image generated by SGB regression, it was similar to the spectral characteristics of the reference image, rather than the normalized subject image generated by AdaBoost regression, but the visual performance was significantly lower when compared with the normalized subject image generated by RF regression.
For the quantitative comparison, the normalization performance of the two images was obtained, as shown in Table 5.The mean R 2 and RMSE of the normalized subject image generated by AdaBoost regression were 0.4162 and 10.0246, respectively.The mean R 2 and RMSE of the normalized subject image generated by SGB regression were 0.4406 and 10.1438, respectively.In comparison with the normalization performance of the RF regression, the normalization performance of AdaBoost regression was lower by 49.28% for R 2 and 18.06% for RMSE, and the normalization performance of SGB regression was lower by 46.84% for R 2 and 19.02% for RMSE.In other words, when compared with other nonlinear ensemble regressions, it confirmed that RF regression was stable and good normalization performance, and the additional usefulness of radiometric and phenological normalization using RF regression was judged.

Analysis of Change Detection
To evaluate the effectiveness of the proposed method in change detection, image differencing was performed for each normalized subject image and reference image.Then, as mentioned above, SLIC was performed by setting cluster space sizes of 5 × 5, 7 × 7, and 9 × 9, and the noise of the change detection results were removed using the obtained minimum object sizes.The results are illustrated in Figures 11-14.When interpreted visually with ground-truth data, the change detection results of the normalized subject images generated by linear regression are shown to be overestimated when compared with the change detection results of the normalized subject image generated by RF regression.
Quantitative comparisons of each change detection result are shown in Table 6.In the case of mean overall accuracy, the performance of change detection using normalized subject images generated by RF regression was 2.98%, 22.95%, and 10.02% higher than the change detection results obtained through MS regression, SR, and NC regression, respectively.In addition, the user's accuracy was improved by 12.53%, 26.81%, and 20.45%, and the producer's accuracy by 2.16%, 9.61%, and 4.78%, respectively.Thus, performance was significantly improved using the normalized subject images generated by RF regression for change detection.

Analysis of Change Detection
To evaluate the effectiveness of the proposed method in change detection, image differencing was performed for each normalized subject image and reference image.Then, as mentioned above, SLIC was performed by setting cluster space sizes of 5 × 5, 7 × 7, and 9 × 9, and the noise of the change detection results were removed using the obtained minimum object sizes.The results are illustrated in Figures 11-14.When interpreted visually with ground-truth data, the change detection results of the normalized subject images generated by linear regression are shown to be overestimated when compared with the change detection results of the normalized subject image generated by RF regression.
Quantitative comparisons of each change detection result are shown in Table 6.In the case of mean overall accuracy, the performance of change detection using normalized subject images generated by RF regression was 2.98%, 22.95%, and 10.02% higher than the change detection results obtained through MS regression, SR, and NC regression, respectively.In addition, the user's accuracy was improved by 12.53%, 26.81%, and 20.45%, and the producer's accuracy by 2.16%, 9.61%, and 4.78%, respectively.Thus, performance was significantly improved using the normalized subject images generated by RF regression for change detection.

Discussion
The objective of this paper was to evaluate the applicability of compositive normalization including radiometric and phenological using RF regression, which requires nonlinear modeling.This paper assessed the performance of RF regression based on statistical values and training time according to the number of trees.The number of trees used in experiments were 32, 64, 128, 256, 512, and 1024, and a total of six experiments were performed.The training time increased proportionally to the number of trees, whereas statistical value was not significantly affected by the number of trees.For this reason, when normalization using RF regression was performed, it was judged that the number of trees was not greatly influenced by the number of trees.Furthermore, in the case of variable importance, it was confirmed that the importance of GLCM, variance, and DEM-related variables were considerably small, and that the importance of Bands 1-3, and mean variables were generally large.In other words, the variables that had the greatest influence on the normalization were the pixels of the RGB bands, rather than data such as texture or topography.Next, normalization performance was assessed with a comparison to the R 2 and RMSE of conventional linear regression, MS regression, SR, NC regression, and compared with other nonlinear ensemble regressions (AdaBoost regression and SGB regression) for additional usefulness.Finally, the

Discussion
The objective of this paper was to evaluate the applicability of compositive normalization including radiometric and phenological using RF regression, which requires nonlinear modeling.This paper assessed the performance of RF regression based on statistical values and training time according to the number of trees.The number of trees used in experiments were 32, 64, 128, 256, 512, and 1024, and a total of six experiments were performed.The training time increased proportionally to the number of trees, whereas statistical value was not significantly affected by the number of trees.For this reason, when normalization using RF regression was performed, it was judged that the number of trees was not greatly influenced by the number of trees.Furthermore, in the case of variable importance, it was confirmed that the importance of GLCM, variance, and DEM-related variables were considerably small, and that the importance of Bands 1-3, and mean variables were generally large.In other words, the variables that had the greatest influence on the normalization were the pixels of the RGB bands, rather than data such as texture or topography.Next, normalization performance was assessed with a comparison to the R 2 and RMSE of conventional linear regression, MS regression, SR, NC regression, and compared with other nonlinear ensemble regressions (AdaBoost regression and SGB regression) for additional usefulness.Finally, the effectiveness of normalization using RF regression in change detection was confirmed by performing change detection.
The RF regression algorithms were credited for its nonlinear expression, stability, and satisfactory accuracy.First, the RF regression (which is characterized by nonlinear regression) provided accurate radiometric and phenological normalization.The visual analysis of the normalized subject images generated by conventional linear regression revealed that phenological normalization was not performed and the statistical values were also unsatisfactory.Unlike conventional linear regression approaches for normalization, the normalized subject image generated by the proposed method contained the spatial characteristics of the subject image and spectral characteristics of the reference image.Furthermore, it confirmed that the R 2 and RMSE values were significantly improved.Second, when compared with other nonlinear ensemble regression, RF regression was able to minimize overfitting and handle high dimensional data.Both the AdaBoost and SGB regressions were significantly lower than those of RF regression when compared on visual and statistical values.That is, when compared with other ensemble regressions, it was confirmed that normalization through RF regression was stable.Third, the normalized subject image generated by RF regression in change detection provided satisfactory accuracy.When the change detection was performed on each normalized subject image, the overall accuracy and user accuracy were significantly improved and the producer accuracy was slightly improved by using the normalized subject image generated by RF regression when compared with the normalized subject image generated by conventional linear regression.At this time, the producer's accuracy was slightly improved when compared with the overall accuracy and user accuracy, which was due to the overestimation of change regions in the normalization image of the linear regression approaches.In other words, if phenological normalization was not performed properly, there was a potential risk of producing meaningless information in change detection.Therefore, the proposed method performed the normalization of nonlinear relations properly, especially when compared to nonlinear ensemble regression as well as conventional linear regression, and was also useful for change detection.

Conclusions
In this paper, we have addressed the issue of normalizing radiometric and phenological conditions.To achieve this goal, RF regression was selected from among the many potential techniques that can model nonlinear relationships.The comparison results based on visual and statistical values showed that normalization through RF regression produced a significantly satisfactory performance.Compared with the normalization performance of conventional linear regression, R 2 and RMSE were improved on average by 53.81% and 65.47%, respectively.In comparison to the normalization performance of the other nonlinear ensemble regressions, R 2 and RMSE on average were higher by 48.01% and 18.54%, respectively.In addition, when compared with the change detection results of a normalized subject image generated by conventional linear regression, 11.98% of the overall accuracy, 19.93% of the user's accuracy, and 5.52% of the producer's accuracy were improved on average.Based on the findings of this study, we concluded that normalization through RF regression had been performed appropriately.
In future studies, efforts should be made to ensure the development of technology to improve the RF regression performance.In particular, the effects of the variable importance and combinations of explanatory variables used for RF regression should be further analyzed.Moreover, the usefulness of the proposed approach should be validated by obtaining sufficient reference images for each season and time period, and by utilizing images acquired by other satellites and sensors.In addition, it is necessary to verify the applicability of this method with complex structures by applying RF regression to high-resolution images.

Figure 2 .
Figure 2. (a) Scattergram of near-infrared bands for subject image and reference image; and (b) the location of selected no-change regions.

Figure 2 .
Figure 2. (a) Scattergram of near-infrared bands for subject image and reference image; and (b) the location of selected no-change regions.

Figure 3 .
Figure 3. Ground truth data obtained by manual digitizing.

Figure 3 .
Figure 3. Ground truth data obtained by manual digitizing.

Figure 4 .
Figure 4. Scattergram of Band 1, Band 2 and Band 3 with reference image: (a) raw subject image; (b) normalized subject image generated by MS regression; (c) normalized subject image generated by SR; and (d) normalized subject image by NC regression.

Figure 4 .
Figure 4. Scattergram of Band 1, Band 2 and Band 3 with reference image: (a) raw subject image; (b) normalized subject image generated by MS regression; (c) normalized subject image generated by SR; and (d) normalized subject image by NC regression.

Figure 6 .
Figure 6.Scattergram of Band 1, Band 2 and Band 3 between normalized subject image generated by RF regression and reference image.

Figure 6 .
Figure 6.Scattergram of Band 1, Band 2 and Band 3 between normalized subject image generated by RF regression and reference image.

Figure 8 .
Figure 8.Comparison of R 2 and RMSE for the raw image and normalized subject image generated by different linear regressions and RF regression.

Figure 8 .
Figure 8.Comparison of R 2 and RMSE for the raw image and normalized subject image generated by different linear regressions and RF regression.

Table 1 .
Description of the utilized explanatory variables.

Table 2 .
Statistical results (R 2 and RMSE) of linear regression normalization.

Table 3 .
OOB-R 2 and training time of RF regression.

Table 3 .
OOB-R 2 and training time of RF regression.

Table 4 .
Statistical results (R 2 and RMSE) of RF regression normalization.

Table 4 .
Statistical results (R 2 and RMSE) of RF regression normalization.

Table 5 .
Statistical results (R 2 and RMSE) of AdaBoost and SGB regression normalization.

Table 5 .
Statistical results (R 2 and RMSE) of AdaBoost and SGB regression normalization.

Table 6 .
Quantitative change detection result based on user's accuracy, producer's accuracy and overall accuracy.

Table 6 .
Quantitative change detection result based on user's accuracy, producer's accuracy and overall accuracy.