kCCA Transformation-Based Radiometric Normalization of Multi-Temporal Satellite Images

: Radiation normalization is an essential pre-processing step for generating high-quality satellite sequence images. However, most radiometric normalization methods are linear, and they cannot eliminate the regular nonlinear spectral differences. Here we introduce the well-established kernel canonical correlation analysis (kCCA) into radiometric normalization for the ﬁrst time to overcome this problem, which leads to a new kernel method. It can maximally reduce the image differences among multi-temporal images regardless of the imaging conditions and the reﬂectivity difference. It also perfectly eliminates the impact of nonlinear changes caused by seasonal variation of natural objects. Comparisons with the multivariate alteration detection (CCA-based) normalization and the histogram matching, on Gaofen-1 (GF-1) data, indicate that the kCCA-based normalization can preserve more similarity and better correlation between an image-pair and effectively avoid the color error propagation. The proposed method not only builds the common scale or reference to make the radiometric consistency among GF-1 image sequences, but also highlights the interesting spectral changes while eliminates less interesting spectral changes. Our method enables the application of GF-1 data for change detection, land-use, land-cover change detection etc.


Introduction
There is an increasing demand for image sequence analysis, because of the growing use of multi-sensor and multi-temporal remote sensing data to monitor the land-use and land-cover change (LUCC) and climate change, as well as to analyze the global resource environment [1][2][3][4]. A satellite image sequence is usually constructed from multi-temporal images generated by one common sensor or multiple sensors. Radiometric normalization is a fundamental method used in the pre-processing of satellite image sequence analysis, especially in the pre-processing of change detection [5]. Radiometric normalization can directly make use of the pixel values of an image to establish the corresponding transformation equation for each multi-spectral band in multi-temporal remote sensing data, without the request of any other parameters such as the atmospheric conditions when the remote sensing data obtained [6]. In such a context, radiometric normalization is called spectral alignment [7]. Radiometric normalization builds not only the common radiometric scale/reference but also the radiometric consistency among an image sequence.
Most radiometric normalization methods make use of the pseudo-invariant feature (PIF) to obtain the common scale among an image sequence (PIF)-based [5,8,9]. It is well-known that these invariant features-the targets whose spectral reflectance does not change over time, are necessary for the atmospheric normalization of temporal image sequences, and thus they are the basis of image sequence analysis [10]. All the PIF-based methods are linear, and they minimize the image radiometric differences caused by different acquisition conditions, such as atmospheric conditions, earth-sun distance, detector calibration, illumination angles, viewing angles and sensor oscillation [5,6,[11][12][13].
The core problem of PIF-based methods is the selection of appropriate PIF points [13]. From simple to sophisticated, various PIF-based methods were developed for image normalization, such as the Ridge method [2,[14][15][16], Simple image regression [4], Dark set-Bright set (DB) [17] and Automatic Scattergram Control Regression (ASAR) [18,19], etc. Specifically, Canty and Nielsen developed a powerful and widely used method, the iteratively reweighted multivariate alteration detection (iRMAD) transformation, which is invariant to linear and affine scaling. One advantage of this method is that only one parameter needs to be specified, i.e., the chi-squared percentile [20][21][22][23].
However, such a linear radiometric normalization cannot eliminate the nonlinear spectral differences, especially the surface reflectance differences [24]. In fact, beside PIFs, there are another two kinds of features/targets in image sequences. One kind of features/targets is the feature whose spectral reflectance changes regularly or expectably over time, such as vegetation growth. The other kind of features/targets is the feature whose spectral reflectance changes irregularly over time, such as city expansion and human activity trails. When these irregularly changed features are analyzed in an image sequence, the regularly changed features are the disturbances that need to be eliminated or attenuated by radiometric normalization. The changes caused by the growth of the vegetation are the most typical non-linear changes in radiometric value of image sequence. Especially, they may induce serious disturbance when the human induced changes are detected. Therefore, different spectral changes need different radiometric normalization methods. If the change detection involves the vegetation variation which needs to be preserved and enhanced, the impact from other features' changes should be eliminated as much as possible.
Eliminating the effects of the regularly nonlinear spectral differences caused by seasonal variation of natural objects requires a nonlinear radiometric normalization method. Histogram matching is a commonly used nonlinear method, but it only uses global statistical information. Thus, it is less flexible and so difficult to adapt to specific applications [4,25].
In this paper, we propose a novel method of nonlinear radiometric normalization to extract nonlinear relationship between multi-temporal image-pairs. The proposed method is based on the kernel version of canonical correlation analysis (kCCA) [1,[26][27][28][29][30][31], that consists of two steps. First, the image-pairs are quantitatively evaluated to extract the unchanged pixels whose Features are Nonlinear Invariant in kernel space (NIFs). Second, the nonlinear transformation of image-pairs is obtained from the nonlinear regression of the NIFs extracted previously. We will analyze the characteristics of NIFs distribution, nonlinear transformations and radiometric normalization results from different temporal image-pairs and compare the results with PIFs-based (CCA-based) method and histogram matching method. The PIFs-based method is derived from CCA of the spectral alignment of linear transformations, and the histogram matching is derived from statistical distribution models.
The paper is organized as follows: In Section 2, we describe the principle of the kernel version of CCA, as well as the selection of NIFs. Section 3 presents the test data and the experimental results, Section 4 is the discussion about the results. Section 5 we give the conclusion of the paper.

Relative Radiometric Normalization Based on kCCA Transformation
The proposed method contains two steps: the extraction of NIFs using kCCA and the use of NIFs to derive radiometric normalization. In this section, we review the kCCA transformation and how to extract NIFs based on it.

kCCA Transformation and NIFs Extraction
kCCA is the kernel version of CCA [7]. CCA is a multivariate feature extraction method that aims at finding the rotation of two sets of variables that maximizes their joint correlation [32]. In principle, CCA tends to select a group of representative indicators (linear combination of variables) from the two sets of random variables. These indicators express the correlation between the two sets of variables.
Remote Sens. 2018, 10, 432 3 of 21 kCCA introduces the theory of kernel functions into CCA, which maps data from a low-dimension to a high-dimensional feature space. Each kernel function converts an n-dimensional inner product in low-dimensional space into an m-dimensional inner product in high-dimensional space. Let x, z ∈ R n , the nonlinear function φ(x) converts R n to R m : x → φ(x) (n m), using the kernel function: <, > is the inner product, and k(x, z) is a kernel matrix [33,34]. It is not necessary to know the form and parameters of the nonlinear transformation function φ(x) in advance. The kCCA method maps the target image and the reference image into high dimensional space in which their linear combination can be obtained. The canonical variable in high dimensional is defined as follows, where x represents the target image matrix. Each row of x is a sample vector, and thus x has size n × N where n is the number of bands and N is the number of sample points of the image-pair. x can be rewritten as x n×N = (x 1 , x 2 · · · x N ) by column vectors x i . z represents the reference image matrix and similarly z can be rewritten as z n×N = (z 1 , z 2 · · · z N ). c and d are constant vectors of R m . U denotes the linear combination of x in the high-dimensional feature space; and V denotes the linear combination of z in the high-dimensional feature space. The workflow of kCCA is shown in Figure 1. Similar to the solving process of linear correlation analysis, the first step is to solve c and d so that the correlation coefficients of the combined variables U and V are maximized. principle, CCA tends to select a group of representative indicators (linear combination of variables) from the two sets of random variables. These indicators express the correlation between the two sets of variables. kCCA introduces the theory of kernel functions into CCA, which maps data from a lowdimension to a high-dimensional feature space. Each kernel function converts an n-dimensional inner product in low-dimensional space into an m-dimensional inner product in high-dimensional space. Let , ∈ , the nonlinear function ϕ( ) converts to : → ϕ( ) ( ≪ ), using the kernel function: <, > is the inner product, and ( , ) is a kernel matrix [33,34]. It is not necessary to know the form and parameters of the nonlinear transformation function ( ) in advance. The kCCA method maps the target image and the reference image into high dimensional space in which their linear combination can be obtained. The canonical variable in high dimensional is defined as follows, where represents the target image matrix. Each row of is a sample vector, and thus has size × where n is the number of bands and is the number of sample points of the image-pair. can be rewritten as × = ( , ⋯ ) by column vectors . represents the reference image matrix and similarly can be rewritten as × = ( , ⋯ ). and are constant vectors of . U denotes the linear combination of in the high-dimensional feature space; and V denotes the linear combination of in the high-dimensional feature space. The workflow of kCCA is shown in Figure 1. Similar to the solving process of linear correlation analysis, the first step is to solve and so that the correlation coefficients of the combined variables and are maximized.
where × = ( , ⋯ ) , × = ( , ⋯ ) . Then the problem of solving the highdimensional vectors and is simplified as a problem of solving the low-dimensional vectors and . By substituting into Equations (2) and (3), and can be rewritten as: According to Equation (1) of the kernel function and the description followed, the kernel matrix can be defined, To solve the vectors c and d in a lower dimensional space, c and d (of length m) are projected onto the vectors α and β (of length n) with a lower dimension.
where α n×N = (α 1 , α 2 · · · α N ), β n×N = (β 1 , β 2 · · · β N ). Then the problem of solving the highdimensional vectors c and d is simplified as a problem of solving the low-dimensional vectors α and β. By substituting into Equations (2) and (3), U and V can be rewritten as: According to Equation (1) of the kernel function and the description followed, the kernel matrix can be defined, Using the definition of kernel matrix in Equations (8) and (9), we obtain that the variance and covariance of U and V are as follows: Var( ) is the variance and Cov( ) is the covariance. The key step in the kCCA transformation is to solve the corresponding α and β when the correlation coefficient is maximized. According to the above formula Equations (10)-(12), we obtain, where ρ is the Pearson correlation coefficient.
The problem can be simplified as an optimization problem, max This optimization problem has the equality constraints that can be solved by the Lagrange multiplier, By taking derivatives with respect to α and β, we obtain, ε 1 and ε 2 are both regularization coefficients. Together with the constraints, Equations (16) and (17) implies that ρ α = ρ β and we define ρ = ρ α = ρ β . The above equation can be converted into the eigenvalue problem, By solving the above equation, we get ρ and the corresponding canonical vectors U and V in Equations (2) and (3).
The non-linear NIFs can be chosen according to Nielsen's [23,35] MAD method. The MAD method uses the components of U and V, i.e., U i and V i to define the variation, and we get, Remote Sens. 2018, 10, 432

of 21
The covariance matrix of MAD variables is a diagonal matrix: The diagonal element γ j is the variance of MAD j , ordered decreasingly γ 1 ≥ γ 2 ≥ . . . ≥ γ N , which gives γ i = 2(1 − ρ). Using the concept of contribution proposed by Nielsen, the contribution of MAD i is, The formula Equation (24) indicates that the variable carries information of the ratio of the variation. Since the accumulated contribution of the first k variables (∑ k i=1 ω i ) is called the accumulative contribution ratio, the last few variables in U i − V i contain the unchanged pixels between the image-pair, and we can select them as NIFs. If we denote, the points that satisfy the following probability conditions are selected as NIFs: where P χ 2 ,N () refers to a chi-square test of the N degrees of freedom. Pr(no change) is used to select NIFs in the kernel space. Pixels above the fixed threshold τ, i.e., Pr(no change) > τ, are NIFs. Normally, the threshold τ needs to be larger than 0.9 to masked out the water, cloud pixels, shadow points in the NIFs. The value of the threshold τ affects the number of the extracted NIFs, as discussed in detail in Section 3.4.2.

Fitting Non-Linear Transformation for Radiometric Normalization
Based on NIFs, the non-linear transformation can be derived by means of the least squares regression and the key is the selection of transformation. From Section 2.1, we know that If we set the subspace The available kernels include: In this paper, we choose polynomial kernel k poly with n = 3, γ = 1, c = 2 as an example, and we use a more general representation, where x is the pixel value of NIFs in the test data. We assume thatŷ x i k is the fitted value of the above equation. y is the corresponding pixel value of the reference image. The least square method is used to minimize y −ŷ x i k 2 to solve the coefficients {a 0 , a 1 , a 2 , a 3 } of the above polynomial Equation (32).
Once the coefficients are resolved, we use Equation (32) do the radiometric normalization.

Test Data
In this work, we use the subsets of 2025 × 2205 pixels of the Gaofen-1(GF-1) data with a spatial resolution of 16m as the test set. GF-1 was launched by China on 26 April 2013. It carries two cameras with 2 m resolution (panchromatic) and 8 m resolution (multi-spectral), and four multi-spectral cameras with 16 m resolution. The band setting of the GF-1 multi-spectral sensor is similar to that of other high-resolution and wide-spectrum satellite sensors, such as QuickBird, IKONOS, and GeoEye-1, with blue, green, red, and near-infrared bands. We refer the parameters of payload of GF-1 to Table 1. To investigate the nonlinear transformation effects derived from NIFs in high dimensional space, 6 GF-1 images obtained from 2013 to 2014 each of which contains all 4 seasons (see Table 2 for details) are tested, as shown in Figure 2. The geographical area of the images is in the northern region of Beijing, China. Most of the land is covered by vegetation, man-made objects, water. geometrically, and then GF-1 and the test image are used as the image-pair to extract NIFs and do radiometric normalization.

NIFs Distribution Map
The parameters of the kCCA-based method used to extract NIFs include the regularization parameter = 0.0001 and threshold of ( ℎ ) τ = 99%.
The distribution map of the extracted NIFs is indicated in Figure 3, where the NIFs are represented by green points. The total numbers of NIFs in image 0330, image 0403, image 0705, image 0717 and image 1214 are 31027, 14839, 124071, 11466 and 10078, respectively. Most of NIFs locate in the vegetation area. The proposed method can extract much more NIFs than that of CCA-based method, which only extracts 2432, 2382, 1014, 2074 and 1069 PIFs, respectively. Here the points with a normalized vegetation index (NDVI) larger than 0.2 are considered as vegetation targets. Figure 4 shows the ratio of NIFs in vegetation area to the total number of NIFs. The blue one in pie chart represents the percentage of NIFs in vegetation area. The ratio of NIFs in vegetation area are 55.1%, 88.5%, 83.4%, 92.1% and 78.4%, respectively.
According to the distribution map of NIFs, kCCA can select more NIFs than CCA, which indicates that the spectral variation of the corresponding points between the image-pair are mostly nonlinear. Moreover, since most NIFs are distributed in the vegetation area, it indicates that the spectral changes caused by vegetation growth over time are regular and predictable, and thus they can be detected by NIFs. The preprocessing of test data includes radiometric calibration, ortho-rectification, and image registration. Here radiometric calibration converts the DN value to the top of atmosphere reflectance (TOA). The GF-1 image acquired in 2 October 2013 is cloud free and has good visibility. There are visible seasonal differences between the image and the others. It is chosen as the reference in geometric space and radiometric space. Each of other test images is at first registered to it geometrically, and then GF-1 and the test image are used as the image-pair to extract NIFs and do radiometric normalization.

NIFs Distribution Map
The parameters of the kCCA-based method used to extract NIFs include the regularization parameter ε 1 = 0.0001 and threshold of Pr(nochange)τ = 99%.
The distribution map of the extracted NIFs is indicated in Figure 3, where the NIFs are represented by green points. The total numbers of NIFs in image 0330, image 0403, image 0705, image 0717 and image 1214 are 31027, 14839, 124071, 11466 and 10078, respectively. Most of NIFs locate in the vegetation area. The proposed method can extract much more NIFs than that of CCA-based method, which only extracts 2432, 2382, 1014, 2074 and 1069 PIFs, respectively. Here the points with a normalized vegetation index (NDVI) larger than 0.2 are considered as vegetation targets. Figure 4 shows the ratio of NIFs in vegetation area to the total number of NIFs. The blue one in pie chart represents the percentage of NIFs in vegetation area. The ratio of NIFs in vegetation area are 55.1%, 88.5%, 83.4%, 92.1% and 78.4%, respectively.
According to the distribution map of NIFs, kCCA can select more NIFs than CCA, which indicates that the spectral variation of the corresponding points between the image-pair are mostly nonlinear. Moreover, since most NIFs are distributed in the vegetation area, it indicates that the spectral changes caused by vegetation growth over time are regular and predictable, and thus they can be detected by NIFs.
Remote Sens. 2017, 9, x 8 of 22      Figure 5 is the scatter plot of the density of the NIFs in blue band of test data. The horizontal axis is the TOA value of the test data that linearly stretched from 0-1 to 0-255. The vertical axis is the corresponding value of the reference image. This density map apparently shows the non-linearity of NIFs between image-pairs. Table 3 shows the fitting coefficients and regression error of Equation (27). M r is the average of the residuals and S r is the standard deviation of the residuals. As indicated in Table 3, the absolute values of the cubic coefficients of image0330, image0403 and image1214 are significantly larger than those of image0705 and image0717. Figure 6 shows the transformation function between test image-pair. The red, green, purple, black, and cyan-blue curve denotes the transformation function between the image-pair (image0330, image1002), (image0403, image1002), (image0705, image1002), (image0717, image1002), and (image1214, image1002), respectively. (a-d) are the results fitted by NIFs, (e-h) are results fitted by PIFs. From (a-h), we find that the nonlinear relationship fitted by NIFs between image-pairs increases with the time span. Similarly, the absolute value of the cubic coefficient in Table 3 is relatively larger and the slope of the linear relationship fitted by PIFs also increase with the time span.  Table 3 shows the fitting coefficients and regression error of Equation (27). Mr is the average of the residuals and Sr is the standard deviation of the residuals. As indicated in Table 3, the absolute values of the cubic coefficients of image0330, image0403 and image1214 are significantly larger than those of image0705 and image0717. Table 3. The coefficients and error ratios of regression equation a0, a1, a2 and a3 represent the coefficients of the constant term, the linear term, the quadratic term, and the cubic term, respectively. Mr is the average of the difference between the function value and the expected value, and Sr is the standard deviation.   Table 3. The coefficients and error ratios of regression equation a 0, a 1 , a 2 and a 3 represent the coefficients of the constant term, the linear term, the quadratic term, and the cubic term, respectively. M r is the average of the difference between the function value and the expected value, and S r is the standard deviation. black, and cyan-blue curve denotes the transformation function between the image-pair (image0330, image1002), (image0403, image1002), (image0705, image1002), (image0717, image1002), and (image1214, image1002), respectively. (a-d) are the results fitted by NIFs, (e-h) are results fitted by PIFs. From (a-h), we find that the nonlinear relationship fitted by NIFs between image-pairs increases with the time span. Similarly, the absolute value of the cubic coefficient in Table3 is relatively larger and the slope of the linear relationship fitted by PIFs also increase with the time span. -is the fitted function of image0330 and the reference, -is that of image0403 and the reference, -is that of image0705 and the reference, -is that of image0717 and the reference, -is that of image1214 and the reference.  -is the fitted function of image0330 and the reference, -is that of image0403 and the reference, -is that of image0705 and the reference, -is that of image0717 and the reference, -is that of image1214 and the reference.

Radiometric Normalization Results
According to the transformation given in Table 3, the image can be radiometrically normalized. Figure 8 shows the radiation normalization results of the CCA, kCCA and the histogram matching. Each image in Figure 8 is a mosaic image of the reference image (right) and the normalization result (left). Figure 8a shows the original mosaic images and the reference, in which the spectrums of the image-pairs are visually different and in particular, the spectral difference of vegetation increases with the time span between the image-pair. The normalization results of the PIFs-based method (Figure 8b) show that the spectral differences still exist, especially in the vegetation area. The normalization results of the kCCA-based method (Figure 8c) show the spectral differences are significantly reduced, even though the vegetation in spring of image0330 and in winter of image1214 shows similar spectral characteristics with the vegetation in early autumn of image1002. The results of the histogram matching (Figure 8d) show that spectral differences are well reduced.

Clouds Pixels in the Image
Towards a quantitative comparison of the proposed radiometric normalization with the CCA-based method and with the histogram matching method, we first discuss the effects of cloud pixels and threshold τ for NIFs extraction and radiometric normalization transformation. The effect caused by cloud pixels is an unavoidable problem that must be considered in remote sensing image radiometric normalization. The usual approach is to mask out the cloud pixels from an image-pair. We use the automatic threshold method to do cloud detection [36]. In fact, if cloud pixels are falsely selected as NIFs, bias will be induced into transformation. The image0330 with clouds is randomly selected as a test image. Following the NIFs extraction steps in Section 3.2, if the cloud pixels have not been masked out, some cloud pixels are selected as NIFs, as shown in the distribution map of NIFs (Figure 3a).

Clouds Pixels in the Image
Towards a quantitative comparison of the proposed radiometric normalization with the CCAbased method and with the histogram matching method, we first discuss the effects of cloud pixels and threshold τ for NIFs extraction and radiometric normalization transformation. The effect caused    -pair (image0330, image1002). The red and green curves denote the fitted curve from NIFs with and without the cloud pixels, respectively. According to the histogram, we notice that the TOA values of NIFs are mostly distributed between 0.1 and 0.4. In this interval, the fitting values of the NIFs without cloud pixels in Figure 9 are obviously larger than that of the NIFs with cloud pixels, while out of this interval, the fitting values of the NIFs without cloud pixels is smaller. The resulting normalized image of NIFs with cloud fitting is darker than that without cloud fitting.
We use the automatic threshold method to do cloud detection [36]. In fact, if cloud pixels are falsely selected as NIFs, bias will be induced into transformation. The image0330 with clouds is randomly selected as a test image. Following the NIFs extraction steps in Section 3.2, if the cloud pixels have not been masked out, some cloud pixels are selected as NIFs, as shown in the distribution map of NIFs (Figure 3a). Figure 9a,b show the radiometric normalization results of image0330, showing NIFs with and without cloud pixels. Visually, the brightness of the result including cloud pixels is relatively lower. This can be explained by the normalization transformation derived from NIFs. Figure 9c-f show the histogram of image0330, where we can see that the pixel values are mainly concentrated in [0.1, 0.4]. Figure 9g-j are the fitting curves of the four bands of the image-pair (image0330, image1002). The red and green curves denote the fitted curve from NIFs with and without the cloud pixels, respectively. According to the histogram, we notice that the TOA values of NIFs are mostly distributed between 0.1 and 0.4. In this interval, the fitting values of the NIFs without cloud pixels in Figure 9 are obviously larger than that of the NIFs with cloud pixels, while out of this interval, the fitting values of the NIFs without cloud pixels is smaller. The resulting normalized image of NIFs with cloud fitting is darker than that without cloud fitting.

The Threshold Parameters τ for NIFs Extraction
We use the formula Pr(nochange) = 1 − P χ 2 ,N (Z) > τ to extract the NIFs. The threshold τ directly determines the number of NIFs selected. The number of NIFs increases with the value of τ. In Section 3, τ = 0.99 is used. Here we set the threshold to 0.95, 0.90 and 0.85 respectively to investigate the changes of NIFs distribution map and the normalization function. Table 4 is the number of NIFs with different τ of the image pair (image0717, image1002). It shows that the total number of NIFs and the number of NIFs in the vegetation area increase as the τ decreases, and the ratio change of NIFs in vegetation area changes is <0.5%, which is relatively stable. Figure 10a Figure 10h-k are the fitting curves of the four bands using the above NIFs. It shows that the fitting curves for different thresholds τ are very similar in the corresponding interval range of the TOA values. We use the formula ( ℎ ) = 1 − , ( ) > τ to extract the NIFs. The threshold τ directly determines the number of NIFs selected. The number of NIFs increases with the value of τ. In Section 3, τ = 0.99 is used. Here we set the threshold to 0.95, 0.90 and 0.85 respectively to investigate the changes of NIFs distribution map and the normalization function. Table 4 is the number of NIFs with different τ of the image pair (image0717, image1002). It shows that the total number of NIFs and the number of NIFs in the vegetation area increase as the τ decreases, and the ratio change of NIFs in vegetation area changes is <0.5%, which is relatively stable. Figure 10a-c show the distribution maps of NIFs with different threshold τ. Figure 10d-g show the histogram of image0717. We can see that the pixel values of the red, green, and blue bands are mainly concentrated in [0.1, 0.4], and the pixel values of near infrared band are distributed between 0.1 and 0.8. Figure 10h-k are the fitting curves of the four bands using the above NIFs. It shows that the fitting curves for different thresholds τ are very similar in the corresponding interval range of the TOA values.

Quantitative Comparison of the Radiometric Normalization Results
We use the root mean square error (RMSE), Pearson correlation coefficient and histogram similarity to evaluate the similarity of the radiometric normalization results of proposed method with the CCA-based method and the histogram matching method.
RMSE is used to evaluate the similarity between the normalized image and the reference image. RMSE is defined as follows, where X i is the pixel value of the normalized data, Y i is the pixel value of the reference image, and |scene| is the total number of pixels. The smaller the value of RMSE is, the higher the similarity of the two images is. Pearson correlation coefficient is used to measure the linear correlation between the normalized image and the reference image. The Pearson correlation coefficient is defined as follows, where X i denotes pixel value of the normalized data, X denotes the average of the normalized data, Y i denotes the corresponding pixel value of the reference data, and Y denotes the average of the reference data. Histogram similarity measured by the histogram correlation which is used to calculate the linear correlation between the two histograms of the normalized image and the reference image. The histogram correlation is defined as follows, where H1 is the histogram of the normalized data, H1 is the mean of H1, H2 is the histogram of the reference data, H2 is the mean of H2. The similarity positively correlates to the histogram correlation. Tables 5 and 6 are the RMSE, Pearson correlation coefficient and histogram similarity between each multi-temporal image and the reference image before and after radiometric normalization. Here RMSE CCA is RMSE of the radiometric normalization results from the CCA-based method, RMSE kCCA is RMSE of the results from the kCCA-based transformation, and RMSE H is RMSE of the results from the histogram matching method. ρ CCA , ρ kCCA and ρ H are Pearson correlation coefficients of the radiometric normalization results from the CCA-based, kCCA-based and the histogram matching methods, respectively. ρ CCA h , ρ kCCA h and ρ H h are histogram correlation of the radiometric normalization results from the CCA-based, kCCA-based and the histogram matching methods, respectively.

Discussion
The paper focused on the radiometric normalization of satellite image sequence analysis, especially, the elimination of the spectral difference among the images caused by different image acquisition conditions and the effects of the regularly nonlinear spectral differences caused by seasonal variation of natural objects. We proposed a novel method, i.e., the kCCA, to select nonlinear radiation control points-NIFs and to derive the nonlinear transformation of image-pairs using the least squares regression method.
The total number of the selected NIFs were larger than that of the linear radiometric normalization method (CCA). This indicates that the spectral variation of the corresponding points between each image-pair is mostly nonlinear. Moreover, since most NIFs were distributed in the vegetation area, it indicates that the spectral changes caused by vegetation growth over time are regular and predictable, and thus they can be detected by NIFs. Figure 9 showed us that the effect of cloud pixels was a problem that must be considered in the extraction of NIFs and that the removal of cloud pixels can improve the accuracy of radiometric normalization. The chi-squared percentile τ directly determined the number of NIFs. Table 4 showed that the total number of NIFs and the number of NIFs in the vegetation area increased as the τ decreased. However, the ratio of NIFs in vegetation area was relatively stable and the fitting curves for different thresholds τ were similar. Therefore, we should appropriately choose the value of τ so that we can get enough invariant points to fit the nonlinear relationship. If the value of τ is too small, the numbers of NIFs will be quite large, which leads to higher computation afford.
By using of the least squares regression, we have derived relationships between image-pairs from NIFs. The scatter plot map of the NIFs density ( Figure 5) showed us that the relationships were nonlinear. Table 3 and Figure 6 showed that the nonlinear relationship positively correlated with the time span between image-pairs. Tables 5 and 6 showed that the kCCA-based nonlinear radiometric normalized images had the smallest RMSE and the largest correlation coefficient with the reference image. As the time span increased, the RMSE increased and the correlation coefficient decreased, and so was that of the normalized images based on the CCA method. Radiometric normalized images of the histogram matching method had the largest RMSE and the smallest correlation coefficient with the reference image. It means the kCCA-based normalization can preserve more similarity and better correlation between image-pairs than the CCA algorithm and histogram matching. Histogram similarity is an index that measures the color difference between an image-pair. The histograms of the radiometric normalized images using the histogram matching had the highest histogram similarity with the reference image, those of kCCA transformation had the second highest similarity, while the CCA-based linear radiometric normalized results had the lowest similarity. However, the histogram matching uses global statistical information for normalization which eliminates all the spectral difference between an image-pair, including the difference must be preserved. Compared with the CCA-based linear normalization, the proposed method can better avoid the color error propagation.
In general, the kCCA-based normalization can either preserve more similarity and correlation between each image-pair or effectively avoid the color error propagation. The proposed method not only builds the common scale or reference to make the radiometric consistency among GF-1 image sequences, but also highlights the interesting spectral changes while eliminates less interesting spectral changes. Our method enables the application of GF-1 data for change detection, land-use, land-cover change detection etc.

Conclusions
In this paper, we presented a novel method to perform nonlinear relative spectral alignment and radiometric normalization for high-resolution wide-spectrum multi-temporal satellite images. The method used the kernel canonical correlation analysis transformation (kCCA) to extract NIFs, a set of samples belonging to unchanged area in the kernel space, and to align the relative spectrum of two images using the NIFs-fitted nonlinear relationship. The proposed approach does not need full supervision. The presented approach may be used as a preprocessing technique for change detection of multi-temporal images, especially when the vegetation spectral changes of multi-temporal images is a disturbance of change detection. Furthermore, we qualitatively and quantitatively compared the radiometric normalization results of the proposed method with CCA-based PIFs method and the histogram matching method derived from statistical distribution. The comparison indicated that the radiometric normalization result of the proposed method had the best linear correlation and least bias with the reference, whereas the radiometric normalization of the histogram matching had the best histogram similarity with the reference image. Finally, in this paper we dealt only with pixel-wise approaches for both the relative spectral alignment and radiometric normalization using the polynomial kernel canonical correlation analysis transformation. In future, we will address the use of different kernel function space, mixed kernel function space and the spatial context to improve both the alignment and the radiometric normalization. This would open many applications of very high-resolution imagery, because high-resolution multi-temporal image-pairs have more complex non-linear relationship and the semantic classes cannot be modeled pixel-wise.