Color Constancy Based on Local Reﬂectance Differences

: Color constancy is used to determine the actual surface color of the scene affected by illumination so that the captured image is more in line with the characteristics of human perception. The well-known Gray-Edge hypothesis states that the average edge difference in a scene is achromatic. Inspired by the Gray-Edge hypothesis, we propose a new illumination estimation method. Speciﬁcally, after analyzing three public datasets containing rich illumination conditions and scenes, we found that the ratio of the global sum of reﬂectance differences to the global sum of locally normalized reﬂectance differences is achromatic. Based on this hypothesis, we also propose an accurate color constancy method. The method was tested on four test datasets containing various illumination conditions (three datasets in a single-light environment and one dataset in a multi-light environment). The results show that the proposed method outperforms the state-of-the-art color constancy methods. Furthermore, we propose a new framework that can incorporate current mainstream statistics-based color constancy methods (Gray-World, Max-RGB, Gray-Edge, etc.) into the proposed framework.


Introduction
Color constancy can ensure that the perception of object color is relatively stable under different illumination conditions, and it is a characteristic of the human color perception system [1][2][3]. For example, whether a piece of white paper is in the outdoor sunlight or in the dim candle light indoors, we can always restore its original white color in our minds. With the development of optics and material technology [4][5][6], the application of digital cameras is becoming more and more extensive. In the digital world, color constancy plays a vital role in areas such as object recognition and tracking, scene analysis, and image-based localization, etc. For example, in the field of autonomous driving, color constancy algorithms ensure that objects in a scene captured under different illumination conditions have the same appearance, thereby improving the robustness of target (pedestrian, vehicle, etc.) recognition and tracking.
Prevailing color constancy methods are mainly divided into the following two categories: learning-based methods and statistics-based methods. Learning-based color constancy methods include the following two categories: (1) Gamut mapping color constancy.
(2) Learn a color constancy model from the training datasets. The gamut mapping [7,8] color constancy method is based on the property that humans can only observe a limited number of colors for a natural image under a given illumination. A typical color gamut is the set of all RGB values of a typical light source (typically a white light source). In RGB space, this canonical gamut proved to be a convex hull [7]. This approach computes the changes that transfer the recorded color gamut into the canonical gamut, allowing us to determine the hue of the light source. Barnard et al. [8] have demonstrated that the gamut mapping method outperforms the Gray-World. The Gray-World assumes that the mean value of the average reflection of light by natural scenes is a constant value close to "grey". Finlayson improved the gamut mapping algorithm by limiting the transformations to chromaticity space, which means that only the illumination corresponding to the existing illumination is allowed [9]. The above improved algorithm is called GCIE, which can be regarded as a robust improvement method to remove the limitation in the diagonal model of illumination variation. The illumination estimation accuracy of the color constancy method of gamut mapping is highly dependent on the proposed assumptions. Once the assumptions do not meet the actual application scenarios, the color constancy performance degrades severely.
The learning-based color constancy method obtains the illumination estimation model through continuous iterative learning from a large amount of given training datasets. Learning-based color constancy methods usually first extract intrinsic properties of natural images as features (e.g., edges, histograms, chromaticity and semantic information of brightest colors, etc.), and then study the complex relationship between features and illumination. Color cat [10] utilizes the linear regression relationship between illumination and histogram to achieve illumination estimation. Corrected moments [11] also show that color moments used as features provide satisfactory illumination estimation performance with least squares training. Learning-based models [12,13], Bayesian-based color constancy [14,15], exemplar-based methods [16], biologically-inspired models [17,18], highlevel information-based methods [19,20], and physics-based models [21,22] are commonly used examples of learning-based color constancy. In recent years, with the rapid development of deep neural network technology, the performance of color constancy methods based on convolutional neural network models has been continuously improved [23]. However, its practical application is limited due to the large amount of parameters and overloaded redundant features.
Statistics-based methods put greater attention on the correlation between illumination and surface reflectance. Buchsbaum [24] introduced the Gray-World hypothesis, which assumes that the average reflectance in a scene under a neutral light source is achromatic. The color constancy method based on the Gray-World assumption can calculate the mean value of the three RGB channels and eliminate the influence of ambient light as much as possible. The color constancy method based on the gray world assumption works well when the color components of the image are relatively uniform. However, once the color distribution of the image is uneven, the effect drops sharply. White-Patch [25] assumes that perfect reflections lead to a maximum response in the RGB channels, i.e., taking the maximum value of RGB as the value of white. However, White-Patch-based color constancy methods fail when the scene is flooded with a large number of monochromatic colors. Grey-Edge is another popular color constancy method [26,27], which implies that the average reflectance differences in a scene is achromatic. Both of the low-level statistical methodologies listed above can be merged in a single framework: where the image f c (x) was captured by the camera. f σ c (x) = f c (x) ⊗ G σ ,G σ denotes a Gaussian filter with standard deviation σ. k functions as a scaling coefficient that varies according to the scene observed. The constant k is between 0 and 1, and color constancy based on this equation indicates that the p − th Minkowski norm of the n − th order derivative in a scene is achromatic. e c is the estimated lighting.
In this paper, we propose a novel low-level statistics-based method as a final extension of the above-mentioned framework described in Equation (1). Inspired by locally normalized reflectance estimation [28] and Gray-Edge hypothesis, to get the locally normalized reflectance differences, we partition the reflectance difference picture into J non-overlapped patches of equal size and divide each reflectance difference by the local maximum inside the non-overlapping local patch. For the obtained color-biased image f c (x), the proposed algorithm is first used to determine the light source estimate e c . Then, color constancy is achieved by transforming image f c (x) according to illumination estimate e c into a photograph taken under a standard light source.
The main contributions of the paper can be summarized as follows: (1) The relationship between the total of reflectance differences and the sum of locally normalized reflectance differences is exploited. After analyzing the statistics of three datasets containing different lighting conditions and scenes, we found that the ratio of the global sum of reflectance differences to the global sum of locally normalized reflectance differences is achromatic. Based on this finding, we propose a more accurate color constancy method for recovering the true color of the scene.
(2) We propose a new framework that incorporates color constancy methods such as Gray-World, maximum RGB, and Gray-Edge. We will also show how that Grey-World, White-Patch, Grey-Edge, and Local-Surface [29] can all be incorporated into the proposed framework of color constancy.
(3) The experiment demonstrates the feasibility and effectiveness of the proposed method when facing scenarios with single or multiple illuminations. In particular, the experimental results on the HDR test set show that the color constancy method proposed by us is superior to the comparison algorithm, showing that the pro-posed method can restore the actual color more accurately for different scenes. We also incorporated a clustering algorithm to improve the results under multiple illuminations.
The rest of the paper is organized as follows: Section 2 presents the proposed algorithm in detail, Section 3 tests the performance of the proposed algorithm on four commonly used datasets, and finally, Section 4 summarizes and further discusses future research work.

Proposed Method
Assuming the scene is illuminated uniformly by a single light source I(λ), such as outdoor lighting, the image f c (x) captured by the camera pipeline model are represented in the following form: where c ∈ {R, G, B} are color channels of the camera sensor, x is the spatial coordinate, wavelength of the light is represented as λ. R(λ, x) denotes the surface reflectance, ω is the visible spectrum. S c (λ) = [S R (λ), S G (λ), S B (λ)] is the camera sensitivity, and under the diagonal transform assumption [7], the observed light source color e c ∈ {e R , e G , e B } can be calculated as: Figure 1 is a flowchart of the proposed color constancy approach, the details of which are described in the following sections.

Local Normalized Surface Reflectance Differences
This section will explain the meaning of local normalized surface reflectance differences. According to the paper [26], we can calculate the differences in the image ( )

Local Normalized Surface Reflectance Differences
This section will explain the meaning of local normalized surface reflectance differences. According to the paper [26], we can calculate the differences in the image f c,X (x) by the formula: The entire region of the differences picture f c,X (x) is divided into J equal-sized nonoverlapping patches. Let where x j,max denotes the spatial location of the pixel with the maximum intensity in the J − th local region: f c,X,j (x) is the edge intensity of the pixel at x position, which is normalized by the maximum value of the edge value in the j − th local image patch.
The reflectance differences in a scene can be represented as R X c,j (x), then local normalized reflectance differences L c,X,j (x) can be represented as:

Hypothesis Validation
In this section, the validation process of the hypothesis in this paper is presented, and it is shown the illuminant estimate e c can be obtained accurately calculated by dividing the total of edges Ω f c,X (x)dx by the sum of local normalized reflectance differences J 1 ξ L c,X,j (x)dxdj, where Ω denotes the overall image area, J represents the total number of the local regions inside the image, and ξ indicates the space of the j − th local area.
Let ∨ R represent the ratio of total of edges Ω f c,X (x)dx and the sum of local normalized reflectance differences J 1 ξ L c,X,j (x)dxdj: where In order to exploit the relationship between RD c,s and RD c,sln , we used the Gehler-Shi [30] reprocessed dataset containing 568 images and the NUS dataset [31] of 1737 images taken with eight cameras for color checking, and SFU dataset [13] contains 105 high dynamic range images of indoor and outdoor areas. The datasets above mentioned provide illumination of each raw image, so that the no color-biased image of each color-biased image can be obtained, and then we can compute the values of RD c,s and RD c,sln respectively.
As shown in  As we can see from Figure 2, it is obvious that most of the scattered points are dis tributed along the diagonal line. Therefore, we can get the formula: As we can see from Figure 2, it is obvious that most of the scattered points are distributed along the diagonal line. Therefore, we can get the formula: where k is a constant. For uniform illumination, the light source color can be computed by: Based on the above inference. Given a color-biased image f c (x) as input, ∨ R can be determined by Equation (8), k functions as a scaling coefficient that varies according to the scene observed. Given that k is the same for all color channels c ∈ {R, G, B}, based on Equation (11) and Equation (3), we don't have to find the actual value of k because it can be negated by using the normalized form of e c as the final result of the illumination.

Expanded into a Unified Framework
In this section, we will show than we propose a new framework that can incorporate several important statistics-based color constancy algorithms (Grey-World, maximum RGB, Grey-Edge, etc.).
Our proposed method, like the previous framework Equation (1), can well be modified to merge into the Minkowski norm: where f n,σ c, L n,σ c,X,j (x) = f σ,n c,X (x) f σ,n c,X x j,max (15) f n,σ c,X (x) denotes the spatial derivatives of order n. The Gaussian filter G σ with standard deviation σ is also introduced in order to exploit the local correlation. The Minkowski norm p determines the relative weights of the multiple measures used to estimate the final illuminant color.
What is more, Table 1 demonstrates that both Grey-World, White-Patch, Grey-Edge, and Local-Surface methods are the extreme case of Equation (13). For example, for color constancy method 2nd order Local-Edge, that is, J represents the total number of the local regions inside the image, the spatial derivatives is 2, p represents Minkowski norm and σ represents the standard deviation of the Gaussian filter. Similarly, the remaining color constancy methods in Table 1 can also be incorporated into the unified framework we propose.

Symbol Equation
Grey-World

Experimental Results
The previous section provided a generic formulation of color illuminant estimates using low-level image features. In this section, the proposed approach is evaluated on four benchmark datasets, one indoor light source dataset [32], two for the real-world dataset (SFU indoor dataset and Grey-ball dataset) [13,30], and one for the HDR light source dataset [33]. The light source color of the scene is given as extra data for both datasets.
The angular error is used as the color constancy error metric [34]: where the ∧ e l indicates the actual light source and the ∧ e e indicates the estimated light source. Smaller angular errors indicate more accurate color constancy results. In order to evaluate the proposed algorithm more objectively, five metrics including median, mean, trimean, max angular error are used to measure the accuracy of the color constancy of the method.

Parameter Setting and Analysis
As previously stated in Section 2, the proposed method should select the best parameter values. In our model there are a total of four variables, because of the computational limitation, only 1st-order and 2nd-order are discussed, and the remaining variables are only three, which are Minkowski norm p, scale σ, and number of the local regions J. Empirically, we traverse the indoor image dataset to determine the optimal parameters. The final selected parameters are shown in Table 2. As shown in Table 2, both Minkowski norm p and the local regions J adopt the same setting in both 1st-order Local edge and 2nd-order Local edge methods. We set scale σ = 4 and scale σ = 6 in the 1st-order Local edge and 2nd-order Local edge color constancy methods, respectively. Table 2. Parameter settings in the proposed 1st-order Local edge and 2nd-order Local edge color constancy methods. Noted that the parameters are empirical.

Indoor Dataset
The angular error results of various models on the SFU indoor dataset are presented in Table 3. A total of 321 linear photos were captured in the laboratory under 11 various illumination conditions in the SFU indoor dataset [32]. It can be seen from Table 3 that our model performed well when compared to other models on a variety of measures. Specifically, our proposed 2nd-order Local Edge color constancy method achieves the smallest angular error on the four metrics of median, mean, trimean and best-25%, and the proposed 1st-order Local Edge method on the worst-25% metric, it achieves better results than the comparison algorithm. Table 3 shows that the proposed color constancy method can achieve more accurate lighting color estimation results than the comparison algorithm in indoor lighting environments.

Real-World Dataset
The Gehler-Shi dataset includes 568 linear natural photos [15,30], all of which were captured in RAW format with a DSLR camera, with no color correction. As in many prior studies, the 24 patch color checkboard in every image of the dataset was disguised for illuminant estimation.
Our approaches were then tested on the SFU Grey-Ball dataset [13], which comprises 11,346 non-linear photos. This dataset has been treated in camera using complicated processing, making it impossible to derive an exact illuminant estimate. Before the experiment, we masked out the gray balls in each photo for unbiased evaluation. Table 4 shows the results on this color check dataset, and Table 5 lists the results of the SFU grey-ball dataset. In general, among all models, our method shows the best color constancy accuracy. As can be seen from Table 4, the 2nd-order Local edge method we proposed achieved the lowest values on the five metrics of median, mean, trimean, best-25% and worst-25%. Lower values for the above five metrics indicate more accurate color constancy. Therefore, the 2nd-order Local edge method we proposed is superior to Grey-World, White-patch, Shade of Grey, Grey-Edge, Local Surface Reflectance, Pixel-based Gamut, Edge-based Gamut on the Gehler-Shi dataset test set, SVR Regression, Bayesian, Exemplar-based and NIS color constancy methods. Specifically, it can be seen from Table 5 that on the SFU Grey-Ball dataset, our proposed 2nd-order Local edge method achieves optimal results on the three metrics of mean, best-25% and worst-25%. And it is only 0.36 higher than the best performing Exemplar-based method on the median metric.  Figure 3 demonstrates the results on sample photos from the color checker dataset. It can be seen from Figure 3 that the proposed color constancy method can well restore the real color of the scene on the Gehler-Shi test data set. For the scenes shown in the first and third rows in Figure 3, the White-Patch, Shade of Gray and Gray-Edge methods can hardly restore the actual color of the scene, and the results have no obvious improvement compared to the original input. Gray-World has achieved better color constancy results than the above methods in all scenes, but it still has a certain degree of color cast. Our proposed method achieves the smallest angular error on all test scenarios.
It can be seen from Figure 3 that the proposed color constancy method can well restore the real color of the scene on the Gehler-Shi test data set. For the scenes shown in the first and third rows in Figure 3, the White-Patch, Shade of Gray and Gray-Edge methods can hardly restore the actual color of the scene, and the results have no obvious improvement compared to the original input. Gray-World has achieved better color constancy results than the above methods in all scenes, but it still has a certain degree of color cast. Our proposed method achieves the smallest angular error on all test scenarios.

SFU HDR Dataset
Our approach was then tested on a dataset with an HDR light source [33], which includes 105 high-quality images captured under indoor and outdoor light sources.
The performance statistics for several methods on the SFU HDR dataset are shown in Table 6. When compared to other models on this dataset, our model performs well on a variety of metrics. It can be seen from Table 6 that on the SFU HDR image dataset, the 2nd-order Local edge method we proposed achieved the best results on the three metrics of median, mean and worst-25%, and the proposed 1st-order Local edge algorithm achieves as good results as the 2nd-order Local edge on the median metric. The proposed methods are significantly better than the second-ranked Grey-Edge algorithm. Table 6 shows that the color constancy method we proposed can also accurately restore the true color of the scene in the HDR scene with rich light sources both indoors and outdoors.

Conclusions
In this paper, inspired by the Gray-Edge hypothesis, our statistical experiments on the three public datasets of NUS, Gehler-Shi and SFU show that the ratio of the global sum of reflection differences to the global sum of locally normalized reflection differences is achromatic. Based on the above conclusion, we propose a new method for more accurate color constancy. Qualitative and quantitative results of the proposed color constancy method on multiple test datasets containing different lighting scenarios demonstrate its effectiveness. In particular, the experimental results on the HDR test set show that the color constancy method proposed by us is superior to the comparison algorithm, showing that the proposed method can restore the actual color more accurately for different scenes.
Additionally, we propose a new framework that can incorporate current mainstream statistics-based color constancy methods (Gray-World, Max-RGB, Gray-Edge, etc.) into the proposed framework. A limitation of our research work is that the four parameters in the proposed color constancy algorithm are empirical rather than adaptive. Our future research work is to combine the statistical information of the image with the convolutional neural network to design a network structure with more accurate color constancy.

Data Availability Statement:
The data that support the findings of this study are available on request from the corresponding author.

Conflicts of Interest:
The authors declare no conflict of interest.