Robust and Precise Matching Algorithm Combining Absent Color Indexing and Correlation Filter

: This paper presents a novel method that absorbs the strong discriminative ability from absent color indexing (ABC) to enhance sensitivity and combines it with a correlation ﬁlter (CF) for obtaining a higher precision; this method is named ABC-CF. First, by separating the original color histogram, apparent and absent colors are introduced. Subsequently, an automatic threshold acquisition is proposed using a mean color histogram. Next, a histogram intersection is selected to calculate the similarity. Finally, CF follows them to solve the drift caused by ABC during the matching process. The novel approach proposed in this paper realizes robustness in distortion of target images and higher margins in fundamental matching problems, and then achieves more precise matching in positions. The effectiveness of the proposed approach can be evaluated in the comparative experiments with other representative methods by use of the open data.


Introduction
In the field of computer vision [1,2], color histogram-based features have been applied in various applications, including image retrieval [3], face recognition, pedestrian tracking [4], and object matching [5]. Among other features, such as grayscale, texture, gradient, and geometric features, color features are much vital in providing useful cues for object detection or matching. In many applications, the color features often appear in the algorithms as a main or auxiliary feature. The color feature [6] of an image is an important statistical feature in histogram-based methods, which facilitates solving problems with rotation, deformation, and scale variation during the matching process. Therefore, using the histogram method to perform statistics on colors can effectively reveal the distribution characteristics of colors to achieve robust search goals. Swain et al. [7] and Stricker et al. [8] introduced a method to utilize a color histogram-based approach for matching, known as color indexing (CI) and the cumulative color histogram (CCH). In CI, a histogram intersection approach is used to perform matching by considering each bin in the color histogram as a type of color feature. The CCH describes the index to fix the order of the colors and then recalculates each bin value to strengthen the feature of high-frequency colors. They can manage the challenges of rotation, deformation, and scale variation; however, they cannot easily process noise interferences and illumination problems. Thereafter, a series of fuzzy color histogram-based methods [9][10][11][12] have been proposed to overcome the problems of noise interference and illumination. Verma et al. [13] introduced triangular membership functions to improve fuzzy color histograms for template matching (TFCM).
However, color histogram-based methods present some disadvantages, such as the lack of position information, which reduces the discrimination sensitivity. In addition to its robustness, as a crucial evaluation indicator in matching, its alignment precision is another evaluation indicator for improving its accuracy while maintaining robustness during object matching.
Image matching technologies based on image features have been widely applied, where classical template matching methods include the sum of squared difference (SSD) and normalized cross-correlation (NCC) [14,15]. Most object detection or matching methods, such as LGCmF [16] and MGNet [17], are based on multi-feature fusion and training processes. However, in general, they tend to be time-consuming, especially in preparing sufficiently large training datasets for models designed to learn verified training signals in applications requiring high performance. In this paper, an efficient approach is proposed for robust, fast, and accurate matching that is based on color histogram matching and combination with a high precision matching scheme.
The novel approach proposed in this paper realizes robustness in distortion of target images and higher margins in fundamental matching problems, and then achieves more precise matching in position. The effectiveness of the proposed approach is evaluated in comparative experiments with other representative methods by use of the open data. In our previous studies [18,19], the fundamental idea of absent color indexing, named ABC, i.e., the decomposition of a normal histogram into two disjoint ones using fixed parameters, was introduced to achieve good performances in feasible matching. In this paper, it is largely modified to give a clear formalization and a new scheme for defining an important parameter h T by use of an original concept of the mean color histogram to obtain more effective threshold values. Furthermore, it can be combined with the correlation filter, CF, that was utilized after ABC to achieve a more precise target search.
The remaining sections of the paper are as follows: Section 2 introduces the concept of apparent colors and absent colors, as well as a statistical method to determine the threshold h T . Section 3 describes a method to incorporate CF to improve the match precision. Section 4 shows the experimental results for real-world and open data. Section 5 presents the conclusions and future work.

Why Are Minor Colors Important?
The motivation of introducing a novel concept is to enhance color-based features in cases where an object to be searched has a few but prominent colors together with existing features. For example, when identifying an individual person, eye color is a quantitatively minor or hidden color feature, but it can provide an important feature for identification. Especially in the cases of image pattern search problems by computers, these may be somewhat hopeful to contribute separation of the targets from other candidates through enhancement of identifier in apparent or neutral color features.
In Figure 1, I 1 and I 2 are examples of the same size (100, 90) that exhibit extremely similar colors; the major color is black which occupies a large proportion; yellow, red, and white are minor colors with relatively few pixels. For conventional color histogram-based matching, the main colors have played an essential role in existing similarity calculation methods. However, minor colors were ignored as trivial information for evaluating similar images.
The proposed approach focuses on low-frequency colors in any pair of two histograms of the reference and target images. To evaluate histogram similarity, three conditional combinations with respect to high and low frequencies in their respective bins are required. If both of two bins include high frequencies, they may have high similarity. In contrast, if only one of the two bins has a lower frequency, its contribution to the overall similarity may be much lower. The last case where they both have low frequencies has been evaluated as having only a relatively small contribution to the overall similarity; however, it is considered to be of interest in this work. This shows that the two images include the color with low frequencies represented in the bins. In this study, this case was formalized as an effective feature in histogram evaluation. However, we must prevent contamination by additional noise in histograms when designing algorithms, because noise may easily influence such low frequencies. This problem should be one of the subjects in the paper to solve by use of some particular definitions of minor colors.

Color Space Selection
Many color spaces have been proposed in color management and image processing, e.g., HSV, YUV, and L*a*b*. In this study, the L*a*b* color space [20] was used for ABC because it is a perceptually uniform space, where the color distribution shows a concentrated distribution trend, and the L* channel expresses the lightness. The value of L* channel defines from 0 and 100. The a* and b* channels mean colors from green to red and blue to yellow, respectively. The range of these two channels is −128 to 127. L*a*b* color space is closer to human vision. It can separate the lightness independently. To avoid the effect of illumination, the a* and b* channels without the L* channel are used in this study.

Apparent and Absent Color Histograms
With reference to the schematic diagram in Figure 2, a detailed mathematical formulation of the proposed procedure is given. It may be helpful to prepare a kind of nomenclature of definitions for reading the formalization as follows: AP (·) and AB (·) show a pair of apparent and absent color histograms before normalization and inverting. AB( ·) is absent color histogram after inverting, and then (·) AP and (·) AB are apparent and absent color histograms after normalization. Assume that each image has N pixels and the color space has β 1 by β 2 bins or quantization. They are represented by two-dimensional color histograms H and G , respectively, as follows [21]: where h i,j is the frequency in pixels in the bin at position (i, j) in the color space. Using H as a representative in the following formalization, the relative frequencies are defined as The histogram G is transformed to G in the same manner. Figure 3 shows the relative histograms H and G for images I 1 and I 2 , respectively. An apparent color histogram AP H and another absent color histogram AB H are defined from H as where h T is an important threshold or parameter for thresholding to decompose the histograms H into two disjoint constituent histograms. The definition of h T is provided in Section 2.4. AP H is an apparent or major color that can be easily observed in images. AB H contains minor colors because their occurrence is not frequent in the image or the proportion of pixels in the image is relatively small. AP H and AB H have the same structure as the two-dimensional histogram H, where their elements AP h ij and AB h ij represent the frequencies of the colors in the (i, j) bin, respectively. Any other bins without the definitions above-mentioned have zero or null frequencies at this point. To utilize any information included in the low frequencies in any histogram, the decomposition process is introduced systematically and effectively.
(a) H for Image I 1 (b) G for Image I 2 Next, after decomposition, the inversion of AB H is necessary to convert each value in it to the complement of the value h T . If AB h ij > 0 for any bin at position (i, j), then the frequencies in the inverted histogram ABH = ABh ij is defined as follows: Through this inversion operation, one can make inverted evaluation in similarity for absent colors as distinguished features. For completion of the inversion, some auxiliary small rules are necessary as follows. If AB h ij = 0 and AP h ij > 0, then ABh ij = 0 because the color represented at coordinate (i, j) in the color space should be considered as an apparent color, and if AB h ij = 0 and AP h ij = 0 and AB g ij > 0, then ABh ij = h T . The last rule imposes a particular condition on the component ABh ij , for which AB g ij > 0, i.e., ABh ij may contain the counter part for comparison when evaluating absent colors as enhanced features. Furthermore, the definition of absent colors is interdependent between the two images for matching comparisons. It is noteworthy that if h ij = g ij = 0 in the original histogram because of no operation on those bins. In the last step, both AP H and ABH must be normalized to H AP = h AP ij and H AB = h AB ij to satisfy the condition that all the components should sum up to one. Figure 4 shows the apparent color histograms H AP and G AP and absent color histograms H AB and G AB for images I 1 and I 2 , respectively.
(a) H AP for Image The procedure proposed here allows us to make certain balanced histograms for the evaluation of image similarity, and then they are expected as effective features to solve some troublesome problems through enhancement a rather small part of the images.

Design of Threshold
The threshold h T is one of the main roles in defining the apparent and absent colors. In this section, the approach to define the colors is explained so that meaningful histograms and effective performances can be achieved. Since jittering just around the level must be a trouble for stable signal conversion, an excellent algorithm using averaging histograms has been proposed to solve this problem. The mean color histogram, M, is introduced to obtain an averaged tendency of color distribution in two histograms to be compared and then to realize a stable definition of the threshold. M = m ij (i,j)=(1,1),··· ,(β 1 ,β 2 ) is defined as M is a critical phase before threshold selection. It analyzes the proportion of each component color in the histogram from a statistical perspective to match images. Hence, the rationality and dynamics of the threshold determination are improved, and the accuracy of the final similarity measurement is guaranteed. After creating M, it is converted to another sorted one-dimensional histogram M sorted = {m i } as follows: where i represents the bin's index in histogram M sorted . The threshold value h T can be defined by the following equation using an order index s related to a significant rate α, by which one can separate the set of all bins into two sets of apparent colors and absent colors in consideration of rare colors in the images.
Compared with a constant threshold, using the parameter s yields a stable decomposition that can be performed without any patterning near the threshold value. Because a zero frequency is important for eliminating noise effects, a removal operation for frequencies close to zero must be applied in the absent color histogram; for example, the bins retained from the original color histogram should be larger than 0.2 × h T in our experiments. Figure 5 shows the mean color histogram M for histograms H and G. Figure 6 is a Pareto chart [22] for this example; the significant rate α represents the effectiveness by revealing the rareness of the absent colors and contributes to the design of the threshold value.

Histogram Intersection
Histogram intersection [23][24][25] is a popular similarity index used in many studies and applications. It provides the following simple procedure for any two histograms H and G of the same size defined in a given color space.
For the two types of histograms proposed herein, i.e., apparent and absent color histograms, a scheme for combining two intersections is defined by using a weighting coefficient as follows [26]: where 0 < w < 1.

Combination of ABC with CF
The proposed method, ABC, is expected to be robust against ill-conditions, such as rotation, distortion, and scaling [18], while it is not enough in positional precision due to the loss of pixel location information. Some applications require higher sensitivity and positioning accuracy, as well as robustness against adverse conditions. As a trial in this paper, these requirements can be realized by combining ABC and another registration scheme of higher positional precision.
Correlation filter, CF, is an effective scheme for some kind of precise registration based on training and filtering in the Fourier domain [27,28], which can produce sharp peaks in the correlation output and achieve accurate localization of matched images. By training on deformed samples, CF is expected to be less sensitive to deformations of the target image, a property that makes it suitable for use in combination with robust but rough registration schemes, such as ABC.
In this method, an optimal filter is defined by a provided two-dimensional peakcentered Gaussian-like distribution and is obtained through several training processes to get a maximum value in a response map, which indicates the best-matched position. Let a reference image be p in image domain. Subsequently, an output q is calculated to use the model of Gaussian-like profile, where σ = 8 for the reference image of the size 110 × 80, as shown in Figure 7. The correlation operation with filter u in the image domain was performed via pixelwise calculations, which is efficiently performed in the frequency domain [29] as follows: where P, Q, and U are the Fourier transforms of p, q, and u, respectively; symbols " * " and " " indicate the complex conjugate and Hadamard product [30]. To obtain a better filter U, several p i are trained as a training set via the affine transformation of the reference image as shown in Figure 8, and q i as the output was generated to make a two-dimensional peak. This training is a key process for achieving high and stable sensitivity in finding any precise position in spite of the ill conditions. The minimization of the output sum of squared error [31] is utilized.
where i is the number of samples from one to 60 in our experiment. A closed-form expression of U * is obtained as follows: Let an image t be the input to CF from the searched position by pre-processing, ABC, as shown in Figure 9. The optimal filter U * is applied to T, the transformed version of t, for making its response map R in the Fourier domain, in which the largest peak indicates any target position. R = T U * (16) Figure 9 is an overview of the combination of ABC and CF. As shown in the upper left corner of the figure, ABC as the first step for coarse matching gives a searched image t for initial candidate and then CF is performed as the next step, in which the filter u in the upper right corner allows for more accurate registration. The figure shows the profile of the response map r in the upper right and the yellow bounding box shows the best-matched position by use of the image data from Box dataset [32,33].

Experiments
Some experiments are performed to demonstrate the performance of our proposed method by comparing it with other approaches. In Section 4.1, four color histogram-based approaches, i.e., CI, CCH, ABC, and ABC-CF, are compared using real-world images. In Section 4.2, not only color histogram-based approaches as comparison, but also template matching approaches are compared for tracking using the open data.
The same set of parameters are used as those of the two-dimensional color histogram, i.e., 10 × 10 bins, α = 0.2, and w = 0.6 throughout all the experiments for a fair comparison.

Experimental Comparison with Color Histogram-Based Methods
Meanwhile, ABC-CF was selected as an improved version of ABC to compare the matching results. In the experiments, different challenges could be tried, such as rotation, deformation, occlusion, scale variation, and illumination variation [34][35][36][37], to prove the merits of ABC and ABC-CF. The results obtained in a scene measuring 360 × 640 are shown in Figure 10. The key feature of ABC is to complete image matching via a color histogram. Therefore, it is compared with some existing color histogram-based methods to evaluate the performance of our approach. Figure 10a shows a reference image measuring 100 × 40. Figure 10b through Figure 10f show the different challenges to search for the reference position. For the case of rotation, deformation, and occlusion, the methods of ABC, CI, and CCH yielded good performances in the experiments; this demonstrates the advantages of the color histogram-based approaches. In the case of scale variation, the CCH indicated a slight shift, whereas ABC and CI maintained the correct matching position. In the experiment pertaining to illumination variation, only ABC matched with the target, although the matching target position shifted upward slightly. Figure 10g-i show the similarity profiles. The best-matched position is compared with the second-best matched position; ABC demonstrated better discrimination ability compared with the other two methods. It was evident that the margin distance of ABC was larger than those of CI and CCH.  ABC-CF, which is the improved version of the original ABC, yielded more accurate searching results and solved the shift problem. GT represents the ground truth for evaluating the performance of the comparison methods. Table 1 shows a comparison of the location error based on different challenges for color histogram-based methods, where the location error was calculated based on the Euclidean distance that used GT to compare with the searched position. In cases involving rotation and deformation, the CCH and CI can search for the best position in the experiments. The ABC matching position exhibited a slight downward shift, and this problem was mitigated using the ABC-CF method. In another three cases, the ABC-CF method proved to be the best method as it yielded the lowest location error.

ABC-CF in Open Data
To evaluate the performance of our new approach, ABC-CF, with color-histogram-and template-based matching methods, the four histogram-based algorithms are selected, i.e., ABC, CI, CCH, and TFCM and two template-matching algorithms, i.e., SSD and NCC, for comparison using open data. Figure 12 shows a reference image of the 85 × 77 from Tiger1 data in [32], where many frames included various instances with severe ill-conditions such as out-of-plane rotation, occlusion, and scaling in many frames. Pixel-by-pixel scanning is done over the scene using the reference image for all the algorithms. In Figure 12, the horizontal axis represents the number of frames, and the vertical axis represents the location error in the Euclidean distance between their best matched positions and the ground truths. The five frames are extracted as examples to show some details in finding or matching the reference in the scene. For instance, Frame #31 shows the matching result under the conditions of deformation and illumination variation, where ABC, CI, CCH, TFCM and SSD obtained better positions despite being slightly shifted. The ABC-CF method yielded the best matched precision. Similar results were observed in other frames. The precision plot is shown in Figure 13, in which the horizontal axis shows the upper limit of the location error. For example, the precision value at limit 15 signifies the total rate of frames in which the detected positions do not exceed 15 relative to all the frames.
The vertical axis shows the precision in the range of 0 to 1. Because both the templatematching-based algorithms failed to increase their precisions as the limits increased, they might have a clear bound of registration up to a distance of approximately 20 pixels, which is approximately 20% of the reference size in this case. All the histogram-based approaches did not exhibit such characteristics; however, their precision increased gradually, albeit lower than the template-based methods, particularly in the low limits. The ABC method demonstrated the best overall performance among all the methods, as indicated by the following two findings: the higher values around the low limits indicated more precise sensitivity in terms of registration performance, whereas the higher values in the high limits indicated more robustness in identifying the targets.

Computation Cost
The programs for the experiments were implemented in C++ by using Visual Studio 2015 and the OpenCV 2.4.13 library, without any parallel processing or GPU acceleration. The hardware was a Windows 10 PC with a 2.81 GHz Intel Core i5-8400 CPU and 8 GB RAM. Since the approach proposed in the paper, ABC-CF, is based on pixel-by-pixel calculation in nature, the computation cost is proportional to the number of pixels in the reference images and the target scene. The matching task depicted in Figures 7a and 9 was selected as a typical example to check the computation time requirements. A reference image of 110 × 80 pixels and a scene of 480 × 640 were used in this task, and the computation time was then observed using the OpenCV timing function. Table 2 shows the computation costs for ABC-CF and the other methods. The most efficient methods were SSD and NCC, given their simplicity. Although ABC-CF exhibited a time disadvantage, this is not problematic in practical applications because the computation cost of all histogram-based methods does not differ significantly.

Conclusions
A new approach named absent color indexing was proposed for robust pattern retrieval by a novel segmentation of histograms for enhancing colors of low-frequency or no-frequency. Using another particular structure, mean color histogram, the threshold h T can be calculated for stable segmentation of histograms without any jitter. To supplement insufficient precision in registration by the absent color indexing, the correlation filter was combined with it. Experiments on various image data show that the proposed method can achieve better performance in image matching and robust tracking compared to other typical methods.
Author Contributions: All five authors contributed to this work. Methodology, Y.T. and S.K.; writing-original draft preparation, Y.T.; writing-review and editing, S.K. and M.F.; supervision, S.K.; project administration, S.S. and M.I.; resources, S.S., M.I. and M.F. All authors have read and agreed to the published version of the manuscript.