Nonlocal Means Two Dimensional Histogram-Based Image Segmentation via Minimizing Relative Entropy

Spatial correlation information between pixels is considered to be very important in thresholding methods. However, it is often ignored and thus unsatisfied segmentation results maybe obtained. To overcome this shortcoming, we propose a new image segmentation approach by taking not only pixels’ spatial information but also pixels’s gray level into account. First, a non-local mean filter is imposed on the image. Then the filtered image and the original image together are adopted to build a two dimensional histogram, it is called non-local mean two dimensional histogram. Finally, a minimum relative entropy criteria is used to select the ideal thresholding vector. Since the non-local mean filter process is performed in a neighborhood of current pixel, it carries out the spatial information of current pixel. Segmentation results on several images illustrate the effectiveness of the proposed thresholding method, whose segmentation accuracy are greatly improved compared to most existing thresholding methods.


Introduction
In the area of computer vision, image segmentation is a primary pre-processing step. The primary goal of image segmentation is to partition the image into several regions. In each region, image characteristic such as brightness, color, and texture are similar to some extend, while between different regions, these characteristics are obviously different. Image segmentation techniques had widely been adopted by different practical application task such as cell segmentation [1], object detection in SAR image [2], defect detection [3,4], etc.

Methods
Advantages Disadvantages superpixel [5,6] reduce redundant information; less complexity cannot locate the edges accurately watershed [7,8] simple and intuition usually result in over segmention active contour models [9,10] rigorous mathematical base; sensitive noise; high computation complexity clustering [11,12] intensive value is enough; simple the number of cluster cannot be determined automatically; spatial information is ignored; deep learning [13,14] high segmentation accuracy large computation burden thresholding [15,16] simple, easy to be implemented ignore spatial information Thresholding is very popular because it is relatively simple and can be implemented more easy. Thresholding methods assume that the gray level histogram of an image has distinct peaks and valleys and therefore the objects could be distinguished from background via a threshold. Usually, an ideal threshold is determined by maximizing or minimizing an objective function constructed from gray level histogram to select an ideal threshold. For example, Otsu proposed to maximize between-class variance to select threshold value [17], Kittler proposed to minimize classification error to select threshold [18]. Pun first introduced entropy as an objective function for image thresholding segmentation [19], in which the posteriori entropy of the object and background was firstly calculated, and then the upper bound of them were maximized. After this, Wong et al. present an improved version of Pun's approach by imposing some inequality constraints on posteriori entropy, which characterizes the regions' uniformity and shape [20]. Kapur proposed another entropy based thresholding method called maximum entropy thresholding algorithm. The maximum entropy method calculated the entropy of objects and background, then the sum of them is maximized [21]. In [22], Li et al. suggested to minimize the difference between original image and segmented image to selected threshold. To achieve this, the concept of cross-entropy was used as the criteria.
The above classical thresholding methods use only gray information of images, while the spatial information between pixels is ignored. Therefore, they often produce some segmentation error. For instance, an identical threshold may not suitable to two different images with the same gray level histogram, and possibly cannot correctly segment the two images. To overcome this shortcoming, the spatial information between pixels should be taken into account in the segmentation process. To achieve this goal, Abutaleb proposed a novel concept, i.e., two dimensional histogram. Two dimensional histogram is a L × L matrix, each elements of it represents the occurrence probability of gray level pair (i, j) , where i denotes the gray level of pixel in the original image and j the neighborhood smoothed image of original image [23]. Since the neighborhood smoothed image contains the spatial information among pixels, and thus, the spatial information was integrated into the selection of threshold. After Abutaleb's work, several authors extend the classical thresholding methods to two dimensional thresholding methods by adopting two dimensional histogram. For example, two-dimensional Reny's entropy thresholding method [24], two dimensional Otsu thresholding method [25] and two dimensional Tsallis entropy thresholding method [26]. Inspired by Abutaleb's idea, many researchers constructed different two dimensional histograms using other spatial information between pixels. Xiao suggested to use similarity of a pixel with its neighborhood pixels as the spatial information to build two dimensional histogram, which is named as gray level spatial correlation (GLSC) histogram [27]. In [28], Adiljan adopted edge information and gray level of pixels to construct two dimensional histogram [28]. In Adiljan's method, the gradient of original image was firstly computed, and then, the orientation histogram of the gradient image was calculated and used as edge information. The resulted two dimensional histogram is called 2D direction histogram. Also, Xiao presented another method to construct two dimensional histogram by combining original images's gray level and gradient magnitude, the obtained histogram is called GLGM histogram [29].
As far as the two dimensional histograms mentioned above is concerned, the core idea is to apply a certain filter to the original image to obtain the spatial information between pixels. Abutaleb used a mean filter, which is restricted in a local neighborhood of a size 3× 3. However, in some situation the local mean may lost fine details, for example, points, lines and edges, of an image. Observing these facts, a non-local mean filter [30] is adopted in this paper. Compared with local mean filter, non-local mean filter computes the weighted mean of all possible pixels in the image which is similar to the target pixel. The non-local filtered image and the original image together are used to constructed two dimensional histogram. The ideal two dimensional threshold vector is obtained by minimizing the relative entropy.
The rest of this paper is organized as follows. In Section 2, the non-local mean filter is briefly reviewed and then non-local mean two dimensional histogram is constructed. In Section 3, the threshold selection process by minimizing relative entropy is described. In Section 4, experimental results and discussion are presented in details. Lastly, the conclusion are conducted in Section 5.

Non-Local Mean Filter
Let X(i) denote the gray level of pixel i in image I. In non-local mean filter, the estimated value of pixel i is the weighted average of other pixels's in image I, which is calculated as where w(i, j) are the weights reflecting the similarity between pixel i and j, which is calculated as where N k denotes a square neighborhood with a fixed size, whose center locates at pixel k, X(N i ) the gray level vector inside the square neighborhood N i . σ is the standard deviation of the Gaussian kernel, and h represent the filtering degree. Z(i) is a normalizing constant as The non-local means compares the grey level in a geometrical configuration in a whole neighborhood as well as in a single point.

Construction of NLMTDH
Let J be the non-local filtered image of original image I. The size of the two images is M × N, their gray level belongs to set {0, 1, · · · , L − 1}. I(x, y) and J(x, y) be the gray level of the pixel at (x, y), where x = 1, 2, . . . , M and y = 1, 2, . . . , N. Let n ij be the total number of pixels such that I(x, y) = i and J(x, y) = j, NLMTDH is defined as NLMTDH P = {p ij ; i, j = 0, 1, · · · , L − 1} is a L × L matrix, and is shown in Figure 1. Assume a two dimensional (2D) threshold vector (s, t) divides NLMTDH into four regions, where s represents the threshold of original image and t the non-local means filtered image. Since the pixels in the interior of objects or background are similar each other, therefore, region 1 and 3 contain the information of objects and background, respectively. Region 2 and 4 contain the information of edges and noise.

Relative Entropy
Relative entropy, also called Kullback-Leibler (KL) divergence, is a measure to reflect the difference between two probability distribution P and Q. Let P = {p 1 , p 2 , · · · , p n } and Q = {q 1 , q 2 , · · · , q n } and satisfy ∑ n i=1 p i = ∑ n i=1 q i = 1. The relative entropy between P and Q is defined as Li and Lee [22] adopted relative entropy for thresholding selection. In image thresholding, P represents the original image distribution and Q the segmented image.

Threshold Selection Based on NLMTDH Using Relative Entropy
Consider the cast that there are only two classes in the image, let C 0 represent the background and C 1 object. As stated before, in Figure 1, region 1 and 3 contain the information of background and object, respectively. Let P 0 and P 1 be the occurrence probability of object and background at threshold vector (s, t). They are computed as and The mean vector of the two classes are and , respectively. Similar to [22], the relative entropy between the original image and the segmented image in NLMTDH at the threshold vector (s, t) is defined as More details about how Equation (10) is defined, one can see Appendix A. Substituting Equations (8) and (9) into Equation (10) and after some manipulations, one can get , which is a constant for the entire image. An ideal threshold vector (s * , t * ) should be one that minimizes D(P, Q|, s, t), i.e., (s * , t * ) = arg min D(P, Q|, s, t). (12)

Results
To demonstrate the effectiveness of the proposed method, several real images are used to test the algorithm. The proposed method is compared with other methods including Otsu method, Kapur method, Minimum cross entropy (MCE) method, 2D histogram-based minimum cross entropy (2DMCE) method. These methods are implemented on an Intel Core(TM) i5-4200U 2.3GB platform with 8GB RAM using Matlab. The test images include Ant, Bacteria, Block, geometric, Junk, Mask, and two casting images.
In this paper, the misclassification error (ME) is adopted as objective criteria to evaluate the performance of the referenced methods. ME [31] is defined as for two classes segmentation, where B o and F o represent the background and foreground pixel set of ground-truth image, while B T and F T are the corresponding parts in the thresholded images, |.| represents the element number of a set. The value of ME lies in [0, 1], where 0 implies a perfect segmentation and 1 for a completely wrong segmentation. A smaller ME value indicates a better segmentation quality. Figure 2 shows all the testing images and corresponding ground-truth images, and Figure 3 exhibits the binary segmentation results through the referenced methods. As is shown in Figure 3, for the Ant image, the proposed method produces the best segmentation result, while Kapur's method fails to give correct segmentation result. Otsu and MCE methods result in over-segmentation in some region and under-segmentation in other regions. 2DMCE method has some under-segmentation in the Leg part. For Bacteria image, Otsu and MCE method produce incorrect segmentation result. In the segmentation result of Kapur method, there are many background pixels are classified into foreground pixels. As for 2DMCE, there exits under-segmentation phenomenon. Our proposed method produces the best segmentation result with less segmentation error. For Block image, Otsu and Kapur method cannot fully extract the objects from background, while MCE and 2DMCE method has some segmentation error. Only our proposed method gives the correct segmentation result. For geometric image, the Otsu and Kapur method cannot separate the object from background correctly. The other three methods give satisfactory segmentation results. As far as Junk is considered, the Otsu, Kapur and MCE method can extract the object but there are much noises in the segmentation results. 2DMCE and our proposed method give satisfactory segmentation results, while our method gives more accurate result. As for Mask image, Otsu and Kapur method give poor segmentation results. MCE, 2DMCE and our method obtains acceptable segmentation results. For the two casting images, it can be seen that our method produces the best result. Table 2 lists the obtained thresholds or threshold vectors for two dimensional histogram-based segmentation method and the ME performance index for different referenced methods. It is obvious that the ME of our proposed method is the smallest, indicating the best segmentation results are obtained by our method.     Table 3 lists the execution time of every method. It can be seen that one histogram-based thresholding methods including MCE, Otsu and Kapur expend less time than two dimensional histogram-based thresholding methods. The reason lies in two aspects. One is that it should filter the original image and then formulate two dimensional histogram. The other is that the threshold search range of two dimensional histogram is L × L, while L for one dimensional histogram. The larger the search range, the more time it needs.

Discussion
Constructing two dimensional histogram by combining the original image and its filtered version is a popular strategy to integrate the spatial information between pixels into the thresholding process, which had been proven to result in higher segmentation performance than a one dimensional histogram. Abutaleb's method was the first and a successful try [23]. In [23], local mean filter was adopted. From theoretic viewpoint, local mean filter belongs to Gaussian filter, which can smooth edges and details of image. Since Abutaleb's method assumed that the pixels inside objects or background are similar, while the pixels located at edge or border are different from those in objects or background. If local mean filter smoothed the edges, it is possible that some pixels at edge or noise will be classified into background or objects if Abutaleb's method is used, and thus higher segmentation error maybe occur. Research results in [30] showed that non-local mean filter is superior to local mean filter. Non-local mean filter can preserve more edges and details than local mean filter in that it finds pixels that are similar to the current pixel in the entire image instead of a local neighborhood and then use the weighted mean of these pixels as the filtered value of current pixel. Therefore, our method can enhance the performance of Abutaleb's method by reducing segmentation error. This enhancement has been demonstrated by experimental results. Of course, since non-local mean filter is not so effective for pepper and salt noise, if an image is corrupted by pepper and salt noise, its performance improvement will be limited. This is the limitation of our proposed method. In the future, one can develop combined filters to filter the original image if it is corrupted by complex noise. This will be our future effort.
Our proposed method belongs to thresholding method. Usually, its performance is not as good as other sophisticated methods, such as CNN-based method [13]. The main reason is that it uses less information of the image.

Conclusions
In this paper, a new thresholding method is proposed based on non-local mean two dimensional histogram. First, the proposed method adopts non-local mean filter to filter the original image. This process can incorporate spatial information between pixels into the filtered image. Then, a non-local mean two dimensional histogram is constructed according the original image and the filtered image. Finally, the minimum relative entropy of the objects and background is calculated based on non-local mean two dimensional histogram, and the optimal threshold vector is determined by minimizing the relative entropy of the objects and background. In experiments, the proposed method is used to segment several real images and compared to some existing thresholding methods. It is shown that the proposed method can obtain better segmentation performance.
Author Contributions: C.J. and Y.T. conceive the idea of the whole paper, and C.J. implements the construction of non-local mean two dimensional histogram. W.Y. and F.W. collect the materials. Y.G. implements the relative entropy and design the whole experiment. Y.G. and Y.T. fulfill the writing of the paper. All authors have read and approved the final manuscript.
Acknowledgments: This work was partially supported by Zhejiang province public welfare project(2017C31126) and Quzhou science and technology projects(2016Y015 and H2018007).

Conflicts of Interest:
The authors declare no conflict of interest.

Appendix A. Derivation of Equation (10)
Let I be an image with N pixels, its gray level lies in F = { f 1 , f 2 , · · · , f N }. Suppose a threshold t separates the image into two classes, i.e., object and background. The gray level of the segmented image is G = {g 1 , g 2 , · · · , g N }. The mean gray level of the two parts are respectively, where N 1 and N 2 are the number of pixels that f i < t and f i ≥ t, respectively. In [22], the relative entropy between the original image and the segmented image is defined as where the mean µ 1 (t) and µ 2 (t) can be calculated using histogram as By extending the definition of relative entropy-based on one dimensional histogram to two dimensional case, one can get Equation (10) In Equation (A5), the term ip ij is similar to the term jh j in (A3), which represents the product of gray level j and its occurrence number.