Image Segmentation Based on Statistical Confidence Intervals

Image segmentation is defined as a partition realized to an image into homogeneous regions to modify it into something that is more meaningful and softer to examine. Although several segmentation approaches have been proposed recently, in this paper, we develop a new image segmentation method based on the statistical confidence interval tool along with the well-known Otsu algorithm. According to our numerical experiments, our method has a dissimilar performance in comparison to the standard Otsu algorithm to specially process images with speckle noise perturbation. Actually, the effect of the speckle noise entropy is almost filtered out by our algorithm. Furthermore, our approach is validated by employing some image samples.


Introduction
Fundamentally, the image segmentation process is defined as the partition realized to an image into homogeneous regions [1][2][3][4][5]. The main objective of this process is to transform an image into something that is more meaningful and softer to examine by extracting important features from the original image information. Therefore, this subdivision will depend on the problem being solved. Nowadays, image segmentation is a fundamental stage in many engineering image preprocessing studies such as medical applications, video retrieval, pattern recognition, manufacturing inspection, structural fault diagnosis, et cetera [1,2,4,[6][7][8][9][10][11][12].
On one hand, several segmentation approaches by using different concepts have been recently granted in the corresponding literature, just to name a few, see [4,5,8,9,[13][14][15]. However, those techniques based on the thresholding design seem simpler [3,[7][8][9]16]. Additionally, thresholding methods can be categorized into two main classes which are global and local thresholding, respectively. See, for instance [1,7,17,18]. In addition, there is a considerable number of thresholding algorithms, mainly based on histogram, spatial analysis, entropy, object attribute, and clustering [7]. Hence image thresholding is a well studied field. Among these is the well-known Otsu thresholding algorithm [8] which uses discriminant manipulation to find the maximum separability of classes [19]. Thus, its principal objective is to minimize the average error in classifying pixels among the group classes. Summarizing, threshold may be viewed as a statistical analysis for decision making. Furthermore, this operation is very simple by only utilizing the zeroth and the first order cumulative moments obtained from the histogram of the corresponding image under study. Lately, this design is straightforwardly extensible to the multivariate thresholding (or multiple thresholds) problems to extend it to an arbitrary number of classes to be identified in a reference image.
On the other hand, in statistics, a confidence interval is a kind of interval estimation of a population parameter that is computed from the experiment sample data. Hence, the confidence level is the proportion of a possible confidence interval that contains the true value of a corresponding probabilistic model parameter. It is important to highlight, however, that the interval computed from a particular sample does not necessarily include the true value of the estimated parameter. For instance, a 95 percent confidence interval reflects a significance level of 0.05. The most commonly level of confidence is set to 95 percent. In addition, a larger sample size normally will lead to a better estimation of the related population parameter. This theory is well documented in any scientific or engineering text book on probability and statistics, see for instance [20].
Lastly, for instance, the ultrasonic and SAR images have the main disadvantage to be contaminated by speckle noise [1,2,21,22]. The speckle noise is usually generated by the interaction of the reflected waves due to the interference of the returning wave at the transducer aperture. The existence of speckle noise significantly degrades the quality of an acquired image [21][22][23]. Thus, development of new techniques to face this external noise sensor perturbation is still an open research topic.
The novelty of this paper mainly lies in the evolution of a new image segmentation method by invoking the confidence interval theory and the well-known Otsu algorithm for segmentation. According to our numerical experiments, our method has a different performance in comparison to the standard based on the Otsu approach without using our pre-processing stage. This performance is specially notorious when our image is contaminated by the speckle noise. Furthermore, our approach is also validated by employing some image samples.
The rest of the paper is structured as follows. Section 2 presents our main contribution on image segmentation by using the previous cited statistical theory. Later, Section 3 gives the main theoretical discussion of our approach. The material and method are given in Section 4. Finally, Section 5 states the final conclusions.

Image Segmentation Based on Statistical Confidence Interval Approach
In this section we present our main contribution. In order to compare our main part against a well-establish method, we present the following approaches. First, the standard image segmentation process is shown in Figure 1. It is important to note that noise perturbation affects the performance of image segmentation based on the Otsu method (see, for instance [7]). The shown multivariate threshold block is in fact the image segmentation stated by Otsu. The whole resultant process is here called the reference method. Second, our approach is shown in Figure 2. In this figure, in comparison to the previous one, we have added our pre-processing image stage technique. Once again, the multivariate threshold segmentation block is exactly the same as the standard method. This Otsu multivariate thresholding method is easily implemented by using, for example, the Matlab specially commands for image processing. In our numerical experiments, we use the three levels thresholding Otsu statement. Lately, our enhancement algorithm is set as follows (see Figure 2):  1. Obtain the standard deviation of the entire image population σ by considering each pixel intensity as a point-wise datum. 2. Separate the original image into n p sub-images. See Figure 3. 3. For each sub-image given by, for instance, A k (i, j) for k = 1, 2, ..., n p , where i and j is the Cartesian coordinate of a pixel and A k (i, j) is the related gray-scale pixel intensity, obtain its sample average valuex, and again, by considering each pixel intensity as a point-wise datum.
Reconstruct the Pre-processed image by compositing the above resultant sub-images. 6. Apply the above outcome image to the Otsu multivariate threshold process to obtain the final result. See Figure 2. The above algorithm is really easy to implement by using, for instance, Matlab programming. In fact, we use it in our image numerical experiments. Then the final image segmentation was colored for gently human performance interpretation. On these experiments, we apply both our and the standard methods. Then, a discussion to the obtained results are given in the section that follows. Finally, we have assumed a confidence interval for the mean population, or expected value, µ, being a normal distribution with known σ 2 and a 95 percent confidence interval for it, that is: This late expression was employed in our computation algorithm. Obviously, we have implicitly assumed that our sample dataset size is large enough to validate the Central Limit Theorem. Some numerical experiments are shown in Figures 4-9. Next are some remarks.

Remark 1.
Step 4 in our algorithm, we assign the valuex to the corresponding variable in case when the conditional if is not satisfied. Obviously, other options can be realized. For instance, by using max{A k (i, j)} or min{A k (i, j)}. However, we decide to use this average valuex as an agent of the studied sub-image sample.

Remark 2.
Since a key assumption of Otsu's method is based on the Gaussian distribution noise perturbation in our image, and because we are invoking statistical confidence interval theory, and according to the Central Limit Theorem, ifX is the mean random variable of a given sample dataset of size n sufficiently large (usually n ≥ 30), then the confidence interval analysis can be realized by employing the standard normal distribution representation even thought the speckle noise perturbation, or other, is in fact presented in our image dataset.

Discussion
In the previous section we have granted our recent algorithm on image segmentation based on confidence interval statistical theory. According to our numerical experiments, our method presents better performance in comparison to the standard process. For instance, from the first example, the processed image, in our case, was able to classify image segmentation more efficiently than the standard method. For the second example, it is clear that there is something around the object center almost imperceptible by the standard method. Finally, Equation (1) was invoked because according to the statistical theory, the confidence interval gives a zone where the population mean value is probably located. In our philosophy, this interval reflects the location of the hight content information versus the noise entropy.
On the other hand, it is interesting to compare our approach to at least one other image segmentation technique. At this point, we program the binary segmentation process by using the Matlab adaptive image thresholding command adaptthresh with sensibility of 0.4 and then followed by the binary image segmentation process by invoking the Matlab statement imbinarize. By employing the speckle noisy image shown in Figure 10, the obtained result along with our is depicted in Figure 11. Even when the binary segmentation technique is employed for this alternative segmentation method, the presence of the speckle noise is still persistent. This validates the superiority of our approach.  Moreover, and by accepting the fact that there is no still a satisfactory performance measure index for a image segmentation process [13]. So far, we utilize the subjective evaluation criterion [13]. However, it is important to note that even when this criterion is widely used, an untrained subjective evaluation may result in a mistake conclusion [13]. Then, by suing the subjective evaluation measure to our processed noisy images, our technique, and according to our numerical experiments, presents an acceptable performance. On the other hand, the Jaccard similarity coefficient has also been extensively used for the performance evaluation of image segmentation methods. Recall that the Dice similarity coefficient is related to the Jaccard one. Essentially, the Jaccard coefficient, also known as the Tanimoto coefficient, measures the overlapping of two given sets, or two images in logical format, where a value of zero indicates no overlapping, and a value of one indicates a perfect agreement. Hence, if two segmentation processes are going to be compared, the one with the bigger Jaccard coefficient (or closer to one) value will have the better performance. Therefore, we proceed as follows in order to validate our performance segmentation algorithm by employing this Jaccard similarity coefficient in comparison to the reference method. Given the original image shown in Figure 12, we add the speckle noise to it. See Figure 13. Then, by utilizing this noisy image, the processed segmentation results are depicted in Figure 14. Then we transform the clean image into a binary image (logical format) by invoking the command im2bw of Matlad with a thresholding level of 0.5. We called it Image A. After transforming the colored images shown in Figure 14 into gray-scale images, we also transform them to their binary formats by employing the same previous command at the same thresholding level too. We call them Image B and Image C, respectively. That is, Image B is the resultant binary image of our processed image, and the other one coming from the corresponding reference method. Now, we are interested in comparing the similarity between Image A and Image B with respect to the similarity between the Image A and the Image C by invoking the Jaccard method. By doing it, we obtain 0.6325 and 0.6215, respectively. Hence, our approach has a better performance. Additionally, and as was suggested by a reviewer remark, we realize an extra experiment but now by using Gaussian noise instead of the speckle one. The obtained results are shown in Figure 15 for the sample image given in Figure 16. Finally, and thanks to a reviewer observation, it is intriguing to observe that by binning and averaging an image into a sub-image samples, it results in obtaining a reduced resolution of the mentioned sub-image, and in consequence, of the whole image itself. Hence, this facilitates the segmentation image process. This is not totally true in our approach if we observe that our technique does not always produce this reduce resolution stage. See step 4 in our algorithm.

Materials and Methods
In this section, we present our algorithm realization approach on image segmentation by using Matlab programming (2017a, MathWorks, Natick, MA, USA). Basically, we employ this software platform to implement our procedure because it requires no programming experience, but some familiarity with it. In any case, Matlab is a powerful software package that has built-in functions to accomplish a diverse range of tasks, from mathematical operations to image processing. Additionally, Matlab has a complete set of programming routines that allows users to customize programs to the user own specifications. Furthermore, and because of we are dealing with image processing by following a well-established algorithm, the best way to describe the computational source material is to grant the programing lines employed to generate our numerical experiments. This is shown in the appendix.
The second part of Materials and Methods consists of appropriately selecting the test-images to appreciate our method on image segmentation. Principality, we are interested in images that have some salient visual property, especially for human visual perception. Therefore, we carefully selected images from different engineering fields: infrared, medical, military, and cosmic images.
Finally, we have the computational resource. Because we are employing the Matlab software with the corresponding tool box on image processing in it, we require a modern computer. We use a Windows 7 64 bits with Matlab version 2017a. Due to our images being small, the computation time is short. This gives us flexibility to run many numerical experiments.

Conclusions
In this paper we have proposed a new method on image segmentation by using a pre-enhancement stage based on a statistical confidence interval. According to our numerical speckle noisy image experiments, our approach has better performance than the standard Otsu method. Actually, the effect of the speckle noise entropy is almost filtered out by our algorithm. In our design, we implicit invoked the Central Limit Theorem requiring a sample size that is sufficiently large. However, if this sample size is small, instead, we can use the statistical confidence interval by utilizing the t-student probabilistic model. This may be interesting when contrasting both methods from their performance point of view. This is left as a future work. On the other hand, the Central Limit Theorem requires that the population probabilistic model has a mean and expected values. However, there are situations where this assumption is not satisfied, for instance, when our population model has the Cauchy distribution model (as was observed by a reviewer), where these values are undefined. Hence, applications where this distribution model is presented is also an open research topic. Moreover, and from the academic point of view, a programming platform competing against Matlab is Python. This because Python is a open source platform. Then, it is interesting to generate the given corresponding Matlab code by using this open source programming option. Additionally, in all our numerical experiments, we use grayscale images; obviously, the extension of our method to color images is also an open inquiry topic. Finally, while exploring the main property of the Jaccard coefficient as a performance index on image segmentation processes, and because we are using random data due the noisy perturbation, the results also propose a new performance index but from the statistical point of view.