1. Introduction
Edge detection methods are very commonly used in many areas of image and video processing and computer vision. They are commonly used in object detection, classification, and recognition. This is because edges are the most important objects, from the Human Visual System point of view [
1], that are present in images. There are many approaches to edge detection. Starting from the simple image filtering methods, which are extremely fast, through more advanced geometrical methods, which are used in shape representation, up to machine learning-based methods, especially deep convolutional neural networks, which need prior data training [
2].
The main problem of efficient edge detection follows from the fact that all edge detection methods work with a global assumption that the edge is defined as a step edge—a sharp change of image intensity. In fact, in the case of still images, we deal with edges of different levels of sharpness and smoothness. The most representative case is an image that is focused—it has a sharp foreground and smooth background. So, treating all edges in the same global way must be inefficient during edge detection of such an image. Thus, to overcome this problem, a method that works locally has to be applied, according to the known and very deep thought “Think globally, act locally”. By working locally, such a method could adaptively adjust the definition of an edge depending on the local background. It means that in the sharp region, this method can detect sharp edges, while in smooth regions, it can detect smooth edges.
In this paper, we address the problem of edge detection in focused images. In such a case, it is difficult to detect both kinds of edges simultaneously. The existing methods either do not detect smooth edges in the background or detect too much noise in the foreground. In this paper, we propose an approach that can detect all edges in focused images efficiently. The efficiency follows from the fact that the proposed method works locally and adaptively detects edges inside a sliding window. This method is based on the k-Means algorithm well known in Machine Learning. The use of the k-Means algorithm locally gives a good balance between the speed of the pointwise methods and the efficiency of the CNN-based methods. To use the k-Means algorithm locally, we proposed the enhanced version of image filtering—we filter an image not pixel-by-pixel, but area-by-area. The proposed algorithm of edge detection was compared to the state-of-the-art methods taken from different classes of edge detection methods.
This paper is organized in the following way. In
Section 2, the related work is presented. In
Section 3, the problem statement is posed. In
Section 4, the new method is proposed. In
Section 5, the experimental results of edge detection are presented. Finally,
Section 6 concludes this paper.
2. Related Work
Edge detection methods have a long history. The most commonly known class of edge detection methods is based on the pointwise approach. The main known icons are the Roberts algorithm [
3], Prewitt operator [
4,
5], Sobel method [
6,
7], and the most commonly used Canny algorithm [
8]. These methods are mainly defined as filters based usually on the first or the second derivative of the function representing an image. These derivatives are often combined with the pre- or postprocessing methods or are improved in different ways [
9,
10,
11,
12,
13]. These algorithms are fast but are not noise resistant.
Another class of edge detection methods is the one represented by the geometrical approach. The most known is the method based on the Hough transform [
14]. Another approach is based on moment computation [
15,
16]. More recently, a new method was developed based on the multiresolution geometrical edge detection [
17,
18,
19]. All the geometrical methods treat edges as line or curve segments. From this follows that they are rather slow in comparison to the pointwise methods, but they treat shapes in a geometrical way and are noise resistant. Thanks to that, these methods can be used as feature extractors in object recognition.
A quite different class of methods is based on the Machine Learning approach. The most known algorithms are the random forests [
20,
21,
22] and the k-Means algorithm [
23,
24,
25]. There are also different variations of such methods [
26,
27]. These methods are efficient and are relatively not time-consuming. They are more sophisticated than pure filtering methods since they analyze the image content.
Finally, these days, many algorithms are based on Deep Learning of convolutional neural networks (CNN). The first attempt to change the classical approach from bottom-up to top-down multiscale edge detection was proposed with the use of DeepEdge [
28]. Another known algorithm is HED [
29], which is also multiscale. Then a number of different approaches were proposed based on CNN [
30,
31,
32,
33,
34]. Recently, a simple, lightweight, and efficient algorithm was proposed based on the Pixel Difference Network [
35]. All CNN methods are based on feature detection. Thus, they can detect edges in a more intelligent way than simple pointwise algorithms. However, these methods require a priori a huge amount of learning data to be further trained and used. Additionally, the training is time-consuming.
4. Edge Detection by the Local k-Means
In this section, we propose an efficient method for edge detection in focused images, which is based on the proposed use of the k-Means algorithm locally. Though the idea of local use of the k-Means algorithm through the sliding window was already proposed [
37], it was performed quite differently and was dedicated to specific tasks like text document analysis.
As we could see in the previous section, the application of the k-Means algorithm to the whole image does not lead to satisfactory results of edge detection. Therefore, we propose to apply k-Means locally via a filtering window, which goes through the image, like a typical filter, and compute k-Means in this local window. However, unlike in the classical filtering process, we compute here a subimage within the filter instead of a single pixel. It means that, unlike in the state-of-the-art methods, we obtain as a result a subimage instead of a pixel.
In more detail, to compute the convolution of an image with the typical filter of size
pixels, see
Figure 4a, we apply the filtered pixel by pixel to the image (in the horizontal and then the vertical direction) and compute the new pixel’s value each time taking into account the values within this
pixels window. However, in our case, we apply the filter to the given square area, see
Figure 4b. In other words, we divide the given image into subimages of size e.g.,
pixels and for each subimage we apply the area mask (of size, e.g.,
pixels) and compute the k-Means algorithm, within this
pixels window. Then we draw the result within just the
pixels area, which is the considered subimage. Similarly, as in classical filtering, in the proposed method, we deal with edge pixels that go beyond the image during filtering. We solve this problem by reducing the mask’s size to the area size in border squares.
The proposed method is summarized in Algorithm 1. We fix the initial values of the segments and means on lines 1–2. In lines 4–12, the classical k-Means algorithm is presented. We iterate the segmentation according to the means (lines 6–7). In each step, the means are updated (lines 8–9). The algorithm stops when the assumed error measured as the Mean Square Error (MSE) between the original image and the segmented one (line 11) is small enough. Next, the local application of this algorithm is defined in lines 13–17. The image is divided into adjacent subsquares of size (lines 13–14). Then, for each such subsquare, the accompanying mask is defined of size (line 15). Next, the k-Means algorithm is computed for such an image within the mask (line 16). Finally, just the small subimage of size is drawn as segmented (line 17). Note that, depending on an image, we can adjust the sizes of the filtering window and the considered area. Note that when we fix these sizes as the same, we deal with a simple application of the k-Means algorithm in squared subimages of the considered image.
Algorithm 1: The local k-Means edge detection. |
|
5. Experimental Results
In this section, we present the experimental results of edge detection. In
Figure 5, the tested benchmark images [
38] are presented. These images were resized to
pixels to make the computations easier. However, the proposed algorithm can be applied to images of any size.
To fix the optimal parameters used in the proposed method (i.e., the number of segments, the size of the area window, and the size of the mask window), a number of experiments were performed. We show them for a sample image “Ladybird”.
The edges detected by the proposed local k-Means algorithm for different sizes of the area window (i.e., subimage) are presented in
Figure 6. In this example, the number of segments were fixed at
. From these images, one can see that the best visual results one obtains for the size equal to
pixels. Smaller areas (especially
) cause the noise effect. So, for all our experiments, we fixed the area size as
pixels.
The edges detected by the proposed algorithm for different numbers of segments
k are shown in
Figure 7. From these images, one can see that the best results are obtained for three or four segments. In further experiments, we use three segments.
The final test was made to check the optimal size of the mask. Therefore, the edges detected by the proposed algorithm for different sizes of the masks are shown in
Figure 8. The size of the area was fixed as
pixels. From these images, one can observe that the optimal results are obtained for the mask’s size
or
pixels.
From the experiments performed, one can conclude that the optimal size of the subimage and the optimal number of segments can be fixed globally for all tested images. However, in the case of the mask’s size, we can observe that, depending on the image, the best results are obtained by the sizes oscillating somewhere between and pixels. However, fixing this parameter for all images give also good results.
Usually, when we deal with edge detection methods, the noise resistance is tested. However, in our case, it can be skipped. The reasons are twofold. Firstly, we are interested in focused images. This kind of image is noise-free by definition. Secondly, even if we would like to consider noised images, there is a number of methods to remove noise from k-Means clustering efficiently, e.g., [
39], that can be applied.
Finally, we compared the proposed method to the state-of-the-art ones. The reference methods are Canny, wedgelets2 [
19], and global k-Means. The Canny and k-Means algorithms are classical and work globally and pointwise (however, the latter method works in a more intelligent way than the former one). However, the wedgelets2-based method was proposed as the geometrical method that can be used for object detection or recognition. This method is local and is based on the local window mechanism. These reference algorithms were chosen as the best methods representing different approaches (i.e., pointwise, geometrical, and ML-based). We excluded CNN-based methods from the experiments since they need a huge training dataset and time-consuming training.
In all the methods tested, the optimal parameters were fixed. In the Canny case, the thresholds were fixed as 100 and 150 as this is the best compromise between the lack of noise in the foreground and the accuracy of the background edges. In the case of wedgelets2, the second-order wedgelets were used. This method works in a geometrical way, so it can better detect background edges than the Canny method. In the case of the k-Means algorithm, the number of segments was used as 8 to find the compromise between the accuracy of the background and the foreground edges detection.
To show the advantages of the proposed method we first show a simple artificial example from
Figure 5f. In this example, one of the balls is in motion, so it is not focused. In
Figure 9 the results of edge detection for this image by different methods are presented. From these results we can see that: (1) the Canny method cannot detect the object in motion properly (the edges are too smooth to be caught by this method); (2) the wedgelets2-based method produces the edges in a manner of small lines or curves, not necessarily connected; (3) the k-Means algorithm detects the shadows on the table and this cannot be avoided without significant degradation of the balls, it follows from its global working; (4) the proposed method seems to overcome all the drawbacks of the former methods. Indeed, it can detect smooth edges as well as sharp ones, it produces nearly continuous edges, and thanks to the locality, it avoids detecting unexisting edges like shadows.
In
Figure 10 the results of edge detection by different methods for natural still images are shown. From the presented results, one can see that the proposed local k-Means algorithm definitely works better than the reference algorithms. For this method, all edges are just edges, no matter how smooth they are.
In the above comparisons, we have limited to the visual assessment of the edge detection accuracy. It follows from the fact that we compared here the methods from different classes of edge detection techniques. Each class has specific characteristics and needs a different approach in the objective evaluation of edge detection efficiency. This is because of the different edge definitions used. Let us note that in the case of pointwise methods, a detected edge is a pure set of points, whereas, in the case of geometrical methods, a detected edge is a segment of a line or curve. The second point is that when we deal with sharp edges, it is relatively easy to decide what is an edge and what is not. When introducing smooth edges, it is hard to clearly state whether we still deal with an edge or a texture, or something else. It depends on the application and the user’s needs. However, to show that the introduction of the local k-Means really improves edge detection numerically, we present here the comparison of the image segmentation results between the original global k-Means algorithm and the proposed local use of it. As one can see from
Table 1, the proposed method gives better-quality image segmentation in the Peak-Signal-to-Noise-Ratio (PSNR) sense. So, we can conclude that since the proposed method segments images more accurately than the k-Means algorithm, it also better represents the detected edges than the original method.
Let us note that the time complexity of the regular k-Means algorithm for an image of size pixels is , where k is the number of segments. When we perform the local k-Means algorithm, we do the same as in the k-Means but for image subsquares with added margins. When we fix the margin’s size as 0, the computation time for the local k-Means is the same as for the k-Means method with the same number of segments. Therefore, the proposed version is faster because we use only 3 segments instead of 8 ones. On the other hand, when we fix the margin’s size as large as the square size, we obtain the mask of size 9 times larger than this square. So, the computation time is and the time complexity is , the same as for the k-Means algorithm. In practical applications, the computation times are comparable between these two methods.
6. Summary
Unlike real images, for which edge detection is relatively well implemented these days, focused images can be troublesome. And it applies not only to really focused images but also to images with different levels of sharp edges. This also can take place in cases when one or a few objects are in motion. In this paper, we presented a new algorithm that can efficiently detect both sharp and smooth edges in focused images alike. The commonly used methods are defined just to use step edges and usually cannot cope with smooth edges properly. The proposed method was defined to overcome the limitations of the existing methods. This improvement was achieved by applying the k-Means algorithm locally. This action causes the algorithm to adapt to the image content so it can correctly detect the edge, no matter if it is sharp or smooth. The proposed method was compared to the reference methods that were chosen as quite different: the classical pointwise one, the modern geometrical one, and the intelligent ML-based one. In all cases, this method gave better results of edge detection than the existing ones.
It is worth mentioning that, although there are plenty of edge detection methods in use, it is not so important which methods we compare our results, since all of them are defined on the assumption that an edge is a step discontinuity. In this paper, we assumed that an edge can also be smooth and has to be detected as well. Thus, this is the main strength of the proposed method. However, this approach raises a number of open problems with the definition of edges with varying smoothness and the proper evaluation of the detection of such edges. Therefore, our future work is to build a model of an edge with varying smoothness and find a way to evaluate such edge detection objectively.