Dilated filters for edge detection algorithms

Edges are a basic and fundamental feature in image processing, that are used directly or indirectly in huge amount of applications. Inspired by the expansion of image resolution and processing power dilated convolution techniques appeared. Dilated convolution have impressive results in machine learning, we discuss here the idea of dilating the standard filters which are used in edge detection algorithms. In this work we try to put together all our previous and current results by using instead of the classical convolution filters a dilated one. We compare the results of the edge detection algorithms using the proposed dilation filters with original filters or custom variants. Experimental results confirm our statement that dilation of filters have positive impact for edge detection algorithms form simple to rather complex algorithms.


Introduction
An edge in an image is the most basic feature and has been intensively researched over time. A huge variety of mathematical methods have been used to identify points in which the image brightness changes sharply or has discontinuities. Edge detection is one low-level technique that is used for the goal of objects boundary detection. This is a fundamental tool in image processing, image analysis, machine vision and computer vision, particularly in the areas of feature detection and feature extraction.
There are humongous approaches to basic feature detection depending on the pixel properties of the image. The methods have been defined from the gray scale levels to color slicing or from local features (lines, shapes) to the global matching features (the shape of objects, meta object property). The standard local edge detection filters are built for highlighting intensity change boundaries in the near neighborhood image regions. As also previous have been mentioned Sen and Pal (2010), the problem of edge finding has no universally accepted technique and this is a motivation for ongoing researching how to improve the edge detection methods.
The most successful edge detection algorithms have considered local methods, where the closest neighbourhood of a pixel is considered to be important for the pixel itself. Nowadays, images are containing more information than in the past (due to the sensors technologies) and we could say that the pixel itself is not similar only to its direct neighbours, but a bigger neighbourhood could be important. From early morphological edge detection operators definition Haralick et al. (1987) the extension idea was well presented. From another perspective, the dilated convolution methods have been recently proven very beneficial in many highly cited computer-vision papers: for small objects detection Hamaguchi et al. (2018), in dense prediction tasks Chen et al. (2018); Yazdanbakhsh and Dick (2019), on prediction without losing resolution Yu and Koltun (2016), for feature classifications in time series Yazdanbakhsh and Dick (2019) or beneficial for context aggregation Zhao et al. (2017).
Merging those ideas we propose not just to increase the filter size but to simply dilate standard edge detection filters in order to characterize the pixel itself through non-direct neighbour properties. We define the dilation operation for filters that consists by simply adding gapes in the well known classical filters. This method of dilation is neither in the mathematical morphological sense Haralick et al. (1987), nor the geometric extension of the kernels discussed in the literature Bogdan et al. (2020).
In our previous work we evaluated the dilation effect on classical first order edge detection algorithms and the classical Canny Canny (1986) algorithm followed, naturally, with an analysis on which approach is better: to expand from lower level or dilate. Bogdan et al. (2020); Orhei et al. (2020a). In this paper we desire to extend our work to other edge detection algorithm that have as a defining step the use of edge detection kernels. The classical edge detection methods Sobel Sobel and Feldman (1973), Prewitt Prewitt (1970), Scharr Scharr (2000), together with more complex boundary detection algorithms like Canny Canny (1986), Marr-Hildreth Marr and Hildreth (1980) or Shen-Castan Castan et al. (1990) are considered for testing with our proposed dilation technique.
To evaluate the dilation benefits over an edge detection filter, we will not limit our analysis to the first order derivative gradient-based edge detection filters but consider the second order filter too. Details regarding the algorithms and steps we used in order to obtain the same edge map format are presented in Section 3. Section 4 highlights the results of our hypothesis regarding the dilated filters, rather than expanding them. For a better comparison of the results, we used the challenging BSDS500 boundary detection benchmark tool and image sets from Arbelaez et al. (2011).
From the experiments of our previous work and from this analysis it is clear that in most cases the dilated approach brought benefits to the resulting edge map, idea that is elaborated in Section 5.

Dilated filters
In our previous work Bogdan et al. (2020) we explored the idea of dilating the filter, kernels, of edge operators to obtain better edge-maps. In order to benefit from a higher neighbourhood of a pixels to obtain a pixel edge we define dilated filter as in Definition 1. When we dilate the kernels, we are considering the newly added positions as gaps and we ignore them by setting zeros.
Definition 1 A dilated filter is obtained by expanding the original filter by a dilation factor/size Bogdan et al. (2020).
By dilating the kernels we increase the distance between important pixels. We consider that this new distance will positively influence the result of the convolution. The bigger region of interest resulted can translate into stronger intensity changes in the image. In order to highlight our definition on a filter, we will use a generic kernel and represent the dilation in Figure 1. Dilating the filters, rather than extending them, helps in finding more edge pixels than the standard or extended variants of the filters. Another benefit worth mentioning of dilating is the fact that the number of operations does not increase with the dilating factor resulting in the same time cost for the edge detection Bogdan et al. (2020).
Another considered approach was presented in Orhei et al. (2020a), where we compare and analyze the dilation of filters defined in Bogdan et al. (2020) with the reconstruction from a lower scale pyramid level. Feature extraction in lower pyramid scale level is a common practice in the domain because of the benefits of lower computation resources which are needed.
The resulting edge map from a dilated 3x3 filter is equivalent with an edge map calculated in a lower scale pyramid level and expanded back to original size. Dilating a factor of one is similar with applying the same filter in the immediately lower scale pyramid level. Dilating with a factor of two is similar with applying the filter in two scales lower in pyramid level and so on.This hypothesis stands because in both cases the region we take in consideration to find edges is not anymore an 3x3 matrix but a 5x5 matrix.
We have examined the equivalence between lower levels processing and dilating in our previous work Orhei et al. (2020a). As expected we obtain similar results when dilating as processing in lower levels. Dilating bring forward benefits regarding the neighborhood we consider and computation time but extracting features in lower levels has benefits of it's on.
The goal of this work is to present the comparison of the results of the dilated filters with original filters results. In the next subsections we will present in short classical operators and algorithms for edge detection, which we used to show the results of our dilation technique.

Preliminaries
In this section we will present all the edge detection algorithms that are considered in our analysis. We will present the classical steps of each approach and the considered kernels in each case. In all algorithms the input images are converted to gray-scale images. The presentation order of the algorithms is in chronological order.
In order to benchmark our results with BSDS500 Arbelaez et al. (2011) all the edge maps resulted need to have the values of the pixels between 0 and 255 and thickness of 1 pixel. To achieve this, we will use different techniques for first order gradient operators based algorithms than second order gradient based. Different approaches are needed because of the different raw edge maps that result. All the necessary steps are described in each subsection in the pseudo-code of the corresponding algorithm. In some of the cases multiple algorithms can use the same algorithm to obtain the desired edge map.

First Order Derivative Orthogonal Gradient operators
First Order Derivative Orthogonal Gradient Operators are the most basic operators and have been extensively researched over the decades. We will consider in our analysis the following edge detected operators and their extensions: Pixel difference operatorMlsna and Rodriguez (2009), Separated pixel difference operator Mlsna andRodriguez (2009), Sobel operator Sobel andFeldman (1973) and the extension to a 5x5 or 7x7 kernel Bandai et al. (2003); Lateef (2008); Kekre and Gharge (2010); Levkine (2012); Gupta and Mazumdar (2013); Lateef (2008), Prewitt operator Prewitt (1970) and extension to a 5x5 or 7x7 kernel Lateef (2008); Levkine (2012); Lateef (2008), Kirsch operator Kirsch (1971) and the 5x5 kernel expansion Bandai et al. (2003), Kitchen and Malin Operator Kitchen and Malin (1989), Kayalli Operator Kawalec-Latała (2014), Scharr Operator Scharr (2000) and the extensions to 5x5 kernel Levkine (2012); Chen et al. (2017), Kroon Operator Kroon (2009), Orhei Operator Orhei et al. (2020c. All the kernel masks for the operators are presented in Figure 25, Figure 27 and, Figure 30 from Appendix 5. All those operators are orthogonal discrete isotropic filters so we are going to represent only one of the kernel. To obtain the other kernels we just need to rotate it by a fraction of π 2 . The gradient is a measure of change in a function and an image can be considered to be an array of samples of some continuous function of image intensity, typically two-dimensional equivalent of first derivative. The magnitude is calculated using Equation 1, where f (x, y) is the image and G x , G y are the components on x and y axis. The Direction of the gradient is calculated using Equation 2 Haralick and Shapiro (1992).
The result of this algorithm is an edge-map formed by edges that are not topically 1 pixel width, this aspect can deform the result of our evaluation. So for a better evaluation we choose to thin the resulting edges before hand. To this algorithm we would like to add two second steps that are commonly used, see Woods (2011): smoothing and thresholding. Smoothing of images is a common practice for enhancing the results as thresholding will eliminate "weak" edges that are found. Steps of our proposed algorithm that will be evaluated are detailed in Algorithm 1. Apply Gaussian filter smoothing /* For gradient magnitude with orthogonal discrete isotropic rotated with π/2, found in Figure 25 */ /* For compass magnitude with orthogonal discrete isotropic rotate it by a fraction of π, found in Figure 26 */ /* For Frei-Chen with orthogonal discrete isotropic found in Figure 31 */ Apply convolution with kernels /* For gradient magnitude applying the edge detection operator using Equation 1 */ /* For compass gradient magnitude applying the edge detection operator using Equation 3 */ /* For Frei-Chen applying the edge detection operator using Equation 4 and Equation 5 */ Calculate gradient magnitude /* Use global threshold so each pixel, which has an intensity value higher or equal to a threshold, will have its value set to a max value(e.g. 255) else to 0 */ Apply global threshold algorithm /* Apply the Gua-Hall thinning algorithm Guo and Hall (1989) to remove the excess edge points caused by the convolution and threshold */ Apply thinning algorithm

First Order Derivative Compass Gradient operators
Compass gradient operators are commonly used in the edge detection and usually detect the influence of the neighbour pixels in a compass rotating directional components. They are commonly used as a alternative for the Orthogonal Gradient Operators.
The gradient magnitude is calculated using Equation 3, where k is the number of kernels and L is the kernel size divided by 2. The resulting value of intensity is normalized or thresholded for eliminating low confident edges. The local edge orientation is estimated with the orientation of the kernel that yields the maximum response, as in Gonzalez and Woods (1991); Szeliski (2010).
The operators are also orthogonal discrete isotropic filters so we are going to represent and use only G x kernel. To obtain the other kernels for this template gradient we need to rotate it by a fraction of π, different from the previous ones.
For our analysis we found in literature the following Prewitt Compass Operator Prewitt (1970), Robinson Compass Operator Robinson (1977); Prewitt (1970), Kirsch operator Kirsch (1971). All those kernel masks details can be found in Figure 26 from Appendix 5.
Similar to Orthogonal Gradient Operators the resulting edge-map is not 1 pixel width or with the same magnitude so will threshold and thin the results before the evaluation, details in Algorithm 1. Similar to other algorithms to obtain the best results we will apply a smoothing filter before-hand.

Frei-Chen operator
The Frei-Chen operator Chen et al. (1977); Park (1990) works on a 3x3 footprint but applies a total of nine convolution masks to the image. Frei-Chen masks are unique masks, which contain all of the basis vectors.
The Frei-Chen masks one trough nine, defined on a 3x3 window span the edge, line, and average subspaces. All the kernel masks for the operators are presented in Figure 31 from Appendix. Kernels G 1 and G 2 are isotropic average gradient basis vectors and kernels G 3 and G 4 are the ripple vectors. These formulas contribute to the edge detection sub space. Kernels G 5 and G 6 are the line basis vectors, respective kernels G 7 and G 8 are the discrete Laplacian vectors and are used for the line subspace detection. Kernel G 9 is the average mask Chen et al. (1977). To use the Frei-Chen operator for edge detection we apply the Equation 4 for the first 4 masks. To use the Frei-Chen operator for line detection the Equation 5 for the first 4 masks should be used.
In some way the Frei-Chen operator is a First Order Derivative Compass Gradient Operator so we will threat it as such when evaluation, see Algorithm 1. Even if line detection is not in our analysis scope we will consider the line output this algorithm as is closely coupled with the edge result.

Laplacian edge operator
The Laplacian is a 2-D isotropic measure of the second spatial derivative of an image. The Laplacian of an image highlights regions of rapid intensity change and is therefore often used for edge detection. Another difference between Laplacian and other operators is that Laplacian does not take out edges in any particular direction. The formula for the Laplacian Jain et al. (1995) is the second-order derivative by each component of a 2D function as is presented in the Equation 6.
From literature Haralick and Shapiro (1992); Szeliski (2010); Jain et al. (1995); Davies (1992) we can find different estimation of isotropic kernels for Laplacian operator that we preset in Figure 29 in Appendix 5. For our work we will consider all of them so we can highlight changes that appear upon edge map when choosing different approximation of the Laplace function.
The steps we need to take to obtain the desired edge-map format are presented in Algorithm 2. We apply the Equation 6 to obtain the raw edge map but to able to evaluate we need to transform the edge result from natural range of (−250, +250) to a (0, 250). Afterwards the scaled edge map is thinned for the final result. We can accept that this isn't the normal usage of the Laplace operator but we wanted to evaluate it separately from Laplace of Gaussian or Marr-Hildreth operators.

Laplacian of Gaussian -LoG -or Mexican Hat operator
The Laplacian of Gaussian (LoG) approach defines that the image is convoluted with an Gaussian filter to reduce the noise followed by an Laplacian convolution to expose the edges. The LoG function with the three dimensional plot looks like the Mexican Hat hence the name of the operator Torre and Poggio (1986); Haralick and Shapiro (1992). Applying the edge detection operator /* Use global threshold so each pixel, which has an absolute value of the intensity higher or equal to a threshold, will have its value set to a max value(e.g. 255) else to 0. */ Apply global threshold algorithm /* Apply the thinning algorithm presented in Guo and Hall (1989) to remove the excess edge points caused by the convolution and threshold */

Apply thinning algorithm
We can use two mathematical variants for obtaining LoG, (see the Equation 7) : convolve the image with a Gaussian smoothing filter and afterwards compute with the Laplacian Operator, or convolve the image with the linear filter that is the Laplacian of the Gaussian filter (Equation 8). LoG Similar to the Laplace operator we need to transform the resulted edge map from range of (−250, +250) to a (0, 250) so we use the steps from Algorithm 2. The determined edge map is scaled followed by thresholding and thinning.

Marr-Hildreth algorithm
Another method of detecting edges in digital images, is the Marr-Hildreth algorithm, there are used continuous curves where are strong and rapid variations in image brightness.
The Marr-Hildreth edge detection method is simple and operates by convolving the image with the Laplacian of the Gaussian function, or, as a fast approximation by difference of Gaussian. Zero crossings are detected in the filtered result to obtain the edges. A zero crossing at pixel level implies that the signs of at least two opposite neighboring pixels are different Marr and Hildreth (1980);HARALICK (1987).
For our implementation of Zero Crossing algorithm we choose to implement a threshold that will permit us to discriminate better relevant zero crossing according to the difference in intensity HARALICK (1987); Grimson and Hildreth (1985). This variant of Zero Crossing will produce better results than the classical version that threshold the results at zero regardless of intensity change.
To obtain the desire edge-map format we have to thin the zero crossing output of the LoG algorithm. All the steps used for simulations are presented in Algorithm 3.

Canny algorithm
Canny edge detection algorithm is a classical and robust method for edge detection in gray-scale images. The edge detection algorithm is widely used due to its short operation time, relatively simple calculation process. The two significant features of this method are introduction of Non-Maximum Suppression and double threshold of the gradient image, see Canny (1986).
The traditional Canny algorithm Canny (1986) has the following steps: 1) smooth the image with a Gaussian function, 2) apply the first order operator, 3) Non-maximum suppression of the magnitude of the gradient, 4) double threshold is used for edge connection and edge connections. Apply Zero Crossing algorithm /* Apply the thinning algorithm presented in Guo and Hall (1989) to remove the excess edge points caused by the convolution and threshold */

Apply thinning algorithm
Non-maximum suppression is an important step for Canny algorithm. The purpose of it is to find the local maximum value of the pixels, and set the gray value corresponding to the non-maximum point to zero, so that a large part of non-edge points can be eliminated.
After the non-maximum suppression phase using the two thresholds, we double threshold and edge link by hysteresis the edge points. If an edge pixel's gradient value is higher than the high threshold it is set as strong edge pixel. Similar if a edge pixel gradient value is smaller than the high threshold value but larger than the low threshold value, it is marked as a weak edge pixel. If an edge pixel's gradient value is smaller than the low threshold value, it will be suppressed. At the end the remaining "weak" and "strong" pixels are connected as long as there is one strong edge pixel that is involved in the blob.
The classic Canny Edge algorithm uses the Sobel Operator during the convolution step in order to compute the gradient.
Other operators such as the dilated Sobel can be also used during this step. Thus, the Canny Edge algorithm offers flexibility when choosing the filters for the convolution step.

Shen-Castan algorithm
The unique feature of the Shen-Castan Edge Detection is that it can detect the edge of an image with noise. By using an Infinite Symmetric Exponential Filter, the noise will be eliminated.
Infinite Symmetric Exponential Filter (ISEF) Castan et al. (1990), described as a real continous function by the Equation 9 and the recursive function by Equation 10, Where b is the Thinning Factor and its values lies in between 0 and 1.
Shen-Castan edge detector has a unique step that overcomes noise in the input image by using the Infinite Symmetric Exponential Filter, see Shen and Castan (1992). The algorithm has the following steps: 1) smooth the image with the ISEF filter, 2) convolve the smoothed image with a Laplace binary operator, 3) threshold the edge strong edge pixels using Zero Crossing, 4) link the edge points by hysteresis and 5) thin the results at the end .
The approximation of Laplacian is computed by subtracting the original image from the smoothed one. The result is a band-limited Laplacian image. Next, a binary Laplacian image is generated by setting all the positive valued pixels to 1 and all others to 0, as in Shen and Castan (1992).
For our analysis we will not use the Laplace binary operator, which is considered to be optimal as run time for this algorithm but use the versions of Laplace isotropic kernels presented in subsection 3.4. Details of the steps we will use for obtaining the edge map is presented in Algorithm 5. Apply Hysteresis Threshold Apply Thinning

Edge Drawing algorithm
Edge Drawing (ED) is an edge detection algorithm that works by computing a set of anchor points, which are most likely to be edge elements, and linking them with a predefined set of rules which we call smart routing. Topal and Akinlar (2012) ED algorithm Topal and Akinlar (2012), described in Algorithm 6, can be summarized in the following steps: suppress the image with a Gaussian filter Aurich and Weule (1995), calculate the gradient magnitude and orientation using Sobel filter Sobel and Feldman (1973), extract the anchor points, connect the anchor points using the smart routing concept. The steps of ED algorithm is presented in Algorithm 6.
The mechanism of connecting anchors is considered the most crucial step of ED. Connecting consecutive anchors is done by passing from one anchor to the next following the cordillera peak of the gradient map mountain. This process, as in Topal and Akinlar (2012), is guided by the gradient magnitude and edge direction maps computed. If a horizontal Threshold gradient map /* Anchor must be a local peak of the gradient map, using anchor_thr and scan_interval */ Extract anchors /* Three immediate neighbors are considered and the maximum gradient value is picked */ Smart routing edge passes through the anchor, we start the connecting process by proceeding to the left and to the right. If a vertical edge passes through the anchor, we start the connection process by proceeding up and down. The process stops if we move out of the edge area or we encounter a previously detected edge.
Benchmarking the edge operators BSDS500 Arbelaez et al. (2011) is a widely used dataset in the field of computer vision for benchmarking edge detection algorithms. It contains natural images that have been manually segmented and is considered the ground truth in many boundary detection comparisons. The benchmark is used to evaluate the result images generated by a specific algorithm with the segmented images in the dataset. For edge detection evaluation we used the Corresponding Pixel Metric (CPM) algorithm, defined in Prieto and Allen (2003). This measure is used for correlating similarities with a small localization error in the detected edges. CPM first finds an optimal matching of the pixels between the edge images and then estimate the error produced by this matching.
The BSDS500 benchmark offers 500 images for testing, presented in few different sets. The images are natural images marked for the boundaries and edges of the objects and structures that they represent or contain.
For each benchmark image, three different probability measures are computed: precision (P ), recall (R) and F-measure (F 1) defined in Sasaki (2007). Precision (Equation 11) is the probability that a resulting edge/boundary pixel was labeled as a true edge/boundary pixel. Recall (Equation 12) is the probability that a true edge/boundary pixel was detected. F-measure (Equation 13) is the accuracy measure computed as an average between precision and recall. Also, we need to specify that T P (True Positive) represents the number of matched edge pixels, F P (False Positive) the number of edge pixels which are incorrectly highlighted as edge pixel and F N (False Negative) the number of pixel that have not been detected as edge pixel but in dataset has been labeled as edge pixel.

Experimental results
In this section we will present the results of our analysis for each edge operator presented in Section 3. In our presentation will consist of visual results, statistical results and remarks for each subsection.
For our simulation to be reproducible and easy to use we have used End-to-End Computer Vision Framework 1 -EECVF- Orhei et al. (2020b). EECVF is an adaptable and dynamic framework designed for researching and testing CV concepts which does not require the user to handle the interconnections throughout the system. All the edge operators and algorithms are present in the framework and can be reproduce by running the main_dilated_f ilters_f or_edge_detection_algorithms module.

First Order Derivative Orthogonal Gradient operators
The first category we analyze is the First Order Derivative Orthogonal Gradient Operators that are described in Section 3.1. Using the steps described in Algorithm 1 we will compare the results we obtain when using the standard kernels, extended kernels and dilated ones.
To obtain the best results from the steps presented in Algorithm 1 we need to choose a good threshold value for Step 3 and of course choose a sigma value for Step 2, that will benefit all images. We know from experience and literature that this two aspects can completely change the results of our simulations. For this parameter tuning we will choose the Sobel and Feldman (1973) operator, being one of the most popular in this category.  As expected a higher threshold will produce less edge points but with a fair confidence that resulted in a high R because of the lack of points. If we look from another perspective, if we choose a small threshold value we see a higher P but a very weak R caused by the excess edge point generated. Looking over the results obtain in Figure 2 we ca observe that the best results we obtain when using sigma value of 2.75 and a threshold value of 50.
Next step was to change the kernel we use from Sobel to other operators kernels. We keep the parameters we found in the tuning phase so we will have a objective comparison of the results. The annotation we use to represent the results are: T hr for gradient threshold and S for Gaussian Sigma value.
Visual comparison results of the operators are presented in Figures 4. We can observe (by comparing the Figures  4 columns) from our simulation that the dilation of the kernels doesn't produce a degradation of the edge map. An increase in "noise" edge pixels is a foreseen effect of the extensions of the convoluted area. By dilation we keep the benefits of the increase region of interest but lose the excess "noise" pixels we would obtain by extending.
From Figure 4 and Table 1 we conclude that dilated filters have similar or better results than the classical ones. Even if the F 1 score is not always better we can clearly see an improvement of P in most cases. We can observe (looking through F 1 score) that dilation can bring a small degradation of the edge map as for Kayyali Kawalec-Latała (2014) but it can bring improvements like for Kitchen operator Kitchen and Malin (1989)

First Order Derivative Compass Gradient operators
The following section consists of the analysis of the results of the First Order Derivative Compass or Directional Gradient. Similar to Magnitude Gradient operators we will conduct our analysis using standard and dilated filters presented in Section 3.2.
We present first some visual comparison results of the operators, in Figures 6. As a first impression we can observe that dilation brings forward a cleaning of artifacts in the overall image. This is an important aspect for this edge operator because it is very affected by "noise" in the input image. Similar to gradient edge operators first we looked for the best smoothing parameter and threshold value to use. To do so we used the classical Robinson operator Robinson (1977) with a 3x3 kernel. In Figure 6 we can observe the F 1 results for a range of the threshold between 30 and 160, sigma value between 0.25 and 3.5, obtaining the best results when threshold value is 50 and sigma 2.5. This is an important step because by choosing a wrong threshold value we can eliminate useful pixels from the edge map.
In Figure 7 we can observe the F1-measure results of all the edge operators presented in Section 3.2 using the values found from our experiments, for the threshold value and sigma value. For detailed observation of the results we presented them in Table 2.
As we can observe in the case of Compass Gradient operators dilation does not bring in all cases a better F 1 but it seems that always brings a better P metric. In case of Prewitt operator dilating the kernels actually brings overall worse results than the classical one.

Frei-Chen operator
Frei-Chen operator is a special case of the Compass Gradient operators and is presented Section 3.3. Using the Algorithm 1 we run the experiments for edge and lines generated by this operator. In Figures 5 we present the visual results.
We fine tune the parameters of the algorithm for a range of the threshold between 30 and 160 and sigma value of the Gaussian Filter Blur between 0.25 and 3.2. As we observe the best tuned Frei-Chen operator is for threshold of 50 and a sigma of 2.5, see Figure 8. In Figure 9 we can observe the F1-measure results of all the edge operators presented in Section 4.3 using the values found from our experiments, for the threshold value and sigma value. For detailed observation of the results we presented them in Table 2.
Even if we can consider that the line detection part of the algorithm is not in scope for our paper we decided to add it to the simulation just to observe the effects which the dilation brings to it. As we can observe, similar to edge detection, line detection has a certain improvement gain, see the Table 2 last row. We can not pass this topic without remarking the aspect of artifacts that are created when processing the line evaluation of the operator that are not resolved by dilating the kernels.

Laplacian edge operator
Laplace edge operator is one of the most popular edge detector based on second order derivative discrete kernels, described in Section 3.4. Following the Algorithm 2 we attempt to standardize the evaluation of the edge detection, even if threshold of the second derivatives operator is not common practice in the field.
For our simulation results and evaluation we first searched the best threshold value by using the Laplacian kernel V1, that we can find in Figure 29. We have chosen to vary the threshold from 15 till 245 and observed that we obtain the best F 1 − measure measure for the value 75, the results can be observed in Figure 10. In Figure 12 we present the visual results for our experiments, the Figure 11 and Table 3 contains the benchmark over all results. We can observe that in this case, dilation of kernel technique does not bring improvements to the edge output.
We can see a duplication of the edge points detected that seems to be caused by the a number of edge points with low value found that are being highlighted more when we threshold, normalize and thin the edge map.
Even all the kernels found in literature are different approximation of the Laplace function we would like to see what effect this different kernels bring in the edge-maps produced. For all our experimentes we will use only the positive variants of the operator, so we will produce inwards edges.

Laplacian of Gaussian -LoG -or Mexican Hat Operator
A natural extension of the Laplace Operator is the Laplacian of Log, described in Section 3.5.
For our simulation results and evaluation concerning Mexican Hat Operator we described in section 3.5 we first searched for the best combination of sigma for constructing the LoG kernel and the threshold. The results can be observed in Figure 13. We have chosen to vary the threshold from 5 till 60, avoiding going for a bigger threshold. Bigger threshold values would not generate better results because of the small intensity resulted after the convolution. Regarding the sigma for the Gaussian Blur kernel, we chose to vary between 0.2 and 2.0, while going higher would generate a lack of resulting edge points.   We observed that we obtain the best F 1 − measure measure by using the values of 5 for the threshold and 1.8 for sigma. For a better visualization in Figure 13 we have chosen to shown only the best results.
As we can see in Figure 14 and Figure 12 dilating the filters does not bring forward better results for most of the kernel variants but if we look closer in Table 3 we see cases like in V 3 and V 4 dilating the kernels bring much better results.

Marr-Hildreth Operator
Evaluation of Marr-Hildreth Operator, presented in section 3.6, are similar to the Log Operator. We first searched the best combination of sigma for constructing the LoG kernel and the threshold of Zero Crossing. The results can be observed in Figure 16 and for a better visualization, we have chosen to show only the best results in F 1 order. We have tried the combination of sigma value from 0.2 to 3.0 with gradient threshold of 30 to 90 of the gradient image, considering that is the interval where these parameters bring good results to the output. We obtain the best results when sigma is 1.8 and the threshold is 0.3 * 255 = 85.
As we can see in Figure 15 and Figure 16 dilating the Laplace kernels brings improvements to all variants. The Table  4 contains the full Marr-Hildreth results were in this case the dilated filter has obtained better results, see the P and F1-score of variants 1 to 4. For V5 variant the results are closer to the original filter results.

Canny algorithm
The following section consists of the analysis of the Canny results using first order derivative gradient operators, extended and dilated filters presented in Section 3.7. The visual comparison results of the operators can be seen in Figures 20. We can observe from our simulation that the dilation of the kernels doesn't produce a degradation of the edge map. The dilation actually brings an improvement in most cases, which was already presented in Bogdan et al. (2020) and Orhei et al. (2020a).
For parameter tuning we vary the following configurations: Gaussian sigma value from 0.2 to 3.0 with a step of 0.25, low threshold from 70 to 150 with a step of 10, high threshold from 90 to 200 with a step of 10. As we can see in Figure  18 the best results we obtain when Gaussian sigma is 1.5, low threshold is 80 and high threshold 90. For the images we use the following notations: Gaussian sigma is S, low threshold is L, high threshold is H.  Table 1 displays the results of the Canny Edge algorithm for all the operators which we analyzed. In most of the cases, the dilation of the initial operators led to higher F 1 scores. In the case of the Prewitt, Kayyali or Scharr filters, we can notice that the best F1 scores are obtained by the 7x7 dilated filters. Also, the dilated 5x5 and 7x7 filters obtain in both cases a significantly higher F1 score than the extended 5x5 and 7x7 filters. Similarly, in the case of the other operators, the dilation of the original filters obtained higher F1 scores.
All these results are also highlighted in Figure 19, where we can notice that our dilation of the filters obtained more edges than the original filters.
When using Pixel Difference and Separated Pixel Difference kernels in Canny algorithm the benchmark results have not been improve by dilation. The edge pixels obtained by the magnitude calculation step in Canny are not strong enough to pass the double hysteresis thresholding and therefore the resulted edge map is blank (as we can see in Table 1 last columns). When 7x7 dilated version is used, a limited number of edge pixels appears but not enough to be relevant in F1-score, even if the precision is good enough.

Shen-Castan algorithm
The results and evaluation of Shen-Castan Operator, described in section 3.8, are presented in this section. We searched again for the best combination of threshold of Laplace, smoothing factor, window size, thinning factor and ratio factor of the algorithm. We chose a threshold value of 40 for the Laplace edge detector and vary the rest of the values as following: smoothing factor of ISEF filter(S) from 0.5 to 0.9; adaptive zero crossing window size(W ) of 5, 7, 9; threshold for zero crossing(R) from 0.5 to 0.9; thinning factor(T H) of 0, 0.5, 0.9.   We can observe in Figure 21 the best results we obtain when ISEF smoothing factor is 0.9, ZC window is 7, ZC threshold is 0.9 and thinning factor of 0.5. It looks that the figure doesn't show the correct value but in reality the values are overlapped. It can be seen in the details of the legend values from the figure.
In Figure 22 we can see the results of simulating the algorithm using different kernels. We can observe that dilating the kernel size of the Laplace operator does not obtain better results in any case of the cases. If we look over the detailed results, see Table 4, the precision is not actually that affected by the dilation but recal is strongly affected.
If we look over the visual results, see Figure 17, we can see that the results in some cases looks better but we can clearly see the increase in quantity of noise pixels appearing.   For our simulation results and evaluation of ED algorithm, described in section 3.9, we first searched for the best combination of Gaussian smoothing kernel size, gradient threshold, anchor threshold and scan interval interval. We choose to vary the parameters as following: Gaussian kernel (GK) between 3 to 9 with a step of 2; gradient threshold (T G) between 10 and 150 with a step of 10; anchor threshold (T A) between 10 and 60 with a step of 10; scan interval in range of 1, 3, 5. Looking over the F 1 results from Figure 23 we can observe that the best results are obtained using GK = 9, T G = 50, T A = 10 and scan interval is 1.
In Figure 20 we present the visual results of using the parameters found and the edge operators from Section 3 and in Figure 24 we can see F 1 results equivalent. We can conclude that, expanding and dilating techniques offer better results than using the classical kernels.
In Table 5 we can see the edge drawing R, P and F 1 results. We can observe some interesting facts like that in the case of KayyaliKawalec-Latała (2014) ED algorithm does not bring any improvements. Another observations is that in general a bigger kernel is equivalent to better results. In this case, dilating the kernels with a factor of 1 results in better F 1 values, than by dilating with a factor of 2.

Conclusions and future work
In this paper we extend our work (see the previous papers Bogdan et al. (2020); Orhei et al. (2020a)) regarding dilation of a classical convolution edge detection filters. In Section 3 we present the theoretical background where we show the algorithms upon we conducted the simulation from Section 4. The experimental results confirm our intuition that dilation of filters have positive impact for edge detection.
Dilating the second order discrete approximation filter did not bring considerable improvements to the edge map resulted. As we see in Figure 4 when using the Marr-Hildreth algorithm we can spot small improvments when using the 5x5 filter. But when applying the Shen-Castan algorithm , see Table 4, we do not see improvements from dilation.
Statistical and visual we could observe that by dilating the filters we find more edge pixels than by the classical operators. By dilation we obtained a better precision and F1-score which can be observed in the results we presented. By the simple structure of the dilated filters, they are also a good choice when the runtime matters. The other classical filter extensions from Gupta and Mazumdar (2013) On the other hand, the dilated filters might have better results on bigger images because they contain more information and therefore, the dilated filters can use pixels from a higher distance for computing the gradient and thus obtaining more edge pixels.
In our research we focused on dilating 3x3 filters but in our future work we can concern ourselves in dilating bigger kernels as starting point. This can be analyzed from the definition of the dilation. Another aspect worth exploring in the future would be to run experiments in automated threshold version of the algorithms. We concerned our experiments on the classical version of edge detection algorithms but looking to the combined effects of dilating and automated threshold could enhance the resulted edge-map.
In our approach, for the comparison purposes, the focus was to fine-tune the classical edge detection algorithms in order to obtain the optimal threshold and sigma values. We are certain that if we would have fine-tuned the algorithms for the dilated kernel, the results would have been better.
In general we can state that using the dilated kernels brings benefits regarding the edge-map resulting from edge detection classical algorithm and run-time needed. We obtain similar or better results, taking in consideration a bigger neighborhood.