A New Bayesian Edge-Linking Algorithm Using Single-Target Tracking Techniques

This paper proposes novel edge-linking algorithms capable of producing a set of edge segments from a binary edge map generated by a conventional edge-detection algorithm. These proposed algorithms transform the conventional edge-linking problem into a single-target tracking problem, which is a well-known problem in object tracking. The conversion of the problem enables us to apply sophisticated Bayesian inference to connect the edge points. We test our proposed approaches on real images that are corrupted with noise.


Introduction
Edge detection is a key and fundamental step in many computer vision and image-processing applications since it produces useful features of raw images.Edge detection takes a gray-scale image as input and generates a binary image map that includes black-and-white values at each pixel.Many researchers have devoted their efforts to inventing edge-detection algorithms.
Edge detection is based on the detection of abrupt local changes in the image intensity.In practice, edge models are classified according to their intensity profiles: step, ramp and roof.Given the edge model, edge detection involves three fundamental steps: image smoothing for noise reduction, detection of edge points and edge localization [1].That is, we first reduce the noise in images by using several low-pass filters [2,3] in the image smoothing for noise reduction step.The detection of edge points step is a local operation that extracts all points that are potential candidates for edge points.In other words, the algorithms consider all of the points and then use thresholds, a histogram, gradients and zero crossing to identify promising edge points [4][5][6].The last step is edge localization, the objective of which is to select from the candidate edge points only those points that would be meaningful to build the desired features.However, it is not straightforward to obtain the complete set of desired features even after the above three steps.Although edge detection should yield sets of pixels that only coincide with edges, the actual underlying edges are seldom characterized by the detected pixels because of unwanted noise in raw images, spurious breaks in the edges and improperly-chosen threshold values.Therefore, edge-linking or boundary detection/tracing algorithms often follow edge detection to eventually recognize the simplified and salient object of interest from the image.Various algorithms have been developed to connect the detected edge points.Broken edges are extended along the directions of their respective slopes by using an (adaptive) morphological dilation operation [7,8].Multiple resolution and hierarchical approaches to edge detection were also proposed by [9,10].Another well-known approach for edge linking is the Hough transform [11].The graph search problem with the A * algorithm has also been used to link edges [12].Probabilistic relaxation techniques have been applied to label edges for boundary tracing, as well [13][14][15][16].A predictive edge-linking algorithm was also developed by considering the prediction generated by past movements [17].In addition to these, many other approaches to developing edge-linking algorithms have been proposed [18][19][20][21].However, most algorithms are not effective for general-purpose usage, and they are highly dependent on the data and applications because of the presence of noise and long interruptions (the existence of occlusions).
Therefore, we propose a new approach to connect the edge points to create meaningful longer edges even in the presence of a large amount of noise, unwanted interruptions and long occlusions.Our idea to link the detected points is to combine our linking edges with data association in the object target tracking methodology.The data association technique in target tracking is commonly used to identify detected points of objects in a cluttered environment, and it classifies observations as being either target oriented or non-target oriented.That is, such object tracking schemes can be applied to classify the detected points into desired edges or unwanted clutter.In the work presented in this paper, the target tracking method for boundary determination is designed as part of a Bayesian scheme, and the posterior distribution is the objective function with the shape prior in a Markov random field (MRF).
The contributions of our paper are fourfold: (1) introducing a new probabilistic relaxation technique using single-target tracking techniques; (2) overcoming multiple occurrences of noise and long occlusions; (3) designing three efficient statistical methodologies; and (4) application to interactive segmentation.
This paper consists of several sections.Section 3 presents background knowledge related to the techniques that are used in this paper.After first explaining the intuitive idea of our approach in Section 2, various algorithms are proposed in Section 4. Afterwards, we conclude with our results and a related discussion.

Motivation (Idea)
In this paper, we propose a new approach to linking edges based on our idea to introduce single-target tracking algorithms.Target tracking algorithms have found use in many application domains of computer vision and image processing [22][23][24][25][26].Given the raw image in Figure 1a, we have a binary image in Figure 1b after applying several detection algorithms, such as Canny and Sobel.Of course, advanced statistical detection algorithms can be applied [27,28].Here, if we want to obtain the boundary of the triangle from the detected points, we can click any point inside of the polygon in an interactive way [29].We name this point an origin.Then, we obtain T lines from the binary image in a radial way using Bresenham's line-drawing algorithm, as shown in Figure 1c.This is similar to the approach followed in the star convexity shape prior approach [30].Each line can cross the detected points that are potential edge candidates, and we set the crossed points to be the measurements of our interest in the tracking system.That is, each line in Figure 1c is the same as the time slot in the tracking domain, as shown in Figure 1d.Now, the radial distance (r) of the crossed points represents the value of the observations.Finally, we can obtain several observations (labeled 'x' in Figure 1) with T discrete time slots since T intersections are taken into account in the image.After transforming the two-dimensional image into a one-dimensional radial space, we apply a conventional single-target tracking technique since the boundary of any object will be constructed from the obtained trajectory or data association represented by the set of red circles in Figure 1d.

Mathematical Background
As described in Section 2, boundary tracing is performed with the single-target tracking technique that is popularly used in highly noisy surveillance systems.Since our tracking model is based on an off-line system, we use the idea of Markov chain Monte Carlo data association (MCMCDA) [31] to provide highly accurate data association for edge linking.MCMCDA uses data association to partition the measurements into two sets of data: target-oriented measurements and non-target oriented measurements (clutter).After this data association process, the single-target tracking problem becomes much simpler and straightforward to address [31].Let the unobserved stochastic process with hidden states {x t } for t ∈ {1, 2, • • • , T}, x t ∈ X, be modeled as a Gaussian Markov random field (GMRF) process of transition probability p(x t |x ne(t) ) for the system prior, where ne(t) is a set of the neighbors of the t-th state.

Gaussian Markov Random Field
A shape prior is now commonly used for segmenting images in image processing and computer vision [32].
A Markov random field (MRF) is often used in the segmentation of images in image processing and computer vision.For instance, MRF models the relation of neighboring pixels to the coloring foreground and background using a mathematical model, such as the Ising or Potts model [33][34][35].There are alternative uses of MRF in computer vision.MRF has been used for the object shape prior, which is used for Bayesian object recognition [36,37].Our work is also based on the use of a shape prior.
Assume that there is a graphical model that consists of a set of states with the second order Gaussian Markov random field (GMRF) [38] representation in Figure 2a.Given the GMRF model, we have an interesting mathematical form of the prior as Equation (1): where Q = κD T D, and D is a circulant matrix defined by:

Circular State Space Model
We assume the likelihood for the target-oriented measurements as follows y t = x t + v t , v t ∼ N (•; 0, τ), and the non-target-oriented measurements follow a uniform distribution since there is no prior information.We have the following state and measurement space models in a time series, as shown in Figure 2b.Given the GMRF prior and likelihood function, we have y 1:T ∼ N (•; X t , τI T×T ).We finally have a marginalized likelihood for the circular state space model with observations in the time series by:

Data Association for Single-Object Tracking
By assuming that the time series contains a single track, then each observation can be assigned to one of the following cases: the observation is an element of a set of clutter, i.e., w 0 , or the observation is a target-oriented signal, i.e., w 1 .Therefore, data association partitions the measurements into target-oriented observations (w 1 ) and clutter (w 0 ).The motivation of our proposed tracking algorithm is now to search for an appropriate single track w 1 .
Note that we simplified the notation of the observations by y = y 1:T and y T+i = y i for i > 0 since it is cyclic.Suppose that w is a configuration of y, and it is a collection of partitions of y.For w ∈ Ω, we have w = {w 0 , w 1 }, where w 0 ∪ w 1 = y and |w 0 ∩ w 1 | = 0, w 0 is a set of clutter, false alarms, where L min is the minimum length of the proposed track.The cardinality of the set is denoted by | • |.Let θ = (p d , λ f ) be the set of unknown environmental parameters (EPs), where is the transpose operator.In the EPs, p d is the detection probability, and λ f is the false alarm rate per unit time.Our interest is to generate samples from a stationary posterior distribution for the configuration w, p(w|y, τ, κ) = p(w|y, θ, τ, κ)π(θ|y, τ, κ)dθ.
The conditional posterior has the form, where p(x|`) follows a Gaussian Markov random field (GMRF) prior.
As mentioned in connection with the original MCMCDA, we can use fixed environmental parameters (EPs), θ.When they are known, we can sample only w over the conditional posterior distribution, p(w|y, θ, τ, κ) since the underlying trajectory X is marginalized as in Equation (2).Therefore, the stationary posterior distribution of our interest is: where Z = p(y 1:T |θ) and Be(), Po() denote the Bernoulli and Poisson distributions, respectively.Here, R is the maximum distance from the origin for all measured observations, and p d and λ f denote the detection probability and the false alarm occurrence rate λ f .The observations are divided into two sets: y w 0 and y w 1 , which are the sets of observations containing the clutter and targets, respectively.In this equation, the parameters, deterministically obtained when the model configuration w is proposed, are given by the number of detections at time t, d t , the number of undetected targets at time t, u t = 1 − d t , and the number of false alarms at time t, f t = n t − d t .In addition, we need to consider a special case in which there are no detected observations for certain sequential slots in order to design occlusions for the boundary.For instance, we may not have any observations at times a and b in data association; thus, let m be a set of the time indexes of the undetected observations.Now, our likelihood can be slightly changed by applying marginalization, and \m denotes 'not m', such that:

Improved Posterior with Marginalization of the EPs
The unknown parameters of our interest are both the configuration w and the environmental parameters θ.Their prior distribution p(w, θ) is decomposed by p(w|θ)p(θ) where: and p d and λ f are assumed to be independent.Given the conjugate prior distribution of the EPs and configuration, we have the marginalized posterior, p(w|y 1:T , τ, κ) ∝ p(y w 1 |τ, κ)p(y w 0 ) p(w|θ)p(θ)dθ where: and p(θ) is assumed to be a uniform distribution since we do not have any prior information thereof.The detailed derivation is shown in Appendix A.1.
Finally, the log of the marginalized posterior is: where S = τI + Q −1 \m .

Approach Based on the Metropolis-Hastings Algorithm
Markov chain Monte Carlo (MCMC) is one of the most intuitive statistical approaches to constructing a target posterior, and it is applied to image processing [39][40][41].We implemented several types of moves for the Metropolis-Hastings (MH) update for linking edges: (a) pop: an observation moves from w 1 to w 0 ; that is, an observation that was previously considered as a target-oriented one is now considered in a non-target observation; (b) push: an observation moves from w 0 to w 1 ; i.e., an observation that was previously considered as a non-target-oriented observation is re-categorized as a target-oriented observation; and (c) swap: for time t, if we have a target-oriented observation a t ∈ w 1 and we can choose a non-target-oriented observation b t ∈ w 0 for y ∈ y t and a t = b t , then, we exchange them in the sets.That is, the updated configuration has two updated sets: b t ∈ w 1 and a t ∈ w 0 .We have an acceptance probability for the Metropolis-Hastings algorithm: A = min 1, p(w |y,τ,κ)q(w;w ) p(w|y,τ,κ)q(w ;w) .

Gibbs Sampler-Based Approach
Here, given the mathematical model, we can calculate the posterior distribution of each case of data association at time t, and a Gibbs sampler can generate the samples from the pre-calculated posterior distribution.From this point of view, we first introduce an auxiliary random variable c t that identifies the index of the target-oriented observation at time t from the data association for , where N t is the number of measurements at time t.Here, c t = 0 represents that no target-oriented measurement is detected, and c t = i represents that the i-th observation is detected and is a target-oriented signal at time t.Now, our interest is to generate samples from the joint full posterior distribution of C 1:T , π(C 1:T |y, τ, κ), rather than p(w|y, τ, κ), since the configuration reconstructed by C 1:t is equivalent to the configuration, w, which has been referred to in the previous sections.We draw a sample c * i from the conditional posterior as in Algorithm 1 by c * t ∼ p(c * t |c * 1:t−1 , c t+1:T , y, τ, κ) for time t.Therefore, and it is directly estimated by: where the configuration w (t,i) is constructed with c (t,i) Algorithm 1 Gibbs sampler-based approach.end for 10: Reconstruct w using c * 1:T and find the best configuration ŵ with the MAP estimate.11: end for
We can assume that the factorized form via the mean field follows a multinomial distribution, and then, we have: where M(•; p) is the multinomial distribution with control parameter p and ∑ N t i=0 p t,i = 1 for all t ∈ {1, 2, • • • , T}.An iterative update equation for the factorized single-value form provides: Unfortunately, the right-most form of the above equation is not easily estimated, since it does not have a conjugate form; hence, we introduce the second approximated function: where c−t are the fixed values that are estimated from the previous step in a way similar to MAP or Gibbs samplers.Given this second approximation, we rewrite: log q(c t ; θ) ≈ log M(c t ; p t ) + log p(y|c t , c−t ) + log 1 Z .
Here, the second term on the right-hand side is the likelihood we already defined in Equation ( 4), such that: Thus, we can calculate the probability mass function for the likelihood directly, and we assign it as: Then, we write the marginal likelihood as a multinomial distribution by: . Therefore, we write: and: where p * t,a = p t,a + π t,a for all t ∈ {1, 2, • • • , T} and a ∈ {0, 1, • • • , N t }.Here, p * t,a is the newly-updated distribution.
The algorithm of the variational approach is decribed in Algorithm 2.
Algorithm 2 Variational Bayes approach.for t = 1 to T do 7: Calculate π t,a = p(y|c t =a,c −t ) Reconstruct w using c * 1:T , and find the best configuration ŵ using the MAP estimate.16: end while

Variance Estimation Using Markov Chain Monte Carlo
The next question is how to estimate κ and τ.Unfortunately, neither of these parameters has any closed form, so that we cannot draw the samples using the Gibbs sampler.Alternatively, we can estimate them by using two approaches: (a) the Metropolis-Hastings (MH) algorithm; and (b) prior processing in the training step using fast heuristic algorithms, such as the EMapproach.In this paper, we use MCMC to ensure consistency in the Monte Carlo approach since the Gibbs sampler is a type of Monte Carlo.

Results
We performed simulations on raw images to evaluate our proposed algorithms.We compared the performance with a popularly-used boundary tracing technique that is a built-in function, "bwtraceboundary", and our proposed three algorithms.Note that the boundary tracing approach is not used for segmenting objects in an image; instead, it is used for linking points.However, we chose this algorithm for the comparison since our approach mainly involves linking edge points in order to segment the objects.We used the following settings for our proposed algorithms: τ = 0.01, κ = 0.01, T = 50 and N iter = 300.

Noise-Free Image
The simulation results we obtained for a noise-free image are as shown in Figure 3.This figure displays three sub-figures: the detected points (left), the trajectory we estimated from the edge points using variational inference (center) and the object segmented using the three proposed approaches (right).Using the interactive segmentation scheme [42,43], we clicked the center of the square in Figure 3a.As we can see in Figure 3b, whereas the detected edge points of uninteresting objects, including "star", "circle," and "triangle", are considered as clutter, the edge points of the square are processed as significant signals with a long trajectory in our proposed approaches.Thus, for a noise-free image, we can obtain proper segmentation using our three proposed approaches as shown in Figure 3c.The elapsed times for this noise-free image are 211.69(MCMC), 211.67 (Gibbs) and 14.558 (variational Bayes).As expected, the variational Bayes approach is more efficient than the other two approaches for noise-free image segmentation.

Images with Occlusion
Image segmentation is known not to be straightforward for images in which occlusion exists.In order to evaluate the performance of our proposed approaches, we generated a synthetic example with occlusion as shown in Figure 4a.In this figure, the square in the black-and-white image includes occlusion.The red circles in the central sub-figure signify the observations associated with the restored trajectory using variational inference.This figure also shows interesting results in that observations in the corner of the square are not detected in the trajectory, even though they are well detected in the noise-free image in Figure 3a.This is because the observations at the corners are considered to be clutter since the occlusions introduce additional uncertainty in the trajectory.

Image Segmentation with Varying Noise Levels
The final aspect we need to investigate as part of our performance evaluation is to check the extent to which the proposed approaches accurately segment objects in an image with varying noise levels.Thus, we added salt and pepper noise at a noise level of σ .
Figure 5 shows (a) the raw image without any added noise σ = 0; (b) hidden true segmentation; (c) detected points obtained using the "Sobel" detector; and (d) the linked boundary.As can be seen in Figure 5c, many detected points are not connected, although the raw image does not contain any additional noise.We evaluated the performance of our algorithm by comparing two measures: the averaged values and standard deviations of overlapping rates and execution times with 10 runs.The overlapping rate is the similarity between the hidden true segmentation in Figure 5e and the region that is reconstructed from the detected trajectories.Let K and K r be the hidden true segmentation and the reconstructed polygon with a small number of detected observations.The similarity is calculated by S(K, K r ) = K∩K r K∪K r .Note that S(K, K r ) has a value between zero and one.As the similarity approaches one, the reconstructed polygon increasingly takes on the desired, hidden true segmentation.
The segmentation results obtained using four different algorithms are shown in Figure 6.As shown in this figure, the "bwtraceboundary" in MATLAB performed very poorly in all cases because it cannot overcome noise.However, our proposed approach performs well even in a severely-cluttered environment as in σ = 0.005.Another interesting finding is the ability of the MCMC-based approach to reliably construct the boundary such that the edges seem invariant.However, the Gibbs sampling and variational Bayes (VB) approaches are adversely affected by the noisy data and perform unstable segmentation.Table 1 presents a more quantitative comparison in terms of the overlapping rate and execution time.As can be seen in this table, MCMC outperforms the other approaches in all cases, and it provides stable results even for σ = 0.01.As we expected, the performance of all approaches also tends to decrease as more noise is added.Since MCMC and the Gibbs sampler are Monte Carlo simulations, they are rather slower than "bwtraceboundary" and VB.The last important finding is that VB is a practically efficient algorithm for small values of σ since it provides a high overlapping rate and its execution time is relatively short compared to that of the Monte Carlo simulations.

Application to More Complicated Raw Images
We also tested our proposed approaches on several natural images that were obtained from a few image databases [29,44].Figure 7 shows the segmentation results processed by the "bwtraceboundary" function and our proposed three approaches.We obtained the edge points from the raw image by using Gaussian convolution and a Canny detection [6].The first, second, third and fourth rows represent the natural images, edge points obtained by the Canny detection, trajectories associated by variational inference and the final segmentation using the four different approaches.
Figure 7 shows that our proposed approaches outperform "bwtraceboundary" in the segmentation procedure.In addition, the segmentation results of the MCMC simulation are more rigid than those obtained using the Gibbs and variational approaches.We also compared the elapsed times for all approaches.Table 2 indicates that 'T bwtraceboundary (Matlab) < T VariationalBayes(VB) << T MCMC ≈ T Gibbs .We also evaluated the performance of our proposed approaches by changing the detection algorithms.Figure 8 shows the segmentation results for the detected points, which were obtained by the Canny, Bergholm and inversion detection algorithms.It is obvious that our proposed approaches depend on the performance of the edge detection algorithm in use.For instance, the objects in (b), (c) and (d) are well segmented when the inversion detection algorithm is used.

Discussion
Apart from the superior performance of our proposed approach in terms of the segmentation of some real images, there are several issues that need to be improved.First, our proposed approach samples the intersecting lines in a radial way such that polygons with a concave form may not be appropriately segmented.Instead of obtaining the concave shape, our proposed approaches are likely to partition the desired concave polygon into several unwanted smaller convex polygons; Second, there are several hyper-parameters that should be carefully selected: κ, τ and u 0:2 = (u 0 , u 1 , u 2 ) for the coefficients of the second order GMRF.Unfortunately, these hyper-parameters are very sensitive in the sense that if they are poorly selected, our proposed approaches do not work.Therefore, we need to conduct research on how to automatically select the hyper-parameters used in learning or estimation frameworks; Third, the current approaches do not guarantee stability as they rely on varying origins.The location of the origin determines the velocity and acceleration of the trajectory, which means that κ and τ would have to be changed with varying location; Last, we need to improve our proposed approaches to enable them to process multiple objects since the current algorithms only have the ability to accommodate single objects using single-target tracking.

Conclusions
We introduced a new scheme capable of embedding single-target tracking algorithms, to connect edge points for longer salient edges.Conventional approaches for linking edges are mostly based on the connection of neighboring points rather than global patterns.Therefore, these approaches perform very poorly in the presence of unwanted noise, interruptions and occlusions.However, we can overcome these problems by introducing a probabilistic target-tracking scheme, such as that typically used in cluttered surveillance systems.That is, we converted the edge-linking problem into a target-tracking problem.Given this tracking scheme, we also developed three different algorithms: MCMC, a Gibbs sampler and variational Bayes.MCMC outperforms the other approaches in most cases, although the variational Bayes algorithm seems practically efficient, because it has reasonable accuracy and is computationally effective.

Figure 2 .
Figure 2. Used graphical models: (a) the second order Gaussian Markov random field; and (b) circular state space model.

Figure 3 .
Figure 3.Comparison of segmentation by three different proposed algorithms for a noise-free image.(a) Detected Points; (b) Detected trajectory via variational Bayes; (c) Segmentation.

Figure 4 .
Figure 4. Comparison of the segmentation by the three different proposed algorithms for an image with occlusion.(a) Detected points; (b) Detected trajectory via variational Bayes; (c) Segmentation.

Figure 8 .
Figure 8. Interactive segmentation via three different detection algorithms.
t = arg a max { p * t,a } N t a=0 , and update p t ← p * t .14: end for 15:

Table 1 .
Comparison of the overlapping rate, S(K, K r ) and execution time.VB, variational Bayes.

Table 2 .
Comparison of the execution times for interactive segmentation via the Canny detection.