Weighted Quasi-Interpolant Spline Approximations of Planar Curvilinear Profiles in Digital Images

The approximation of curvilinear profiles is very popular for processing digital images and leads to numerous applications such as image segmentation, compression and recognition. In this paper, we develop a novel semi-automatic method based on quasi-interpolation. The method consists of three steps: a preprocessing step exploiting an edge detection algorithm; a splitting procedure to break the just-obtained set of edge points into smaller subsets; and a final step involving the use of a local curve approximation, the Weighted Quasi Interpolant Spline Approximation (wQISA), chosen for its robustness to data perturbation. The proposed method builds a sequence of polynomial spline curves, connected C0 in correspondence of cusps, G1 otherwise. To curb underfitting and overfitting, the computation of local approximations exploits the supervised learning paradigm. The effectiveness of the method is shown with simulation on real images from various application domains.


Introduction
Approximating curvilinear profiles in digital images with piecewise polynomials is very attractive in many application domains, as it leads to more compact and less wiggly representations of borders and contours. In this work, we study the applicability of the quasi-interpolation paradigm for approximating open and closed curves that can be modeled as 1-manifolds, that is, topological spaces wherein each point has a neighborhood that is homeomorphic to the Euclidean space of dimension 1; an extension to piecewise 1-manifolds is straightforward by decomposing the curve in 1-manifolds and proceeding one segment at a time [1].
The term quasi-interpolation has been interpreted differently by different authors, depending on the context of application. We here call quasi-interpolant any linear operator L of the form: where f : Ω ⊂ R n → R is a function being approximated, µ j are linear functionals, g j : Ω ⊂ R n → R are functions at our disposal (see, for example [2,3]). The coefficients µ j ( f ) are, in general, one of the following types: linear combinations of given values of the function f to be approximated (discrete type); linear combinations of values of derivatives of f , of order at most d (differential type); linear combinations of weighted mean values of f (integral type). Equation (1) can be interpreted as a "reconstruction" formula: given some input data sampled from the true function f , it creates a tentative reconstruction Q f . The origin of quasi-interpolation is traditionally traced back to Bernstein's approximation [4], where the functions g j are Bernstein polynomials. A rather simple generalization, known as Variation Diminishing Spline Approximation (VDSA), generalizes this construction to B-splines (see, for example [5,6]). Since its inception, quasi-interpolation has been studied to obtain methods that apply to different domains and with the aim of increasing the order of convergence: recent developments include univariate and tensorproduct spaces [7][8][9], triangular meshes [10][11][12][13], quadrangulations [14] and tetrahedra partitions [15], among others.
Unlike traditional quasi interpolation spline methods, which generally focus on the approximation of functions with strong regularity conditions, Weighted Quasi Interpolant Spline Approximations (wQISAs) [16,17] aims to provide local approximations of point clouds which can be affected by artifacts such as noise and outliers. However, a paradigm to compute global approximations was, so far, missing; indeed, the only two global approximations presented in [16] are obtained by manually processing the set of edge points, thanks to the quite simple geometry of the studied profiles. In this paper, we introduce a novel method that exploits wQISAs to construct global approximations of planar curvilinear profiles, with application to digital image processing. More specifically, the main contributions of this paper are: • The introduction of a novel algorithm, based on Weighted Quasi-Interpolant Spline Approximations, to compute global approximations of planar curvilinear profiles; • The validation of the method on real images from different application domains with respect to different evaluation measures.
The smoothness of a curve is generally distinguished between parametric continuity C n and geometric continuity G n [18,19]. A curve p(t) is said to be C n continuous at t = t 0 if it is continuous at t 0 and if the first n left and right derivatives of p match at t 0 , that is, A curve p(t) is said to be G n continuous at t = t 0 if, up to regular re-parametrization, it is C n continuous at t = t 0 . Recently, the concept of fractional continuity has been proposed for generalized fractional Bézier curves [20]. In our context, the global approximation of a planar profile is computed by gluing together local approximations, by imposing a C 0 continuity in correspondence of cusps, and a G 1 continuity otherwise; while it is possible to consider the more general case of G n (or C n ) continuity, G 1 continuity has proved to be sufficient in many practical applications (see, for example [21,22]).
The remainder of the paper is organized as follows. In Section 2, we provide some notation and background knowledge about the wQISA family. In Section 3, the core of our paper, we introduce our reconstruction algorithm. To test the validity of the suggested approach, Section 4 presents several experimental results on images from different application domains and with respect to different evaluation measures. Concluding remarks end the paper.

Preliminary Concepts
This section is meant to list some basic definitions regarding the wQISA family, as well as to introduce the essential terminology and notation. Because in this paper we focus on 1-manifolds, we restrict our attention to a univariate formulation of wQISA; for more details on the more general multivariate setting and an extended analysis of the theoretical properties, we refer the reader to [16].
Given a degree p, a knot vector is said to be (p + 1)-regular if no knot occurs more than p + 1 times and each boundary knot occurs exactly p + 1 times.
Let P ⊂ R 2 be a point cloud and p ∈ N * . Let x be a (p + 1)-regular knot vector with boundary knots x 1 = · · · = x p+1 = a and x n+1 = · · · = x n+p+1 = b. A Weighted Quasi Interpolant Spline Approximation of degree p to the point cloud P over the knot vector x is defined by: where x * i := (x i + . . . + x i+p )/p are the knot averages, the expression, gives the control polygon estimator with weight function w : 1] denotes the B-spline of degree p, which is uniquely determined by the local knot vector ]. Note that, given a point cloud, its wQISA depends on the following inputs: a spline space, uniquely defined by a degree and a (global) knot vector, and a weight function (possibly, with free parameters to be tuned). An example of weight function is given by the k-Nearest Neighbours (k-NN) average: where k ∈ N * ; N k (u) denotes the neighborhood of u defined by the k closest points of the point cloud; here, k is the free parameter. Figure 1 shows the approximation of a dataset of 250 points when a 5-NN weight is considered. The point cloud is sampled from the analytic function F(x) = x sin (π/2x), and is eventually perturbed with Gaussian noise ∼ N(0, s(x)), where The approximation is computed by a quadratic spline space containing n = 20 Bsplines over a uniform knot vector.  Figure 1. Variable noise approximation. The analytic function F(x) = x sin (π/2x) is sampled and then perturbed with (non-uniform) Gaussian noise of mean 0 and the standard deviation of Equation (5). The resulting wQISA approximation, obtained by a 5-NN weight function, is displayed red.

Curvilinear Profile Approximation via a Quasi-Interpolation-Based Technique
In this section, we design an algorithm that piecewisely applies wQISA to the approximation and recognition of open and closed planar curvilinear profiles. The method consists of three main steps: • Pre-processing step (see Section 3.1); • Point cloud split (see Section 3.2); • Local approximation via wQISA (see Section 3.3).

Pre-Processing Step
Firstly, we run an edge detection algorithm to an input a (grayscale) image, hereinafter denoted by I: in our case, we used the Canny edge detection [23], as it is considered among the most effective edge detection techniques [24,25]. To give an example, when considering the mushroom of Figure 2, we note that the Canny edge detection algorithm provides a much better result than other popular methods, see Figure 2: indeed, the Canny edge detection provides a much cleaner set of edge points, and is the only one able to identify the whole curvilinear profile without wrongly decomposing it into subparts.

Canny
Sobel Prewitt Roberts log The pixels identified as discontinuity points by the edge detection algorithm are then processed to identify 8-connected components. Their centres are maintained in P. For the sake of conciseness, we will hereinafter suppose that P consists of a single 8-connected component; in case of more than one 8-connected component, the reconstruction method will be applied as many times as their number.
For every point in P, the local discrete tangent vector is computed (for example, by means of regionprops in MATLAB). All those points with a sharp change in the unit tangent vector are stored in P 0 , and will be eventually used to impose a C 0 continuity instead of G 1 continuity; to put it another way, the order of continuity will be only C 0 in correspondence with the identified cusps, as expected. A graphical illustration of the C 0 and G 1 continuities is given in Figure 3.

Point Cloud Split
Differently from the method in [26] that iteratively covered a curvilinear profile with disks of fixed radius, in our pipeline the point cloud P is split into subsets, each of which is meant to be approximated independently. The key idea behind our procedure is loosely inspired by the concept of tubular neighborhoods-see, for example [27]. A graphical illustration of two possible iterations is given in Figure 4.  3.2.1. Constructing the k-th Local Tubular Neighborhood, k ≥ 1 Given an initial point P k 0 , where k ≥ 1, and an input radius R 1 > 0, let B(P k 0 , R 1 ) denote the (2-)ball of center P k 0 and radius R 1 . We build a sequence of points P k 0 , . . . , P k such that: • For any index 1 ≤ i ≤ M k + 1, the point P k i is computed from P k i−1 as follows: -Advance step. Set Q k i to be the farthest point from P k i−1 in B(P k i−1 , R 1 ) ∩ P, among all those points that were not contained in any previously-computed balls.

-
Smoothing step. Set P k i to be the point in P which is the closest to the barycenter of the points in B(Q k i , R 2 ) ∩ P, where R 2 < R 1 is an input value that should account for the amount of noise.

•
The index i = M k is the first one for which either of the following criteria holds true: -Angle criterion. The angle between the tangent vectors in P k 0 and P k M k +1 is above the threshold α 0 .
-Point criterion. The ball B(P k M k +1 , R 1 ) ∩ P does not contain new points (i.e., points which were not contained in previously-computed balls).
, R 1 ) contains the initial point P 1 0 . When stopping because of the angle criterion, we define the k-th local tubular neighbourhood as: Its candidate edge points are E k 0 := P k 0 and E k 1 := P k M k ; finally, we set P k+1 0 := P k M k to be the initial point for the subsequent local tubular neighborhood. On the contrary, when stopping because of either the point or loop criteria, the k-th local tubular neighbourhood is defined as: and its candidate edge points are E k 0 := P k 0 and E k 1 := P k M k +1 , where P k M k +1 := P 1 0 in case the loop criterion holds.
When stopping because of the point criterion, a new search is conducted from the initial point P 1 0 backward, to take into account the possibility of open profiles. Note that the whole splitting procedure merely relies on the following input: the initial point P 1 0 , the radii R 1 and R 2 , and the threshold α 0 . A more detailed discussion on the parameter setting is left to Section 3.4.
The local tubular neighbourhoods P 1 , . . . , P N allow us to locally represent the curvilinear profile of interest. However, their construction does not take into account the points in P 0 , that is, the points having a sharp change in the tangent vector.
Suppose that a cuspP is inside the two balls B(P k i , R 1 ) and B(P k i+1 , R 1 ). Then, we split the k-th local tubular neighbourhood into two new local tubular neighbourhoods: • One will contain all balls B(P k The cuspP will be set to be a common edge point of the two new local tubular neighbourhoods.

Local Approximation via wQISA
For any P k , the local parametrization is computed as follows: 1.
Change of coordinates. Apply a roto-translation ρ : R 2 → R 2 to P k , so that the line segment < ρ(E k 0 ), ρ(E k 1 ) > lies on the x-axis. This corresponds to assuming that our local approximation can be locally flattened onto the x-axis without any overlap; note that this assumption relies on the parameter α 0 .

2.
Local spline space. Define τ k j := [τ 1 , . . . , τ j+6 ] as the j-th 3-regular knot vector where: , being E k,ρ,x i the abscissa of the roto-translation of E k i with respect to the map ρ. Note that the map ρ can be chosen so that E k,ρ,x 0 ].

3.
Local wQISA approximation. Compute a local approximation for P k by applying Equation (2) with k-NN weight function; note that the parameter k in the weight function does not necessarily equal the index k in the sequence of local tubular neighborhoods.
A discussion on the tuning of the free parameters (i.e., knot vector length and k of the k-NN weight function) is provided in Section 3.4.
Given that for any local tubular neighbourhood, the corresponding knot vector is (p + 1)-regular, the final approximation will be automatically C 0 continuous. To impose G 1 continuity, we exploit (p + 1)-regularity: the tangents to the curve at the boundaries are determined by its 2nd and (n − 1)th control points; we project the 2nd and (n − 1)th onto the straight lines determined by the tangent vectors. A similar reasoning can be applied to interpolate the left and right derivatives at cusps, up to a multiplicative constant.

Free Parameters and Tuning
The initial point P 1 0 is set to be the cusp, which is the closest to the lower-left corner. If the profile does not exhibit any cusps, we can consider the point of maximum (or minimum) curvature-the closest to the lower-left corner if more than one exists .
The radii R 1 and R 2 depend on the profiles and the noise level, and are currently empirically set by the user.
For each local approximation, the optimal global knot vector τ k opt and the optimal k opt for the weight function can be found by a model assessment and selection procedure, as in statistical learning; more precisely, we proceed by performing Leave-One-Out Cross-Validation (LOO-CV, see [28]), by considering the Root Mean Square Error as a generalization error (see Section 4). This allows us to limit under-and overfitting; an example is given in Figure 5 of a local tubular neighborhood exhibiting both noise and outliers. The use of Cross-Validation for the estimation of the unknown coefficients allows us to interpret the curve fitting problem as a supervised learning problem (see, for example [29]).
To generalize the method, one could select the continuity G k at the segment edge points by cross-validation; however, G 1 continuity has proved to be sufficient for all the examples given in Section 4.   Figure 6. In the left image, the local approximation consists of an oversimplification of the input data. In the right image, the local approximation models the training data too well, resulting in unwanted oscillations that reduce the prediction capability on new points. A good balance between underand over-fitting is displayed in the central image.

Computational Complexity
The computational complexity of the parametrization step is the sum of the single wQISA iterations. On its turn, wQISA mainly depends on the k-NN weight, whose time complexity for estimating a single coefficient is proportional to O(klog(N)), where N is the number of points. To further reduce the computational complexity of the k-NN search, we follow the k-d trees approach proposed in [30]. It is worth noting that, each coefficient and each local approximation being computed independently, this task is embarrassingly parallel.
The splitting procedure can itself rely on a k-d tree, thus it has the same computational complexity. On the other hand, it cannot be performed independently, and turns out to be the most demanding part of the algorithm.

Experimental Results
We evaluate our method on photographic images, animation images and CT scans. Although the approximation could theoretically make use of B-splines of any degree, we here focus on quadratic spline approximations, as they provide a sufficient flexibility for our purposes; nevertheless, one could consider the degree of each local approximation as an additional parameter to be tuned.

Performance Measures
To evaluate the quality of an approximation against the input point clouds, the following indicators are considered: • The Root Mean Square Error (RMSE), that is, the square root of the Mean Square Error, is defined by: Y i being the abscissa of an input point with respect to its local coordinate system,Ŷ i its approximation, and N the cardinality of the input point cloud. Similarly, the Mean Absolute Error (MAE) is given by: • The (normalized) Directed Hausdorff Distance from the points of a ∈ A ⊂ R 2 to the points of b ∈ B ⊂ R 2 is defined as: where d is the Euclidean distance and L is the diameter of the input point cloud. In our case, we set B to be the input point cloud P, while A is given by the points defining our spline approximation. | · | being the cardinality of a set. The Jaccard index has been intensively applied to measure the performance of curve recognition methods for images (see, for example [31]). Here, we compare the set of pixels corresponding to P with the set of pixels crossed by the approximation and their 1-neighbours.

Photographic Images
We start by considering a 183 × 183 image representing a flower (Image retrieved on MathWorks ® ), see Figure 6. The use of the Canny edge detection algorithm and the subsequent study of the tangent vector of the profile allows us to identify all cusps, as shown in Figure 6b. The set of edge points is then split into 13 local tubular neighborhoods, see Figure 6c. The obtained approximation, superimposed to the original image, is displayed in Figure 6d. Given that plotting the local tubular neighborhoods is not particularly informative, we just report their number for the remaining examples in Section 4.5.

Animation Images
A composite example is exhibited in Figure 8. In Figure 8a, the input 984 × 600 image is shown. In Figure 8b,   A single but more complex closed profile is provided in the 734 × 670 image from Figure 9a, obtained from the dataset used in [33]. The set of edge points and the detected cusps are given in Figure 9b, while the final approximation is shown in Figure 9c.  We then consider a coronal slice obtained from the MedPix database (MedPix database, https://medpix.nlm.nih.gov/, accessed on 1 September 2021), see Figure 11. For our testing, we focus our attention on the lumbar spine, whose edge points are shown in Figure 11b and approximated in Figure 11c. For the last simulation, we consider a 512 × 512 axial X-ray CT slice of a human lumbar vertebra, see Figure 12. The image has been obtained by the repository [34]. In Figure 12b we provide the three sets of edge points chosen to be approximated, and corresponding to the external profile of the body section; the vertebra, where it is possible to distinguish body, transverse process and spinous process; the abdominal aorta. In Figure 12c we show the superimposition of three distinct approximations to the original image.

Quantitative Evaluation and Discussions
The indicators computed on each profile of each image are provided in Tables 1-3. When considering the RMSE, MAE and d dHaus , all approximations show a quite satisfying performance. In particular, having normalized the directed Hausdorff distance, we can conclude that the error is always below 3%, with the highest values coming from the lumbar spine. Only a few profiles have a Jaccard index lower than 95%; the lowest possible is that of the body in Figure 12 (61.41% ca.). However, one should consider that this indicator is not reliable when applied to point sets containing a high number of outliers, which is the case for this specific profile, see Figure 13b: the local approximation is able to capture the underlying trend of the data and preserve the monotonicity, but due to the presence of an excessive number of erroneous points, this indicator is not fully representative of the situation. On the contrary, the edge points from Figure 9 are noisy but contain a limited number of outliers; an example of a small local tubular neighborhood and its approximation is given in Figure 13a.  Figures 6 and 7. For each profile, we report: the Root Mean Squared Error (RMSE), the Mean Absolute Error (MAE), the normalized directed Hausdorff distance d dHaus and the Jaccard index J. Here, fit1, fit2 and fit3 stand, respectively, for quadratic polynomial, piecewise cubic and smoothing spline fitting. For each figure and each indicator, the best performance is reported in bold.   All in all, it might be concluded that the method can successfully reconstruct planar curvilinear profiles, but its success is inevitably linked to the quality of the segmentation: as observed in the qualitative discussion of Figure 7, the Canny edge detection algorithm was unable to identify the stalk of the mushroom in its entirety.

Figure
We compared the generalization capability of wQISAs with that of other fitting methods implemented in MATLAB: quadratic polynomial, piecewise cubic and smoothing spline fitting. When considering RMSE, MAE and d dHaus , our methods outperforms its competitors; however, in a few cases the results are comparable. In terms of the Jaccard index, the results are more fluid: as previously observed, this is partly justified by the presence of outliers in some point clouds, which makes this indicator not always trustworthy.
To provide a complete analysis of the experimental results, Table 4 provides the number of tubular neighborhoods, the number of edge points and the execution times for all the examples shown in the paper.

Conclusions
In this work, we have proposed a novel method to recognize and approximate closed and open 1-manifolds in digital images. Based on the Weighted Quasi-Interpolant Spline Approximation family, our approach can provide a sequence of parametrizations connected G 1 continuously, except in correspondence with the identified cusps, where we impose C 0 continuity. For each local approximation, the number of knots in the knot vectors and the free parameter k are tuned by cross-validation, to limit under-and overfitting. Being based on the quasi-interpolation paradigm, the method is computationally efficient because it does not involve least-square approximation routines. Experimental results on a set of real digital images show a good performance on real data.
As a future development of the method, we intend to extend it to the case of space curves, that is, to non planar profiles embedded into the Euclidean space, and to curves with non-manifold configurations, such as knots and self-intersections. A typical solution for dealing with complicated image borders is to consider arcs and line segments at the same time [1]; in our case, our method can be iterated on the single 1-manifold arcs by decomposing a general profile into segments.