A Method of Curve Reconstruction Based on Point Cloud Clustering and PCA

: In many application ﬁelds (closed curve noise data reconstruction, time series data ﬁtting, image edge smoothing, skeleton extraction, etc.), curve reconstruction based on noise data has always been a popular but challenging problem. In a single domain, there are many methods for curve reconstruction of noise data, but a method suitable for multi-domain curve reconstruction has received much less attention in the literature. More importantly, the existing methods have shortcomings in time consumption when dealing with large data and high-density point cloud curve reconstruction. For this reason, we hope to propose a curve ﬁtting algorithm suitable for many ﬁelds and low time consumption. In this paper, a curve reconstruction method based on clustering and point cloud principal component analysis is proposed. Firstly, the point cloud is clustered by the K++ means algorithm. Secondly, a denoising method based on point cloud principal component analysis is proposed to obtain the interpolation nodes of curve subdivision. Finally, the ﬁtting curve is obtained by the parametric curve subdivision method. Comparative experiments show that our method is superior to the classical ﬁtting method in terms of time consumption and effect. In addition, our method is not constrained by the shape of the point cloud, and can play a role in time series data, image thinning and edge smoothing.


Context
Curve reconstruction of planar cloud is an important research issue in reverse engineering. Reverse engineering technology is a new discipline and technology developed with the development of computer technology and the progress of data measurement technology. Curve reconstruction technology has important applications in the field of virtual cultural relic restoration, image edge smoothing, image refinement and 3D point cloud reconstruction. Curve reconstruction methods are generally divided into the curve reconstruction of an ordered set or a disordered set of points. The reconstruction of ordered points refers to constructing a curve to interpolate or approximate these sampling points in turn. For this reconstruction, there have been many mature methods, such as the B-spline method [1,2], the rational interpolation or approximation method [3,4], the subdivision method [5,6] etc. However, in reverse engineering, due to the different data sampling methods, the obtained data point set is a disordered discrete data point cloud. There are many methods for curve reconstruction of unordered point cloud, such as the least square method [7][8][9][10], the clustering method [2,[11][12][13][14] and principal component analysis [15][16][17][18].

Analysis of Existing Methods and Research Objectives
The least square (LS) method is one of the most widely used methods at present, mainly because it can find the best matching function of data by minimizing the square of error, and get displayed expression. In 1998, Levin [7] proposed a moving least square (MLS) method, which was applied to the field of curve reconstruction, and used this method to refine the scattered point cloud. In 2017, Mustafa [10] used the dynamic weighted iterative least squares method to propose a nonlinear subdivision scheme based on one-variable cubic polynomial, which is used to deal with the fitting of scattered data with noise and outliers. MLS plays a good role in curve reconstruction on small data sets or open curve data sets; however, on large data sets, this method needs to calculate the coefficients of polynomials. If the condition number of the matrix is large, the equations may be ill conditioned. On the other hand, the MLS method is a technique based on polynomial fitting. If the values of large data sets (such as stock prices) change sharply, the fitting process is prone to over fitting; that is, the fitting curve is too smooth, ignoring some details of the data, and making insufficient contribution to the maintenance of the shape of curve, as shown in Figure 1. At the same time, in the fitting process of large data sets, the number of compact supported sets is directly proportional to the running time, and there are also deficiencies in the fitting efficiency of the MLS method.

Analysis of Existing Methods and Research Objectives
The least square (LS) method is one of the most widely used methods at present, mainly because it can find the best matching function of data by minimizing the square of error, and get displayed expression. In 1998, Levin [7] proposed a moving least square (MLS) method, which was applied to the field of curve reconstruction, and used this method to refine the scattered point cloud. In 2017, Mustafa [10] used the dynamic weighted iterative least squares method to propose a nonlinear subdivision scheme based on one-variable cubic polynomial, which is used to deal with the fitting of scattered data with noise and outliers. MLS plays a good role in curve reconstruction on small data sets or open curve data sets; however, on large data sets, this method needs to calculate the coefficients of polynomials. If the condition number of the matrix is large, the equations may be ill conditioned. On the other hand, the MLS method is a technique based on polynomial fitting. If the values of large data sets (such as stock prices) change sharply, the fitting process is prone to over fitting; that is, the fitting curve is too smooth, ignoring some details of the data, and making insufficient contribution to the maintenance of the shape of curve, as shown in Figure 1. At the same time, in the fitting process of large data sets, the number of compact supported sets is directly proportional to the running time, and there are also deficiencies in the fitting efficiency of the MLS method. The abscissa represents the time (day) and the ordinate represents the price; there are 2130 data points. (b) The fitting curve obtained by MLS. The number of nodes is 100, the weight function is cubic spline function, and the basis function is [1, , ]. (c) The fitting curve obtained by the method proposed in this paper; the number of control points is 100.
The clustering method is a widely used and effective classification method, which is mainly used in image segmentation [19][20][21][22], statistical analysis [23][24][25], and industrial design [16,17]. The goal of point cloud clustering is to identify clusters with the same characteristics from a group of point clouds. K-means algorithm is one of the oldest and most commonly used clustering algorithms, it is best suited to create the desired shape curve, such as the approximate optimal shape of the scanned data point set. In point cloud reconstruction, this method is mainly used to deal with outliers and has achieved good results. In 2021, using the clustering and B-spline method, Chen [14] presented an automatic approach to generating a fitting curve to a set of unorganized points generated randomly from a closed curve; unfortunately, this method can only solve the fitting of noisy data on closed curves. In 2021, Gu [26] proposed a novel moving total least squares (MTLS)-based reconstruction method combined with k-means clustering to enhance the robustness and accuracy of MTLS in handling the measurement data with outliers. The algorithm first adopts LS for pre-fitting, and then uses MTLS to generate the fitting curve. Although it The clustering method is a widely used and effective classification method, which is mainly used in image segmentation [19][20][21][22], statistical analysis [23][24][25], and industrial design [16,17]. The goal of point cloud clustering is to identify clusters with the same characteristics from a group of point clouds. K-means algorithm is one of the oldest and most commonly used clustering algorithms, it is best suited to create the desired shape curve, such as the approximate optimal shape of the scanned data point set. In point cloud reconstruction, this method is mainly used to deal with outliers and has achieved good results. In 2021, using the clustering and B-spline method, Chen [14] presented an automatic approach to generating a fitting curve to a set of unorganized points generated randomly from a closed curve; unfortunately, this method can only solve the fitting of noisy data on closed curves. In 2021, Gu [26] proposed a novel moving total least squares (MTLS)based reconstruction method combined with k-means clustering to enhance the robustness and accuracy of MTLS in handling the measurement data with outliers. The algorithm first adopts LS for pre-fitting, and then uses MTLS to generate the fitting curve. Although it can well remove the outliers, two matrix operations are performed in each support domain, which has high computational complexity and reduces the fitting efficiency. For example, when processing the point cloud curve reconstruction of more than 6000 data points, if the classical fitting method MLS is used, it generally takes more than 20 s.
Principal component analysis (PCA) is a classical point cloud normal vector calculation method which was proposed by Hoppe [27]. The method constructs the covariance matrix through the neighborhood information of the point, so as to obtain the geometric information of the point. In 2011, Furferi [16] provided a method of using PCA to detect the local main direction of the point cloud and fit the disordered data. The method was applied to image thinning, and proved to be effective in preserving the original shape. In 2020, Yang [17] proposed a point cloud simplification algorithm based on PCA and cluster. Its main purpose was to obtain the normal vector, angle entropy, curvature, and density information of the point cloud. In 2021, Neuville [28] proposed a method to segment tree stems in a deciduous forest stand that does not rely on any site-specific parameters. In this paper, the clustering algorithm is used to segment the trunk, and the PCA method is used to extract the trunk direction. In the above literatures, PCA is used to determine the direction (curve) or normal vector (surface) of a point cloud. In fact, this method can also be adopted for noise processing in data processing; however, it is not reflected in the above literature.
For this reason, we propose a point cloud curve reconstruction method based on point cloud clustering and principal component analysis. Firstly, the K-means++ clustering method is used to segment the point cloud into several clusters. Secondly, point cloud principal component analysis (PCPCA) is proposed on each cluster to remove the outliers, and detect the main control point and the principal direction of the point. Thirdly, by non-uniform curve subdivision, a C 1 limit curve is obtained. The main contributions of this paper are as follows: 1.
PCPCA is proposed to find the projection line and main direction of each cluster, and σ-principle is adopted to remove the outliers.

2.
For complex point clouds with self-intersection, using the main direction of each cluster, we can divide all main control points into two groups and sort the points in groups.

3.
The method proposed in this paper has wider application fields than state-of-the-art methods, including planar point cloud curve reconstruction, time series data fitting, image refinement and image edge smoothing. 4.
In the curve reconstruction of high-density point cloud, our method takes 20% of the time of classical methods (such as MLS).
In the following, we first review K-means clustering and K-means++ clustering in Section 2. In Section 3, PCPCA is proposed, the method of removing outliers and the determination of main control points and main directions are introduced. In Section 4, we review the non-uniform curve subdivision method. Section 5 gives a series of numerical examples to show the effectiveness and robustness of the method proposed in this paper. The conclusions and discussion about this paper are in Section 6.

K-Means Clustering
The K-means clustering algorithm was proposed independently by Steinhaus (1955), Lloyed (1957), Ball and Hall (1965), and McQueen (1967), in different scientific fields. The K-means clustering algorithm has been widely studied and applied in different disciplines since it was proposed.

Classical K-Means Clustering
The K-means clustering algorithm is an iterative clustering algorithm, which is to randomly divide all the points {p i } N i=1 into k clusters {C 1 , C 2 , . . . , C k }, then randomly select points {p 1 , p 2 , . . . , p k } as the initial clustering center {c 1 , c 2 , . . . , c k }. The distance from p j to each clustering center p i , i = 1, 2, . . . , k, is noted as: Then, based on the minimum value min i d i,j , p j is assigned to the cluster C i . After assigning all points to constitute the first clusters, we can generate a new cluster center C New i , i = 1, 2, . . . , k according to the points in each cluster as: Next, all data points are redistributed according to the previous method to obtain the second clusters. Continuing in the same way, the K-means clustering algorithm stops until the points in each cluster do not change.

Improved K-Means Clustering (K-Means++)
In 2007, D. Arthur [20] improved the K-means clustering method and the proposed K-means++ clustering method, which randomly selected an initial data point p i as the initial clustering center c 1 in the initial data set {p i } N i=1 . The distance between the other points, p j and p i , is calculated as: is introduced. p j is randomly selected as the second clustering center c 2 , according to the probability D 1,j . Assuming that n initial clustering centers have been selected (0 < n < k), when selecting the (n + 1)th clustering center, the point farther away from the current n clustering centers will have a higher probability to be selected as the (n + 1)th clustering center. K-means++ can significantly improve the final error of classification results. The clustering method used in this paper is the K-means++ clustering method. Figure 2 shows the difference between two clustering methods for the same point cloud. Under different clustering methods and different numbers of clusters, Table 1 shows the average distance from the point in each cluster to the cluster center of the point cloud in Figure 2a. Then, based on the minimum value min { , }, is assigned to the cluster . After assigning all points to constitute the first clusters, we can generate a new cluster center , = 1,2, ⋯ , according to the points in each cluster as: Next, all data points are redistributed according to the previous method to obtain the second clusters. Continuing in the same way, the K-means clustering algorithm stops until the points in each cluster do not change.

Improved K-Means Clustering (K-Means++)
In 2007, D. Arthur [20] improved the K-means clustering method and the proposed K-means++ clustering method, which randomly selected an initial data point as the initial clustering center in the initial data set { } . The distance between the other points, and , is calculated as: The probability , = , ∑ , is introduced. is randomly selected as the second clustering center , according to the probability , . Assuming that initial clustering centers have been selected (0 < < ), when selecting the ( + 1) ℎ clustering center, the point farther away from the current clustering centers will have a higher probability to be selected as the ( + 1) ℎ clustering center. K-means++ can significantly improve the final error of classification results. The clustering method used in this paper is the K-means++ clustering method. Figure 2 shows the difference between two clustering methods for the same point cloud. Under different clustering methods and different numbers of clusters, Table 1 shows the average distance from the point in each cluster to the cluster center of the point cloud in Figure 2a. It can be seen from the two figures that the red rectangular area is an extreme region of the point cloud; the K-means clustering method divides the region into two clusters and generates two cluster centers, but the K-means + + method generates a cluster in the region and forms a cluster center, which can better protect the shape of the point cloud. It can be seen from the two figures that the red rectangular area is an extreme region of the point cloud; the K-means clustering method divides the region into two clusters and generates two cluster centers, but the K-means++ method generates a cluster in the region and forms a cluster center, which can better protect the shape of the point cloud. PCA is a statistical method that converts multiple related variables into a few independent variables (principal components) through dimension reduction technology, and is one of the most important dimension reduction methods. It can reduce the dimension of high dimensional data and remove noise by dimension reduction.
In this paper, we propose a point cloud principal component analysis (PCPCA). The main idea of this method is as follows: (1) The abscissa x and ordinate y of each point in the point cloud P = {p i } M i=1 are used to form two M-dimensional columnvectors x and y ; (2) A M-order square matrix A is obtained by using the formula A = x x T + y y T , and eigenvalues and eigenvectors of A are found; (3) The eigenvector corresponding to the maximum eigenvalue is recorded as ω, and the projection vector set Q is obtained from the formula Q = ωω T x , y .
is a projected point set on a straight line l, denoted: The main control point q = (x 0 , y 0 ) is defined as: Since x 0 (and y 0 ) is 1-dimensional vector, the dimension reduction of the initial Mdimensional data vector x (and y )is needed, so that the idea of the PCA method can be used.
First of all, let ω be the transform unit vector for dimensionality reduction, so that The minimization principle (1) implies that the projected points x 1 , y 1 are sufficiently near to the data points x , y . Let Substituting x 0 = ω T x and y 0 = ω T y into the above expression yields Constructing Lagrange functions For the first item of L(ω, λ) is a quadratic form, the partial derivative L over ω is which implies that ω is the eigenvector of the matrix x x T + y y T corresponding to the eigenvalue λ.
be a point cloud set, and ω be the eigenvector corresponding to the maximum eigenvalue λ of matrix x x T + y y T . If is generated by: The rank of matrix x x The matrix x x T + y y T is a symmetric matrix, so the matrix has at most two non-zero eigenvalues, denoted as λ 1 , λ 2 . Suppose λ 1 > λ 2 , the corresponding eigenvectors are ω 1 , ω 2 , and ω 1 , ω 2 are the orthogonal unit vectors.
Substituting ω 1 , The eigenvector ω 1 corresponding to the eigenvalue λ 1 can minimize the distance between the projected point x 1 , y 1 = ω 1 ω T 1 x , ω 1 ω T 1 y and x , y . The straight line where x 1 , y 1 is located at is the main straight line of the cluster, and the direction n of the straight line is the principal direction, as shown in Figure 3.

Processing of Point Cloud Outliers
The main function of the PCA method in data processing includes two aspe mensionality reduction and noise reduction. In Section 3.1, we indirectly use the sionality reduction function of the PCPCA method to obtain the projection poin projection lines of noisy data. In the traditional PCA method, we will treat these pro points as real points, so as to achieve the function of noise reduction. However, thes data may contain some outliers, which will have some impact on the projection Therefore, we hope to add a step of removing outliers in the PCPCA method.
In statistical analysis, we often use the 3 principle to denoise. That is, in a g test data, the data contains random error. The mean and standard deviation

Processing of Point Cloud Outliers
The main function of the PCA method in data processing includes two aspects: dimensionality reduction and noise reduction. In Section 3.1, we indirectly use the dimensionality reduction function of the PCPCA method to obtain the projection points and projection lines of noisy data. In the traditional PCA method, we will treat these projection points as real points, so as to achieve the function of noise reduction. However, these noisy data may contain some outliers, which will have some impact on the projection points. Therefore, we hope to add a step of removing outliers in the PCPCA method. In statistical analysis, we often use the 3σ principle to denoise. That is, in a group of test data, the data contains random error. The mean µ and standard deviation σ can be obtained by calculating the data. The probability of the data in the interval (µ ± 3σ) is 0.997. Accordingly, the data not in this interval is not the data generated by random error, but outliers, which should be eliminated. This paper adopts σ principle to delete outliers. σ principle: Let d i , i = 1, 2, . . . , M is the distance from the point p i , i = 1, 2, . . . , M in the cluster and the projection line, the mean and standard deviation of the distance are d and σ, respectively. If the distance d k from a point p k and the line satisfies the condition p k is the outlier, and will be deleted from the point cloud, where the d and σ are calculated by the following formula: Next, we continue to use the method in Section 3.1 to recalculate the main line and main control point of the point cloud, as shown in Figure 4; (a) presents the noisy data (blue dots), its projection points (black asterisks), and the projection line; (b) shows the projection line and projection points after removing the outliers by σ principle. The points not connected by solid red lines represent the outliers. Compared with (a), we find that the projection points in (b) are more concentrated, and the points far away from the projection line are removed. The reason for not adopting the 3 -principle commonly used in statistics is because when the first PCPCA is used, if there is an outlier in the point cloud, the projection line has shifted to the point , and the distance from the outlier to the line has been reduced. If adopting the 3 -principle, it will be difficult for us to delete the outlier. The point cloud and its projection line after removing outliers. In (b), some noisy points are not connected with their projection points because they are judged as outliers (13 in total). It can also be seen from the figure that these outliers obviously deviate from the projection line. Therefore, these points will be removed when calculating the control points.

The Subdivision of the Main Control Points
In Section 3 of this paper, we introduce the generation method of main control points of the point cloud. Next, we need to use these points as interpolation nodes to fit the curve. In the previous literature, most of them generate fitting curves by the polynomial or Bspline method. We know that polynomial interpolation is prone to the Runge phenomenon, while the B-spline method needs to calculate the control vertices, which will increase a lot of computation. In 2013, Beccari [29] proposed a four-point binary non-uniform parametric curve subdivision method, and gave a non-uniform parameterized surface subdivision method on regular quadrilateral meshes, and proved the continuity of the limit surface.
Since non-uniform curve subdivision has a good interpolation or approximation effect on control polygons with arbitrary topology, the method is used in this paper to generate the fitting curve. Let = be the vertices after the th refinement, and denote by = . It can also be seen from the figure that these outliers obviously deviate from the projection line. Therefore, these points will be removed when calculating the control points.
The reason for not adopting the 3σ-principle commonly used in statistics is because when the first PCPCA is used, if there is an outlier p k in the point cloud, the projection line has shifted to the point p k , and the distance from the outlier to the line has been reduced. If adopting the 3σ-principle, it will be difficult for us to delete the outlier.

The Subdivision of the Main Control Points
In Section 3 of this paper, we introduce the generation method of main control points of the point cloud. Next, we need to use these points as interpolation nodes to fit the curve. In the previous literature, most of them generate fitting curves by the polynomial or B-spline method. We know that polynomial interpolation is prone to the Runge phenomenon, while the B-spline method needs to calculate the control vertices, which will increase a lot of computation. In 2013, Beccari [29] proposed a four-point binary non-uniform parametric curve subdivision method, and gave a non-uniform parameterized surface subdivision method on regular quadrilateral meshes, and proved the continuity of the limit surface.
Since non-uniform curve subdivision has a good interpolation or approximation effect on control polygons with arbitrary topology, the method is used in this paper to generate the fitting curve.
be the vertices after the kth refinement, and denote by d k i = q k i+1 − q k i 2 the distance between q k i and q k i+1 . Then, by means of the parameterized Lagrange interpolation basis functions, the subdivision mask can be obtained as: The four-point binary non-uniform curve subdivision scheme is:

Curve Fitting of the Planar Point Clouds
The function y = sin x is taken as the benchmark for our method and MLS. The point cloud distribution after adding Gaussian noise is shown separately as the black points on the left in Figure 5a. In Figure 5a, there are 6284 initial data points. A large number of data points can be removed through our method. The number of retained data points is 4532 is shown by the cyan dots, the removal rate is 28%, of which black data is the removed point. From Figure 5b, we find that the limit curve generated by our method is closer to the real curve near the extreme point than the MLS method, as shown in the red and blue boxes in the figure. In addition, in terms of running time, due to the large number of data points, the running time of our method is 4.0098 s, while that of the MLS method is 21.6411 s. Figure 5c is a closed curve point cloud generated by function x 2 + y 2 = 1. There are 6284 data points. The removal rate of the point cloud is 12%, and the cyan points are reserved points. The MLS method cannot generate the fitting curve of the closed curve. Figure 5d shows the fitting curve generated by our method, which basically coincides with the original curve. Figure 6a,c shows the curve fitting of X-type and Y-type point clouds. For the fitting of these two types of point clouds, we first cluster the point cloud into several clusters, and then find the main control points and main directions through PCA. Combined with the main directions, we divide the main control points into two groups and sort each group of points. Finally, the limit curves are obtained by the subdivision method for two main control points, as shown in Figure 6b,d; the two real curves (red and blue) in the figure correspond to the two curves fitted by the two main control point sets, respectively.  Figure 6a,c shows the curve fitting of -type and -type point clouds. For the fitting of these two types of point clouds, we first cluster the point cloud into several clusters, and then find the main control points and main directions through PCA. Combined with the main directions, we divide the main control points into two groups and sort each group of points. Finally, the limit curves are obtained by the subdivision method for two main control points, as shown in Figure 6b,d; the two real curves (red and blue) in the figure correspond to the two curves fitted by the two main control point sets, respectively.

Curve Reconstruction of the Time Series
Time series (or dynamic series) refers to the series formed by arranging the values of the same statistical index according to the time sequence of occurrence. The main purpose of time series analysis is to predict the future according to the existing historical data. Most of the economic data are given in the form of time series. According to the different observation time, the time in the time series can be year, quarter, month or any other form of time, such as stock price movements shown in Figure 1a. If regression analysis, MLS and other methods are used for data fitting, the imitative effect shown in Figure 1b is not so good; the reason for this is that polynomials are not very sensitive to the processing of mutation data. For this kind of data, we use the method proposed in this paperto get a C 1 continuous limit curve for the data points, and the trend of the curve has high similarity with the change trend of the original data. Figure 1c shows the results of reconstruction.

Image Refinement
Refinement refers to the process of reducing the lines of an image from multi-pixel width to unit pixel width, which is also called skeleton extraction of the image. The refined images in this paper are from MNIST digital image sets, as shown in Figure 7a. The pixel values of each point in these images are distributed in [0, 30] or [240,255]. The refinement method of this paper is divided into the following steps. Firstly, the pixels with pixel values between 240-255 in the digital image are regarded as point clouds, the coordinates of each pixel are recorded as (x, y), and the point cloud dataset P = {p i (x i , y i ), i = 1, 2, . . . , n}. Next, for the dataset P, the method in this paper is used to obtain the fitting curve through clustering, PCPCA and subdivision methods, and the curve is used as the thinning result of the image, as shown in Figure 7b.

Image Edge Smoothing
Image enlargement is one of the important digital image processing technologies. When the image is enlarged, the edge of the image will appear with sawtooth distortion. In order to suppress this distortion, it is necessary to refine the edge of the image. The commonly used image edge processing methods include bilinear interpolation, spline interpolation, and so on. We use the method proposed in this paper to process the edge of the image (the images are from the MNIST digital image set, as shown in the left column of Figure 8). Firstly, the Canny algorithm is used to extract the image edge, as shown in the middle column of Figure 8, and the pixels of the image edge are transformed into double precision. The double precision of white pixels is 1 and the double precision of black pixels is 0. The point set with double precision of 1 is used as the point cloud. Next, the fitting point set is generated by using the methods of clustering, PCPCA and subdivisions. The coordinate of the points in are rounded to obtain a new coordinate with image double precision of 1, and the double precision of other points is defined as 0. The last edge of the generated image, as shown in the right column of Figure 8. From the experimental results, our method can effectively smooth the image edge.

Image Edge Smoothing
Image enlargement is one of the important digital image processing technologies. When the image is enlarged, the edge of the image will appear with sawtooth distortion. In order to suppress this distortion, it is necessary to refine the edge of the image. The commonly used image edge processing methods include bilinear interpolation, spline interpolation, and so on. We use the method proposed in this paper to process the edge of the image (the images are from the MNIST digital image set, as shown in the left column of Figure 8). Firstly, the Canny algorithm is used to extract the image edge, as shown in the middle column of Figure 8, and the pixels of the image edge are transformed into double precision. The double precision of white pixels is 1 and the double precision of black pixels is 0. The point set with double precision of 1 is used as the point cloud. Next, the fitting point set S is generated by using the methods of clustering, PCPCA and subdivisions. The coordinate of the points in S are rounded to obtain a new coordinate with image double precision of 1, and the double precision of other points is defined as 0. The last edge of the generated image, as shown in the right column of Figure 8. From the experimental results, our method can effectively smooth the image edge. the middle column of Figure 8, and the pixels of the image edge are transforme double precision. The double precision of white pixels is 1 and the double precis black pixels is 0. The point set with double precision of 1 is used as the point cloud. the fitting point set is generated by using the methods of clustering, PCPCA an divisions. The coordinate of the points in are rounded to obtain a new coordinat image double precision of 1, and the double precision of other points is defined as last edge of the generated image, as shown in the right column of Figure 8. From t perimental results, our method can effectively smooth the image edge. Figure 8. Image edge smoothing. The first column is two images from the digital image set M The second column is the edges of the two images obtained by the Canny algorithm. The th umn is the smoothed image edge.

Conclusions and Discussion
In order to further improve the performance of curve reconstruction algorithm point cloud, and ensure the fitting efficiency, a point cloud curve reconstruction m Figure 8. Image edge smoothing. The first column is two images from the digital image set MNIST. The second column is the edges of the two images obtained by the Canny algorithm. The third column is the smoothed image edge.

Conclusions and Discussion
In order to further improve the performance of curve reconstruction algorithm of a point cloud, and ensure the fitting efficiency, a point cloud curve reconstruction method based on K-means++ clustering and PCA is proposed. By clustering the point cloud, the point cloud is divided into several clusters, then the outliers of each cluster are removed by PCPCA, and the projection line and main control points of the points in the cluster are found. Finally, the main control points are interpolated by the curve subdivision method to obtain the fitting curve of the point cloud. Compared with the classical curve reconstruction methods, as shown in Table 2, our method can be well adapted to the fitting of the open curve, closed curve and self-intersection curve. At the same time, in the curve reconstruction of high-density point cloud, our method takes 20% of the time of classical methods (such as MLS). Finally, we extend this method to image refinement and image edge smoothing, and the experimental results are also satisfactory. Table 2. Comparison of curve reconstruction methods in different types of point clouds.

Open Curve Close Curve Self-Intersection Curve
Levin's [7] √ × × Furferi's [15] √ × × Chen's [14] × √ × Gu's [26] √ × × Ours √ √ √ However, the method in this paper is to be further improved in the following aspects. The cluster of the point cloud uses a method of taking random starting points, so the results of clustering are not unique, and the k-means++ clustering method is not sensitive to the density of the point cloud. On the other hand, extending this method to the surface reconstruction is also the work we are studying at present. Finally, when extending our method to image processing, the image type is digital. How to extend this method to other image types is also a problem we need to consider later.