A Robust Nonrigid Point Set Registration Method Based on Collaborative Correspondences

The nonrigid point set registration is one of the bottlenecks and has the wide applications in computer vision, pattern recognition, image fusion, video processing, and so on. In a nonrigid point set registration problem, finding the point-to-point correspondences is challengeable because of the various image degradations. In this paper, a robust method is proposed to accurately determine the correspondences by fusing the two complementary structural features, including the spatial location of a point and the local structure around it. The former is used to define the absolute distance (AD), and the latter is exploited to define the relative distance (RD). The AD-correspondences and the RD-correspondences can be established based on AD and RD, respectively. The neighboring corresponding consistency is employed to assign the confidence for each RD-correspondence. The proposed heuristic method combines the AD-correspondences and the RD-correspondences to determine the corresponding relationship between two point sets, which can significantly improve the corresponding accuracy. Subsequently, the thin plate spline (TPS) is employed as the transformation function. At each step, the closed-form solutions of the affine and nonaffine parts of TPS can be independently and robustly solved. It facilitates to analyze and control the registration process. Experimental results demonstrate that our method can achieve better performance than several existing state-of-the-art methods.

The feature points that are used in the point set registration methods are extracted from the corresponding images, which may include edges [7], corners [8], SIFT [9], ORB [10], and so on. These feature point sets can well preserve the crucial structural features of images. Let the model point set and the scene point set be represented by X M×D = [x T 1 , x T 2 , . . . , x T M ] T and Y N×D = [y T 1 , y T 2 , . . . , y T N ] T , respectively, where M and N are the number of points in the point sets and D is the dimension of the feature points. The aim of the point set registration methods is to find the interpolation function f (X; θ) to recover the spatial transformation from X to Y, where θ represents the parameters of the interpolation function. Once the correspondences are determined, a set of equations can be established to solve the transformation functions. However, the correspondences are unknown in practice. Point set registration can be divided into the two closely related subproblems: determine the correspondences and estimate the transformation. Nonrigid transformation is flexible and irregular so that the complicated interpolation functions with a large number of parameters have to be adopted.

Previous Work
In recent decades, a number of good methods have been developed to deal with the nonrigid point set registration problems. Here, we give a brief review.
The method of iterative closest points (ICP) [14,15] may be the most popular iterative point registration method. This technique establishes a binary matrix to represent the corresponding relationship based on the nearest neighborhood strategy. The ICP method requires that the two point sets are close enough, otherwise, a number of false correspondences are generated to severely affect the performance. In order to improve the robustness of ICP, many famous methods are developed by relaxing the binary corresponding constraints. Chui et al. [16,17] proposed the famous thin plate spline-robust point matching (TPS-RPM) method. In this work, the authors designed a general framework to iteratively determine the fuzzy correspondences and estimate the spatial transformation based on soft assignment and deterministic annealing. Another representative work is coherent point drift (CPD) that models the point set registration as a probability density estimation problem [18,19]. This method lets one point set be taken as the centroids of Gaussian mixture model (GMM), and achieves the registration by making the GMM centroids to fit the other point set under the expectation-maximization (EM) framework. Besides, in [20], the point sets are treated as two kernel densities, and the point set registration is achieved by maximizing the kernel correlation (KC) between them. An improved version of KC work, GMMREG, can be found in [21] by taking the L2 distance to measure the similarity between two Gaussian mixtures. Later, Ma et al. [22] and Hasanbelliu et al. [23] refined the measure of the similarity between Gaussian mixtures by L2E (namely, RPM-L2E) and Cauchy-Schwarz divergence. By refining the models to capture the spatial distributions of point sets, Tao et al. [24], Wang et al. [25], and Zhou et al. [26] accomplished the point set registration using nonuniform Gaussian mixture models, asymmetric Gaussian mixture models, and Student's t mixture models, respectively. The above methods [14][15][16][17][18][19][20][21][22][23][24][25][26] mainly utilize the spatial locations of the feature points. However, they neglect the local structures around the feature points that are very important to help determine the corresponding relationship between different point sets.
Through introducing the local geometric characteristic, many good methods were developed [27][28][29][30][31][32][33][34][35][36][37]. Zheng et al. [27] proposed a robust nonrigid matching method by preserving local neighborhood structures (RPM-LNS). The local structures were interpreted as a simple graph, and were preserved by maximizing the number of matched edges between two corresponding graphs. Yang et al. [28] proposed GLMDTPS by designing a global and local mixture distance to determine the corresponding relationship. Ma et al. [29] developed a robust point method by preserving the global and local structures (PRGLS). In [30], the authors used k-connected neighbors to construct connectivity matrix, and cast the local structures preservation to minimize the weighted least square error. In [31], the authors proposed the mixture structure descriptor to define the pointwise distance, and designed two energy functions to simultaneously preserve the global and local structures, respectively. In [32,33], Ma proposed a novel method named "MR-RPM" by adopting the manifold regularization to catch the underlying structure of the point sets and help to learn the transformation. In [34], the authors achieved the nonrigid point registration by using two local descriptors of the connectivity matrix and Laplacian coordinate to preserve the geometry structures. In [35,36], Song and Fan proposed a nonrigid registration method via global-local topology preservation (GLTP). The local linear embedding (LLE) was employed to preserve the local topological structures. By taking a local geometric constraint as a regularizer, and designing a semisupervised EM framework, a feature-guided Gaussian mixture model for point set registration was presented in [37]. Next, a brief analysis of these methods is given in [27][28][29][30][31][32][33][34][35][36][37]. In conclusion, the first strategy of the methods is to estimate the correspondences by fusing various structural features, and the second is to introduce spatially constraints to preserve local structural topology. The above two strategies can be utilized together. These methods are very Sensors 2020, 20, 3248 4 of 21 constructive and notable to improve utilization efficiency of the structural information. There are still some details that need to be addressed. For the first strategy, the methods in [27,[29][30][31][32][33]37] employ multiplicative model to fuse the global and local structures. It mainly concentrates on the point pairs that have both small Euclidean distance and similar local structures. It is efficient when the point sets are compact and simple. However, this model might be not able to find enough correspondences when the point sets have complicated structures, such as Chinese characters. Besides, most of them take less account of efficient design to handle mismatches. The performance of these methods would be sensitive to the large deformations and outliers. For instance, the mixture distance in [28] can be used to search accurate correspondences without outliers. Nonetheless, the outliers can reduce the discriminative ability of the mixture distance, which leads to generate mismatches to degrade performance. For the second strategy, [27,[30][31][32][33][34][35][36][37] adopt different spatially constraints, such as the neighboring connectivity matrix and classical manifold regularization techniques, to maintain stability of local structures in the registration process, which can be treated as a regularizer. This assumption is very reasonable. But, once the accurate corresponding relationship is not established, this strategy does not perform well. Besides, in our paper, we employ TPS as the spatial transformation. Its parameters can be explicitly decomposed into the affine and nonaffine parts. Therefore, we can respectively add regularization terms to the affine and nonaffine transformation. Experimental results show that it can efficiently prevent arbitrary spatial transformation. Thus, we focus on establishing accurate corresponding relationship in this paper.
In order to handle large deformations, Du et al. [38] developed a novel method based on heuristic tree. This method first built the heuristic tree using the shape similarities that are derived by affine ICP. Then TPS and CPD can possessively accomplish the nonrigid point set registration along the tree. This method requires a set of point sets to build the tree. In [39], the local structure preservation theories were exploited to remove mismatches for improving the corresponding accuracy, which is named "LPM." This method requires preregistration to determine putative one-to-one correspondences. Graph techniques were also exploited for point set registration problems [40][41][42][43]. In [40], the graph centralities that are combined with the spatial information of point sets were regarded as priors to be embed with CPD. In [41,42], the authors developed a graph-based point registration method (namely, FGM). They factorized the large pairwise affinity matrix into smaller matrices that encode the local structure of each graph and the pairwise affinity between edges. In [43], a new third-order graph matching technique was developed to determine the correspondences. Graph-based methods provide a novel and creative way to handle with the nonrigid point set registration problems. However, graph-based methods should further improve their performance when there exist data degradations of noise and outliers.
There are also many other excellent methods for nonrigid point registration and we just cite a few here. More methods can be found in good reviews like [44,45].

Methods
In this section, we first defined the AD and AD-correspondences and the RD and RD-correspondences. Then the AD-correspondences and RD-correspondences are combined by the proposed heuristic method to determine the corresponding relationship between two point sets. Once the correspondences are obtained, the transformation estimation can be modeled as a least square problem. Fortunately, we can independently get the closed-form solutions of the affine and nonaffine parts of TPS, when the correspondences are given. Subsequently, we introduced the deterministic annealing scheme and analyzed the convergence properties. Finally, we gave the computational cost of the proposed method.

Absolute Distance and Correspondence
AD is defined as the Euclidean distance between the spatial locations of the points. Given a point f (x m ) from the model point set f (X) and a point y n from the scene point set Y as the reference points, the AD between f (x m ) and y n is denoted as Based on AD, several available forms of correspondences can be established. The ICP approach is broadly used in practice. It employs AD and the nearest strategy to define pointwise correspondences.
The correspondence between f (x m ) and y n can be expressed as where d AD , y N )} and κ is a threshold to reject the correspondences with large AD. However, ICP only utilizes the structural information between the reference point and its nearest point in the other point set. It is too rough to detect the complicated relations between point sets. If the initial positions of the point sets are not overlapping enough, a number of false correspondences are generated in the ICP approach. In addition, because the nonrigid deformations vary greatly in different parts of the same object, it is difficult to set an appropriate threshold to reject the putative correspondences.
In our paper, we use Gaussian kernel to define the AD-correspondence between f (x m ) and y n , which is written as where T is the variance or temperature of the Gaussian kernel. According to Equation (3), the corresponding relationship between f (x m ) and scene point set Y can be represented by At the beginning of the registration process, because the AD between the model point X and the scene point Y is usually large, it is difficult to establish the accurate pointwise correspondences. A robust way is to utilize a Gaussian kernel with high temperature T to preserve the sufficient correspondences as the candidates. Notably, although the point pairs with small AD initially possess a closer relationship than the point pairs with large AD, the point pair with the smallest AD is not necessarily the correct correspondence. Different from ICP arbitrarily employing 0-1 correspondence, our method not only retains the point pairs with the smallest AD to participate in the determination of the correspondences but also leaves the chance for all the point pairs to improve their roles gradually. When the temperature T decreases, Gaussian kernel pays more attention on the local structures, and the AD-correspondences are increasingly refined. The correspondences of point pairs with small AD play increasingly important roles. Specifically, when the temperature T is low enough, this approach corresponds to the nearest strategy.

Relative Distance and Correspondence
As the concise representations of the images, the point sets preserve abundant local structures. The local structures are stable and reliable, despite the point set going through severe degradations because of the physical and geometric constraints in the real world. Given the reference points f (x m ) and y n , we can detect their correspondences by comparing the similarity of their surrounding local structures, where the similarity is defined as RD. Herein, SC is employed as the local structural descriptors. SC adopts a group of well-designed bins in log-polar space to catch the spatial distributions of the remaining points with respect to the reference points. Suppose h X m (s) and h Y n (s) are the number of points in s − th bin with respect to the reference points f (x m ) and y n , respectively. The RD between two points is defined by the χ 2 test statistic, which is denoted as Based on RD, the preliminary RD-correspondences can be recovered as one-to-one mapping by bipartite graph matching techniques, such as the Hungarian method [46]. If f (x m ) is matched to y n , the RD-correspondence between two points is TRUE and c RD ( f (x m ), y n ) = 1. Otherwise, the RD-correspondence between two points is FALSE and c RD ( f (x m ), y n ) = 0.
However, it cannot be guaranteed that there is no mismatch among the preliminary RD-correspondences. Herein, we exploit neighboring corresponding consistency to assign confidence for each correspondence to restrain the adverse impacts of mismatches. The main motivation is that the neighborhoods of points that are correctly matched should also be matched. Suppose the corresponding point of the model point f (x m ) is y n . First, we find K closest points of f (x m ) in model point sets, and K closest points of y n in scene point sets, when the RD-correspondences are TRUE. The closest points of f (x m ) and y n are, respectively, denoted as f x π(m,k) and y v(n,k) , where k = 1, 2, . . . , K, π(m, k) and v(n, k) represent the serial number of the k − th closest point to f (x m ) and y n , respectively. Second, based on the one-to-one correspondences determined by the SC, we can obtain the corresponding points of f x π(m,k) in the scene point sets, which are denoted as y v(n,k) , where k = 1, 2, . . . , K and y v(n,k) is the corresponding point of f x π(m,k) . Third, we use consistent distance d C to evaluate the consistency between y v(n,k) and y v(n,k) Subsequently, the confidence between f (x m ) and y n is defined as follows The final RD-correspondences are determined as follows As illustrated by Equations (6)-(8), if y v(n,k) and y v(n,k) are highly consistent, the consistent distance is small, and the corresponding confidence between f x π(m,k) and y n is high. Otherwise, the confidence is low. Specially, if y v(n,k) and y v(n,k) are identical, the confidence η mn equals 1. Besides, the correspondences are also involved with the temperature T. We can maintain much RD-correspondences by employing high temperature T. Through reducing T, the requirement of the corresponding consistency becomes more and more rigorous. In the end, when T is low enough, only the RD-correspondences with zero or approximately zero consistent distance are preserved.

Correspondence Collaboration
As illustrated in Sections 3.1 and 3.2, AD-correspondences and RD-correspondences are derived from different structural features. The point sets contain multiple structural features in scales and patterns. In order to efficiently utilize the structural features, the methods cannot merely rely on Sensors 2020, 20, 3248 7 of 21 single type of structural feature. Here, we fuse various structural information by the collaborative correspondences as follows where ρ is a parameter to control the effects of the RD-correspondence. In the registration process, AD-correspondences and RD-correspondences are complementary and collaborative to handle various cases. Given reference points f (x m ) and y n , in one case when the AD is small, the AD-correspondence is strong and plays a fundamental role in the collaborative correspondence. Moreover, if the RD-correspondence between f (x m ) and y n is TRUE with high confidence, the collaborative correspondence is further strengthened by the RD-correspondence. Thus, this pair of points is assigned a strong corresponding relationship and plays a dominant role in the following registration process. If the RD-correspondence is FALSE or TRUE with low confidence, the collaborative correspondence can still maintain strong relation because of the AD-correspondence. In another case, the nonrigid deformations frequently make some local areas move away from their original positions, which leads to the AD being large and the AD-correspondence being weak. However, by comparing the RD of the local structures, RD-correspondences of the points in these local areas are likely to be established. The collaborative correspondences of these points are mainly determined by RD-correspondences and play an auxiliary role in the following registration process. Notably, these corresponding local areas are matched more accurately step by step, which can lead to smaller ADs and stronger AD-correspondences. Thus, an increasing number of point pairs that simultaneously have strong collaborative correspondences can help us improve the registration accuracy.

Transformation Estimation
Based on the collaborative correspondences, the corresponding relationship and confidence between f (x m ) and y n can be defined as where ς is a parameter to suppress outliers, P = {p mn } M×N is the corresponding matrix, and let N n=1 p mn = 1 by means of normalization. After the corresponding relationship is determined, the transformation estimation can be treated as a least square problem. We adopt the TPS function as the transformation function [13]. The greatest merit of TPS is that its parameters can be explicitly decomposed into the affine and nonaffine parts. Let x m = [1, x m ] and y n = [1, y n ] be the homogeneous coordinates of x m and y n , respectively. The new location of point x m after TPS transformation is represented as follows where d ∈ R (D+1)×D and w ∈ R N×D represent the affine and nonaffine parts, respectively, and φ m T and take QR decomposition of X, the affine and nonaffine warping spaces are separated, which is written as where Q 1 ∈ R M×(D+1) and Q 2 ∈ R M×(M−D−1) are orthogonal. Because the boundary constraint of TPS is given by X T w = 0, we must have Sensors 2020, 20, 3248 With the above analysis and without changing transformation, Equation (11) can be rewritten into where q 2,m is the row m of Q 2 . In order to smooth the nonaffine parts of TPS, a regularization term λ 1 w T Φw is added to the cost function. Here, λ 1 is a free parameter to control the effects of the regularization term. Based on the above analysis, the cost function can be written as where y m = N n=1 p mn y n can be treated as the newly estimated positions, (d − I) T (d − I) is a term to regularize the affine transformation, in which λ 2 is a free parameter to control it. Notably, λ 2 is much smaller than λ 1 in order to give affine transformation more freedom. If considering Equation (13), the compact form of Equation (15) can be represented as follows where According to the properties of QR decomposition, we have Insert the above expressions into Equations (17) and (18), then the final solution can be expressed as By iteratively determining the correspondences and estimating the spatial transformation, the nonrigid point set registration can be accomplished. It is worth noting that the affine part d and the nonaffine part w are independently calculated, which can avoid mutual inference between d and w, and thus, convenient for controlling and analyzing the registration process.

Annealing Scheme and Convergence Analysis
In the optimization, we adopt the deterministic annealing scheme to control the iterative process. As a heuristic and efficient strategy to escape the poor local minimum points, the deterministic annealing has been broadly employed by many nonrigid point registration methods [16,17,25,29]. The deterministic annealing plays an important role in our method, which is carefully discussed as follows: (1) Temperature T controls the search range of AD-correspondences, as illustrated by Equations (3) and (4). Temperature T is gradually reduced by a linear annealing schedule T new = τT old , where τ is the positive annealing rate less than one. At the start, it is hard to determine the accurate one-to-one correspondences so that high temperature T is employed to preserve the corresponding relations with a wide-range. The search range is reduced through gradually reducing temperature T. When T is close to zero, it equivalently uses the hard decisions to determine one-to-one correspondences.
(2) Temperature T is involved in the process of confidence assignment for each RD-correspondence according to consistent distance, as illustrated by Equations (6)- (8). Under the deterministic annealing strategy, we can gradually exclude RD-correspondences with low consistency, as discussed in Section 3.2. We can also control the roles of RD-correspondences to adjust decay parameter ρ using the deterministic annealing scheme.
(3) Regularization parameters λ 1 and λ 2 are also evolved by following deterministic annealing schedule, λ 1 = λ init 1 · T and λ 2 = λ init 2 · T, where λ init 1 and λ init 2 are the two initial values. Through reducing regularization parameters from large to small, the spatial transformation based on TPS is recovered from rigid to nonrigid.
Overall, we adopt a coarse-to-fine and rigid-to-nonrigid strategies to optimize the proposed method based on deterministic annealing techniques. As a dual update process, in each iteration, we can obtain the closed-form solutions of TPS transformation after the correspondences are determined. In next iteration, the newly estimated transformation can be used to help determine the correspondences. Besides, as the temperature decreases, the search range of AD-correspondences is reduced and RD-correspondences with high consistency are emphasized. Thus, we can determine the correspondences more accurately. Next, the newly updated correspondences are utilized to recover the spatial transformation. Given the lower temperature, the spatial transformation can have more freedom on nonrigid terms to recover local nonrigid transformation more accurately. In the end, the correspondences are approximately one-to-one, and the nonrigid transformation is recovered. In summary, the correspondence determination and the spatial transformation estimation run iteratively to gradually converge to a stable local minimum under the deterministic annealing scheme.

Computational Complexity
In each iteration, the AD-correspondences between two point sets take complexity O(MN). The complexity of preliminary RD-correspondences using Hungarian method is O(M 3 ). The corresponding confidence assignment needs to find K (K << M) closest points for each point pairs for which the preliminary RD-correspondence is TRUE. By using sequential search, it costs O(KM) complexity for one point and spends O(2KM 2 ) for the two point sets. It totally needs O(M 3 + 2KM 2 ) operations to compute the RD-correspondences. Solving the nonrigid transformation needs complexity O(M 3 ). Overall, the complexity of the proposed method takes O(M 3 ).

Results
Our method is implemented in MATLAB, and the experimental environment is an Intel Core i7-7700 CPU and 32GB RAM. We choose the methods which have publicly available codes that are provided by the authors, and keep their default parameter settings, including TPS-RPM [17], CPD [19], GLMDTPS [28], MR-RPM [33], and GLTP [36]. The root-mean-square error (RMSE) is used as the registration error, which is denoted as follows where y m is the ground truth corresponding point of x m . Parameter settings: The T init , T f inal , and annealing rate τ are used to control the deterministic annealing process. As the initial temperature, T init is high. The stopping temperature T f inal is low. However, they are not too high or too low, otherwise, they would spend much more computation to accomplish the registration. We experimentally recommend to set T init ∈ [0.1, 0.5], T f inal ∈ 10 −3 , 10 −5 , and τ ∈ [0.90, 0.98]. In our paper, we fix T init = 0.2, T f inal = T init /1500, and τ = 0.93. We keep these settings in the following experiments. In each temperature, the method runs three iterations. In order to save computation, we only calculate RD-correspondences once in each temperature.
Parameters that are related to RD-correspondences include parameters of SC, and parameters ρ and K. We preserve the default settings of SC in [11]. Parameter ρ controls the roles of RD-correspondences in collaborative correspondences. The proposed method can perform well for ρ = 0 in the outlier-free scenarios. When there are outliers in data, we set ρ = N/(4M). Parameter K is the number of neighboring points that are employed to evaluate the corresponding consistency of RD-correspondences. Large K can underline the RD-correspondences with the high corresponding consistency, in order to obtain more accurate RD-correspondences. However, if K is too large, it would be too strict to preserve sufficient RD-correspondences. Simultaneously, computational cost will increase. In our paper, by taking a balance of these factors, the number of neighboring points is set as K = 3 and K = 7 when M = N and M N, respectively.
Parameter ς is used to handle outliers. Regularization parameters λ 1 and λ 2 are employed to trade off the flexibility and smoothness of the spatial transformation. As discussed in [17], the λ init 2 is set as much smaller than λ init 1 for providing more freedom for the affine transformation. In our paper, we fix ς = 0.5 and λ init 2 = 0.01λ init 1 . We set initial value λ init 1 as 0.5 and λ init 2 = 0.01λ init 1 = 0.005.

Results on Fish Dataset
We test our algorithm on fish data in [17,27]. Each fish model point set contains 98 points. There are five kinds of degradations, including nonrigid deformation, noise, outlier, rotation, and occlusion. For each degradation, there are five levels. For each level, there are 100 samples. We give the qualitative and quantitative results. In quantitative results, the registration results of all the 100 samples for each level are used to evaluate the performance of the nonrigid point set registration algorithms. Notably, as demonstrated in [28], there are two cases of occlusion: missing points in one side and both sides.
Our method can achieve good performance in first case. In the second case, our method enforces each model point to search its corresponding point in the scene point set, while the corresponding point might not exist. Thus, our method cannot handle the second case well. In the experiments of occlusion, we present the performance of the methods with missing points in single side. Figure 1 shows qualitative results on some examples of fish dataset. The model points are represented by "+," and the scene points are denoted by "o." The results are arranged by every two rows, the upper row shows the data and the lower gives the registration results. From top to bottom, there are different degradations including deformation, noise, occlusion, outliers, and rotation. From left to right, the degradation level gradually increases. The proposed method is accurate and robust in the degradations of deformation, low noise, occlusion (single side), outliers, and rotation. When the noise level is high, the structures are severely damaged. The proposed method can roughly recover the spatial transformation. Figure 2 reports the quantitative results of seven state-of-the-art methods and ours. In the test of deformation, as illustrated in Figure 2a, when the deformation is not large, all the methods can obtain good results in the registration accuracy. Notably, our method can still keep perfect performance as the deformation becomes large. Figure 2b shows the results of the registration methods under the degradation of noise. Our method can get the best results in low noise level. When the noise level is high, the structures of the point sets are not well preserved so that the performance declines. Figure 2c shows the registration results under the degradation of occlusion. The ratios of missing points are from 0.08 to 0.40 with an interval of 0.08. We can see that the proposed method can keep better performance than other methods when some points are missing. Figure 2d presents the results of the registration methods under the degradation of outliers. CPD and GLMDTPS can perform well when the number of outliers is small, but cannot handle large number of outliers well. We observe that our method shows outstanding performance when dealing with outliers. Figure 2e shows the registration results in rotation. We rotate the data from 30 • to 180 • with an interval of 30 • . As illustrated in [27], by using the direction from a point to the mass center of a point set as the positive X-axis for the local coordinate system, it can get rotation invariant shape context (RISC). We adopt RISC to correct the rotation between two point sets at the start. This can make our method be invariant to the rotation.
Sensors 2020, 20, x FOR PEER REVIEW 11 of 21 qualitative and quantitative results. In quantitative results, the registration results of all the 100 samples for each level are used to evaluate the performance of the nonrigid point set registration algorithms. Notably, as demonstrated in [28], there are two cases of occlusion: missing points in one side and both sides. Our method can achieve good performance in first case. In the second case, our method enforces each model point to search its corresponding point in the scene point set, while the corresponding point might not exist. Thus, our method cannot handle the second case well. In the experiments of occlusion, we present the performance of the methods with missing points in single side. Figure 1 shows qualitative results on some examples of fish dataset. The model points are represented by "+," and the scene points are denoted by "o." The results are arranged by every two rows, the upper row shows the data and the lower gives the registration results. From top to bottom, there are different degradations including deformation, noise, occlusion, outliers, and rotation. From left to right, the degradation level gradually increases. The proposed method is accurate and robust in the degradations of deformation, low noise, occlusion (single side), outliers, and rotation. When the noise level is high, the structures are severely damaged. The proposed method can roughly recover the spatial transformation. Figure 2 reports the quantitative results of seven state-of-the-art methods and ours. In the test of deformation, as illustrated in Figure 2a, when the deformation is not large, all the methods can obtain good results in the registration accuracy. Notably, our method can still keep perfect performance as the deformation becomes large. Figure 2b shows the results of the registration methods under the degradation of noise. Our method can get the best results in low noise level. When the noise level is high, the structures of the point sets are not well preserved so that the performance declines. Figure  2c shows the registration results under the degradation of occlusion. The ratios of missing points are from 0.08 to 0.40 with an interval of 0.08. We can see that the proposed method can keep better performance than other methods when some points are missing. Figure 2d presents the results of the registration methods under the degradation of outliers. CPD and GLMDTPS can perform well when the number of outliers is small, but cannot handle large number of outliers well. We observe that our method shows outstanding performance when dealing with outliers. Figure 2e shows the registration results in rotation. We rotate the data from 30 to 180 with an interval of 30 . As illustrated in [27], by using the direction from a point to the mass center of a point set as the positive X-axis for the local coordinate system, it can get rotation invariant shape context (RISC). We adopt RISC to correct the rotation between two point sets at the start. This can make our method be invariant to the rotation.

Results on Chinese Character Dataset
Chinese characters are pictographic and have abundant and meaningful structures. The Chinese character dataset can be obtained from https://github.com/xdregis/complicated_chinese_characters. To evaluate the performance of the registration methods, we choose five Chinese characters with complicated structures, namely, "cake," "dim," "math," "micro," and "tree," with 148, 153, 177, 189, and 149 points, respectively. The Chinese characters are handwritten, and the feature points are manually annotated as the model point set. The nonrigid deformation versions of the model point set are made by the techniques in [28] as scene point sets. There are five levels for nonrigid deformation. For noise degradation, we add Gaussian white noise to the data with five levels. The mean of the noise is zero, and the standard deviation is changed from 0.06 to 0.18 with an interval of 0.03. For occlusion degradation, the ratios of missing points are from 0.08 to 0.40 with an interval of 0.08. For outlier degradation, we add random distributed outliers to data. The ratios of outliers to data are from 0.2 to 1.0 with an interval of 0.2. For rotation degradation, we rotate the data from 30 to 180 with an interval of 30 . For each level, each character has 100 samples. Therefore, five characters have 500 examples. Figure 3 shows the qualitative results of the proposed method on some examples of the five Chinese characters on the degradations. We can see that the structures of Chinese characters are complicated because they are formed by several relatively independent parts. The experimental results illustrate that the proposed method can accurately align model point sets to scene point sets. Figure 4 shows the quantitative results on the Chinese characters of the seven state-of-the-art methods and ours. As shown in Figure 4a, our method is most accurate in deformation. In Figure 4b, our method shows the best performance from levels 1-4 in noise. As illustrated by Figure 4c, in

Results on Chinese Character Dataset
Chinese characters are pictographic and have abundant and meaningful structures. The Chinese character dataset can be obtained from https://github.com/xdregis/complicated_chinese_characters. To evaluate the performance of the registration methods, we choose five Chinese characters with complicated structures, namely, "cake," "dim," "math," "micro," and "tree," with 148, 153, 177, 189, and 149 points, respectively. The Chinese characters are handwritten, and the feature points are manually annotated as the model point set. The nonrigid deformation versions of the model point set are made by the techniques in [28] as scene point sets. There are five levels for nonrigid deformation. For noise degradation, we add Gaussian white noise to the data with five levels. The mean of the noise is zero, and the standard deviation is changed from 0.06 to 0.18 with an interval of 0.03. For occlusion degradation, the ratios of missing points are from 0.08 to 0.40 with an interval of 0.08. For outlier degradation, we add random distributed outliers to data. The ratios of outliers to data are from 0.2 to 1.0 with an interval of 0.2. For rotation degradation, we rotate the data from 30 • to 180 • with an interval of 30 • . For each level, each character has 100 samples. Therefore, five characters have 500 examples. Figure 3 shows the qualitative results of the proposed method on some examples of the five Chinese characters on the degradations. We can see that the structures of Chinese characters are complicated because they are formed by several relatively independent parts. The experimental results illustrate that the proposed method can accurately align model point sets to scene point sets. Figure 4 shows the quantitative results on the Chinese characters of the seven state-of-the-art methods and ours. As shown in Figure 4a, our method is most accurate in deformation. In Figure 4b, our method shows the best performance from levels 1-4 in noise. As illustrated by Figure 4c, in occlusion (single side), GLMDTPS, MR-RPM, and our methods are accurate and robust in five levels. In Figure 4d, our method achieves the outstanding performance in the degradation of outliers. Figure 4e shows the registration results in rotation. Our method is invariant to the rotation.
Sensors 2020, 20, x FOR PEER REVIEW 14 of 21 occlusion (single side), GLMDTPS, MR-RPM, and our methods are accurate and robust in five levels.
In Figure 4d, our method achieves the outstanding performance in the degradation of outliers. Figure  4e shows the registration results in rotation. Our method is invariant to the rotation. Figure 3. Registration results on Chinese character dataset. From top to bottom: "cake," "dim," "math," "micro," and "tree" in every two rows; the upper row shows the data and the lower row Registration results on Chinese character dataset. From top to bottom: "cake," "dim," "math," "micro," and "tree" in every two rows; the upper row shows the data and the lower row shows the registration results. From left to right: deformation, noise, occlusion, outliers, and rotation. The model points and the scene points are marked by "+" and "o," respectively.
shows the registration results. From left to right: deformation, noise, occlusion, outliers, and rotation. The model points and the scene points are marked by "+" and "o," respectively.    Figure 5. The first row is the face of view 1. This face is employed as the model face. The second row is the faces from view 2 to view 6. These faces are employed as the scene faces. We can see that the facial expressions and angles are significantly changed from view 1 to view 3. The third row shows the point sets to be matched. The fourth row shows the registration results. The proposed method can accurately align the point sets even though the facial expressions and angles are very different. The quantitative results of five groups are provided in Figure 6. We can see that the proposed method achieves the best performance.
Sensors 2020, 20, x FOR PEER REVIEW 16 of 21 matched. The qualitative results on one of the persons are shown in Figure 5. The first row is the face of view 1. This face is employed as the model face. The second row is the faces from view 2 to view 6. These faces are employed as the scene faces. We can see that the facial expressions and angles are significantly changed from view 1 to view 3. The third row shows the point sets to be matched. The fourth row shows the registration results. The proposed method can accurately align the point sets even though the facial expressions and angles are very different. The quantitative results of five groups are provided in Figure 6. We can see that the proposed method achieves the best performance.  matched. The qualitative results on one of the persons are shown in Figure 5. The first row is the face of view 1. This face is employed as the model face. The second row is the faces from view 2 to view 6. These faces are employed as the scene faces. We can see that the facial expressions and angles are significantly changed from view 1 to view 3. The third row shows the point sets to be matched. The fourth row shows the registration results. The proposed method can accurately align the point sets even though the facial expressions and angles are very different. The quantitative results of five groups are provided in Figure 6. We can see that the proposed method achieves the best performance.

Results on IMM Hand Dataset
IMM hands dataset consists of 40 annotated monocular images with resolution pixels 800 × 600 from four different human hands, and each person provides 10 images with different hand gestures. In each group, each gesture is employed as the model data in turn. The other nine gestures are employed as the scene data. Thus, there are 90 different combinations for each group. The qualitative results of gestures from one person are shown in Figure 7. We take the first gesture as the model point set, and the other nine gestures are used as the scene point sets. Although the hand gestures have significant changes, the proposed method can accurately recover the spatial transformation. Figure 8 presents the quantitative results. As illustrated by Figure 8, the proposed method can get the highest accuracy from Group 1 to Group 4. In the Group 4, our method is still most accurate, but the accuracy is lower than previous groups, and the standard deviation is large. Although our method can obtain good registration results in most cases, several examples in Group 4 are hard to be well matched by ours. We give a failure case in Figure 9. As shown in the first and second subfigures in Figure 9, we can see that the adjacent fingers of the model gestures are very close, and the feature points that represent these fingers approximatively line up on the boarders. Although these points belong to different fingers, their structural features are similar so that they are difficult to be correctly separated when they are employed as the model points. Besides, the spatial transformation between two gestures is large. This further increases the difficulty to accurately match the point sets. Our method is not very well to handle these cases. In Group 4, the hand gestures are more than others, and thus, this results in performance degradation.  Figure 7. We take the first gesture as the model point set, and the other nine gestures are used as the scene point sets. Although the hand gestures have significant changes, the proposed method can accurately recover the spatial transformation. Figure 8 presents the quantitative results. As illustrated by Figure 8, the proposed method can get the highest accuracy from Group 1 to Group 4. In the Group 4, our method is still most accurate, but the accuracy is lower than previous groups, and the standard deviation is large. Although our method can obtain good registration results in most cases, several examples in Group 4 are hard to be well matched by ours. We give a failure case in Figure 9. As shown in the first and second subfigures in Figure 9, we can see that the adjacent fingers of the model gestures are very close, and the feature points that represent these fingers approximatively line up on the boarders. Although these points belong to different fingers, their structural features are similar so that they are difficult to be correctly separated when they are employed as the model points. Besides, the spatial transformation between two gestures is large. This further increases the difficulty to accurately match the point sets. Our method is not very well to handle these cases. In Group 4, the hand gestures are more than others, and thus, this results in performance degradation.  Finally, in Table 1, we give the comparisons of runtime of point set registration methods on fish (98 points), cake (138 points), dim (153 points), math (177 points), micro (189 points), tree (149 points), IMM face (58 points), and IMM hands (58 points). CPD, GLMDTPS, and GLTP are implemented by C and MATLAB. Although the computation complexity is in same level, they run much faster than the methods that are only implemented by MATLAB, including TPS-RPM, MR-RPM, and ours. TPS-RPM is the most relevant method to ours. Both TPS-RPM and our methods employ TPS as the transform function and adopt the deterministic annealing as the optimization strategy. TPS-RPM only needs to calculate AD-correspondences. In order to get more accurate correspondences, our method has to consume more time to calculate RD-correspondences besides AD-correspondences. In summary, our method can efficiently improve the registration accuracy, but the computation cost increases.  Finally, in Table 1, we give the comparisons of runtime of point set registration methods on fish (98 points), cake (138 points), dim (153 points), math (177 points), micro (189 points), tree (149 points), IMM face (58 points), and IMM hands (58 points). CPD, GLMDTPS, and GLTP are implemented by C and MATLAB. Although the computation complexity is in same level, they run much faster than the methods that are only implemented by MATLAB, including TPS-RPM, MR-RPM, and ours. TPS-RPM is the most relevant method to ours. Both TPS-RPM and our methods employ TPS as the transform function and adopt the deterministic annealing as the optimization strategy. TPS-RPM only needs to calculate AD-correspondences. In order to get more accurate correspondences, our method has to consume more time to calculate RD-correspondences besides AD-correspondences. In summary, our method can efficiently improve the registration accuracy, but the computation cost increases.  Finally, in Table 1, we give the comparisons of runtime of point set registration methods on fish (98 points), cake (138 points), dim (153 points), math (177 points), micro (189 points), tree (149 points), IMM face (58 points), and IMM hands (58 points). CPD, GLMDTPS, and GLTP are implemented by C and MATLAB. Although the computation complexity is in same level, they run much faster than the methods that are only implemented by MATLAB, including TPS-RPM, MR-RPM, and ours. TPS-RPM is the most relevant method to ours. Both TPS-RPM and our methods employ TPS as the transform function and adopt the deterministic annealing as the optimization strategy. TPS-RPM only needs to calculate AD-correspondences. In order to get more accurate correspondences, our method has to consume more time to calculate RD-correspondences besides AD-correspondences. In summary, our method can efficiently improve the registration accuracy, but the computation cost increases. Finally, in Table 1, we give the comparisons of runtime of point set registration methods on fish (98 points), cake (138 points), dim (153 points), math (177 points), micro (189 points), tree (149 points), IMM face (58 points), and IMM hands (58 points). CPD, GLMDTPS, and GLTP are implemented by C and MATLAB. Although the computation complexity is in same level, they run much faster than the methods that are only implemented by MATLAB, including TPS-RPM, MR-RPM, and ours. TPS-RPM is the most relevant method to ours. Both TPS-RPM and our methods employ TPS as the transform function and adopt the deterministic annealing as the optimization strategy. TPS-RPM only needs to calculate AD-correspondences. In order to get more accurate correspondences, our method has to consume more time to calculate RD-correspondences besides AD-correspondences. In summary, our method can efficiently improve the registration accuracy, but the computation cost increases.

Conclusions
In this paper, we have presented a robust and accurate method and its applications for nonrigid point set registration. The main idea of our method is to find an efficient way to combine AD-correspondences and RD-correspondences to determine the corresponding relationship between the point sets to be matched. AD-correspondences and RD-correspondences are derived from the structures with different attributes. They are complementary and are able to improve the utilization efficiency of the abundant structural information within the point sets. Moreover, TPS is adopted as the transformation function. At each iteration step, the closed-form solutions of the affine and nonaffine parts have been used, which makes it be convenient to analyze and control the registration process. Experiments illustrate that the proposed method can achieve good performance on both synthetic and real data.
Although the proposed method can achieve good performance in most scenarios of nonrigid point set registration, the proposed method cannot handle the degradations well when the miss points are on both sides. Our method has relatively heavy computation when there are a large number of points, such as the 3D point clouds that are obtained by RGB-D cameras, laser scanners, and lidar, which usually contain thousands of points. In the future, we will focus on finding efficient way to reduce the computational complexity.