SPPD: A Novel Reassembly Method for 3D Terracotta Warrior Fragments Based on Fracture Surface Information

As one of China′s most precious cultural relics, the excavation and protection of the Terracotta Warriors pose significant challenges to archaeologists. A fairly common situation in the excavation is that the Terracotta Warriors are mostly found in the form of fragments, and manual reassembly among numerous fragments is laborious and time-consuming. This work presents a fracture-surface-based reassembling method, which is composed of SiamesePointNet, principal component analysis (PCA), and deep closest point (DCP), and is named SPPD. Firstly, SiamesePointNet is proposed to determine whether a pair of point clouds of 3D Terracotta Warrior fragments can be reassembled. Then, a coarse-to-fine registration method based on PCA and DCP is proposed to register the two fragments into a reassembled one. The above two steps iterate until the termination condition is met. A series of experiments on real-world examples are conducted, and the results demonstrate that the proposed method performs better than the conventional reassembling methods. We hope this work can provide a valuable tool for the virtual restoration of three-dimension cultural heritage artifacts.


Introduction
Terracotta Warriors are the burial objects of China s first emperor. They are known for their large numbers, vivid appearances, and diverse postures, and they are of great value in history, archaeology, art, and the tourist industry. However, owing to thousands of years of weathering erosion and earth crust movement, a large number of the Terracotta Warriors have been damaged and gathered in piles of fragments [1]. To restore the original appearance, the reassembly task is essential. Nevertheless, the traditional manual reassembly is tedious and laborious for archaeologists and may induce secondary damage to the fragments [2]. Advances in computer technology have made it possible to replace manual operations with virtual-reassembly techniques. Due to the simple structure and pure information, the point cloud is utilized to represent virtual fragments. Therefore, virtual reassembly is a complex problem in the field of the point cloud, involving feature extraction, point cloud segmentation, sampling, optimization, etc. To date, some methods have been exploited to address this problem. In general, these methods mainly consist of two stages: pairwise matching of fragments and the registration of matching pairs. Firstly, we build a dataset of all the fragments to be reassembled. If the number of fragments in this dataset is more than 1, there may be a matching relationship. Secondly, we enter the matching stage where we preprocess the data, extract the fracture surfaces of fragments, and judge the similarity between fracture surfaces according to the score calculated with SiamesePointNet. Thirdly, we select the most similar fracture surfaces in the registration stage and apply them to a coarse-to-fine registration method combining PCA and DCP. Then, the dataset is updated by replacing the original matching fragments with the registration result. The above process is repeated until any one of the two termination conditions is met: either the number of fragments in the dataset is equal to 1, or the maximum similarity score is not greater than the threshold ζ, which means that there is no potential matching relationship anymore. This is a greedy strategy, taking full account of the intermediate registration products, ensuring that each round of reassembly objects is the optimal choice, and ultimately guarantees global optimization and avoids mismatch.
Our key contributions can be summarized as follows: 1. A novel reassembly method for 3D Terracotta Warrior fragments is proposed; 2. A deep network called SiamesePointNet to evaluate the similarity of fracture surfaces is proposed; 3. A coarse-to-fine registration method combining PCA and DCP is proposed; 4. A series of real-world-based experiments have been conducted to prove the validity of the proposed method.
The rest of this work is structured below. In Section II, data preprocessing and fracture surface extraction, SiamesePointNet network, and PCA-DCP based registration method are introduced. In Section III, we conduct a series of real-world-data-based experiments to demonstrate the effectiveness of our approach. Finally, we present a conclusion and outlook in Section 4. Firstly, we build a dataset of all the fragments to be reassembled. If the number of fragments in this dataset is more than 1, there may be a matching relationship. Secondly, we enter the matching stage where we preprocess the data, extract the fracture surfaces of fragments, and judge the similarity between fracture surfaces according to the score calculated with SiamesePointNet. Thirdly, we select the most similar fracture surfaces in the registration stage and apply them to a coarse-to-fine registration method combining PCA and DCP. Then, the dataset is updated by replacing the original matching fragments with the registration result. The above process is repeated until any one of the two termination conditions is met: either the number of fragments in the dataset is equal to 1, or the maximum similarity score is not greater than the threshold ζ, which means that there is no potential matching relationship anymore. This is a greedy strategy, taking full account of the intermediate registration products, ensuring that each round of reassembly objects is the optimal choice, and ultimately guarantees global optimization and avoids mismatch.
Our key contributions can be summarized as follows:

1.
A novel reassembly method for 3D Terracotta Warrior fragments is proposed; 2.
A deep network called SiamesePointNet to evaluate the similarity of fracture surfaces is proposed; 3.
A coarse-to-fine registration method combining PCA and DCP is proposed; 4.
A series of real-world-based experiments have been conducted to prove the validity of the proposed method.
The rest of this work is structured below. In Section 2, data preprocessing and fracture surface extraction, SiamesePointNet network, and PCA-DCP based registration method are introduced. In Section 3, we conduct a series of real-world-data-based experiments to demonstrate the effectiveness of our approach. Finally, we present a conclusion and outlook in Section 4.

Data Preprocessing and Fracture Surfaces Extraction
As real-world objects, Terracotta Warriors should be represented in a suitable form, and fracture surfaces should be extracted for matching and registration stages. The fragment data came from the Emperor Qinshihuang s Mausoleum Site Museum and was collected by Northwestern University with an Artec Eva handy scanner. As there is a large amount of noise in the preliminary scanned data, it is necessary to denoise it with Artec Studio Professional and Geomagic Wrap 2017 first. The fracture surface is selected as the data source because of its more extensive area, richer information, and continuous shape. This work follows Li s approach [11] to accomplish region segmentation to strip fracture surfaces from the entire fragment. Additionally, random sampling is implemented to prepare normalized data as input to the following network. These steps are shown in Figure 2.

Data Preprocessing and Fracture Surfaces Extraction
As real-world objects, Terracotta Warriors should be represented in a suitable form, and fracture surfaces should be extracted for matching and registration stages. The fragment data came from the Emperor Qinshihuang′s Mausoleum Site Museum and was collected by Northwestern University with an Artec Eva handy scanner. As there is a large amount of noise in the preliminary scanned data, it is necessary to denoise it with Artec Studio Professional and Geomagic Wrap 2017 first. The fracture surface is selected as the data source because of its more extensive area, richer information, and continuous shape. This work follows Li′s approach [11] to accomplish region segmentation to strip fracture surfaces from the entire fragment. Additionally, random sampling is implemented to prepare normalized data as input to the following network. These steps are shown in Figure  2.

Network Architecture for Matching Stage
Inspired by the Siamese network [37], which is used to judge the similarity between 2D images, we propose a modified network to calculate the similarity score between 3D point clouds, named SiamesePointNet. The network architecture is depicted in Figure 3. The main intention of SiamesePointNet is to measure similarity in a quantitative way to provide a reliable basis for determining the matching relationship and selecting registration objects. Our network takes a Siamese structure as the overall framework. The feature is extracted through two parallel PointNet networks. The weights are adjusted by matching labels so that SiamesePointNet can migrate from the original classification and segmentation task of PointNet to evaluate similarity. Intuitively, this network mainly includes two parts: a feature extractor module and a scoring module. The two inputs (point cloud A and B) are in pairs. Specifically, they are denoted as , ⋯ , , ⋯ , ⊂ ℝ , , ⋯ , , ⋯ , ⊂ ℝ , respectively. The feature extractor module is obliged to obtain a

Network Architecture for Matching Stage
Inspired by the Siamese network [37], which is used to judge the similarity between 2D images, we propose a modified network to calculate the similarity score between 3D point clouds, named SiamesePointNet. The network architecture is depicted in Figure 3. The main intention of SiamesePointNet is to measure similarity in a quantitative way to provide a reliable basis for determining the matching relationship and selecting registration objects.

Data Preprocessing and Fracture Surfaces Extraction
As real-world objects, Terracotta Warriors should be represented in a suitable form, and fracture surfaces should be extracted for matching and registration stages. The fragment data came from the Emperor Qinshihuang′s Mausoleum Site Museum and was collected by Northwestern University with an Artec Eva handy scanner. As there is a large amount of noise in the preliminary scanned data, it is necessary to denoise it with Artec Studio Professional and Geomagic Wrap 2017 first. The fracture surface is selected as the data source because of its more extensive area, richer information, and continuous shape. This work follows Li′s approach [11] to accomplish region segmentation to strip fracture surfaces from the entire fragment. Additionally, random sampling is implemented to prepare normalized data as input to the following network. These steps are shown in Figure  2.

Network Architecture for Matching Stage
Inspired by the Siamese network [37], which is used to judge the similarity between 2D images, we propose a modified network to calculate the similarity score between 3D point clouds, named SiamesePointNet. The network architecture is depicted in Figure 3. The main intention of SiamesePointNet is to measure similarity in a quantitative way to provide a reliable basis for determining the matching relationship and selecting registration objects. Our network takes a Siamese structure as the overall framework. The feature is extracted through two parallel PointNet networks. The weights are adjusted by matching labels so that SiamesePointNet can migrate from the original classification and segmentation task of PointNet to evaluate similarity. Intuitively, this network mainly includes two parts: a feature extractor module and a scoring module. The two inputs (point cloud A and B) are in pairs. Specifically, they are denoted as , ⋯ , , ⋯ , ⊂ ℝ , , ⋯ , , ⋯ , ⊂ ℝ , respectively. The feature extractor module is obliged to obtain a Our network takes a Siamese structure as the overall framework. The feature is extracted through two parallel PointNet networks. The weights are adjusted by matching labels so that SiamesePointNet can migrate from the original classification and segmentation task of PointNet to evaluate similarity. Intuitively, this network mainly includes two parts: a feature extractor module and a scoring module. The two inputs (point cloud A and B) are in pairs. Specifically, they are denoted as A = {x 1 , · · · , x i , · · · , x N } ⊂ R 3 , B = y 1 , · · · , y j , · · · , y N ⊂ R 3 , respectively. The feature extractor module is obliged to obtain a correct mapping F : R N×3 → R K from a collection of scattered points to a K-dimensional normalized feature vector, which is easy to compare. On account of the feature transformation, point-level convolution, and pooling layer, PointNet [29] can map point cloud to a meaningful feature space, regardless of the pose and coordinate system, which disturbances the judgment of the similarity. Therefore, PointNet is embedded in the feature extractor module, where A is mapped to a feature vector F(A) = (a 1 , a 2 , · · · , a k ), and B is the same. In the scoring module, an energy function E(A, B) = F(A) − F(B) 2 is used to describe the distance between two feature vectors, and the final similarity score is negatively correlated with this E (A, B). Therefore, point clouds with similar shapes will be mapped to feature vectors close to each other, and finally, lead to a high score. In other words, a high score suggests that the two fragments may be sub-parts of an original fragment and break on the fracture surface, so they can be reassembled to recover the complete fragment, i.e., they match.
To constrain the network to associate the matching relationship with the similarity score correctly, the contrastive loss is used and shown as below: where f g refers to the label in the ground truth (1 for the match, 0 for not), and f p refers to the output similarity score. This score, roughly distributed between 0-1, is a manifestation of the possibility of matching. After the network training is completed, the output score guides selecting registration objects for the registration stage, the subsequent stage of the proposed reassembly method. As shown in the example in Figure 4, given two fragments, the similarity scores among all fracture surfaces are calculated; the matching pair with the highest score is selected as the objects to be registered. correct mapping F: ℝ → ℝ from a collection of scattered points to a K-dimensional normalized feature vector, which is easy to compare. On account of the feature transformation, point-level convolution, and pooling layer, PointNet [29] can map point cloud to a meaningful feature space, regardless of the pose and coordinate system, which disturbances the judgment of the similarity. Therefore, PointNet is embedded in the feature extractor module, where A is mapped to a feature vector , , ⋯ , , and B is the same. In the scoring module, an energy function , ‖ ‖ is used to describe the distance between two feature vectors, and the final similarity score is negatively correlated with this , . Therefore, point clouds with similar shapes will be mapped to feature vectors close to each other, and finally, lead to a high score. In other words, a high score suggests that the two fragments may be sub-parts of an original fragment and break on the fracture surface, so they can be reassembled to recover the complete fragment, i.e., they match.
To constrain the network to associate the matching relationship with the similarity score correctly, the contrastive loss is used and shown as below: where refers to the label in the ground truth (1 for the match, 0 for not), and refers to the output similarity score. This score, roughly distributed between 0-1, is a manifestation of the possibility of matching. After the network training is completed, the output score guides selecting registration objects for the registration stage, the subsequent stage of the proposed reassembly method.
As shown in the example in Figure 4, given two fragments, the similarity scores among all fracture surfaces are calculated; the matching pair with the highest score is selected as the objects to be registered.

PCA-DCP Based Method for Registration Stage
The problem in the registration stage can be stated as: Given , , ⋯ , ⊂ ℝ , , , ⋯ , ⊂ ℝ , our purpose is to find apposite rigid transformation parameters , where ∈ SO 3 and ∈ ℝ which transform to approach . Transform result * is obtained by applying the parameters to . If transformation parameters correct, * and will completely coincide in position in an ideal situation.
As shown in Figure 5, our registration method follows the coarse-to-fine principle, where the PCA method obtains the initial position estimation, and DCP [35] brings the fine adjustment result on this basis.

PCA-DCP Based Method for Registration Stage
The problem in the registration stage can be stated as: our purpose is to find apposite rigid transformation parameters R xy , T xy where R xy ∈ SO(3) and T xy ∈ R 3 which transform A to approach B. Transform result A * is obtained by applying the parameters to A. If transformation parameters correct, A * and B will completely coincide in position in an ideal situation.
As shown in Figure 5, our registration method follows the coarse-to-fine principle, where the PCA method obtains the initial position estimation, and DCP [35] brings the fine adjustment result on this basis.
Inspired by [38], the main idea of the PCA coarse registration method is utilizing PCA to obtain the main axes system (MAS, a general direction descriptor, consists of three mutually orthogonal unit vectors, each called an axis) of the fracture surface, and then registering MASs instead of the whole surfaces. Since MAS is attached to the fracture surface, when MASs are registered, fracture surfaces will also be registered by applying transformation parameters for the entire fracture surfaces. Furthermore, because the fracture surface is a subset of the fragment, they share the same coordinate system. In the same way, the registration parameters of the fracture surfaces can also be applied to the whole fragments. Therefore, the registration of the two fragments is completed by registering two MASs of fracture surfaces; Figure 6 shows this process. Inspired by [38], the main idea of the PCA coarse registration method is utilizing PCA to obtain the main axes system (MAS, a general direction descriptor, consists of three mutually orthogonal unit vectors, each called an axis) of the fracture surface, and then registering MASs instead of the whole surfaces. Since MAS is attached to the fracture surface, when MASs are registered, fracture surfaces will also be registered by applying transformation parameters for the entire fracture surfaces. Furthermore, because the fracture surface is a subset of the fragment, they share the same coordinate system. In the same way, the registration parameters of the fracture surfaces can also be applied to the whole fragments. Therefore, the registration of the two fragments is completed by registering two MASs of fracture surfaces; Figure 6 shows this process. However, the PCA registration method is ambiguous. Since each axis in a MAS has two corresponding potential choices among all the axes in another MAS, there are eight possible registration schemes between them. Under the right-hand rule, half of the potential situations are excluded so that only four strategies remain. For lack of effective priori methods to determine which of the four possible schemes is correct, we can implement them and choose the best one by measuring their performances. For fracture surfaces A and B, the MAS matrix is denoted as and , and every column vector stands for the direction of an axis. The details of 4 potential schemes are described as follows and shown in Figure 7.  Inspired by [38], the main idea of the PCA coarse registration method is utilizing PCA to obtain the main axes system (MAS, a general direction descriptor, consists of three mutually orthogonal unit vectors, each called an axis) of the fracture surface, and then registering MASs instead of the whole surfaces. Since MAS is attached to the fracture surface, when MASs are registered, fracture surfaces will also be registered by applying transformation parameters for the entire fracture surfaces. Furthermore, because the fracture surface is a subset of the fragment, they share the same coordinate system. In the same way, the registration parameters of the fracture surfaces can also be applied to the whole fragments. Therefore, the registration of the two fragments is completed by registering two MASs of fracture surfaces; Figure 6 shows this process. However, the PCA registration method is ambiguous. Since each axis in a MAS has two corresponding potential choices among all the axes in another MAS, there are eight possible registration schemes between them. Under the right-hand rule, half of the potential situations are excluded so that only four strategies remain. For lack of effective priori methods to determine which of the four possible schemes is correct, we can implement them and choose the best one by measuring their performances. For fracture surfaces A and B, the MAS matrix is denoted as and , and every column vector stands for the direction of an axis. The details of 4 potential schemes are described as follows and shown in Figure 7. However, the PCA registration method is ambiguous. Since each axis in a MAS has two corresponding potential choices among all the axes in another MAS, there are eight possible registration schemes between them. Under the right-hand rule, half of the potential situations are excluded so that only four strategies remain. For lack of effective priori methods to determine which of the four possible schemes is correct, we can implement them and choose the best one by measuring their performances. For fracture surfaces A and B, the MAS matrix is denoted as MA = a 1 a 2 a 3 and MB = b 1 b 2 b 3 , and every column vector stands for the direction of an axis. The details of 4 potential schemes are described as follows and shown in Figure 7.
In any scheme, the rotation matrix is given by: The mean of coordinates should be acquired to obtain the translation vector. The translation vector T = B − A, and corresponding registration results can be obtained: The essence of the evaluating registration result is to evaluate the degree of overlap between A * and B. The shortest distance indicates the degree of overlap from point to surface.
This D with a small value indicates a good registration performance. All these four schemes are implemented, and the scheme corresponding to the smallest D is elected as the final decision. Based on PCA coarse estimation, the result of fine registration can be further obtained through DCP [35]. Avoiding the iterative strategy, point-level optimization, and risk of non-convergence, our PCA-DCP method has an excellent performance in efficiency and robustness. In any scheme, the rotation matrix is given by: * The mean of coordinates should be acquired to obtain the translation vector.
The translation vector ̅ , and corresponding registration results can be obtained: * * The essence of the evaluating registration result is to evaluate the degree of overlap between * and . The shortest distance indicates the degree of overlap from point to surface.

Results
All experiment data is digitized by the Visualization Laboratory of Northwest University, Shaanxi Province, China. SiamesePointNet network for the matching stage, the PCA-DCP-based method for the registration stage, and reassembly of multiple real-world Terracotta Warrior fragments are discussed in this section. The experiment is conducted on a computer with an Intel(R) Core i7-9700F@3.00HZ CPU and two NVIDIA GeForce RTX 2080Ti GPUs.

SiamesePointNET Network for Matching Stage
A small dataset containing 1800 matched pairs and 1560 non-matched pairs is collected and divided into training, validation, and test sets according to the ratio of 6:2:2. Throughout the experiment, we keep the feature dimension K = 1024. We follow [29] and experiment with an input point number of 1024. The accuracy achieves 96.77 and 93.69 in matched samples and non-matched samples, respectively. The average accuracy between two classes is 95.39, and the overall accuracy of all samples is 95.23. It demonstrates the effectiveness and applicability of SiamesePointNet in a small dataset. After that, the sampling rate is changed in the course of data processing, and the impact of the scale of input points is discussed on the proposed network. Intuitively, it is learned from Table 1 that the trend of accuracy gradually increases as the number of input points increases.
When the number of points reaches 2048, accuracy achieves a state of saturation. According to the principle of Occam s razor, 2048 is determined as the number of input points of our network. For comparison, in contrast to our deep learning approach, two traditional descriptors are applied to the matching stage: FPFH [39] and SHOT [40] features. Their performance on the same dataset is recorded in Table 2. The default number of points is kept at 2048. The experimental results show that the proposed method is superior to traditional methods on the same dataset. The data for the registration method is 360 matching fracture surfaces, which can be correctly registered if adequately handled. Table 3 demonstrates the effectiveness of our method and its peers. Firstly, the success rate of each method is recorded as a rough metric. Registration results that differ too much with ground truth in direction and distance are defined as failures, a contrast to success. After eliminating these failures, the MSE, RMSE, and MAE of the corresponding points of the rest are calculated to measure their performance in detail. Finally, the average number of iterations is recorded to reflect consumption. The first part is the vertical and quick comparison. As shown in Table 3, the PCA-DCP method in this paper has the highest success rate, and its MSE, RMSE, and MAE are slightly higher than the RANSAC method. Between iterative methods, the participation of the PCA method in this paper reduces the resource overhead.
The second part is the horizontal and detailed comparison and analysis. Due to the iterative computation and dependence on the initialization, traditional ICP performs the worst in success rate, error, and consumption. The RANSAC [41] method in row 3 is based on the FPFH feature, and its error performance is outstanding. However, the error performance is mainly due to the low success rate. We cannot deny that it performs very well on several successful samples, but such a success rate is unacceptable. The success rate of NDT [42] is too low to cope with real application scenarios. The results of methods using only PCA or DCP are in rows 5 and 6, respectively. The PCA method achieves a high success rate but is not excellent enough in terms of error, while the DCP method performance garners a poor success rate, but the error is smaller. In the 8th row, our PCA-DCP combination method achieves the highest success rate and slight error close to RANSAC, outperforming both PCA and DCP, which illustrates the mutual positive effects of the two ways.
Since the PCA method provides a reasonable initial estimate, it can also serve as a preprocess for other methods, such as ICP. Compared with traditional ICP, PCA-ICP has improved a lot. However, it is still inferior to PCA. Through the experimental results, we believe that PCA can provide good initialization for many cases where ICP would fail, which is reflected in the improvement of the success rate compared to ICP. Nevertheless, as a global coarse registration method, the initial position estimation provided by PCA is not accurate enough, so it is easy to make the PCA + ICP method fall into a local optimum in the process of iteratively calculating the corresponding points, which eventually leads to a decrease in performance compared to PCA. The comparison in rows 7 and 8 shows that PCA and DCP are much better than the combination of PCA and ICP.
PCA is outstanding in initial estimation but not accurate enough in detail, while DCP performs well in detail but depends on initialization. Combining them can absorb their advantages and remedy their shortcomings, showing the effect of "1 + 1 > 2".
To examine the performance of these methods on each sample, we save all the successful registration results and sort them according to MAE, as shown in Figure 8. It can be concluded that our method maintains the best performance in terms of robustness and details on different samples compared to other methods.
iterative computation and dependence on the initialization, traditional ICP performs the worst in success rate, error, and consumption. The RANSAC [41] method in row 3 is based on the FPFH feature, and its error performance is outstanding. However, the error performance is mainly due to the low success rate. We cannot deny that it performs very well on several successful samples, but such a success rate is unacceptable. The success rate of NDT [42] is too low to cope with real application scenarios. The results of methods using only PCA or DCP are in rows 5 and 6, respectively. The PCA method achieves a high success rate but is not excellent enough in terms of error, while the DCP method performance garners a poor success rate, but the error is smaller. In the 8th row, our PCA-DCP combination method achieves the highest success rate and slight error close to RANSAC, outperforming both PCA and DCP, which illustrates the mutual positive effects of the two ways.
Since the PCA method provides a reasonable initial estimate, it can also serve as a preprocess for other methods, such as ICP. Compared with traditional ICP, PCA-ICP has improved a lot. However, it is still inferior to PCA. Through the experimental results, we believe that PCA can provide good initialization for many cases where ICP would fail, which is reflected in the improvement of the success rate compared to ICP. Nevertheless, as a global coarse registration method, the initial position estimation provided by PCA is not accurate enough, so it is easy to make the PCA + ICP method fall into a local optimum in the process of iteratively calculating the corresponding points, which eventually leads to a decrease in performance compared to PCA. The comparison in rows 7 and 8 shows that PCA and DCP are much better than the combination of PCA and ICP.
PCA is outstanding in initial estimation but not accurate enough in detail, while DCP performs well in detail but depends on initialization. Combining them can absorb their advantages and remedy their shortcomings, showing the effect of "1 + 1>2".
To examine the performance of these methods on each sample, we save all the successful registration results and sort them according to MAE, as shown in Figure 8. It can be concluded that our method maintains the best performance in terms of robustness and details on different samples compared to other methods.   Figure 9 is a comparison of different methods on a specific pair of real-world data. In this example, PCA obtains a coarse initial pose estimation but is not good enough in detail. When no initial estimate is given, the DCP method leads to failure: there is a significant deviation between the two fracture surfaces in spatial position. The PCA-DCP method proposed in this work achieves the best result. Figure 10 shows the registration visualization of our method on several samples. Figure 9 is a comparison of different methods on a specific pair of real-world data. In this example, PCA obtains a coarse initial pose estimation but is not good enough in detail. When no initial estimate is given, the DCP method leads to failure: there is a significant deviation between the two fracture surfaces in spatial position. The PCA-DCP method proposed in this work achieves the best result. Figure 10 shows the registration visualization of our method on several samples.

Robustness Experiments
The robustness of our registration method is verified concerning the initial angle, initial distance, and noise, and the average MAE over all samples is shown in Figure 11. In the angle-robustness experiment, different upper limits for the initial angle (15°, 30°, 45°, 60°, 75°, 90°, 105°, and 120°) are set, and a specific fracture surface from ground truth is randomly rotated within the upper limit. The distance-robustness experiment is similar, and the maximum initial distance (10 mm, 20 mm, 30 mm, 40 mm, 50 mm, 60 mm, 70 mm, 80 mm, 90 mm, and 100 mm) specifies the translation distance between two surfaces. In the noise-robustness experiment, Gaussian noise is added to ground truth with gradually attenuated signal-to-noise ratios (SNR). The initial value of SNR is set to 100 db. As the attenuation rate increases from 0 to 70%, samples in ground truth gradually become noisier and noisier. The average MAE of the RANSAC method over all samples is significant, reaching more than 30, although its performance is stable on datasets with different angles, distances, and noise. Similarly, unlike distance-based methods, the results of NDT often show serious deviations in the registration direction, resulting in a higher mean MAE. Therefore, their MAE are not shown in the figure.  Figure 9 is a comparison of different methods on a specific pair of real-world data. In this example, PCA obtains a coarse initial pose estimation but is not good enough in detail. When no initial estimate is given, the DCP method leads to failure: there is a significant deviation between the two fracture surfaces in spatial position. The PCA-DCP method proposed in this work achieves the best result. Figure 10 shows the registration visualization of our method on several samples.

Robustness Experiments
The robustness of our registration method is verified concerning the initial angle, initial distance, and noise, and the average MAE over all samples is shown in Figure 11. In the angle-robustness experiment, different upper limits for the initial angle (15°, 30°, 45°, 60°, 75°, 90°, 105°, and 120°) are set, and a specific fracture surface from ground truth is randomly rotated within the upper limit. The distance-robustness experiment is similar, and the maximum initial distance (10 mm, 20 mm, 30 mm, 40 mm, 50 mm, 60 mm, 70 mm, 80 mm, 90 mm, and 100 mm) specifies the translation distance between two surfaces. In the noise-robustness experiment, Gaussian noise is added to ground truth with gradually attenuated signal-to-noise ratios (SNR). The initial value of SNR is set to 100 db. As the attenuation rate increases from 0 to 70%, samples in ground truth gradually become noisier and noisier. The average MAE of the RANSAC method over all samples is significant, reaching more than 30, although its performance is stable on datasets with different angles, distances, and noise. Similarly, unlike distance-based methods, the results of NDT often show serious deviations in the registration direction, resulting in a higher mean MAE. Therefore, their MAE are not shown in the figure.

Robustness Experiments
The robustness of our registration method is verified concerning the initial angle, initial distance, and noise, and the average MAE over all samples is shown in Figure 11. In the angle-robustness experiment, different upper limits for the initial angle (15 • , 30 • , 45 • , 60 • , 75 • , 90 • , 105 • , and 120 • ) are set, and a specific fracture surface from ground truth is randomly rotated within the upper limit. The distance-robustness experiment is similar, and the maximum initial distance (10 mm, 20 mm, 30 mm, 40 mm, 50 mm, 60 mm, 70 mm, 80 mm, 90 mm, and 100 mm) specifies the translation distance between two surfaces. In the noise-robustness experiment, Gaussian noise is added to ground truth with gradually attenuated signal-to-noise ratios (SNR). The initial value of SNR is set to 100 db. As the attenuation rate increases from 0 to 70%, samples in ground truth gradually become noisier and noisier. The average MAE of the RANSAC method over all samples is significant, reaching more than 30, although its performance is stable on datasets with different angles, distances, and noise. Similarly, unlike distance-based methods, the results of NDT often show serious deviations in the registration direction, resulting in a higher mean MAE. Therefore, their MAE are not shown in the figure.  It is worth noting that all non-convergent samples are filtered. In ICP-participated methods (traditional ICP and PCA + ICP), there is no statistical result for MAE if it does not converge. A threshold with a value of 200 is set, and once the number of iterations exceeds this threshold, it is considered non-convergence.
Due to the randomness in data manufacturing, we should pay more attention to the overall trend rather than the value of each sample. Our method exhibits the most stable and robust properties no matter the initial angle, distance, or noise. Figure 12 shows the registration results of several noisy fracture surfaces. It is worth noting that all non-convergent samples are filtered. In ICP-participated methods (traditional ICP and PCA + ICP), there is no statistical result for MAE if it does not converge. A threshold with a value of 200 is set, and once the number of iterations exceeds this threshold, it is considered non-convergence.
Due to the randomness in data manufacturing, we should pay more attention to the overall trend rather than the value of each sample. Our method exhibits the most stable and robust properties no matter the initial angle, distance, or noise. Figure 12 shows the registration results of several noisy fracture surfaces.

Reassembling Feasibility on Multiple Real-World Terracotta Warrior Fragments
To investigate the effectiveness of our framework, several sets of experiments are conducted on multiple real-world Terracotta Warrior fragments. Due to the diversity and uncertainty of the real-world Terracotta Warrior fragments reassembling scene, the performances of the proposed framework are different in specific cases. This section is arranged to demonstrate the feasibility of this framework through a complete reassembly process of a typical sample. A further discussion of the limitations of the proposed method will be mentioned in Section 4.
The results are satisfactory when fracture surfaces are sufficiently similar, as shown in Figure 13. Figure 13a shows the five pieces we are faced with. In this scenario, the largest fragments reach a volume of about 2000 cm 2 , and the smallest fragments are less than a tenth of that. However, since the data concerned by the proposed method is the fracture surface, the area difference becomes small after the fracture surfaces are extracted. In the beginning, a dataset for five fragments is created. Then, we calculate the similarity among fracture surfaces, select the most similar pair "3-4" to register and update the dataset, as shown in Figure 13b. At this stage, only four fragments are left. The original fragments "3" and "4" are replaced by a merged fragment "3-4", and the fracture surface where the registration was implemented was blocked and will not be considered in the next round of matching and registration steps. As with the above process, after several steps, we obtain the result shown in Figure 13e. At this point, there is only one fragment left, which

Reassembling Feasibility on Multiple Real-World Terracotta Warrior Fragments
To investigate the effectiveness of our framework, several sets of experiments are conducted on multiple real-world Terracotta Warrior fragments. Due to the diversity and uncertainty of the real-world Terracotta Warrior fragments reassembling scene, the performances of the proposed framework are different in specific cases. This section is arranged to demonstrate the feasibility of this framework through a complete reassembly process of a typical sample. A further discussion of the limitations of the proposed method will be mentioned in Section 4.
The results are satisfactory when fracture surfaces are sufficiently similar, as shown in Figure 13. Figure 13a shows the five pieces we are faced with. In this scenario, the largest fragments reach a volume of about 2000 cm 2 , and the smallest fragments are less than a tenth of that. However, since the data concerned by the proposed method is the fracture surface, the area difference becomes small after the fracture surfaces are extracted. In the beginning, a dataset for five fragments is created. Then, we calculate the similarity among fracture surfaces, select the most similar pair "3-4" to register and update the dataset, as shown in Figure 13b. At this stage, only four fragments are left. The original fragments "3" and "4" are replaced by a merged fragment "3-4", and the fracture surface where the registration was implemented was blocked and will not be considered in the next round of matching and registration steps. As with the above process, after several steps, we obtain the result shown in Figure 13e. At this point, there is only one fragment left, which suggests that all the fragments have been reassembled together, and our termination condition is met so that the reassembly process ends. suggests that all the fragments have been reassembled together, and our termination condition is met so that the reassembly process ends.

Discussion
This article proposes a 3D fragments reassembling method named SPPD based on fracture surface similarity, which mainly includes a similarity measurement network called SiamesePointNet and a PCA-DCP registration method. PointNet is embedded into

Discussion
This article proposes a 3D fragments reassembling method named SPPD based on fracture surface similarity, which mainly includes a similarity measurement network called SiamesePointNet and a PCA-DCP registration method. PointNet is embedded into a Siamese structure to quantitatively analyze the similarity of fracture surfaces in the matching stage. PCA is used to obtain a rough initial position estimate in the registration stage, and DCP is used to refine the results further. Extensive experiments prove the accuracy of SiamesePointNet on small-scale samples, the efficiency and robustness of the PCA-DCP method, and the effectiveness of the proposed method on real-world Terracotta Warriors.
Given the richness and reliability of the information on the fracture surface, the core of our method lies in the fracture surface. Once the information on the fracture surface is unreliable, for example, if the fracture surfaces are severely damaged or partially similar, problems will arise in our method. Figure 14 reflects the situation of severe damage, and our SPPD method judges that the two fragments do not match. This is a wrong judgment contrary to the ground truth.

Discussion
This article proposes a 3D fragments reassembling method named SPPD based on fracture surface similarity, which mainly includes a similarity measurement network called SiamesePointNet and a PCA-DCP registration method. PointNet is embedded into a Siamese structure to quantitatively analyze the similarity of fracture surfaces in the matching stage. PCA is used to obtain a rough initial position estimate in the registration stage, and DCP is used to refine the results further. Extensive experiments prove the accuracy of SiamesePointNet on small-scale samples, the efficiency and robustness of the PCA-DCP method, and the effectiveness of the proposed method on real-world Terracotta Warriors.
Given the richness and reliability of the information on the fracture surface, the core of our method lies in the fracture surface. Once the information on the fracture surface is unreliable, for example, if the fracture surfaces are severely damaged or partially similar, problems will arise in our method. Figure 14 reflects the situation of severe damage, and our SPPD method judges that the two fragments do not match. This is a wrong judgment contrary to the ground truth. Although our registration method can obtain excellent registration results, there is a premise that the similarity of the fracture surfaces is sufficiently high. As shown in Figure  15, when given fragments with low similarity, our registration result will be unsatisfactory in detail. Therefore, how to complete the reassembling task with insufficient information on the fracture surface is a problem to be overcome in our later work. Although our registration method can obtain excellent registration results, there is a premise that the similarity of the fracture surfaces is sufficiently high. As shown in Figure 15, when given fragments with low similarity, our registration result will be unsatisfactory in detail. Therefore, how to complete the reassembling task with insufficient information on the fracture surface is a problem to be overcome in our later work. In the reassembly process of entire Terracotta warriors, severe damage to the fracture surfaces inevitably occurs. At present, the proposed method undertakes the reassembly of parts with high integrity. After the reassembly of these parts is completed, the rest are difficult parts requiring manual guidance from experts. Figure 16 shows the reassembling result of an entire Terracotta warrior. The number of fragments to be processed is 49, and seven intermediate results were successfully obtained, leaving nine individual fragments without matching objects. These are then artificially reassembled together to obtain the final result. It is believed that the problem of incomplete fracture surface will be overcome in the future, and the dependence on manual operation will be gradually reduced. In the reassembly process of entire Terracotta warriors, severe damage to the fracture surfaces inevitably occurs. At present, the proposed method undertakes the reassembly of parts with high integrity. After the reassembly of these parts is completed, the rest are difficult parts requiring manual guidance from experts. Figure 16 shows the reassembling result of an entire Terracotta warrior. The number of fragments to be processed is 49, and seven intermediate results were successfully obtained, leaving nine individual fragments without matching objects. These are then artificially reassembled together to obtain the final result. It is believed that the problem of incomplete fracture surface will be overcome in the future, and the dependence on manual operation will be gradually reduced.
of parts with high integrity. After the reassembly of these parts is completed, the rest are difficult parts requiring manual guidance from experts. Figure 16 shows the reassembling result of an entire Terracotta warrior. The number of fragments to be processed is 49, and seven intermediate results were successfully obtained, leaving nine individual fragments without matching objects. These are then artificially reassembled together to obtain the final result. It is believed that the problem of incomplete fracture surface will be overcome in the future, and the dependence on manual operation will be gradually reduced. In this framework, the computational overhead caused by the rising number of input fragments cannot be ignored. Fortunately, however, this number will not keep increasing endlessly. In the excavation and preservation of the fragments, there is a weak spatial continuity. That is to say, the fragments from the same Terracotta Warrior are close to each other in space. Therefore, the maximum number of fragments that our system needs to deal with at one time is usually not more than 100, which is an acceptable range. Even so, the search for a faster solution is a potential future research direction.

Conclusions
In conclusion, the purpose of the current study is to reassemble the 3D fragments correctly and effectively. The experiment demonstrates the practicability of the real-world Terracotta Warriors of the proposed method. We hope this work can provide a valuable tool for the virtual restoration of three-dimension cultural heritage artifacts.
Author Contributions: Conceptualization, W.Y. and X.C.; methodology, X.C. and K.L.; software, W.Y.; validation, X.C. and K.L.; investigation, W.Y. and W.T.; resources, W.Y.; data curation, T.C.; writing-original draft preparation, W.Y.; writing-review and editing, X.C.; visualization, F.Z. and In this framework, the computational overhead caused by the rising number of input fragments cannot be ignored. Fortunately, however, this number will not keep increasing endlessly. In the excavation and preservation of the fragments, there is a weak spatial continuity. That is to say, the fragments from the same Terracotta Warrior are close to each other in space. Therefore, the maximum number of fragments that our system needs to deal with at one time is usually not more than 100, which is an acceptable range. Even so, the search for a faster solution is a potential future research direction.

Conclusions
In conclusion, the purpose of the current study is to reassemble the 3D fragments correctly and effectively. The experiment demonstrates the practicability of the real-world Terracotta Warriors of the proposed method. We hope this work can provide a valuable tool for the virtual restoration of three-dimension cultural heritage artifacts.

Data Availability Statement:
The data presented in this study are available on request from the corresponding author. The data are not publicly available due to that the Terracotta Warriors involve a policy of secrecy over cultural heritage.