HoloLens-Based AR System with a Robust Point Set Registration Algorithm

By the standard of today’s image-guided surgery (IGS) technology, in order to check and verify the progress of the surgery, the surgeons still require divert their attention from the patients occasionally to check against the display. In this paper, a mixed-reality system for medical use is proposed that combines an Intel RealSense sensor with Microsoft’s Hololens head-mounted display system, for superimposing medical data onto the physical surface of a patient, so the surgeons do not need to divert their attention from their patients. The main idea of our proposed system is to display the 3D medical images of the patients on the actual patients themselves by placing the medical images and the patients in the same coordinate space. However, the virtual medical data may contain noises and outliers, so the transformation mapping function must be able to handle these problems. The transform function in our system is performed by the use of our proposed Denoised-Resampled-Weighted-and-Perturbed-Iterative Closest Points (DRWP-ICP) algorithm, which performs denoising and removal of outliers before aligning the pre-operative medical image data points to the patient’s physical surface position before displaying the result using the Microsoft HoloLens display system. The experimental results shows that our proposed mixed-reality system using DRWP-ICP is capable of performing accurate and robust mapping despite the presence of noise and outliers.


Introduction
AR/MR (Augmented reality/mixed reality) is a field of study that is rapidly catching the interests of researchers of medical and image processing technologies [1,2]. This study is on the use of AR in the field of image-guided surgery [3], which seeks to reduce invasiveness by showing the doctor the exact locations of the lesions on the skin surface of the patient. An important part of any AR-based research is the problem of data registration; i.e., how to register two sets of data points, e.g., the image data point and the physical object's surface data points, in the same coordinate space using alignment so as to superimpose two sets of data points accurately. The traditional method used for registration is the ICP (iterative closest points) algorithm, introduced by Besl and Zhang [4]. It uses non-elastic transformations to convert 3D data points from one coordinate system to another. The problems with it include that it requires a good starting position to start the alignment in order to get a good result, and any noise in the data could potentially corrupt the final alignment result. Later, Andriy et al. proposed the coherent point drift (CPD) [5], the main idea of which is to initialize a Gaussian mixture model and use the expectation-maximization calculation to move the centroid of the mixture. The CPD uses a uniform distribution component to account for the outliers, and could also handle some noise in the

The DRWP-ICP Procedure
Step 1. Data Input. The pre-operative medical/CT image was the set of reference points R, and the patient's surface data captured by the RGB-Depth sensor was the set of floating points, F1 and F2.
Step 2. Floating Data Processing by Denoising and Resampling. The main idea in the denoising stage was to use a statistical filter on the neighborhood of each data point and remove points that did not meet certain criteria. The sparse outlier removal method was based on the calculation of the distance distribution from the point in the input data to the adjacent point. The statistical filter assumed that the average distance between all points in the point cloud and the nearest K neighbors satisfies the Gaussian distribution. According to the mean variance, a distance threshold can be determined. When the average distance between a point and the nearest K points was greater than this the threshold, the points beyond this threshold were then removed. First the average distance between each point and its nearest K neighbors was calculated, and then the mean μ and standard deviation σ of all the average distances were calculated. The distance threshold can be defined as: where α is a constant known as the scale factor and its values depend on the number of neighbors. Finally each point whose distance was greater than dmax was removed. In the resampling stage, the main idea was to use a moving least squares surface reconstruction method [13] to smooth and resample the noisy data. This algorithm attempted to reconstruct the missing portion of the surface by high-order polynomial interpolation between surrounding data points. First, the corresponding surface was obtained from the acquired point cloud data set, and then the surface normal was calculated from the model. The solution for estimating the surface normal was simplified by analyzing the eigenvectors and eigenvalues (or PCA-principal component analysis [14]) of the covariance matrix from the nearest neighbor of the query point. The purpose of the surface normal estimation was to first obtain the nearest neighbor element of point p and then calculate the surface normal n of point p. Then we checked whether the direction of n was consistently pointing to the viewpoint. If not, it was flipped. The viewpoint coordinates were preset to (0,0,0). The least squares method is a global method, which cannot meet the requirements for localized processing, and it can cause difficulty in model setting and computational instability for a large number of point data. Therefore, this method used the moving least squares method to solve the registration problem. The scatter adaptability also has the advantages of local curve fitting, being adaptable to distributed difference characteristics, as well as having a high precision. The moving least squares method  The DRWP-ICP Procedure Step 1. Data Input. The pre-operative medical/CT image was the set of reference points R, and the patient's surface data captured by the RGB-Depth sensor was the set of floating points, F 1 and F 2 .
Step 2. Floating Data Processing by Denoising and Resampling. The main idea in the denoising stage was to use a statistical filter on the neighborhood of each data point and remove points that did not meet certain criteria. The sparse outlier removal method was based on the calculation of the distance distribution from the point in the input data to the adjacent point. The statistical filter assumed that the average distance between all points in the point cloud and the nearest K neighbors satisfies the Gaussian distribution. According to the mean variance, a distance threshold can be determined. When the average distance between a point and the nearest K points was greater than this the threshold, the points beyond this threshold were then removed. First the average distance between each point and its nearest K neighbors was calculated, and then the mean µ and standard deviation σ of all the average distances were calculated. The distance threshold can be defined as: where α is a constant known as the scale factor and its values depend on the number of neighbors. Finally each point whose distance was greater than d max was removed. In the resampling stage, the main idea was to use a moving least squares surface reconstruction method [13] to smooth and resample the noisy data. This algorithm attempted to reconstruct the missing portion of the surface by high-order polynomial interpolation between surrounding data points. First, the corresponding surface was obtained from the acquired point cloud data set, and then the surface normal was calculated from the model. The solution for estimating the surface normal was simplified by analyzing the eigenvectors and eigenvalues (or PCA-principal component analysis [14]) of the covariance matrix from the nearest neighbor of the query point. The purpose of the surface normal estimation was to first obtain the nearest neighbor element of point p and then calculate the surface normal n of point p.
Then we checked whether the direction of n was consistently pointing to the viewpoint. If not, it was flipped. The viewpoint coordinates were preset to (0,0,0). The least squares method is a global method, which cannot meet the requirements for localized processing, and it can cause difficulty in model setting and computational instability for a large number of point data. Therefore, this method used the moving least squares method to solve the registration problem. The scatter adaptability also has the advantages of local curve fitting, being adaptable to distributed difference characteristics, as well as having a high precision. The moving least squares method divided the region into meshes, then looped through each point in the mesh to compute the shape function at each node, then finally connected the nodes to form a fitted curve surface. This was done for both floating datasets, then the data were resampled from both datasets to generate a smoother, more accurate floating dataset, F.
Step 3. ICP Alignment. For any given point f in F, we found its nearest corresponding point r in R by calculating the minimum distance between f and each possible r, d. Then weight values were assigned to each pair in the same group of data that matches the same reference point, then the min value was calculated according to the following: where N B is the total number of points in the floating data set, and N C is the number of points to be excluded. This number was determined after sorting d i in order to reduce the overall complexity.
Then the weight was assigned as follows: After the assignments were completed, then the objective function which seeks to minimize the root mean square (RMS) value, was calculated. The way to evaluate the objective function was similar to gradient descent, so in order to speed up the search, the line search method was implemented, which searched in the nonlinear space of the objective function, F, in order to arrive at a minimum faster. If the solution was accepted, then a transformation matrix T' was obtained by the ICP algorithm, and if the RMS value calculated was less than the previous iteration, then T' replaced the currently optimal transformation matrix T, then automatically jumped to the perturbation stage. Otherwise, whether the stop condition had been reached had to be determined first.
Step 4. Stop Condition. The stop condition was so that if the RMS value is less than a preset threshold, or if the number of iterations so far was greater than a preset value, then the stop condition was determined to be reached, and the algorithm stopped.
Step 5. Perturbation. If we let T init be the initialization transformation T before the ICP begins and let the T temp be the transformation matrix for the converged solution, then the range to explore for T temp would be in the range of r, where since T init and T temp are both 4 × 4 transformation matrices, the range of r represents the range of possible discrete 4 × 4 transforms. Now suppose T temp only reached a local minimum, then the transformation for perturbation was chosen from the range of r, where probability of being selected was: where α is the scaling factor to help expand the perturbation range if the current search range was not sufficient to escape from a local solution.
The illustrations for the effects of denoising and resampling can be seen below in Figure 2.

Experimental Results
In the experiments, a specially-designed marker was placed next to the patient's head (dummy head), so when the camera detected the marker plate, the built-in Vuforia image tracking SDK (Software Development Kit) [15] would know the spatial position of the patient's head so as to map the HoloLens' coordinate system to the marker's coordinate system, then the medical image data points were aligned to the patient's surface information via different ICP-based algorithms, including our proposed method. The RMS (root-mean-squared) errors and alignment times were recorded as the performance of DRWP-ICP was compared to other well-known ICP-based algorithms proposed in the literature. The final results were then displayed in the HoloLens output. In this setup, each floating dataset was composed of 48,596 points.
The hardware setup included a desktop running Windows with Intel ® Core™ i5-4460 CPU, an Intel ® RealSense RGB-Depth sensor (Santa Clara, CA, USA), and the Microsoft HoloLens MR system (Redmond, WA, USA). The software and user interface was developed using Unity [16] and OpenCV [17]. In order to compare the DRWP-ICP's accuracy with other ICP-based algorithms, positional values of five references points were first obtained by using the MicroScribe G2X digitizer coordinate tracking device [18]. The number of pre-measured reference points was chosen based on the experimental setup in the Improved-ICP paper [8]. Figure 3 shows possible alignment results, where Figure 3a is one of the alignment results through the use of the Improved-ICP algorithm (blue points represent the reference dataset, the pink points represent the floating dataset (that are different from the values of the reference dataset), and yellow points represent the five reference points whose positional values are pre-measured); Figure 3b is one of the alignment results through the use of our proposed DRWP-ICP algorithm (the blue points represent the reference dataset, the pink points represent the floating dataset (that are different from the values of the reference dataset), and yellow dots represents the five pre-measured reference points); and Figure 3c is the output taken using a regular camcorder, without Hololens, with the areas representing floating data points marked in gray; and Figure 3d is the outputs on the Hololens display, with the reference data points (in pink) overlaying the dummy. As it can be seen, HoloLens provided an augmented display over the reality of the dummy head.

Experimental Results
In the experiments, a specially-designed marker was placed next to the patient's head (dummy head), so when the camera detected the marker plate, the built-in Vuforia image tracking SDK (Software Development Kit) [15] would know the spatial position of the patient's head so as to map the HoloLens' coordinate system to the marker's coordinate system, then the medical image data points were aligned to the patient's surface information via different ICP-based algorithms, including our proposed method. The RMS (root-mean-squared) errors and alignment times were recorded as the performance of DRWP-ICP was compared to other well-known ICP-based algorithms proposed in the literature. The final results were then displayed in the HoloLens output. In this setup, each floating dataset was composed of 48,596 points.
The hardware setup included a desktop running Windows with Intel ® Core™ i5-4460 CPU, an Intel ® RealSense RGB-Depth sensor (Santa Clara, CA, USA), and the Microsoft HoloLens MR system (Redmond, WA, USA). The software and user interface was developed using Unity [16] and OpenCV [17]. In order to compare the DRWP-ICP's accuracy with other ICP-based algorithms, positional values of five references points were first obtained by using the MicroScribe G2X digitizer coordinate tracking device [18]. The number of pre-measured reference points was chosen based on the experimental setup in the Improved-ICP paper [8]. Figure 3 shows possible alignment results, where Figure 3a is one of the alignment results through the use of the Improved-ICP algorithm (blue points represent the reference dataset, the pink points represent the floating dataset (that are different from the values of the reference dataset), and yellow points represent the five reference points whose positional values are pre-measured); Figure 3b is one of the alignment results through the use of our proposed DRWP-ICP algorithm (the blue points represent the reference dataset, the pink points represent the floating dataset (that are different from the values of the reference dataset), and yellow dots represents the five pre-measured reference points); and Figure 3c is the output taken using a regular camcorder, without Hololens, with the areas representing floating data points marked in gray; and Figure 3d is the outputs on the Hololens display, with the reference data points (in pink) overlaying the dummy. As it can be seen, HoloLens provided an augmented display over the reality of the dummy head.
The experiments were designed for two groups of tests, identical except for the pre-alignment process. In the first group, no initial coarse pre-alignment was performed for each ICP algorithm under testing for their effectiveness under non-ideal conditions. In the second group of experiments, initial coarse pre-alignments were performed for each ICP algorithm (except Improved-ICP and DRWP-ICP algorithms) before further testing. Each of these groups were divided into two subgroups of experiments. Each of the subgroups had several sets of experiments to be performed: The first set of experiments tested the efficacy of each ICP-based algorithm in the presence of sparse floating data.
The second set of experiments tested the efficacy of each ICP-based algorithm in the presence of additive Gaussian noises. The third and final set of experiment tested the efficacy of each ICP-based algorithm in the presence of outliers. Both averaged alignment time (excluding coarse alignment), and the average RMS errors of five pre-measured reference points were recorded after multiple repeated experiments. The ICP-based algorithms to be tested included the PCL-ICP, the CPD, the CC-ICP, the CC-ICP with its denoise function enabled (CC-ICP+D), the Improved-ICP, and our proposed DRWP-ICP.  The experiments were designed for two groups of tests, identical except for the pre-alignment process. In the first group, no initial coarse pre-alignment was performed for each ICP algorithm under testing for their effectiveness under non-ideal conditions. In the second group of experiments, initial coarse pre-alignments were performed for each ICP algorithm (except Improved-ICP and DRWP-ICP algorithms) before further testing. Each of these groups were divided into two subgroups of experiments. Each of the subgroups had several sets of experiments to be performed: The first set of experiments tested the efficacy of each ICP-based algorithm in the presence of sparse floating data.
The second set of experiments tested the efficacy of each ICP-based algorithm in the presence of additive Gaussian noises. The third and final set of experiment tested the efficacy of each ICP-based algorithm in the presence of outliers. Both averaged alignment time (excluding coarse alignment), and the average RMS errors of five pre-measured reference points were recorded after multiple repeated experiments. The ICP-based algorithms to be tested included the PCL-ICP, the CPD, the CC-ICP, the CC-ICP with its denoise function enabled (CC-ICP+D), the Improved-ICP, and our proposed DRWP-ICP.

Sparse Data
In order to test the effect of sparse data, the cases included testing 100% of floating data used in the alignment as reference, as well when only 70%, 50%, 25%, and 10% of the floating data points were used. The searches all started from random positions, including random distance and orientation. The following figure, Figure 4, shows examples of the floating data points in white and the reference data points in gold. Tables 1 and 2 show the average, maximum and minimum errors; and alignment times, respectively. Since the alignment times for CC-ICP and CC-ICP+D were almost the same, no extra column was used for the CC-ICP+D alignment times.

Sparse Data
In order to test the effect of sparse data, the cases included testing 100% of floating data used in the alignment as reference, as well when only 70%, 50%, 25%, and 10% of the floating data points were used. The searches all started from random positions, including random distance and orientation. The following figure, Figure 4, shows examples of the floating data points in white and the reference data points in gold. Tables 1 and 2 show the average, maximum and minimum errors; and alignment times, respectively. Since the alignment times for CC-ICP and CC-ICP+D were almost the same, no extra column was used for the CC-ICP+D alignment times.

Noise
In this set of experiments, Gaussian noises of different variances, i.e., 1.0, 2.0, 5.0, and 7.0, were added to the floating datasets. The following figure, Figure 5, shows examples of the floating data points in green and the reference data points in gold. Tables 3 and 4 show the average, minimum, maximum errors; and the alignment times, respectively.

Outliers
In this set of experiments, different numbers of outliers, i.e., 600, 1800, 3000, and 6000, were deliberately added to the floating dataset. The following figure, Figure 6, shows examples of the floating data points in pink and the reference data points in gold. Tables 5 and 6 show the average, minimum, maximum errors; and alignment times, respectively.

Outliers
In this set of experiments, different numbers of outliers, i.e., 600, 1800, 3000, and 6000, were deliberately added to the floating dataset. The following figure, Figure 6, shows examples of the floating data points in pink and the reference data points in gold. Tables 5 and 6 show the average, minimum, maximum errors; and alignment times, respectively.

With SAC-IA Coarse Pre-Alignment
In this subgroup of experiment a preliminary coarse pre-alignment, the sampling consensus initial alignment (SAC-IA) [19], was first performed for each ICP-based algorithm. However, no coarse alignments were performed for the Improved-ICP algorithm, because its authors indicated no coarse alignment is necessary. Also, no coarse alignment was performed for our proposed DRWP-ICP, because we also believe that it is not necessary. Their values in the tables below are exactly the same as their values in the first group of experiments and are listed for reference purposes only. The figure below, Figure 7, illustrates the process of coarse alignment, where the green points represents the floating data and the red points represent the reference data. Table 7 displays the average RMS errors after the coarse alignment for the five reference points. coarse alignment is necessary. Also, no coarse alignment was performed for our proposed DRWP-ICP, because we also believe that it is not necessary. Their values in the tables below are exactly the same as their values in the first group of experiments and are listed for reference purposes only. The figure below, Figure 7, illustrates the process of coarse alignment, where the green points represents the floating data and the red points represent the reference data. Table 7 displays the average RMS errors after the coarse alignment for the five reference points.  Again, in order to test the effects of different ICP-based algorithm in the presence of sparse data, only 70%, 50%, 25%, and 10% of the floating data points were used. The original 100% of the data was also used for reference purposes. Tables 8 and 9 show the average, minimum, maximum errors; and alignment times, respectively (not including the times taken for coarse pre-alignments).

Sparse
Again, in order to test the effects of different ICP-based algorithm in the presence of sparse data, only 70%, 50%, 25%, and 10% of the floating data points were used. The original 100% of the data was also used for reference purposes. Tables 8 and 9 show the average, minimum, maximum errors; and alignment times, respectively (not including the times taken for coarse pre-alignments).

Noise
In this set of experiments, Gaussian noises of different variances, i.e., 1.0, 2.0, 5.0, and 7.0, were added to the floating datasets. The mathematical expression of Gaussian distribution is: where µ is the sample mean and the σ is the sample standard deviation. Tables 10 and 11 show the average, minimum, maximum errors; and alignment times, respectively.

Outliers
In this set of experiments, different numbers of outliers, i.e., 600, 1800, 3000, and 6000, were deliberately added to the floating dataset. Tables 12 and 13 show the average, minimum, maximum errors; and alignment times, respectively. A final experiment using floating datasets with sparse data (10%), Gaussian noise (Variance = 7.0), and outliers (6000) was tested in order to test the robustness of our proposed DRWP-ICP when all three conditions exist. Table 14 below shows the average, maximum, and minimum alignment errors of the five registration methods. Coarse alignments were performed for PCL-ICP, CPD, and CC-ICP.

Discussion
The same observation as the Improved-ICP paper [8] can be drawn from the differences between the RMS values of experiments with and without coarse alignment: that the performances of the PCL-ICP, CPD and CC-ICP algorithms can all be improved by a coarse pre-alignment. The results from the experiments without alignments, seem to partially validate the claims by CPD, that is more robust under noisy and outliers, because all of the RMS errors of CPD were less than those of the PCL-ICP and the CC-ICP. However, in the experiments with coarse alignment, the performance of CPD dropped below the other ICP-base methods when that data became too noisy, or had too many outliers. Also in all the experiments the CPD took longer to terminate. Also, an obvious observation can also be made: noisier data, or data that have more outliers, will take longer to align. In the case of sparse data, the sparser the data, the less time it takes to perform alignment. Another interesting observation concerns the effects of the additional coarse pre-alignment for PCL-ICP, CPD, and CC-ICP algorithms. For example, by first performing coarse alignment, the alignment time for the PCL-ICP, CPD, and CC-ICP algorithms took shorter times to achieve their respective, and better alignment results, including sparse, noisy, and in most of the outliers cases. We believe that this is due to the fact that CPD and the other ICP-based methods, except Improved-ICP, did not fully consider being trapped into local minima, so when the floating data started from a random starting position, they were trapped into sub-optimal solutions. Our study proposes a method to escape from local minima even when the search appears to have converged. The coarse alignment reduced the search-space for the other methods and allowing them to reach better solutions.
Another observation is that by including the pre-alignment operation, and without counting the time for the pre-alignment, most of the alignment times for the PCL-ICP, CPD, and CC-ICP algorithms actually decreased, even though they achieved better results with pre-alignments. This is an interesting phenomena, however, because their performances still lag behind DRWP-ICP, it remains just a mildly interesting phenomena.
The entries in the tables above show that in almost all of the test cases, including RMS errors and running times, the proposed DRWP-ICP exhibited much better performances than the other algorithms in the test group, even the Improved-ICP algorithm. However, it is also observed that the CC-ICP with denoise function turned on can perform better. This is even more obvious with our coarse alignment, CC-ICP with denoise function was able to also reach near optimal solutions. This shows the importance of a good coarse alignment for the CC-ICP algorithm. So, these results combined show that the operations of denoising and resampling do affect the final results of alignment, and implies that DRWP-ICP algorithm is not only more accurate but also more robust than the other algorithms tested in the experiment for point set registration/alignment. This observation is valid since the original coarse alignment algorithm included in CC-ICP was not sufficient for it to reach near optimal solutions. The results in Table 14 show that it is superior even when all three conditions exists; i.e., noisy, sparse, and having outliers. It is also clear that the DRWP-ICP algorithm also does not require pre-alignment in order to reach near globally optimal solutions, so there is no need to consider pre-alignment in order to improve DRWP-ICP, which is makes it more efficient than the other algorithms that require pre-alignments in order to achieve better results.

Conclusions
In this work, a mixed-reality system for aiding the surgeons during image-guided surgeries is proposed. The pre-operative medical images are first taken and then superimposed on the patient through the use of the DRWP-ICP algorithm, which was shown by experiments to be superior both in performance and robustness when compared to other ICP-based algorithms in the presence of sparse data, noisy data, outlier data, or a combination of them. The aligned results are then displayed on the Microsoft's HoloLens display as the medical data are superimposed on the patient. The use of the proposed MR system implies that the surgeon does not need to remove his focus from the patient during the operation in order to ascertain the progress of the operation, and that the surgeon can have full confidence that the medical data will be accurately displayed over the patient.