I-VoxICP: A Fast Point Cloud Registration Method for Unmanned Surface Vessels

Jing, Qianfeng; Bai, Mingwang; Yin, Yong; Guo, Dongdong

doi:10.3390/jmse13101854

Open AccessArticle

I-VoxICP: A Fast Point Cloud Registration Method for Unmanned Surface Vessels

¹

Navigation College, Dalian Maritime University, Dalian 116026, China

²

State Key Laboratory of Maritime Technology and Safety, Dalian Maritime University, Dalian 116026, China

^*

Author to whom correspondence should be addressed.

J. Mar. Sci. Eng. 2025, 13(10), 1854; https://doi.org/10.3390/jmse13101854

Submission received: 1 August 2025 / Revised: 23 September 2025 / Accepted: 24 September 2025 / Published: 25 September 2025

(This article belongs to the Special Issue Unmanned Surface Vessels (USVs): Technology, Applications and Regulatory Landscapes)

Download

Browse Figures

Versions Notes

Abstract

The accurate positioning and state estimation of surface vessels are prerequisites to autonomous navigation. Recently, the rapid development of 3D LiDARs has promoted the autonomy of both land and aerial vehicles, which has attracted the interest of researchers in the maritime community. However, in traditional maritime surface multi-scenario applications, LiDAR scan matching has low point cloud scanning and matching efficiency and insufficient positional accuracy when dealing with large-scale point clouds, so it has difficulty meeting the real-time demand of low-computing-power platforms. In this paper, we use ICP-SVD for point cloud alignment in the Stanford dataset and outdoor dock scenarios and propose an optimization scheme (iVox + ICP-SVD) that incorporates the voxel structure iVox. Experiments show that the average search time of iVox is 72.23% and 96.8% higher than that of ikd-tree and kd-tree, respectively. Executed on an NVIDIA Jetson Nano (four ARM Cortex-A57 cores @ 1.43 GHz) the algorithm processes 18 k downsampled points in 56 ms on average and 65 ms in the worst case—i.e., ≤15 Hz—so every scan is completed before the next 10–20 Hz LiDAR sweep arrives. During a 73 min continuous harbor trial the CPU temperature stabilized at 68 °C without thermal throttling, confirming that the reported latency is a sustainable, field-proven upper bound rather than a laboratory best case. This dramatically improves the retrieval efficiency while effectively maintaining the matching accuracy. As a result, the overall alignment process is significantly accelerated, providing an efficient and reliable solution for real-time point cloud processing.

Keywords:

LiDAR scan matching; point cloud alignment; voxel structure

1. Introduction

Precise positioning and state estimation of surface vessels are vital for realizing autonomous navigation. In the marine environment, unmanned surface vehicles (USVs) rely on high-precision localization techniques to ensure their autonomous navigation and obstacle avoidance capabilities [1]. Typically, the position and heading angles of a ship are obtained through Global Navigation Satellite Systems (GNSS), such as GPS-based positioning and GPS compassing. Japan’s Quasi-Zenith Satellite System (QZSS) has also been developed, incorporating a Centimeter-Level Augmentation Service (CLAS). In open-water environments, GNSS positioning accuracy is generally sufficient to meet operational requirements. However, in port areas, GNSS signals are highly susceptible to obstruction, reflection, and refraction caused by tall structures such as containers, cranes, and buildings. These environmental interferences significantly degrade positioning accuracy. The resulting signal unreliability presents a major challenge for unmanned surface vehicles (USVs) that depend on precise positioning data, particularly during critical operations such as autonomous mooring. To ensure the operational safety of USVs, sensor redundancy must be implemented to provide a reliable alternative positioning solution when GNSS signals are unavailable. Although traditional inertial navigation systems (INSs) can offer short-term positioning during GNSS outages, their inherent error accumulation limits their effectiveness during prolonged signal loss.

To enhance USV safety through sensor redundancy, LiDAR is employed for simultaneous localization and mapping (SLAM) to construct a coastal point cloud map. This map can support autonomous navigation in GNSS-challenged environments, especially in obstacle-dense areas such as during automatic ship mooring. This capability is primarily due to LiDAR’s independence from GNSS signals, insensitivity to lighting conditions, and ability to provide an omnidirectional field of view, enabling scale-aware six-degree-of-freedom (6-DOF) pose estimation and accurate three-dimensional scene reconstruction. Furthermore, LiDAR can be integrated with other sensors, such as inertial measurement units (IMUs), to improve positioning accuracy and reliability through sensor fusion techniques.

In the domain of unmanned surface vehicle (USV) positioning within complex dynamic port environments, multidimensional explorations have aimed to enhance robustness and accuracy. To tackle the insufficient robustness of traditional LiDAR SLAM systems in dynamic berthing scenarios, Zhaozheng et al. [2] proposed the integration of semantic information into the algorithmic framework. Specifically, a semantically weighted odometer was constructed to guide the algorithm toward prioritizing static and stable features (e.g., docks and buildings) while mitigating the interference caused by dynamic objects such as moving ships and vehicles. This approach ultimately achieved centimeter-level high-precision positioning, effectively resolving the challenge of positioning reliability in dynamic scenarios.

Building upon this work, Sawada et al. [3] further improved the redundancy and stability of the positioning system by introducing a scheme that employs LiDAR SLAM as a GNSS redundancy component. The combination of IMU pre-integration and coordinate transformation constraints optimized the point cloud matching process, significantly enhancing the stability and accuracy of USV positioning under wave-induced heaving conditions in marine environments.

With respect to the optimization of perception capabilities, relevant studies have focused on addressing the challenges associated with marine target detection and satisfying real-time performance requirements. Ponzini et al. [4] developed a hybrid learning approach incorporating a supervised module into an unsupervised LiDAR detection framework. This innovation effectively overcame the technical bottlenecks in target detection and classification within marine environments. For scenarios with constrained computing and energy resources, Eustache et al. [5] established a dedicated USV LiDAR point cloud dataset to validate the efficiency of the PointPillars network, providing a feasible solution for fast and reliable object detection in lightweight environments.

Furthermore, to balance real-time performance and accuracy in obstacle detection, Wan et al. [6] proposed an improved RANSAC-based online obstacle detection scheme specifically designed for USVs. In this scheme, point cloud data were preprocessed via voxel filtering, followed by the application of the improved RANSAC algorithm to achieve ground segmentation and the classification of static/dynamic point clouds. Finally, obstacle marking was implemented through grid clustering. Experimental validations demonstrated that this scheme effectively reduced the missed detection rate of static obstacles and met the real-time performance requirements.

These studies provide multidimensional insights that allow us to optimize the efficiency of point cloud scan matching. The core of this research work is inseparable from point cloud matching technology, and its efficiency is crucial for real-time matching during the autonomous berthing process.

Point cloud matching, as the core aspect of LiDAR scan matching, is the basis of simultaneous localization and mapping (SLAM) [7] for unmanned vessels and ensures efficient and accurate state estimation in dynamic environments.

In LiDAR scan-matching algorithms, the two main approaches are direct matching and feature-based matching. Direct matching deals with the original point cloud data directly, while feature-based matching utilizes the local features in the point cloud. These two methods have their own advantages and disadvantages in practical applications, especially in the point cloud storage structure and search speed.

Direct LiDAR scan-matching methods need to align a large number of data points when processing raw point clouds, which requires efficient point cloud storage structures to support fast search and matching operations. Since direct methods do not distinguish between feature points, they need to iterate according to the nearest-neighbor criterion when establishing correspondences, which may lead to a large amount of computation. To improve the efficiency, spatial index structures such as kd-tree [8] and ikd-tree [9,10] are often used to accelerate the nearest-neighbor search. However, in large-scale data processing, there are problems such as reduced query efficiency, excessive memory usage, and difficulty in meeting real-time requirements. These problems are especially prominent in maritime environments. In addition, the direct method is more dependent on the initial guess, and if the initial position is inaccurately estimated, more iterations may be required to converge to the correct transformation. In the case of partial overlap and noisy scan data, the performance of the direct method may suffer more as the noise points are also incorporated into the matching process, which increases the computational complexity and may lead to matching errors.

Feature-based LiDAR scan matching methods, on the other hand, reduce the amount of data by extracting local features (e.g., edges, corner points) in the point cloud, thus improving the search speed and matching efficiency. This method usually requires a more complex point cloud storage structure to effectively store and retrieve feature information. The advantage of the feature matching method is that it can handle partial overlaps and noisy scans more efficiently because only feature points with differentiation are focused on, thus reducing the interference of irrelevant data. However, the feature extraction process itself may introduce additional computational overhead, especially when the number of feature points is large or the feature description is complex. In such cases, the structure of point cloud data for efficient search is a primary concern.

To cope with these challenges, researchers have proposed various optimization strategies. For instance, by improving the traditional tree spatial indexing structure, it is possible to achieve faster approximate nearest-neighbor search in large-scale point cloud data while reducing the memory footprint. Adopting a voxel-based nearest-neighbor structure, which is simpler than the tree structure, can also effectively reduce the time-consuming point cloud alignment without affecting the accuracy performance of point cloud alignment.

In practical applications, to further improve the efficiency and accuracy of point cloud alignment, we present the I-VoxICP framework that integrates the incremental sparse voxel index iVox with iterative closest point with singular value decomposition (ICP-SVD) [11]. After downsampling and voxel filtering the target point cloud, an iVox hash table is constructed; in each iteration, the table is queried for fast nearest-neighbor correspondences, while the rigid transformation is recovered in closed form via SVD, enabling the efficient and accurate registration of large-scale, density-heterogeneous point clouds. The contributions of this paper are as follows:

(1).: I-VoxICP is proposed by integrating incremental sparse voxels (iVox) with singular value decomposition (SVD)-based pose estimation. Tailored for nearshore berthing scenarios, it enables robust and real-time registration of large-scale, non-uniform-density point clouds.
(2).: Search efficiency is improved by almost an order of magnitude: iVox lowers average query time by 72.23% versus ikd-tree and 96.8% versus kd-tree, with a 99% reduction on some models.
(3).: Under noisy and partially overlapping conditions, I-VoxICP achieves lower RMSE and RMD than conventional methods, maintaining high alignment accuracy on complex structures such as ship hulls and quays. It consequently furnishes the SLAM front end with superior initial poses and effectively curbs cumulative drift.

The rest of this article is organized as follows. In Section 2, related work and recent advances in LiDAR-based scan matching and point cloud data structures are covered. In Section 3, the proposed methodology is detailed, focusing on the integration of the iVox point cloud data structure with the ICP-SVD algorithm for refined local pose estimation. Section 4 presents the experimental results comparing I-VoxICP with traditional point cloud storage structures, along with a detailed analysis of the findings. Finally, the conclusion and future research directions are discussed in Section 5.

2. Related Work

In recent decades, significant advancements have been made in six-degree-of-freedom position estimation based on 3D LiDAR scan matching, with accuracies reaching the centimeter level. To precisely align two continuous point clouds in time, researchers have adopted an algorithm based on the iterative closest point (ICP) [12] as a standard method to estimate the relative transformation of LiDAR scans. The ICP algorithm identifies the optimal transform matrix by minimizing the distance between the two point clouds, thus facilitating their alignment.

2.1. Direct Scan Matching

The direct iterative closest point (ICP) scan-matching algorithm was first proposed by Besl et al. [12]. The ICP algorithm iteratively searches for nearest neighbors and establishes correspondences in a target point cloud, refining the optimal relative transformation through iterations. In the original formulation, point-to-point distances were utilized for nearest-neighbor associations. However, this approach can be overly dependent on viewpoints, potentially leading to suboptimal results when significant variations are present between the viewpoints captured by two scans. To address this challenge, the authors of [13] propose the use of point-to-plane distances for robust data association. The point-to-plane distance metric is more accommodating of scans that overlap partially, as it does not demand strict point correspondence. To unify these distance metrics within a probabilistic framework, the generalized ICP (G-ICP) approach has been proposed [14], which emphasizes the alignment of the LiDAR scan surface as a whole, rather than solely based on point-to-point correspondences. This strategy exhibits enhanced tolerance to erroneous correspondences and substantially mitigates the risk of partially overlapping LiDAR point clouds falling into local minima during the alignment process. Furthermore, the G-ICP framework addresses the alignment challenge posed by LiDAR scans by introducing a probabilistic model, thereby enhancing its robustness to measurement noise from manufacturing defects or extreme operating conditions.

2.2. Integration of Point Cloud Structure and Registration

Shi hua Li et al. [15] proposed an advanced iterative closest point (ICP) algorithm that utilizes k-d trees, thereby enhancing the existing ICP algorithm. Diego Caro et al. [16] proposed a new data structure, Compressed kd-tree (ckd-tree), for efficiently storing and querying temporal maps with a spatial complexity that approaches the information theoretic lower bound. However, its performance is contingent upon the data distribution. Bereczky et al. [17] proposed Quad-kd-tree (QKd-tree), a generalization of point quadtree and K-d-tree, which balances the time and space cost by employing insertion heuristics. By setting thresholds for the Euclidean distance, the point-to-point distance difference, and the normal vector, the efficiency and accuracy of point cloud alignment are enhanced. Wei Wang et al. [18] proposed a spatio-temporal indexing structure based on a hybrid of octree and 3D R*-tree to optimize the performance of spatial indexing and nearest-neighbor search for large-scale point cloud data. Furthermore, Holanda et al. [19] proposed a workload-adaptive indexing technique for multidimensional query predicates, Cracking kd-tree, which dynamically generates multidimensional indexes in response to data access patterns. Cracking kd-tree can dynamically generate multidimensional indexes for data access patterns three times faster than the full kd-tree. However, the index construction overhead of Cracking kd-tree is higher when there are many query dimensions.

In order to circumvent the substantial impact of large-scale point clouds on search speed while preserving the integrity of the kd-tree point cloud structure, Wei Guan et al. [20] proposed a point cloud alignment method. This method is founded on the improved iterative closest point (ICP) algorithm, leveraging voxel grids to resample point cloud data and optimizing the computation of normal vectors using the kd-tree. In a similar vein, Cao et al. [21] proposed a novel algorithm for constructing a balanced K-dimensional (kd) tree based on pre-sorting results, which are presented in the following section. This algorithm reduces the complexity of the construction process; however, it does not support parallel construction of kd-trees. Hou et al. [22] proposed an improved k-nearest-neighbor classification algorithm, KNN-kd-tree, based on kd-trees. This algorithm optimizes the search process through the kd-tree structure, significantly reduces time complexity, and improves search performance. Ji et al. [23] combined the improved KN-4PCS algorithm with two-way Kd-tree optimization. Zhang et al. [24] made significant improvements in the accuracy, efficiency, and robustness of point cloud alignment based on an improved ICP algorithm, using surface curvature as a feature descriptor and the kd-tree search mechanism, but the approach demonstrated limited resilience. Despite the enhancement of alignment accuracy and speed, these algorithms still fall short in meeting the stringent real-time requirements of certain applications.

In the context of mobile robot navigation or real-time 3D reconstruction, algorithms may require constant updating and adaptation. This can augment the computational burden and impact real-time performance.

In previous studies, the kd-tree structure [25], a prevalent method in robotics, was found to be “static,” relying on the exhaustive utilization of all points to construct the tree from the beginning. This approach stands in direct opposition to the prevailing reality in real-world robotics applications, where data collection is typically sequential in nature. Consequently, integrating a new frame of data into an existing data structure by rebuilding the entire tree from the beginning is often inefficient and time-consuming. To address this issue, Yixi Cai et al. [9,10] proposed a dynamic k-d tree structure, termed the ikd-tree, which constructs and incrementally updates the k-d tree with only new points while downsampling them to a desired resolution.

The advent of ikd-tree [26] signaled a paradigm shift in the field, as Zhang et al. unveiled a pioneering six-dimensional pose estimation method for I-HS4PCS. This method is based on Harris3D feature extraction and ikd-tree optimization, techniques that have been demonstrated to enhance the efficiency of the algorithm while maintaining the estimation accuracy.

In their work, Chunge Bai et al. [27] sought to enhance the tracking speed by leveraging the point cloud spatial data structure of incremental voxels (iVox). The efficacy of incremental checking surpassed that of ikd-tree, further reducing the time expended on maintaining the local map and querying the nearest neighbors. Subsequent analysis indicated that implementing an efficient point cloud storage structure can enhance the efficiency of point cloud alignment.

In the present study, we propose the I-VoxICP point cloud alignment algorithm to improve registration efficiency. It first applies voxel-grid downsampling to reduce data volume and create a more uniform spatial distribution, followed by statistical outlier removal to eliminate sparse isolated outliers. Subsequently, the ICP-SVD algorithm is employed to align the two sets of point clouds to a suitable initial position. The algorithmic framework is delineated in Figure 1. The ICP-SVD method has been shown to exhibit higher numerical stability and robustness than the traditional ICP algorithm, a result of its ability to solve rigid transformation parameters through singular value decomposition (SVD). The traditional ICP algorithm is susceptible to interference from initial matching errors and noise during the iterative process, which can result in convergence to a local optimum. In contrast, ICP-SVD swiftly and accurately calculates the optimal rotation and translation matrices, thereby accelerating the convergence of the algorithm and enhancing the matching accuracy. In the fine local position estimation stage, the I-VoxICP point cloud alignment algorithm, enhanced by iVox [27], is employed to enhance the searching efficiency by adopting the incremental voxel index structure, which can effectively reduce the time-consuming point cloud alignment.

3. Point Cloud Indexing and Fine Local Position Estimation

The fundamental challenge of point cloud alignment is the computation of the nearest neighbor of a specified point to a historical point cloud [28]. This process typically involves the utilization of multiple nearest-neighbor data structures, which can be broadly categorized into two distinct types: tree-like and voxel-like structures.

3.1. Kd-Tree: K-Dimensional Tree

The kd-tree (Figure 2) is a classical multidimensional data structure that constructs a binary tree by recursively dividing data points according to the median of a specific dimension. Each node of the tree represents a hyper-rectangular region containing a data point and a partition plane. It performs well in operations such as nearest-neighbor search in high-dimensional spaces and is able to quickly find the nearest neighbors of a target point, thus accelerating the convergence of point cloud alignment algorithms such as ICP. However, in practical applications, it is often necessary to query and modify operations on dynamic data, a task for which the traditional kd-tree is not adequate.

3.2. Ikd-Tree: Incremental Kd-Tree

The advent of kd-tree prompted researchers to propose ikd-tree (illustrated in Figure 3), an incremental kd-tree. This innovation facilitated dynamic insertion, deletion, and updating operations while automatically rebalancing to ensure efficient query performance. It is particularly well-suited for processing dynamically changing point cloud data, as evidenced by its applications in robot navigation and real-time map building. Additionally, it is capable of efficiently processing incremental data while maintaining fast nearest-neighbor search capability.

The traditional kd-tree structure supports exact K-nearest-neighbor and range queries and allows approximate nearest-neighbor (ANN) searches by specifying the maximum distance. However, it lacks native incremental capabilities, a feature exhibited by ikd-tree. ikd-tree supports incremental addition but necessitates ongoing maintenance of the structure, thereby affecting efficiency. Furthermore, point cloud alignment does not necessitate the same level of accuracy in K-nearest neighbors as other methods. Instead, it is more tolerant of distant neighbors, as long as the computation is not excessively affected.

Conversely, the sparse-voxel-based nearest-neighbor structure is more suitable for alignment needs. It possesses a natural query range restriction, supports flexible addition and deletion operations, and has low construction and update overhead. Users can adjust the search granularity according to their needs, taking into account the efficiency and accuracy, and the structure is easy to parallelize and suitable for efficient implementation. Consequently, the voxel structure exhibits higher practical application value in environments with limited computing resources.

3.3. IVox: Incremental Sparse Voxels

The development of iVox was motivated by the necessity to address the challenges of managing and processing point cloud data. iVox employs an incremental sparse voxel structure, which facilitates the efficient organization and processing of such data. The underlying principle of iVox’s sparse voxel representation is to construct and maintain voxel structures exclusively in regions where points exist in space. This representation builds and maintains the voxel structure uniformly through a hash table (Figure 4), with the coordinates of the spatial points as keys and the indices generated by the hash function as values. When adding new data, the index is computed from the key and the corresponding value at that index is saved. When searching, the index is computed from the key and the value at that index is returned. This approach circumvents the need for voxelization of the entire space, a strategy that conserves memory and computational resources.

iVox has two underlying structures: linear iVox and iVox-PHC. Linear iVox is suitable for matching a small number of points, while iVox-PHC utilizes pseudo-Hilbert curves (PHCs) to optimize the alignment efficiency for a large number of points. iVox-PHC reduces the complexity of the kNN search by dividing the voxels into smaller cubes and storing them using the PHC indexes. To address the challenges posed by the bit position estimation of point cloud data, this algorithm employs a pseudo-Hilbert curve (PHC) (illustrated in Figure 5) to preprocess the point cloud data, thereby optimizing the data structure and enhancing retrieval efficiency. In conjunction with the ICP-SVD algorithm, its rigid-body transformation estimation method based on singular value decomposition is employed to further enhance the accuracy and computational efficiency of the position estimation.

3.4. ICP-SVD Algorithm

In the domain of point cloud alignment, ICP-SVD is a methodology that utilizes the least-squares method and integrates the conventional ICP (iterative closest point) algorithm and the SVD (singular value decomposition) technique. This integration facilitates the iterative identification of the nearest corresponding points between two sets of point clouds. Furthermore, it enables the efficient computation of the optimal transformation matrix through SVD. This process is undertaken to minimize the distance error between the two sets of point clouds. The outcome of this process is rapid and precise alignment of point cloud data. The two fundamental steps in this process are re-finding the nearest corresponding point pairs and reconstructing the least-squares problem by performing SVD of the covariance matrix to obtain the new optimal rotation matrix and translation vector.

The following is the fundamental sequence of operations underlying the ICP-SVD algorithm, pertaining to the alignment of the reference point cloud and target point cloud:

(1).: For each point in the target point cloud, its nearest neighbors are identified in the reference point cloud to form a collection of point pairs. This step typically employs spatial indexing structures (e.g., ikd-tree, iVox) to expedite the search process.
(2).: The construction of a least-squares problem is then initiated to minimize the distance error between two sets of point clouds, as determined by the set of point pairs that have been identified. Specifically, the rotation matrix and translation vector are solved such that the objective function is minimized:

$(R, t) = \arg \min_{R \in S O (d), t \in ℝ^{d}} {\sum_{i = 1}^{n} ‖R p_{i} + t - q_{i}‖}^{2}$

(1)

where $p_{i}$ and $q_{i}$ are the corresponding point pairs in the target and reference point clouds, respectively.
(3).: The center of mass of the target and reference point clouds $μ_{p}$ and $μ_{q}$ must be computed.
(4).: By centering the original point set, the resulting de-centered point sets are $P' = p_{i} - μ_{p}$ and $Q' = q_{i} - μ_{q}$ .
(5).: The construction of the covariance matrix is then required:

$H = P' T Q'$

(2)
(6).: The singular value decomposition (SVD) of H results in the following outcome:

$H = U Σ V^{T}$

(3)
(7).: The rotation matrix is given by $R$ for $R = V U^{T}$ and the translation vector $t$ is given by $t = μ_{q} - R μ_{p}$ .
(8).: The target point cloud is to be transformed using the derived transform matrix.
(9).: Check whether the error satisfies the convergence condition (e.g., the error is less than a certain threshold or the maximum number of iterations is reached) and stop the iteration if it does; otherwise, return to step 2 to continue the iteration.

3.5. Improved ICP-SVD Algorithm Based on iVox

The traditional ICP-SVD algorithm has shown excellent accuracy and efficiency. However, it has some limitations when dealing with large-scale dynamic point cloud data. In order to address these issues, an improved ICP-SVD algorithm based on iVox is proposed (illustrated in Figure 6). iVox organizes point cloud data by sparse voxelization, building and maintaining voxel structures only where points exist in the space. iVox provides a choice of versions; of these, the PHC version has demonstrated both excellent efficiency and robustness in handling large-scale point cloud data. Consequently, in the implementation of our algorithm, we have selected the PHC version of iVox as the fundamental point cloud data structure.

In the PHC version, iVox divides each voxel into

{(2^{K})}^{3}

spatial cells, and the value of K is related to the size of the voxel. The algorithm takes K = 6 for most datasets and K = 18 for some datasets, corresponding to the 6th-order PHCs and 18th-order PHCs, respectively. Each spatial cell retains the center of mass of all points falling within that spatial cell, with the center of mass coordinates being updated each time a new point is added. The number of neighboring spatial cells that must be searched before and after using the PHC is determined by the order of K.

The iVox data structure (Table 1) uses voxels as the basic unit to discretize continuous point cloud data into a fixed-size grid, controls the edge length of voxels by the parameter resolution, and utilizes the inverse resolution inv_resolution for quantization mapping. The default parameter sets the resolution to 0.2, but when combined with ICP-SVD, different resolutions are selected depending on the size of the dataset point cloud.

During the fine local position estimation process, the point cloud is organized into iVox for the purpose of processing. The ICP-SVD process involves the minimization of error between the source and target point clouds by iteratively optimizing the transformation matrix. In each iteration, the source point cloud is first transformed using the current transform matrix, and nearest-neighbor point pairs are then searched in the target point cloud. The key of the voxel to which each point belongs is first identified, and then this key is used to find the keys of all the neighboring voxels. These neighbor voxels are then the object of kNN search, so the nearest-neighbor search function is invoked one by one. All the nearest neighbors within the voxel are gathered, and then the k closest distances are taken. The algorithmic process is delineated in Algorithm 1.

The I-VoxICP algorithm efficiently organizes the target point cloud into a regular 3D voxel grid, leveraging spatial division to quickly locate the nearest neighbor for each point in the source cloud. This approach avoids the exhaustive point-by-point search of traditional ICP algorithms, significantly reducing computational complexity. Additionally, the iVox structure’s neighborhood filtering mechanism ensures accurate matching within a specified range, avoiding distant mismatches and improving alignment precision. Furthermore, the iVox structure stores local point distribution information within each voxel, enabling the algorithm to better understand the target point cloud’s local structure. This enhances feature alignment accuracy and robustness during the iterative process.

Algorithm 1: I-VoxICP
1	Input:
2	Source point cloud: cloud_source
3	Target point cloud: cloud_target
4	iVox: resolution (resolution), nearby type (nearby_type)
5	ICP: maximum number of iterations (max_iter), maximum matching distance (max_dist)
6	Output:
7	Transformation matrix: T
8	Registered point cloud: cloud_registered
9	RMSE: rmse
10	Voxel filter cloud_target → cloud_target_filtered
11	iVox ← Initialize iVox (resolution, nearby_type) iVox.AddPoints(cloud_target_filtered)
12	T ← Identity matrix
13	matches ← Empty list for each point p in cloud_source_transformed do closest_points ← iVox.GetClosestPoint(p, max_dist) if closest_points is not empty then matches.add((p, closest_points [0])) end if end for
14	U, Σ, V^T ← SVD(H) R ← U * V^T if det(R) < 0 then V[:,3] ← −V[:,3] R ← U * V^T end if
15	t ← mu_t − R * mu_s T ← compose(R, t) * T
16	End

4. Experimental Results

4.1. Real-Time Performance Experiment

In order to quantitatively assess the accuracy, robustness, and real-time performance of the proposed algorithm, we established a unified experimental platform for data recording and conducted multiple comparative experiments. Table 2 summarizes the hardware devices, programming languages, and operating environments used in the experiments, Table 3 consolidates the parameters of the datasets, and Table 4 presents key specifications such as the model of the LiDAR used for data recording. The experiments were divided into real-time performance testing and computational efficiency testing. Both testing environments operated on Ubuntu 20.04 LTS, with real-time performance tests conducted on an unmanned surface vehicle (USV) during actual vessel trials, while efficiency tests were performed on a desktop computer using both publicly available datasets and self-recorded data. The experimental conditions and information about the self-recorded dataset are listed in Table 5.

The real-time experimental testing was conducted at the Xiaoping Island Wharf in the Ganjingzi District of Dalian City, focusing on berthing perception experiments. The geographical coordinates of Xiaoping Island Wharf are 39°N latitude and 121°E longitude. The wharf features artificially constructed breakwaters measuring 317 m and 680 m in length, with a channel width of 220 m. A top view is illustrated in Figure 6.

Intel Core i9-13900K, NVIDIA GeForce RTX 3090, Kingston FURY DDR4-3200, Canonical Ubuntu 20.04 LTS, ARM Cortex-A57, and NVIDIA Maxwell—all bought at Dalian Electronics Market, China.

The unmanned surface vehicle (USV) conducted experiments on-site and captured the vessel’s movement process (Figure 7). The USV performed docking scans at points A and B indicated in Figure 8. The weather and experimental conditions on the day of the berthing scan are shown in Figure 9.

In this experiment, an unmanned surface vehicle (USV) was specifically equipped with an R-Fans-16 LiDAR as the core sensing device, along with an NVIDIA Jetson NANO onboard computer for data processing. To enhance the comprehensiveness and accuracy of the experiment, the USV was also outfitted with auxiliary sensors, including a Global Navigation Satellite System (GNSS) and an inertial measurement unit (IMU).

During the docking scan tests conducted at points A and B, key operational parameters of the sensors were strictly set and maintained. The scanning frequency of the R-Fans-16 LiDAR was adjusted between 5 and 20 Hz to adapt to different testing scenarios, while the IMU operated at a stable frequency of 100 Hz to ensure high-precision real-time motion state monitoring. In contrast, the GNSS, mainly used for positioning reference, operated at a frequency of 1 Hz.

The real-time capability of the proposed iVox-ICP-SVD algorithm is evaluated in Table 6. Mean per-frame computation time (Proc. Time) is 56 ms with a worst-case value of 65 ms (56 ± 9 ms). These measurements were collected on a Jetson Nano (four ARM Cortex-A57 cores at 1.43 GHz) while processing ≈ 18 k downsampled points per 3D scan; the iVox hash index supplies the initial guess and ICP-SVD refinement converges within 20 iterations.

Because the LiDARs used in the harbor trial deliver 10–20 frames per second (frame period 50–100 ms), the algorithm’s worst-case latency of 65 ms is always below the minimum frame period, guaranteeing that every scan is processed before the next one arrives.

During the continuous 73 min experiment the CPU temperature stabilized at 68 °C without thermal throttling, confirming that the reported 56 ms average/65 ms worst-case latency is sustainable under field conditions rather than an ideal-laboratory optimum.

4.2. Offline Point Cloud Registration Experiment

The hardware environment consists of an Intel (R) Core (TM) i9 13900k processor running at 5.4 GHz and 32 GB of RAM. To validate the algorithm proposed in this paper, the point cloud models were selected from the publicly available Stanford point cloud dataset, the Pohang canal dataset [29], and the self-recorded dataset from Xiaoping Island, Dalian City, for full testing (Figure 8). The Pohang canal dataset was selected from the 03 sequence of the dock scene, where frames 792, 794, and 802 were selected. Our self-recorded dataset was acquired with the RS-Ruby-80 lidar, which has 80 lines and is denser than the HDL-64E point cloud. We number the point cloud models to facilitate subsequent presentations, and the iVox parameters chosen for the proposed method and the experimental dataset parameters are summarized in Table 7 and Table 8, respectively.

In this study, we meticulously configure the iVox parameters by taking into account the characteristics of the point cloud data, processing requirements, and computational resource constraints. The experimental results demonstrate that smaller resolution values (e.g., 0.2 and 0.5) are effective in retaining details, making them suitable for complex models requiring high accuracy; in contrast, larger resolutions (e.g., 5) enhance processing efficiency for large-scale data. The neighborhood type is uniformly set to 6, meeting common search range requirements while avoiding the computational overhead and noise interference associated with settings of 18 or 26. The capacity is set to 300,000 to pre-allocate approximately 11% of the largest model size, ensuring an average occupancy of ≤10 points per voxel and eliminating runtime re-hashing. This single-shot allocation keeps memory consumption around 10 MB (≈0.03% of system RAM) and guarantees temporal determinism for real-time marine berthing applications.

Table 8 summarizes the ICP parameter configuration for various datasets. The parameter Icp_n is set to 50 to balance convergence speed and alignment accuracy, while Max_cor_dis is set to 5 to filter out spurious point pairs, ensuring accurate matching. The setLeafSize parameter is not applied to models 1, 2, and 3 due to their low point counts, in order to avoid reducing available correspondences. To ensure a fair comparison of search speeds, the configuration maintains a high point cloud density while effectively reducing data size. The subsequent section details the ICP parameter configurations across datasets, with the test data categorized into small-scale models (approximately 50,000 points) and large-scale models (over 50,000 points) after downsampling, as shown in Table 9 and Table 10.

The setting of the downsampling parameter results in a substantial number of point clouds following the downsampling process. This outcome aligns with our objective of preserving as much detailed information as possible in the original point clouds. In the absence of downsampling, the kd-tree point cloud search is excessively time-consuming, impeding effective comparison of search times for point cloud data structures. Our strategy ensures that the number of point clouds after downsampling aligns more closely with the actual data size during processing. Additionally, it provides a more comprehensive reflection of the algorithm’s performance in real-world applications.

4.3. Fine Local Pose Estimation Experiment

In the fine local position estimation experiments, we carefully select three representative point cloud data structures—kd-tree, ikd-tree, and iVox—and integrate them with the classical ICP-SVD algorithm for application testing. As illustrated in Figure 10, the comparative analysis focuses on the nearest-neighbor search performance of these structures during point cloud alignment. In the figure, the source point cloud is depicted in gray, the target point cloud in red, and the aligned result in blue. Subfigure (a) presents the alignment based on nearest-neighbor search using the traditional kd-tree, (b) shows the results obtained with ikd-tree-based matching, and (c) illustrates the matching performance using the iVox structure.

The evaluation of point cloud alignment accuracy is based on two metrics: the root mean square error (RMSE) and the relative mean distance (RMD). RMSE serves as an indicator of positional estimation accuracy by computing the average of the square roots of the squared differences between the ground truth and the estimated values in a rigid transformation. It is mathematically defined as follows:

R M S E = \sqrt{\frac{1}{m} \sum_{i = 1}^{m} {(y_{i} - {\hat{y}}_{i})}^{2}}

(4)

In this context,

m

signifies the number of corresponding point pairs,

y_{i}

denotes the true Euclidean distance between corresponding point pairs, and

{\hat{y}}_{i}

denotes the Euclidean distance between corresponding point pairs after the estimation of the finely localized pose. Ideally, the distance between corresponding points after alignment should be 0, which means that a lower RMSE value indicates a higher alignment accuracy. RMD indicates the geometrical relationship between the estimated and the true poses and is a measure of the average distance from each point in the aligned point cloud to its nearest neighbor in the target point cloud. The RMD can be expressed as follows:

R M D = \frac{1}{N} \sum_{i = 1}^{N} d i s t (P_{reg, i}, P_{target, i})

(5)

Here,

N

is the number of points in the post-alignment point cloud

P_{reg}

;

P_{reg, i}

is the i-th point in the post-alignment point cloud;

P_{target, i}

is the i-th point in the post-alignment point cloud;

P_{target, i}

is the nearest point to

P_{reg, i}

in the target point cloud; and

d i s t (P_{reg, i}, P_{target, i})

is the Euclidean distance between the point

P_{reg, i}

and its nearest neighbor

P_{target, i}

.

Among the two evaluation metrics, RMSE demonstrates a heightened sensitivity to larger errors, as the squaring operation serves to amplify the effect of the error. Conversely, RMD demonstrates a heightened sensitivity to the mean error and is more apt for assessing the aggregate alignment effect.

To comprehensively evaluate the scanning alignment performance of the three point cloud data structures in different scenarios, the point cloud search time is selected as the core metric in this paper. The test data encompass the Stanford point cloud model, the outdoor self-recorded dataset, and the Pohang (03) dataset.

The Stanford point cloud model is characterized by its density and absence of noise, encompassing a diverse range of 3D models from small-scale (e.g., Bunny) to large-scale (e.g., Statuette) with discernible scale-dependent variations. This feature facilitates the analysis of the relationship between search time and data size.

In contrast, the outdoor self-recorded dataset and the Pohang (03) dataset exhibit higher sparsity and are more akin to the actual maritime environment. The former simulates point cloud alignment scenarios of neighboring ships at sea, while the latter provides point clouds underneath piers to test adaptability in typical maritime environments. The incorporation of these diverse datasets augments the practical significance and representativeness of the evaluation.

Figure 11 shows the point cloud search performance of ICP-SVD combined with three point cloud data structures under 14 different models. It is worth noting that iVox has the shortest point search time.

In this study, we systematically compared the performance of iVox, ikd-tree, and traditional kd-tree in point cloud correspondence point search. The experimental results demonstrate that iVox exhibits a substantial performance improvement compared to the other two structures, particularly for both small- and large-scale point cloud data. Specifically, in the point cloud models numbered 1, 2, and 6, the search time for iVox was reduced by 98.47%, 98.86%, and 99.07%, respectively, compared to kd-tree, and by 85.46%, 84.47%, and 90.14%, respectively, compared to ikd-tree.

In the preliminary small-scale point cloud experiment, the mean search time for iVox was found to be 0.1833 s, which is 73.49% and 97.51% higher than the mean search times for Ikd-tree (0.6913 s) and kd-tree (7.3687 s), respectively. Similarly, in the subsequent large-scale experiment, the mean search time for iVox was 1.7022 s, which is 71.06% and 96.10% higher than that of ikd-tree (5.8808 s) and kd-tree (43.662 s), respectively. In the large-scale point cloud test, the average search time of iVox was 1.7022 s, that of ikd-tree was 5.8808 s, and that of kd-tree was 43.662 s. In general, iVox demonstrates a 72.23% improvement in terms of average search time compared to ikd-tree. Furthermore, ikd-tree exhibits a 96.8% reduction in search time compared to kd-tree.

Notably, in point cloud model 8, the kd-tree exhibits reduced search time despite the increased data volume. This phenomenon is attributed to the highly homogeneous spatial distribution of the point cloud, characterized by a mean density of 1 point/area and a standard deviation of 0.07, which aligns well with the structural assumptions of the kd-tree.

In contrast, model 10 presents a significantly higher and more heterogeneous density distribution (mean density of 230.53 points/area and standard deviation of 43.02), leading to a marked decline in kd-tree performance. This contrast highlights the robustness of iVox in handling complex and uneven point cloud distributions, thereby demonstrating its superior adaptability and efficiency across diverse scenarios.

In summary, iVox exhibits superior performance in terms of search time when compared to both ikd-tree and traditional kd-tree. This outcome is not merely coincidental but rather a consequence of the distinctive strengths of iVox in algorithm design and data structures. The efficacy of iVox is evident in its ability to markedly reduce the computational demands and temporal requirements of the search process by optimizing data organization and search strategy. This substantial enhancement in point cloud search efficiency leads to considerable acceleration in fine localized position estimation.

4.4. Registration Result Benchmarking

To compare the performance of the proposed I-VoxICP algorithm, a comprehensive quantitative evaluation was performed on the point cloud model.

As illustrated in Table 11, under most of the models, the RMD and RMSE of I-VoxICP exhibit superiority over the bit position estimation accuracy of the I-d-ICP and K-d-ICP algorithms. Notably, in the context of large-scale point clouds, I-VoxICP demonstrates a substantial reduction in error. The efficacy of point cloud alignment in real-world scenarios, such as outdoor docks and berthing, is demonstrated by Figure 12.

As illustrated in point cloud alignment result number 4 (Figure 13a), a distinct hull structure is discernible. The aligned blue point cloud exhibits a substantial overlap with the target red point cloud at the edge of the hull. Point cloud alignment result number 12 (Figure 13b) reveals a complex building structure at the pier and demonstrates consistent point cloud alignment.

The RMD and RMSE of I-VoxICP are superior to those of Kd-ICP and Ikd-ICP in point cloud models 3, 4, and 5 with small-scale points. Furthermore, the RMD and RMSE of the three algorithms are nearly equivalent in point cloud models 1, 2, 6, and 7 with uniform point density.

This suggests that the positional accuracy of the three point cloud data structures is not significantly different from that of the ICP-SVD in the processing of simple point cloud models with uniform point density. The positional accuracy of point cloud alignment is comparable. In the large-scale point cloud, the substantial number of point clouds and the relatively uneven point cloud density create discernible disparities. ICP-SVD with iVox augmentation exhibits remarkably high bit position estimation accuracy and can expedite the point cloud search within a constrained timeframe.

The experimental results suggest that I-VoxICP can expedite point cloud searches while sustaining optimal position estimation accuracy. Notably, in complex point cloud data analysis, I-VoxICP exhibits a remarkable capacity to attenuate errors, thereby enhancing alignment precision.

5. Discussion

The core design of the I-VoxICP algorithm integrates a sparse voxel index (iVox) with pose refinement based on singular value decomposition (SVD), forming the iVox-ICP-SVD framework. This approach effectively addresses the computational bottleneck encountered by traditional iterative closest point (ICP) algorithms when processing large-scale, density-heterogeneous marine point clouds. The integrated method demonstrates significant advancements in efficiency, accuracy, and robustness. A comparative analysis with conventional methods such as i-HS4PCS [26] further validates the performance advantages of this approach, as illustrated in the runtime comparison between iVox-ICP-SVD and i-HS4PCS presented in Figure 14.

The results presented in Figure 14 illustrate the single-threaded execution times of iVox-ICP-SVD and i-HS4PCS across six typical marine point cloud models from VI to XI. It is evident that iVox-ICP-SVD consistently maintains the lowest processing time across all test scenarios, with an average runtime of 1.30 s, representing approximately a 40% speedup compared to i-HS4PCS’s 2.17 s. As the model size increases, the execution time for i-HS4PCS exhibits an almost linear growth trend (slope: 0.34 s/model), whereas the fluctuation range for iVox-ICP-SVD remains limited to 0.8–1.5 s, resulting in a reduced slope of 0.12 s/model. This indicates that the proposed method has lower sensitivity to increases in data volume.

Notably, in Model X—characterized by its highest point cloud density and structural complexity—the execution time for iVox-ICP-SVD is 1.47 s, which is only 49% of that required by i-HS4PCS (3.00 s). This finding validates the efficiency and stability of sparse voxel indexing combined with SVD optimization under extreme conditions.

Although the hash-based indexing and pre-voxelization strategy of iVox demonstrably boosts both efficiency and robustness in point cloud registration, these advantages—validated across multiple scenarios—must be weighed against its inherent limitations.

On the one hand, by replacing conventional tree-type kNN queries with constant-time hash lookups, I-VoxICP raises the registration rate from 2–3 Hz to over 10 Hz on a desktop CPU while cutting RMSE by up to 94%. In challenging maritime settings such as the Pohang canal, it maintains centimeter-level accuracy, whereas kd-tree-ICP diverges in 12% of runs. The sparse voxel index, coupled with SVD regularization, enlarges the convergence basin, and the hash grid retains O (1) average query cost for scenes with 35 k to 3.6 M points. Memory consumption scales with the number of non-empty voxels, eliminating the need for per-dataset parameter retuning and easing deployment on vessels that encounter widely varying scene sizes during a single mission.

This disparity stems from a fundamental difference in matching paradigms. I-HS4PCS first extracts a 4PCS key tetrad set for coarse registration, then refines the alignment via ICP iterations. Guo et al. [30] adopt a similar two-stage strategy: PCFC-based filtering/compression and an improved FPFH descriptor feed SAC-IA for coarse registration, after which Small_GICP performs the final refinement. In contrast, I-VoxICP establishes point-to-point correspondences in a single shot by exploiting sparse voxel hashing, collapsing the coarse-to-fine pipeline into one step and eliminating redundant candidate evaluations and secondary iterative computations.

These gains reveal structural drawbacks. First, although a hash lookup is constant-time, the I-Vox construction still spawns massive redundant searches: neighboring blocks overlap spatially, so the same correspondence may be retrieved repeatedly, wasting computation time and shrinking the feasible optimization space. Second, the method remains grounded in the rigid-alignment assumption; its objectives and regularizers only model rigid deformation, so convergence is easily lost in dynamic flexible scenes (e.g., hawser motion or hull loading). Finally, datasets were collected under clear skies; rain, snow, or fog will increase outliers from spray, refraction, and absorption, likely overwhelming the pre-voxelization outlier cap and eroding robustness margins.

In summary, I-VoxICP delivers a step change in speed and accuracy over traditional ICP variants via hash indexing and voxel-wise pre-filtering and is especially attractive for resource-constrained maritime systems whose scene sizes vary drastically. However, the limitations of redundant computation, rigid-body assumption, and weather sensitivity indicate that future work must pursue neighborhood reuse mechanisms, elastic deformation extensions, and adverse-weather degradation models before long-term, stable, fully autonomous point cloud registration can be guaranteed in real-world ocean environments.

6. Conclusions

In this study, we propose an ICP-SVD point cloud alignment method based on the point cloud storage structure of iVox. The method has demonstrated particular effectiveness in scenarios involving large-scale point clouds and instances where point cloud density is sparse. Experimental comparisons between the proposed I-VoxICP method and traditional kd-tree and ikd-tree-based methods reveal that iVox significantly reduces search time while maintaining or improving alignment accuracy. For instance, in small-scale tests, iVox achieved up to 97.51% faster search times, and in large-scale scenarios, up to 96.10% faster. Furthermore, even in sparse point cloud environments, iVox maintained a comparable or superior level of alignment accuracy.

In summary, the I-VoxICP point cloud alignment ensures convergence in large-scale point clouds and in complex environments, such as those with multiple ships traveling nearby during unmanned boat berthing. Notwithstanding its inherent limitations, this methodology paves the way for further research on the integration of point cloud storage structures and point cloud alignment algorithms. Furthermore, it establishes a foundational framework for the prospective advancement of the LiDAR SLAM front-end odometer.

Author Contributions

Writing—original draft preparation, Q.J.; writing—review and editing, Y.Y. and Q.J.; methodology, Q.J., Y.Y. and Q.J.; software, Q.J. and M.B.; formal analysis, Q.J.; validation, Q.J.; investigation, M.B. and Q.J.; resources, Y.Y. and Q.J.; data curation, M.B., D.G. and Q.J.; visualization, D.G.; funding acquisition, Y.Y. and Q.J.; supervision, Y.Y. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported in part by the Natural Science Foundation of Liaoning Province of China under Grant 2024-BS-013/84240088, in part by Dalian City Science and Technology Plan (Key) Project (2024JB11PT007), and by the Fundamental Research Funds for the Central Universities (3132025138), Maritime Safety Administration of Liaoning Province Fund (80825013).

Data Availability Statement

The original contributions presented in this study are included in the article. Further inquiries can be directed to the corresponding author.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Zhou, B.; He, Y.; Qian, K.; Ma, X.; Li, X. S4-SLAM: A Real-Time 3D LIDAR SLAM System for Ground/Watersurface Multi-Scene Outdoor Applications. Auton. Robot. 2021, 45, 77–98. [Google Scholar] [CrossRef]
Hu, Z.; Zhang, M.; Meng, J.; Xiao, H.; Shi, X.; Long, Z. Semantic Map-Based Localization of USV Using LiDAR in Berthing and Departing Scene. In Proceedings of the 2023 7th International Conference on Transportation Information and Safety (ICTIS), Xi’an, China, 4–6 August 2023; pp. 583–589. [Google Scholar]
Sawada, R.; Hirata, K. Mapping and Localization for Autonomous Ship Using LiDAR SLAM on the Sea. J. Mar. Sci. Technol. 2023, 28, 410–421. [Google Scholar] [CrossRef]
Ponzini, F.; Zaccone, R.; Martelli, M. LiDAR Target Detection and Classification for Ship Situational Awareness: A Hybrid Learning Approach. Appl. Ocean. Res. 2025, 158, 104552. [Google Scholar] [CrossRef]
Eustache, Y.; Seguin, C.; Pecout, A.; Foucher, A.; Laurent, J.; Heller, D. Marine Object Detection Using LiDAR on an Unmanned Surface Vehicle. IEEE Access 2025, 13, 121658–121669. [Google Scholar] [CrossRef]
Wan, C.; Lv, X.; Mao, Z.; Wang, Z.; Li, Y.; Ni, C. Online Obstacle Detection for USV Based on Improved RANSAC Algorithm. In Proceedings of the 2023 6th International Symposium on Autonomous Systems (ISAS), Nanjing, China, 23–25 June 2023; pp. 1–6. [Google Scholar]
Smith, R.; Self, M.; Cheeseman, P. Estimating Uncertain Spatial Relationships in Robotics. In Proceedings of the 1987 IEEE International Conference on Robotics and Automation, Raleigh, NC, USA, 31 March–3 April 1987; Volume 4, p. 850. [Google Scholar]
Bentley, J.L. Multidimensional Binary Search Trees Used for Associative Searching. Commun. ACM 1975, 18, 509–517. [Google Scholar] [CrossRef]
Cai, Y.; Xu, W.; Zhang, F. Ikd-Tree: An Incremental K-D Tree for Robotic Applications. arXiv 2021, arXiv:2102.10808. [Google Scholar]
Xu, W.; Cai, Y.; He, D.; Lin, J.; Zhang, F. FAST-LIO2: Fast Direct LiDAR-Inertial Odometry. IEEE Trans. Robot. 2022, 38, 2053–2073. [Google Scholar] [CrossRef]
Arun, K.S.; Huang, T.S.; Blostein, S.D. Least-Squares Fitting of Two 3-D Point Sets. IEEE Trans. Pattern Anal. Mach. Intell. 1987, PAMI-9, 698–700. [Google Scholar] [CrossRef] [PubMed]
Besl, P.J.; McKay, N.D. A Method for Registration of 3-D Shapes. IEEE Trans. Pattern Anal. Mach. Intell. 1992, 14, 239–256. [Google Scholar] [CrossRef]
Chen, Y.; Medioni, G. Object Modeling by Registration of Multiple Range Images. In Proceedings of the 1991 IEEE International Conference on Robotics and Automation, Sacramento, CA, USA, 9–11 April 1991; Volume 3, pp. 2724–2729. [Google Scholar]
Wang, X.; Li, Y.; Peng, Y.; Ying, S. A Coarse-to-Fine Generalized-ICP Algorithm With Trimmed Strategy. IEEE Access 2020, 8, 40692–40703. [Google Scholar] [CrossRef]
Li, S.; Wang, J.; Liang, Z.; Su, L. Tree Point Clouds Registration Using an Improved ICP Algorithm Based on Kd-Tree. In Proceedings of the 2016 IEEE International Geoscience and Remote Sensing Symposium (IGARSS), Beijing, China, 10–15 July 2016; pp. 4545–4548. [Google Scholar]
Caro, D.; Rodriguez, A.; Brisaboa, N.R.; Fariña, A. Compressed k D-Tree for Temporal Graphs. Knowl. Inf. Syst. 2016, 49, 553–595. [Google Scholar] [CrossRef]
Bereczky, N.; Duch, A.; Nemeth, K.; Roura, S. Quad-Kd Trees: A General Framework for Kd Trees and Quad Trees. Theor. Comput. Sci. 2016, 616, 126–140. [Google Scholar] [CrossRef]
Wang, W.; Zhang, Y.; Ge, G.; Jiang, Q.; Wang, Y.; Hu, L. A Hybrid Spatial Indexing Structure of Massive Point Cloud Based on Octree and 3D R*-Tree. Appl. Sci. 2021, 11, 9581. [Google Scholar] [CrossRef]
Holanda, P.; Nerone, M.; Almeida, E.C.d.; Manegold, S. Cracking KD-Tree: The First Multidimensional Adaptive Indexing (Position Paper). In Proceedings of the 7th International Conference on Data Science, Technology and Applications, Porto, Portugal, 26 July 2018; SCITEPRESS-Science and Technology Publications, Lda.: Setubal, Portugal, 2018; pp. 393–399. [Google Scholar]
Guan, W.; Li, W.; Ren, Y. Point Cloud Registration Based on Improved ICP Algorithm. In Proceedings of the 2018 Chinese Control And Decision Conference (CCDC), Shenyang, China, 9–11 June 2018; pp. 1461–1465. [Google Scholar]
Cao, Y.; Wang, H.; Zhao, W.; Duan, B.; Zhang, X.; Cheng, S. A New Method to Construct the KD Tree Based on Presorted Results. Complexity 2020, 2020, 8883945. [Google Scholar] [CrossRef]
Hou, W.; Li, D.; Xu, C.; Zhang, H.; Li, T. An Advanced k Nearest Neighbor Classification Algorithm Based on KD-Tree. In Proceedings of the 2018 IEEE International Conference of Safety Produce Informatization (IICSPI), Chongqing, China, 10–12 December 2018; pp. 902–905. [Google Scholar]
Ji, S.; Ren, Y.; Ji, Z.; Liu, X.; Hong, G. An Improved Method for Registration of Point Cloud. Optik 2017, 140, 451–458. [Google Scholar] [CrossRef]
Zhang, Y.; Jia, T.; Chen, Y.; Tan, Z. A 3D Point Cloud Reconstruction Method. In Proceedings of the 2019 IEEE 9th Annual International Conference on CYBER Technology in Automation, Control, and Intelligent Systems (CYBER), Suzhou, China, 29 July–2 August 2019; pp. 1310–1315. [Google Scholar]
Rusu, R.B.; Cousins, S. 3D Is Here: Point Cloud Library (PCL). In Proceedings of the 2011 IEEE International Conference on Robotics and Automation, Shanghai, China, 9–13 May 2011; pp. 1–4. [Google Scholar]
Zhang, Z.; Li, H. I-HS4PCS: Object 6D Pose Estimation Method Based on Harris3D-ikdTree Optimization. IEEE Access 2024, 12, 138018–138026. [Google Scholar] [CrossRef]
Bai, C.; Xiao, T.; Chen, Y.; Wang, H.; Zhang, F.; Gao, X. Faster-LIO: Lightweight Tightly Coupled Lidar-Inertial Odometry Using Parallel Sparse Incremental Voxels. IEEE Robot. Autom. Lett. 2022, 7, 4861–4868. [Google Scholar] [CrossRef]
Wang, H.; Yin, Y.; Jing, Q. Comparative Analysis of 3D LiDAR Scan-Matching Methods for State Estimation of Autonomous Surface Vessel. J. Mar. Sci. Eng. 2023, 11, 840. [Google Scholar] [CrossRef]
Chung, D.; Kim, J.; Lee, C.; Kim, J. Pohang Canal Dataset: A Multimodal Maritime Dataset for Autonomous Navigation in Restricted Waters. Int. J. Robot. Res. 2023, 42, 1104–1114. [Google Scholar] [CrossRef]
Guo, D.; Jing, Q.; Yin, Y.; Xu, H. sAn Efficient Laser Point Cloud Registration Method for Autonomous Surface Vehicle. J. Mar. Sci. Eng. 2025, 13, 1720. [Google Scholar] [CrossRef]

Figure 1. iVox point cloud search and local pose estimation framework.

Figure 2. Illustration of a kd-tree structure for efficient querying. It is evident that traditional kd-tree structures exhibit suboptimal performance in dynamic data scenarios, characterized by frequent insertions and deletions.

Figure 3. The ikd-tree structure, designed to facilitate dynamic self-balancing and rebalancing processes.

Figure 4. Hash key–value mapping diagram.

Figure 5. A Hilbert curve of order N divides the space into

2^{N} \times 2^{N} \times 2^{N}

cells. The centers of all cells are connected according to a certain alignment rule.

Figure 5. A Hilbert curve of order N divides the space into

2^{N} \times 2^{N} \times 2^{N}

cells. The centers of all cells are connected according to a certain alignment rule.

Figure 6. Voxel-based ICP-SVD point cloud registration algorithm.

Figure 7. Experimental trials of the USV.

Figure 8. Xiaopingtou Wharf top-down view. Point A is the left dock, and Point B is the right dock.

Figure 9. The weather on the day of the berthing experiments.

Figure 10. Comparison of point cloud registration pipelines using (a) kd-tree, (b) ikd-tree, and (c) iVox data structures for nearest-neighbor search. The source cloud is gray, the target is red, and the aligned result is blue.

Figure 11. Search time analysis of three data structures under small point cloud model (a) and large point cloud model (b).

Figure 12. Point cloud registration result of model 4 (a); point cloud registration result of model 12 (b).

Figure 13. Photograph of model 4 hull (a); photograph of model 12 dock (b).

Figure 14. Comparison of runtimes between the iVox-ICP-SVD algorithm and the I-HS4PCS algorithm.

Table 1. Data structure of iVox.

No.	iVox Structure	Description
1	HashTable voxel_map;	Hash table storing voxel nodes, using 3D coordinates as keys.
2	List<VoxelNode> voxel_cache;	List for caching voxel nodes, possibly for quick access or lifecycle management of voxels.
3	float resolution;	The resolution of the voxel map, defining the size of each voxel.
4	float inv_resolution;	The inverse of the resolution, used for quick voxel index calculation.
5	std::size_t capacity;	The maximum capacity of the voxel map, i.e., the maximum number of voxels it can store.
6	NearbyType nearby_type;	Defines the type of voxel range considered when searching for nearest neighbors.
7	bool GetClosestPoint;	Function to find the nearest point to a given point, returning whether the operation was successful.
8	void AddPoints;	Member function to add point cloud data to the voxel map.

Table 2. Experiment platform configuration overview.

Platform	Component	Model/Version	Key Parameters
Desktop	OS	Ubuntu 20.04 LTS	64-bit, kernel 5.4
	CPU	Intel i9-13900K	5.4 GHz
	GPU	NVIDIA RTX 3090	24 GB GDDR6X
	RAM	DDR4-3200	32 GB
USV	OS	Ubuntu 20.04 LTS	64-bit, kernel 5.4
	CPU	ARM Cortex-A57	1.43 GHz
	GPU	NVIDIA Maxwell	921 MHz
	RAM	LPDDR4	4 GB, 64-bit
Common	Language	C++14	——
	PCL	1.10	——
	VTK	9.1	——

Table 3. Dataset parameters.

Category	Dataset	LiDAR	Frequency
Self-recorded Dataset	XPD_ship	RS-Ruby-80	10 Hz
Self-recorded Dataset	Xpd_cloud	RS-Ruby-80	10–20 Hz
Public Dataset	Stanford	——	——
Public Dataset	Pohang Canal	Velodyne HDL-64E S2	5–20 Hz

Table 4. LiDAR parameters.

LiDAR	Beam	Frequency	Resolution	HFOV
RS-Ruby-80	80	10–20 Hz	0.4 m~3 m: Up to ± 5 cm	360°
RS-Ruby-80	80	10–20 Hz	3 m~200 m: Up to ± 3 cm	360°
R-Fans-16	16	5–20 Hz	±2 cm	360°
Velodyne HDL-64E S2	32	5–20 Hz	±2 cm	360°

Table 5. Dataset acquisition specifications.

Category	XPD_ship	Xpd_cloud
Duration (min)	73	58
LiDAR	RS-Ruby-80	RS-Ruby-80
Installation Height (m)	——	5
Frame Rate (Hz)	10	20
Data Size (GB)	70	55
Weather	Sunny, light clouds	Sunny, light clouds
Sea Conditions	BF 1–2; waves 0.1–0.2 m	BF 1–2; waves 0.1–0.2 m
Data Integrity	>95%	>92%
Applicability	Ship	Berthing and Ship

Table 6. Real-time performance evaluation of point cloud registration.

Trial Site	Speed (kn)	LiDAR	Frequency (Hz)	Proc. Time (s/f)
A	0.5	R-Fans-16	5–20 Hz	0.056
B	0.3	R-Fans-16	5–20 Hz	0.083
A	0.2	RS-Ruby-80	10–20 Hz	0.097
B	0.1	RS-Ruby-80	10–20 Hz	0.096

Table 7. iVox parameters.

Model	Num	Res	Nby_type	Cap
bunny	I	5	6	300,000
horse	II	5	6	300,000
XPD_ship	III	1	6	300,000
Xpd_cloud	IV	1	6	300,000
Armadillo	V	0.5	6	300,000
happy buddha	VI	5	6	300,000
China dragon	VII	5	6	300,000
manuscript	VIII	0.2	6	300,000
dragon	IX	5	6	300,000
hand	X	5	6	300,000
statuette	XI	5	6	300,000
Pohang_03_792	XII	0.5	6	300,000
Pohang_03_794	XIII	0.5	6	300,000
Pohang_03_802	XIV	0.5	6	300,000

Table 8. Dataset-specific parameterizations of the ICP algorithm.

Num	Icp_n	Max_cor_dis (m)	setLeafSize (m)
I	50	5	——
II	50	5	——
III	50	5	——
IV	50	5	[0.2f]
V	50	5	[1.0f]
VI	50	5	[0.001f]
VII	50	5	[0.001f]
VIII	50	5	[0.5f]
IX	50	5	[0.5f]
X	50	5	[0.01f]
XI	50	5	[1.0f]
XII	50	5	[0.1f]
XIII	50	5	[0.1f]
XIV	50	5	[0.2f]

Table 9. Point cloud dataset (small scale).

Number	Original Size	Downsampled Size
I	35,947	35,947
II	48,485	48,485
III	30,048	30,048
IV	144,000	39,919
V	172,974	46,228
XII	131,072	15,954
XIII	131,072	15,899
XIV	131,072	9703

Table 10. Point cloud dataset (large scale).

Number	Original Size	Downsampled Size
VI	543,652	65,549
VII	437,645	84,972
VIII	2,155,617	85,026
IX	3,609,600	218,049
X	327,323	243,290
XI	4,999,996	277,597

Table 11. Registration result benchmarking with metrics of RMD (

e^{- 10}

mm) and RMSE (

e^{- 5}

mm).

Table 11. Registration result benchmarking with metrics of RMD (

e^{- 10}

mm) and RMSE (

e^{- 5}

mm).

	Kd-ICP		Ikd-ICP		I-VoxICP
Number	RMD	RMSE	RMD	RMSE	RMD	RMSE
1	0.0176	4.19	0.0176	4.19	0.0176	4.19
2	0.0688	8.29	0.0688	8.29	0.0688	8.29
3	137	370.0	15.1	123	1.04	32.3
4	528	726.0	42.8	207	27.9	167
5	2140	1460	77.9	279	18.9	180
6	0.0373	6.11	0.0373	6.11	0.0373	6.11
7	0.146	12.1	0.146	12.1	0.146	12.1
8	682	826.0	233	483	159	399
9	17,400	4170	844	919	682	891
10	25.8	161	13.2	115	8.7	70
11	308	1760	151	389	140	301
12	29.9	547	398	631	262	750
13	58.9	767	398	242	101	117
14	515	2270	365	604	166	408

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Jing, Q.; Bai, M.; Yin, Y.; Guo, D. I-VoxICP: A Fast Point Cloud Registration Method for Unmanned Surface Vessels. J. Mar. Sci. Eng. 2025, 13, 1854. https://doi.org/10.3390/jmse13101854

AMA Style

Jing Q, Bai M, Yin Y, Guo D. I-VoxICP: A Fast Point Cloud Registration Method for Unmanned Surface Vessels. Journal of Marine Science and Engineering. 2025; 13(10):1854. https://doi.org/10.3390/jmse13101854

Chicago/Turabian Style

Jing, Qianfeng, Mingwang Bai, Yong Yin, and Dongdong Guo. 2025. "I-VoxICP: A Fast Point Cloud Registration Method for Unmanned Surface Vessels" Journal of Marine Science and Engineering 13, no. 10: 1854. https://doi.org/10.3390/jmse13101854

APA Style

Jing, Q., Bai, M., Yin, Y., & Guo, D. (2025). I-VoxICP: A Fast Point Cloud Registration Method for Unmanned Surface Vessels. Journal of Marine Science and Engineering, 13(10), 1854. https://doi.org/10.3390/jmse13101854

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

I-VoxICP: A Fast Point Cloud Registration Method for Unmanned Surface Vessels

Abstract

1. Introduction

2. Related Work

2.1. Direct Scan Matching

2.2. Integration of Point Cloud Structure and Registration

3. Point Cloud Indexing and Fine Local Position Estimation

3.1. Kd-Tree: K-Dimensional Tree

3.2. Ikd-Tree: Incremental Kd-Tree

3.3. IVox: Incremental Sparse Voxels

3.4. ICP-SVD Algorithm

3.5. Improved ICP-SVD Algorithm Based on iVox

4. Experimental Results

4.1. Real-Time Performance Experiment

4.2. Offline Point Cloud Registration Experiment

4.3. Fine Local Pose Estimation Experiment

4.4. Registration Result Benchmarking

5. Discussion

6. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI