1. Introduction
The growth in transportation demand and the expanded utilization of underground space have driven the rapid development of tunnel engineering, leading to the continuous expansion of tunnel scale worldwide [
1]. Shield tunnels have become the predominant form in most urban subway systems due to their advantages, such as construction convenience [
2]. However, the growth in tunnel scale has also brought about a series of issues. Due to factors such as the aging of tunnel structures and the mechanical effects of the surrounding soil on the tunnel, structural deformation and damage inevitably occur. RThese issues ultimately affect the load-bearing capacity of the tunnel structure and pose safety hazards.
At the same time, defects such as cracks, spalling, and lining water leakage are widely present in operational tunnels [
3,
4], which also impact the load-bearing performance of tunnel structures. Therefore, structural inspections of tunnels play a crucial role in tunnel operations [
5].
Traditional tunnel inspections primarily rely on manual methods, using tools such as measuring tapes, rangefinders, ground-penetrating radars, and total stations to collect data [
6,
7,
8]. However, these measurement methods require significant time, manpower, and material resources during preparation and measurement. Additionally, the measurement accuracy for tunnel deformation is greatly affected by factors such as human intervention and the arrangement of measurement points [
9]. Therefore, there is a significant demand for rapid and high-accuracy detection methods for structural deformation in the field of tunnel inspections.
Light Detection And Ranging (LiDAR) has been widely applied in structural inspections due to its advantages, such as non-contact measurement and fast data acquisition. Laser scanning systems, including mobile laser scanning systems and tripod-based laser scanning systems, are also increasingly being utilized in tunnel inspections [
10,
11].
However, laser scanning systems acquire surface point-cloud data of tunnels during the scanning process, and these raw point-cloud data contain a large number of non-structural facilities, such as pipelines and maintenance walkways. While these non-structural facilities have relatively little impact on the mechanical performance of the shield tunnel itself, the effectiveness of removing non-structural facility point clouds significantly affects the health inspection of the tunnel lining structure [
12].
Numerous researchers have conducted extensive studies on point-cloud denoising, integrating various computer science theories into the denoising process. This has led to the development of a range of point-cloud processing techniques, including filter-based point-cloud denoising, optimization-based point-cloud denoising, and machine learning-based point-cloud denoising [
13,
14,
15].
Filter-based point-cloud denoising primarily originates from the concept of filters in signal processing, where appropriately designed filters are used to remove unwanted noise components. This method typically smooths the data directly through weighted averaging or similar operations, eliminating noise while striving to preserve the characteristics of the original signal. For example, Wen et al. [
16]. considered the impact of three parameters—number of nearest neighbor points (N), Euclidean distance weight, and normal vector direction weight—on denoising performance and applied bilateral filtering to denoise point-cloud data in autonomous driving scenarios. Similarly, Wang et al. [
17]. denoised forest point-cloud data using statistical filtering, effectively segmenting ground point-cloud data and individual tree point-cloud data.
In addition, filters such as statistical outlier removal (SOR) [
18], geometry-based filtering [
19,
20], and radius filtering [
21] have also been applied to point-cloud denoising in various scenarios. Filter-based point-cloud denoising achieves good results when targeting specific point-cloud distributions. In filter-based point-cloud denoising methods for tunnel structures, filtering techniques such as cloth simulation filtering (CSF) [
22] and wavelet transform [
23] have achieved promising results.
Optimization-based point-cloud denoising methods formulate the denoising process as an optimization problem. By defining objective functions based on geometric properties and noise distributions, these methods seek a denoised point cloud that best fits the objective function derived from the input noisy point cloud.
Fitting methods, represented by the least-squares method [
24], have achieved notable results in point-cloud denoising. This approach introduces locally approximated surface polynomials to adjust point set density, enabling the smooth reconstruction of complex surfaces while minimizing geometric errors. The least-squares method and its improved variants have also been widely applied in the denoising of point-cloud data [
25,
26,
27,
28]. Meanwhile, the least-squares method, combined with other denoising techniques such as outlier filtering [
29] and local kernel regression (LKR) [
30], has also been applied to point-cloud data denoising. In addition to the least-squares method, other fitting optimization techniques, such as local optimal projection (LOP) [
31], weighted local optimal projection (WLOP) [
32], and moving robust principal component analysis (MRPCA) [
33], have also been widely applied.
Some of the aforementioned point-cloud denoising techniques have also been applied to the processing of tunnel-structure point-cloud data. Bao, Yan et al. [
34] conducted research on denoising tunnel cross-sectional point clouds by leveraging the geometric features and intensity information of the point-cloud data. Their method effectively removes most noise points in the tunnel point cloud, thereby improving data quality and accuracy. However, because this method requires manually transforming the coordinate system and extracting the cross-section based on the point cloud’s geometric and intensity features, the denoising accuracy is significantly affected by factors such as equipment, environment, and the precision of manual data preprocessing. Consequently, its robustness under complex tunnel conditions is relatively poor.
Since the tunnel cross-sectional structure exhibits minimal variation along the tunnel’s mileage, some scholars have extracted tunnel cross-sections along the mileage direction based on this structural characteristic [
35] and performed denoising on individual cross-section point clouds. Moreover, Soohee Han et al. [
36] projected the three-dimensional point cloud onto a two-dimensional plane and extracted the tunnel axis by obtaining the planar skeleton, thereby achieving tunnel cross-section extraction.
In addition, some researchers employ ellipse-fitting methods to model the boundary lines of tunnel point clouds and, based on the fitting results, remove outliers to complete both the denoising of tunnel cross-sectional point clouds and the extraction of the cross-section. Among various ellipse-fitting methods, the least-squares method [
37,
38,
39,
40] is widely used. The least-squares method can effectively fit and denoise tunnel point clouds that contain relatively few noise points and where the noise causes only minimal dispersion of the fitted curve. However, some tunnel point clouds contain a significant amount of discrete noise, such as those generated by tunnel pipelines, making it difficult for the least-squares method to perform effective fitting and denoising.
In order to improve the robustness of the fitting method against noise, some researchers have employed RANSAC [
41,
42,
43] to fit tunnel cross-sectional curves. RANSAC is highly robust method capable of extracting valid data even in the presence of a large amount of noise and outliers, making it suitable for complex point-cloud distributions with unevenly distributed noise or numerous outliers. However, the selection of parameters for this method (such as the inlier threshold and the number of iterations) relies on experience, and the tuning process is quite cumbersome. Moreover, because this method is based on random sampling, the ellipse-fitting results obtained with RANSAC tend to be unstable and lack repeatability. In addition, B-spline curves [
44] have also been applied to the curve fitting of tunnel cross-sectional point clouds. This method can generate smooth, continuous curves with excellent local control, making it suitable for describing complex tunnel cross-sectional shapes. However, it is sensitive to the selection of initial control points and parameter settings, and inappropriate settings may lead to fitting bias.
Machine learning-based point-cloud denoising does not rely on manually defined filters or objective functions. Instead, it denoises through a data-driven approach, leveraging the powerful computational capabilities of computers to uncover the inherent patterns and features within the data. Machine learning primarily includes two types: supervised learning and unsupervised learning.
In the field of point-cloud denoising, unsupervised learning denoising methods are often based on clustering techniques. These methods group similar data points into clusters and identify and remove outliers or noise points to achieve denoising. Density-based clustering algorithms such as DBSCAN [
45]; PCA-based adaptive clustering [
46]; the DTSCAN clustering method based on Delaunay triangulation and the DBSCAN mechanism [
47]; clustering based on spatial distance, normal similarity, and logarithmic Euclidean Riemannian metrics [
48]; and K-means clustering based on local statistical features of point clouds [
49] have been applied to point-cloud denoising.
These methods have achieved good denoising results for point-cloud data with distinct geometric distribution features. These unsupervised learning models typically do not require explicit labels. Instead, they learn the inherent features of the data by identifying structures, patterns, or correlations within the data.
In contrast, supervised learning involves learning from labeled training data to predict outcomes for new, unseen data. In the field of point-cloud denoising, supervised learning-based denoising is often achieved using neural networks. Network architectures such as PointNet and convolutional neural networks (CNNs) have been applied to point-cloud denoising.
The PointNet [
50] and PointNet++ [
51] architectures, which can directly process point-cloud data, have opened new directions in the field of 3D data processing and have driven the development of point-cloud processing networks. Improved versions of the PointNet network, such as PointCleanNet [
52], which focuses on the fusion of local and global features of point clouds, and PCPNet [
53], which emphasizes the estimation of local geometric features like normal vectors and curvature, have demonstrated good performance in point-cloud denoising tasks.
PointNet has relatively weak local feature extraction capabilities and lacks spatial structure awareness of point-cloud data. To address this, some researchers have employed improved convolutional neural networks (CNNs) for point-cloud denoising tasks [
54,
55,
56,
57]. For existing convolutional neural networks (CNNs), the unstructured and sparse nature of point-cloud data does not perfectly match the input data requirements of CNNs. Therefore, the use of CNNs for point-cloud denoising still has room for improvement. In machine learning-based point-cloud denoising, DBSCAN clustering [
58] has been applied to tunnel point-cloud denoising.
In the aforementioned research on point-cloud denoising, various models of laser scanning devices have been applied for point-cloud acquisition. For instance, the Faro X130, which excels in high-precision detail capture, features a scan angular resolution of 0.036° and a beam divergence of 0.19 mrad, a measuring range of 0.6 m to 130 m, and a single measurement error of ±2 mm; a single scan can generate a high-density point cloud containing approximately 1371 million points. Additionally, the Robosense Helios 1615 laser scanner, known for its fast measurement speed and suitability for mobile mapping and large-scale three-dimensional laser scanning, employs a 32-line scanning mechanism, offers a 360° horizontal field of view and a 70° vertical field of view (covering −55° to +15°), and can achieve a point-cloud acquisition rate of 576,000 pts/s in single-return mode.
In addition, the Optech LYNX Mobile Mapper, which is suitable for engineering surveys and large-scale infrastructure measurements (such as tunnels), features a maximum measuring range of 200 m, a range precision of 8 mm, and a range accuracy of ±10 mm (1σ). This device has a laser measurement rate of 75–500 kHz (with multiple measurements possible per laser pulse), a scanning frequency of 80–200 Hz, uses a 1550 nm (near-infrared) laser, and offers a full 360° scanning field of view. Besides laser scanning devices, other data acquisition devices, such as the Intel RealSense D435i, are also used for data collection tasks. The Intel RealSense D435i is a measurement device that generates depth images based on infrared stereo vision and related technologies. It features an RGB frame resolution of 1920 × 1080 and a field of view of approximately 69° × 42°. This device is capable of capturing both RGB and depth information simultaneously. Together, these devices provide a reliable data foundation for the aforementioned denoising methods.
However, in point-cloud denoising methods for tunnel data, filter-based denoising techniques often focus on the distribution of individual point clouds and tend to have poor robustness when dealing with complex point-cloud distributions. Machine learning-based point-cloud denoising methods, represented by clustering and neural networks, often involve complex parameter tuning or require large amounts of data collection and labeling. As a result, these methods typically have long preparation periods and substantial workloads. Optimization-based denoising methods, such as point-cloud fitting, often require manual data preprocessing. The quality of the preprocessing directly affects the ellipse-fitting results, which in turn impacts the denoising performance. However, for large-scale point-cloud data in practical inspection tasks, performing manual preprocessing on all the data would consume substantial resources.
Therefore, this study proposes an ellipse-fitting denoising method based on the Huber loss function. This method achieves good fitting results and effectively denoises tunnel-lining-structure point-cloud data without the need for preprocessing. Additionally, it offers good computational efficiency.
2. Methods
The process of ellipse-fitting and point-cloud data denoising for subway shield-tunnel sections based on the Huber loss function is shown in
Figure 1. Raw point-cloud data of the subway tunnel are obtained through a mobile laser scanning system. The raw point-cloud data undergo principal component analysis (PCA) to determine the principal component direction of the point-cloud data and use the Huber loss function to fit the cross-section’s symmetry axis. Based on the principal component direction and the fitted symmetry axis, the spatial coordinate system of the point-cloud data is transformed. This ensures that the mileage direction (tunnel axis direction) of the tunnel point cloud aligns parallel to the Cartesian coordinate system’s z-axis, the symmetry axis of the tunnel cross-section aligns parallel to the y-axis, and the x-axis remains orthogonal.
After transforming the coordinate system, tunnel point-cloud data are extracted along the z-axis direction at fixed distances to obtain cross-sectional point-cloud data. For each individual cross-section point cloud, the Huber loss function is used to fit an ellipse equation. The original point cloud is then projected onto the fitted ellipse by calculating the radial shortest distance, resulting in the denoised point cloud.
Finally, evaluation metrics are derived by calculating the differences in radial distance between before and after projection. These metrics include mean squared error (MSE), root-mean-squared error (RMSE), and mean absolute error (MAE), which are used to comprehensively assess the effectiveness of tunnel point-cloud data denoising.
2.1. PCA-Based Point-Cloud Data Coordinate System Transformation
Due to significant variations in the actual tunnel alignment direction caused by engineering requirements, geological conditions, and other factors, the principal feature direction of the tunnel point-cloud data scanned by the laser scanning system is not orthogonal to the XYZ axes of the Cartesian coordinate system. This study selected the Amberg Clearance GRP 5000—Profiler 6012 model mobile 3D laser scanner to collect data, which is a high-precision measurement system designed for tunnel and rail infrastructure inspection. The laser scanning system quickly advances along the tunnel axis to achieve continuous tunnel point-cloud data collection. The actual collected tunnel point-cloud data are shown in
Figure 2. Since the tunnel point cloud is a three-dimensional graphic, a single color cannot display the three-dimensional graphic. This study uses a color threshold that is evenly distributed along the point-cloud data to display the three-dimensional point-cloud graphics. The color threshold in the figure has no actual geometric or physical meaning.
However, during point-cloud data processing, coordinate system transformation is often involved. To achieve better computational results and facilitate calculations, this study proposes a PCA-based tunnel point-cloud coordinate system transformation method. This method rotates the principal feature direction of the original point-cloud data to align with the z-axis of the Cartesian coordinate system. The results show that this method effectively completes the tunnel point-cloud coordinate system transformation and reduces the errors introduced by manual operations.
PCA is a commonly used data feature extraction method that represents the main variation trends of the data by finding the direction with the largest variance (i.e., the principal components). A larger variance indicates that the differences in the data are more pronounced. PCA was first proposed by Karl Pearson [
59] in 1901. It uses linear combinations to reduce the dimensionality of high-dimensional data to process multivariate data, revealing the correlation between variables and the inherent structure of the data. Later, it was further developed and improved by Harold Hotelling [
60] and named “principal component analysis”. This method has been widely used in the field of data statistical analysis. Li, D. [
61] et al. successfully extracted the principal axis direction of the turbine-blade point cloud with the help of PCA, proving the effectiveness of the PCA method in extracting the principal component direction of point-cloud data. Based on the above PCA principles and references, this study improves the PCA formula for tunnel point-cloud data processing and applies it to tunnel point-cloud coordinate system transformation. For tunnel point-cloud data with a narrow distribution, this method can effectively extract the principal component direction.
Since the actual amount of tunnel point-cloud data is huge, usually counted in millions, it is difficult to directly use tunnel point-cloud data to show the principal component direction of the tunnel point cloud. In order to better demonstrate the PCA principle, this paper uses the sample point-cloud data generated by Python to visualize the PCA principle, as shown in
Figure 3. The two arrows in the figure represent the first and second principal component directions of the data. The gray dots are sample data generated by the program, representing the point-cloud data of the actual tunnel. The coordinate axes in the figure are for the quantitative display of point-cloud data and do not have actual units.
To reduce the errors introduced by manually selecting the tunnel axis direction, this study uses PCA to calculate the eigenvector of the direction with the maximum variance of the tunnel point cloud (the first principal component direction) to represent the tunnel point cloud’s axis direction. Based on this, the tunnel point cloud is rotated, as shown in
Figure 4.
For the original tunnel point-cloud dataset
, where each point in the dataset has coordinates
, in order to eliminate the effect of the data’s position (i.e., the offset) in the coordinate space on the data’s feature vector (i.e., the principal component direction), the mean of the point-cloud data must be calculated according to Equation (1), and then all points in the point-cloud dataset are centered by subtracting the mean, as shown in Equation (2).
The original tunnel point-cloud data are centered by subtracting the mean, resulting in the centered tunnel point-cloud dataset
. The covariance matrix
of
is then calculated according to Equation (3). The covariance matrix
undergoes eigenvalue decomposition, as shown in Equation (4). The principal component corresponds to the largest eigenvalue
, and the eigenvector
associated with
represents the axis direction of the tunnel point-cloud data.
In the equation, represents the i-th eigenvalue of the covariance matrix, and is the eigenvector corresponding to . The eigenvalues are solved in descending order, such that .
To align the obtained principal axis direction of the point cloud with the Z-axis direction, the rotation axis r and the rotation angle
need to be calculated. Based on these, the rotation matrix R is computed. The rotation vector r is calculated as shown in Equation (5), and the normalized rotation vector
is calculated as shown in Equation (6), which represents the direction of rotation. The x, y, and z axes represent the corresponding rotation vectors
,
,
. The rotation angle
can be computed using the dot product between the principal axis and the Z-axis, as shown in Equation (7).
In the equation, and represent the magnitudes of the unit vectors along the principal axis and the z-axis, respectively.
In three-dimensional space, the rotation matrix can be generated using axis–angle representation. For any point
in space, the coordinates of the rotated point can be directly obtained by multiplying it with the rotation matrix
. Given a rotation vector
and a rotation angle
, the rotation matrix
can be generated using Rodrigues’ equation, as shown in Equation (8). The skew–symmetric matrix is defined in Equation (9).
The original point cloud is rotated along the tunnel axis, and the entire-tunnel point cloud is rotated to align with the z-axis direction, as shown in
Figure 5.
2.2. Point-Cloud Coordinate Transformation and Section Fitting Denoising Based on Huber Loss Function
The PCA-based point-cloud coordinate transformation can align the tunnel point cloud’s main axis direction with the target axis (Z-axis direction), so that the tunnel section becomes parallel to the XY plane of the Cartesian coordinate system. However, for the tunnel point cloud after coordinate transformation, the actual tunnel section is not orthogonal to the XY plane, and the point cloud contains a large amount of noise points generated by tunnel ancillary facilities. To address this issue, this study proposes a method for tunnel-section coordinate system transformation and section fitting denoising based on the Huber loss function.
The Huber loss function was proposed by statistician Peter J. Huber in [
62]. It combines the square error (for small deviations) and the linear error (for large deviations), as shown in Equation (10). Therefore, this fitting method is more robust to outliers. Based on the principle of the Huber loss function and some vector algebra and Euclidean geometry principles, this study proposes the following method and formula.
The principle of the Huber loss function is illustrated in
Figure 6. The objective of fitting using the Huber loss is to minimize the loss between the fitting function and all the points, as shown in Equation (11). In this study, the fitting result using the Huber loss function is a circular equation. This equation describes the distribution and geometric parameters of the tunnel cross-sectional point cloud, laying the foundation for subsequent point-cloud coordinate transformation and denoising. In this circular equation, the circle’s center is defined as
, and the radius is
. By minimizing this loss function, the circle’s center
and radius
can be determined.
In this study, the geometric parameters of the tunnel cross-section are fitted using the Huber loss function
, which serves as the basis for coordinate system transformation and point-cloud denoising.
In the Equation (10), is a parameter used to control the transition of the function from squared residuals to linear residuals. When the absolute value of the residual is less than or equal to , the loss function behaves as squared residuals, making the loss function smoother for small residuals. When the absolute value of the residual exceeds , the loss function transitions to a linear increase, reducing the influence of outliers on the fitting result. represents the absolute residual between the fitted value and the actual value. is the weight parameter, which controls the impact of the point-cloud data’s error on the total loss.
In Equation (11), the subscripts h, k, and r represent the parameters optimized in the minimization process, which represent the x-coordinate of the circle center, the y-coordinate of the circle center, and the radius of the circle, respectively. The algorithm continuously adjusts these three parameters to minimize the Huber loss function, and the obtained equation is the circle equation with the best fitting effect for the tunnel point-cloud section.
The Huber loss function can fit the center, axis of symmetry, and other geometric information of the tunnel cross-section. Using this geometric information, the axis of symmetry direction
for each cross-sectional point set
can be computed. By calculating the average axis of symmetry direction
for all cross-sections and the angle
between it and the XY plane axis, the point-cloud data can be rotated around the Z-axis
, thus obtaining the tunnel point cloud that is orthogonal to the XY plane, as shown in
Figure 7.
For a single tunnel cross-section point cloud, in order to improve computational efficiency and make the point cloud easier to fit, a circular model with only two parameters is selected for the fitting, as shown in Equation (12). In this equation,
represents the center of the fitted circle, and
is the radius of the fitted circle. The goal is to minimize the residuals between the fitting function and all points, i.e., to make all points fall as close as possible to the fitted circle, as illustrated in
Figure 8.
For the fitted circle center
, the direction vector
from each point on the cross-section to the circle center can be calculated using Equation (13). After normalizing the direction vectors using Equation (14), the unit direction vector from the circle center to all points in the cross-section can be used to compute the direction of the symmetry axis
using Equation (15). The angle
by which the point-cloud data need to be rotated around the z-axis to align the symmetry axis direction
with the coordinate axis can be calculated using the inverse trigonometric function in Equation (16). This angle represents the angle between the symmetry axis and the coordinate axis.
In the equation, represents the normalized direction vector from each point in the point cloud to the center of the circle.
After the tunnel point cloud is rotated along the axis line based on PCA and the tunnel cross-section is rotated around the z-axis based on the Huber loss function, the resulting tunnel point-cloud data are aligned parallel (and orthogonal) to the axes of the three-dimensional coordinate system.
For the point-cloud data with the transformed coordinate system, this study also applies the Huber loss function to fit the tunnel cross-sections and perform point-cloud denoising. In operational tunnels, the tunnel may undergo some deformation due to soil pressure. To better approximate the real cross-section, an elliptical model is used for fitting the cross-sections during the denoising process. The fitted ellipse equation can be expressed as Equation (17). In the ellipse equation,
represents the center of the fitted ellipse, and
and
represent the lengths of the semi-major and semi-minor axes, respectively.
For the original point cloud, this study uses the shortest distance projection method to project each point radially onto the boundary of the fitted ellipse. The projection direction is determined by calculating the shortest path from each point to the ellipse. The principle is illustrated in
Figure 9.
For each original tunnel point cloud’s XY plane coordinates
, the distance between the point and the center of the ellipse is calculated and normalized, as shown in Equations (18) and (19). Based on the normalized relative distance, the directional factor i for the point-cloud projection can be computed, as shown in Equation (20). Using Equations (21) and (22), the final projected point-cloud coordinates
can be obtained, while the z-coordinate of the point cloud remains unchanged throughout the process.
In the equations, and represent the coordinate differences between each point cloud and the center of the fitted ellipse . a and b denote the lengths of the semi-major axis and semi-minor axis of the fitted ellipse, respectively.
The point cloud after the coordinate system transformation and cross-section fitting denoising based on the Huber loss function can effectively remove noise points, such as pipeline and other non-structural facility point clouds, while retaining the geometric shape of the tunnel lining structure.
2.3. Evaluation Metrics Calculation
Due to the huge amount of point-cloud data, it is difficult to evaluate the denoising effect of the denoised point cloud manually. Some researchers evaluate the denoising effect of the point cloud by calculating the distribution difference of the point-cloud data before and after denoising. Bao, Yan [
34] and other researchers quantitatively evaluated the denoising effect of the point cloud by calculating the maximum relative deviation (MRD), average deviation (AD), and mean square error (MSE). Wang, Z. [
35] evaluated the denoising effect of the point cloud by calculating the maximum positive displacement and maximum negative displacement of the point-cloud projection before and after denoising.
To quantitatively evaluate the effectiveness of the Huber loss function-based point-cloud denoising method, this study, based on the aforementioned literature and mathematical statistical principles, employs three evaluation metrics—mean squared error (MSE), root-mean-squared error (RMSE), and mean absolute error (MAE)—to assess whether the fitted ellipse accurately matches the distribution of the original point cloud, and thereby evaluates whether the denoised point cloud retains the geometric features of the tunnel lining.
However, due to the presence of non-structural facility point clouds (such as noise points from auxiliary structures) in the original point cloud, these non-structural point clouds introduce calculation errors when evaluating the entire section. This ultimately affects the judgment of the denoising effect for the section. Therefore, directly calculating the three metrics for the entire section point cloud cannot accurately reflect the fitting and denoising performance.
In addition to calculating the evaluation metrics for the entire section, this study also selects the local point cloud at the tunnel crown (from 60° to 120° of the tunnel cross-section, with the positive x-axis as the 0° direction) to compute the three evaluation metrics. This part of the tunnel point cloud, from the 60° to 120° range at the top of the tunnel, effectively reflects the denoising results of the tunnel lining structure since there are no non-structural facilities (noise points). As shown in
Figure 10, the red area represents the selected partial point cloud.
Due to the presence of noise points (outliers), calculating the error using only the x and y coordinates in the Cartesian coordinate system for each point cloud may result in sudden value changes that affect the evaluation of the point-cloud denoising effect.
To mitigate the influence of outliers and comprehensively evaluate the denoising effect of the method in both the
and
coordinate directions, this study uses the distance
from the points to the center of the fitted circle before and after projection, instead of calculating the evaluation metrics based on the
and
coordinates, as shown in Equation (23).
This equation evaluates the difference between the fitted denoised point cloud and the tunnel-lining-structure point cloud by calculating the difference in the radius from each point to the center of the fitted circle before and after projection. Therefore, the calculation of evaluation metrics MSE, RMSE, and MAE based on the radial distances of the point cloud and the projected point-cloud radial distances is as follows:
The MSE is the average of the squared differences between the fitted values and the original values, used to quantify the overall magnitude of the error, as shown in Equation (24). The RMSE is the square root of the MSE, and it has the same units as the original data, making it more intuitive when interpreting the error magnitude. As shown in Equation (25), the lower the value of RMSE, the smaller the error.
The MAE is the average of the absolute errors of all data points, and it provides an intuitive measure of the magnitude of the error. As shown in Equation (26), the smaller the MAE, the smaller the error in the results.
This study calculates and evaluates the point-cloud denoising effect based on the above three metrics for both the entire cross-sectional point cloud and the tunnel-top point cloud, which contains fewer noise points.
4. Discussion
From the experimental results that we have reported in the previous section, it can be seen that the point-cloud denoising method based on the Huber loss function can effectively denoise the point-cloud data of the shield-tunnel section, and the highest accuracy can reach the millimeter level. This method demonstrates good denoising performance for both individual cross-sectional point clouds and the entire-tunnel point cloud, indicating its robustness to complex noise distributions.
This robustness is due to the fact that the Huber loss function uses squared error for small deviations—making it highly sensitive to minor errors—and linear error for large deviations, thereby reducing the influence of outliers or anomalous points on the overall fitting result. This characteristic ensures that noise from non-structural facilities such as pipelines does not excessively interfere with the final fitting, thus allowing for the accurate extraction and denoising of the main tunnel structure (tunnel lining structure).
Moreover, the Huber loss function exhibits smoother error variations, combining the high precision of mean squared error when the data are relatively clean with the robustness of absolute error when handling noisy data, thereby enabling effective denoising even for tunnel-top point clouds that contain fewer noise points.
In order to evaluate the merits and drawbacks of the denoising performance of the proposed method, this study compares the denoising effect of the elliptical-fitting denoising method based on the Huber loss function with that of a widely used free point-cloud processing software. Taking into account factors such as the user base and whether the software is open source, this study selected CloudCompare version 2.10 as the benchmark for the denoising performance comparison.
CloudCompare is an open-source 3D point-cloud processing software that focuses on handling high-density point-cloud data and 3D meshes. Its primary functions include point-cloud alignment, registration, segmentation, denoising, resampling, and geometric feature extraction, among others. This software is widely used in point-cloud data processing tasks in fields such as engineering surveying and archaeological mapping.
In this study, tunnel point-cloud data were denoised using CloudCompare software version 2.10, with the software interface shown in
Figure 16. CloudCompare allows for denoising by manually selecting and removing points from the cloud. However, due to the enormous volume of tunnel point-cloud data, manually denoising the data consumes a considerable amount of time and manpower. Moreover, manual denoising often introduces human errors, which can affect the accuracy of the denoising process.
In addition, CloudCompare software also includes a statistical outlier removal (SOR) filtering denoising function that can automatically denoise point clouds. The denoising effect is shown in
Figure 17, where blue represents the original point cloud and red represents the denoised point cloud. As can be seen from the results, point-cloud denoising based on the SOR filter in CloudCompare is capable of removing some outlier points from tunnel point clouds; however, it is difficult to remove noise with a continuous spatial distribution, such as pipeline point clouds.
To further discuss the effectiveness of the ellipse-fitting and point-cloud denoising method proposed in this study, we compare it with two widely used fitting and denoising methods: the least-squares method and random sample consensus (RANSAC) method.
The least-squares method is a fitting approach based on the sum of squared errors. It minimizes the squared differences between the actual data values and the fitted function to obtain the optimal fitting ellipse. By projecting the point cloud onto the fitted ellipse, denoising is achieved. This method is widely used in various data fitting tasks due to its simple algorithmic structure and other advantages. As described in the previous section, this study extracted five sections from a 40 m long tunnel for denoising. In order to more intuitively and scientifically compare the denoising effects of the least-squares method and the method proposed in this study, this study performed denoising on the same section (
Section 1 in 3. Results). The denoising result is shown in
Figure 18, where the blue points represent the original point cloud and the red points represent the denoised point cloud after fitting.
Random sample consensus (RANSAC) is an iterative fitting method based on random sampling. It randomly selects a subset of data, treating points that are close to the fitted ellipse as inliers and points that are far from the fitted ellipse as outliers (noise points). Through iterative sampling and estimation of model parameters, RANSAC ultimately obtains the optimal fitted ellipse. This method is widely used in fitting and denoising tasks due to its robustness to noise, ability to handle various fitting models, and other advantages.
In order to compare the denoising effect of the method proposed in this study, the RANSAC method is used to denoise Section 1 extracted in the previous stages (or steps) of the work. The denoising results are shown in
Figure 19, where the blue points represent the original point cloud and the red points represent the denoised point cloud after fitting.
As shown in the above figures, both the least-squares method and RANSAC method can perform fitting and denoising on a single subway tunnel section. To quantitatively assess the denoising effectiveness of these two methods in comparison with the method proposed in this study, the evaluation metrics are calculated for the point cloud of all cross-sections of the entire tunnel as well as the selected top tunnel point cloud (60°–120°) of each section. The average values of the evaluation metrics for all cross-sections are calculated, and the results are compared with those of the Huber loss-based method proposed in this study. The calculation results are shown in
Table 6 and
Table 7.
In addition, this study also recorded the time required for these three methods to process the same data (the entire 40 m long tunnel point cloud) under the same experimental conditions, including computer performance and system environment. The results are shown in
Table 8.
Based on the actual fitting effect images and the calculated evaluation metrics, it can be observed that both the least-squares method and RANSAC can fit ellipses that approximately match the tunnel section. However, these two methods are significantly affected by noise, resulting in fitting outcomes that deviate considerably from the actual geometric characteristics of the tunnel lining structure.
Additionally, due to the iterative nature of the RANSAC method in searching for the optimal model, its computational efficiency is relatively lower. Furthermore, both the least-squares method and RANSAC exhibit poor fitting results for the lower part of the tunnel lining structure. In contrast, the ellipse-fitting method based on the Huber loss function can effectively perform fitting and denoising for tunnel point clouds containing noise. The fitting results align closely with the tunnel lining structure, and the method maintains good computational efficiency.