1. Introduction
In recent years, the rapid advancement of 3D scanning technologies has led to an explosion of point cloud data in various fields, including computer vision, robotics, and autonomous systems [
1,
2,
3,
4]. As raw digitalizations of physical structures, point clouds often contain inherent noise that degrades segmentation and 3D reconstruction fidelity. Despite their utility, raw point cloud datasets frequently contain inherent imperfections like noise and outliers that compromise the accuracy of downstream applications including segmentation and 3D reconstruction. This inherent data quality challenge necessitates robust denoising algorithms as a fundamental preprocessing requirement for reliable 3D analysis [
5,
6,
7,
8,
9,
10].
Among the various denoising techniques, clustering-based methods have shown promising results due to their ability to group similar points together and distinguish between signal and noise [
11,
12]. While conventional K-means clustering demonstrates effectiveness in regular data spaces, its dependence on isotropic distance measures creates fundamental limitations when processing irregular 3D structures. Specifically, the Euclidean metric fails to adequately characterize manifold geometries inherent to point cloud data, particularly under spatially variant noise conditions.
Our methodological contribution resolves this through a density-aware clustering framework that adapts to local topological characteristics [
13,
14]. Given the differences in local geometric structures between noise and valid data on the manifold, points with similar local features should be grouped together into the same cluster. Previous studies [
15,
16] have employed the K-means clustering algorithm for point cloud denoising, but they relied solely on a limited number of metrics to measure distances on the manifold. Furthermore, these studies overlooked the influence function, which is crucial for evaluating the robustness of the mean derived from each metric in the presence of outliers.
Conversely, [
17] proposed a point cloud denoising method leveraging the geometric structure of the Gaussian distribution manifold, employing five distinct measures: Euclidean distance, Affine-Invariant Riemannian Metric (AIRM), Log-Euclidean Metric (LEM), Kullback–Leibler Divergence (KLD), and its symmetric variant (SKLD). The research evaluated metric robustness by deriving influence functions for mean estimators under outlier conditions. Experimental results revealed that geometric metrics outperformed Euclidean measurements in denoising quality, with geometric means showing greater outlier resistance than arithmetic means. Notably, KLD and SKLD demonstrated computational advantages over AIRM/LEM through reduced complexity.
However, [
17] had three key limitations: Lack of theoretical bounds for influence functions, high computational complexity (
) from iterative mean calculations, and absence of quantitative measures for local geometric variations.
More recently, Xu et al. [
10] introduced TDNet, a deep learning approach achieving state-of-the-art results but requiring substantial GPU resources that limit real-time applications. Parallel developments in Bregman divergences show promising theoretical properties yet remain underexplored for 3D data: Liu et al. [
18] established Total Bregman Divergence (TBD) for shape retrieval tasks, while Hua et al. [
19] adapted TBD for radar detection without extending it to 3D point processing. To our knowledge, this work presents the first unified framework that integrates TBD with K-means clustering for manifold-aware point cloud denoising, bridging geometric robustness with computational efficiency. To address these gaps, we introduce Total Bregman Divergence (TBD), a novel divergence class that measures orthogonal deviation between a convex differentiable function’s value at one point and its tangent approximation at another. As shown in [
18], TBD’s orthogonal projection property offers fundamental advantages over [
17]’s framework. These include closed-form mean solutions that avoid iterative optimization (
Section 2.2), anisotropy indices for quantifying local geometric distortions (
Section 4.1), and bounded influence functions supported by strict theoretical guarantees (
Section 4.2).
Building on [
17], our TBD-K-means framework provides three fundamental breakthroughs:
The TBD framework introduces closed-form mean solutions derived through orthogonal projection. This fundamental innovation eliminates the need for iterative optimization in mean estimation procedures.
Our method establishes theoretically bounded influence functions with strict mathematical guarantees. These bounded functions provide inherent robustness against outliers and data perturbations.
TBD develops novel anisotropy indices for quantifying local geometric distortions. These indices enable precise characterization of manifold structures in complex data spaces.
This paper is structured into six main sections.
Section 1 introduces the importance of point cloud denoising and highlights the limitations of traditional K-means clustering approaches based on Euclidean distance.
Section 2 delves into the theoretical foundations, defining TBD and deriving the means of several positive-definite matrices.
Section 3 proposes a novel K-means clustering algorithm that leverages TBD for point cloud denoising.
Section 4 provides an in-depth analysis of the anisotropy indices of Bregman divergence and the influence functions of the TBD means.
Section 5 presents simulation results and performance comparisons, demonstrating the effectiveness of the proposed algorithm.
Section 6 concludes by summarizing the main contributions and discussing potential applications and future research directions.
2. Geometry on the Manifold of Symmetric Positive-Definite Matrices
The collection of
real matrices is represented by
, whereas the subset comprising all invertible
real matrices forms the general linear group, denoted as
. Notably,
possesses a differentiable manifold structure, with
functioning as its Lie algebra, denoted as
. The exchange of information between
and
is facilitated by exponential and logarithmic mappings. In particular, the exponential map, given by
converts a matrix
X in
to an element in
. Conversely, for an invertible matrix
A devoid of eigenvalues on the closed negative real axis, there exists a unique logarithm with eigenvalues within the strip
. This logarithm, serving as the inverse of the exponential map, is termed the principal logarithm map and symbolized by
Let
denote the space of real symmetric
matrices:
The subset of symmetric positive-definite matrices (SPD) forms the Riemannian manifold:
Three fundamental metric structures on are considered:
- (i)
Euclidean (Frobenius) Framework: The canonical inner product on
is defined as
inducing the norm
and metric distance
The tangent space at any coincides with due to ’s open submanifold structure in .
- (ii)
Affine-Invariant Riemannian Metric (AIRM): The geometry metric at
is given as follows:
This induces the geodesic distance [
20,
21]:
- (iii)
Log-Euclidean Metric (LEM): Through the logarithmic group operation
becomes a Lie group. The metric at
is defined via differential mappings:
where
denotes the differential of the matrix logarithm. The corresponding distance becomes
effectively Euclideanizing the manifold geometry through logarithmic coordinates.
2.1. Bregman Divergence on Manifold
For matrices
, the Bregman matrix divergence associated with a strictly convex differentiable potential function
is given by:
where
denotes the Frobenius inner product [
22].
This divergence can be systematically extended to a Total Bregman Divergence (TBD) through the subsequent formulation. For invertible matrices
, the TBD is defined as follows [
23]:
When calculating the divergence associated with a certain convex function
, the Riemannian gradient of
is often needed, which can be obtained based on the covariant/directional derivative related to the Riemannian metric as follows:
with the curve
and
. After linearizing the curve
, (
12) can be rewritten as follows:
Proposition 1.
For any nonsingular matrix with spectral exclusion condition , there exists a well-defined matrix entropy functional:which generalizes the scalar function to the matrix setting through trace-theoretic extension [
23]
. Then, the Riemannian gradient of is as follows:Furthermore, the logarithm divergence (LD) is expressed as follows:and the total logarithm divergence (TLD) is provided as follows: Proof. From (
13),
is calculated as follows:
where
Next, we will calculate the second term of the above expression, namely the differential term. Using Lemmas 3 and 4 in [
17], we obtain the following:
Then, we have
and thus (
15) has been proven. Substituting the result into Equation (
10), the proof of Equation (
16) is complete. □
Similar to the proof methodology employed in Proposition 1, we can derive Proposition 2 and Proposition 3 through analogous applications.
Proposition 2.
Consider which is induced from the function [
22]
. Then, the Riemannian gradient of is as follows:Furthermore, the exponential divergence (ED) is expressed asand the total exponential divergence (TED) is provided by Proposition 3.
When A is invertible, let , which is induced from the function [
22]
. Then, the Riemannian gradient of is as follows:Furthermore, the inverse divergence (ID) is expressed as follows:and the total inverse divergence (TID) is provided as follows: To analyze the differences among the TBD divergences defined on the positive-definite matrix manifold,
Figure 1 shows three-dimensional isosurfaces centered at the identity matrix, which are induced by TLD, TED, and TID, respectively. All of these isosurfaces are convex balls with non-spherical shapes, differing completely from the spherical isosurfaces in the context of the Euclidean metric.
2.2. TBD Means on
In this subsection, we study the geometric mean of several symmetric positive-definite matrices induced by the TBD, by considering the minimization problem of an objective function.
For m positive real numbers
, the arithmetic mean is often denoted as
and expressed as
where
represents the absolute difference between
a and
, which signifies the distance separating the two points on the real number line. The mean of m SPD
is the solution to
Thus, the arithmetic mean of
endowed by the Euclidean metric can be represented as follows:
And, the mean of
with LEM is as follows [
16]:
The result indicates that the Log-Euclidean mean has Euclidean characteristics when considered in the logarithmic domain, whereas AIRM’s mean lacks these properties [
17].
The concept extends naturally to the TBD mean. For a strictly convex differentiable function
and matrices
, the TBD mean is defined as follows:
The strict convexity of
ensures the Bregman divergence
retains strict convexity in
A. Consequently, uniqueness of the TBD mean (
30) follows directly from convex optimization principles, provided such a mean exists. To guarantee existence,
must operate within a compact subset of
. Compactness ensures adherence to the Weierstrass extreme value theorem, which mandates attainment of minima for continuous functions over closed and bounded domainsconditions inherently satisfied by the convex functional on this Riemannian manifold. Let us define
as follows:
According to (
13), the Riemannian gradient of
can be obtained as follows:
Then, by solving
, the TBD mean can be expressed as
with the weight
Next, by substituting (
15), (
20), and (
23) into (
33), respectively, it is straightforward to get explicit expressions for the means corresponding to TLD, TED, and TID.
Proposition 4.
The TLD mean of m SPD is provided bywhere By comparing (
29) and (
35), it can be seen that the TLD mean is essentially a weighted version of the LEM mean.
Proposition 5.
The TED mean of m SPD is provided bywhere Proposition 6.
The TID mean of m SPD is provided bywhere 3. K-Means Clustering Algorithm with TBDs
In the n-dimensional Euclidean space
, we denote a point cloud of size
as follows:
For
, we first identify its local neighborhood
using the k-nearest neighbors algorithm. Subsequently, the intrinsic geometry of
is characterized by computing two statistical descriptors:
1. The centered mean vector:
where
denotes empirical expectation;
2. The covariance operator:
quantifying pairwise positional deviations.
These descriptors induce a parameterization of
as points on the statistical manifold of n-dimensional Gaussian distributions:
where
and
f denotes the probability density function. This geometric embedding facilitates subsequent analysis within the framework of information geometry. The point cloud
undergoes local statistical encoding through operator
, generating its parametric representation
in symmetric matrix space. This mapped ensemble, termed the statistical parameter point cloud, preserves neighborhood geometry through covariance descriptors while enabling manifold–coordinate analysis.
Due to the topological homeomorphism between
and
[
24], the geometric structure on
can be induced by assigning metrics on
. Additionally, the distance on
is denoted as
where
stands for the difference between
and
. The mean of
on the parameter point cloud
is denoted as
, where
The definitions of both distance and mean operators on
are contingent upon the specific metric structure imposed on
. In our proposed algorithm, these fundamental statistical measures will be directly computed through TBDs, which play a crucial role in establishing the geometric framework for subsequent computations.
The intrinsic statistical properties of valid and stochastic noise components exhibit fundamental divergences in their local structural organizations. This statistical separation forms the basis for our implementation of a K-means clustering framework to partition the complete dataset
into distinct phenomenological categories: structured information carriers and unstructured random perturbations. The formal procedure for this discriminative clustering operation is systematically outlined in Algorithm 1.
Algorithm 1 Signal–Noise Discriminative Clustering Framework |
- 1.
Point Cloud Parametrization: - 2.
Applying the K-means Algorithm: - Step a:
Barycenters Setup - Step b:
Grouping with K-means - Step c:
Updating Barycenters Recalculate the barycenters for the two clusters based on the new clustering results, updating them to and .
- Step d:
Convergence Check Set a convergence threshold and a convergence condition . Upon satisfying convergence criteria, project the partitioned clusters in to , then finalize the computation with categorical assignments. If the convergence condition is not met, repeat steps b, c, and d.
|
Assign initial barycenters for the two clusters as
and
using statistical descriptors from any two distinct points in the encoded point cloud
. Specifically:
where
are arbitrary indices.
4. Anisotropy Index and Influence Functions
This section establishes a unified analytical framework for evaluating geometric sensitivity and algorithmic robustness on
.
Section 4.1 introduces the anisotropy index as a key geometric descriptor, formally defining its relationship with fundamental metrics including the Euclidean metric, AIRM, and the Bregman divergences. Through variational optimization, we derive closed-form expressions for these indices, revealing their intrinsic connections to matrix spectral properties.
Section 4.2 advances robustness analysis through influence function theory, developing perturbation models for three central tensor means: TLD, TED, and TID. By quantifying sensitivity bounds under outlier contamination, we establish theoretical guarantees for each mean operator.
4.1. Anisotropy Index Related to Various Metrics
The discriminatory capacity of weighted positive definite matrices manifests through their associated anisotropy measures. Defined intrinsically on the matrix manifold
, the anisotropy index quantifies local geometric distortion relative to isotropic configurations. For
, the anisotropy measure relative to the Riemannian metric is as follows:
This index corresponds to the squared minimal projection distance from A to the scalar curvature subspace
. Larger
values indicate stronger anisotropic characteristics. For explicit computation, we minimize the metric-specific functional:
Next, we systematically investigate anisotropy indices induced by three fundamental geometries: the Frobenius metric representing Euclidean structure, AIRM characterizing curved manifold topology, and Bregman divergences rooted in information geometry. This trichotomy reveals how metric selection governs directional sensitivity analysis on .
Following Equations (
3) and (
45), the anisotropy index associated with the Euclidean metric can be derived through direct computation.
Proposition 7.
The anisotropy index according to Euclidean metric (
3)
at a point is given bywhere Proposition 8.
The anisotropy index associated with AIRM at a point is formulated aswhere Proof. Following (
6) and (
45), we have
Then, differentiating
with respect to
yields the expression that
Let
Applying Lemma 14 and 15 from Reference [
19], we derive the following:
With the eigenvalue of A denoted as
, Equation (
54) can be written as follows:
By solving
, the proof of Equation (
49) is complete. □
Proposition 9.
The anisotropy index according to the Bregman divergence (
16)
at a point is formulated aswhere Proof. Via (
16) and (
45), we have
and
By solving
, the proof of Equation (
56) is complete. □
Proposition 10.
The anisotropy index according to the Bregman divergence (
21)
at a point is formulated aswhere Proof. Via (
21) and (
45), we have
and
By solving
, the proof of Equation (
60) is complete. □
Proposition 11.
The anisotropy index associated with the Bregman divergence (
24)
at a point is given bywhere Proof. Via (
24) and (
45), we have
and
By solving
, the proof of Equation (
64) is complete. □
4.2. Influence Functions
This subsection develops a robustness analysis framework through influence functions for symmetric positive-definite matrix-valued data. We systematically quantify the susceptibility of the TBD mean estimators under outlier contamination by deriving closed-form expressions of influence functions. Furthermore, we establish operator norm bounds of influence functions, thereby characterizing their stability margins in perturbed manifold learning scenarios.
Let
denote the TBDs mean of mSPD
Let
denote the mean after adding a set of
l outliers
with a weight
to
[
25]. Therefore,
can be defined as follows:
which shows that
is a perturbation of
and
is defined as the influence function. Let
denote the objective function to be minimized over the
SPD, formulated as follows:
Given that
denotes the mean of
SPD, the optimality condition requires
, which gives:
By taking the derivative of the equation
with respect to
and evaluating it at
, we obtain the following:
Next, the influence functions of TLD mean, TED mean, and TID mean are given by the following properties.
Proposition 12.
The TLD mean of m SPD and l outliers is withFurthermore, Proof. Following from (
15), we derive the following:
Then, substituting (
74) into (
71) and computing the trace on both sides, we obtain the following:
By considering the arbitrariness of
, we derive (
72) for the TLD mean. Furthermore, it can be deduced that the influence function
has an upper bound (
73) and is independent of outliers
. □
Proposition 13.
The TED mean of m SPD and l outliers is , whereFurthermore, Proof. Following from (
20), we derive the following:
Then, substituting (
78) into (
71) and computing the trace on both sides, we obtain the following:
By considering the arbitrariness of
, we derive (
76) for the TED mean. Furthermore, it can be deduced that the influence function
has an upper bound (
77) and is independent of outliers
. □
Proposition 14.
The TID mean of m SPD and n outliers is , whereFurthermore, Proof. Following from (
23), we derive the following:
Then, substituting (
82) into (
71) and computing the trace on both sides, we obtain the following:
By considering the arbitrariness of
, we get (
80) for the TED mean. Furthermore, it can be deduced that the influence function
has an upper bound (
81) and is independent of outliers
. □
While the influence function of the AIRM mean demonstrates unboundedness with respect to its input matrices [
19], all of the TBD means exhibit bounded influence functions under equivalent conditions.
5. Simulations and Analysis
In the following simulations, the SPD matrices used in Algorithm 1 are generated as follows:
where
is a square matrix randomly generated by MATLAB R2024a.
5.1. Simulations and Results
In this example, we apply Algorithm 1 to denoise the point cloud of a teapot, employing TBD and metrics in [
17]. The signal-to-noise ratio (SNR) is specified as 4148:1000, with parameters
and
. The experiments utilize MATLAB’s built-in Teapot.ply dataset, a standard PLY-format 3D point cloud that serves as a benchmark resource for validating graphics processing algorithms and visualization techniques. Synthetic noise was injected following a hybrid uniform distribution:
,
, and
with
corresponding to the coordinate limits. This explicitly violates Gaussian assumptions through bounded support, distributional asymmetry (non-negative in Z-dimension), and multimodal density, comprising 41.8% (
) of the teapot data.
The experimental results in
Table 1 confirm that
is the optimal neighborhood size, achieving peak SNRG and minimum FPR. Performance degrades significantly when
(FPR surges to 0.6430) or
(SNRG drops by
with
longer runtime). This data-driven analysis confirms
balances precision (max SNRG 1.439) and reliability (min FPR 0.410) for most scenarios.
To optimize indicator weights and enhance data visualization, we utilize Principal Component Analysis (PCA) to provide a holistic view of the covariance matrix encompassing all data.
Figure 2a displays the initial distribution of valid data and noise prior to denoising by PCA. Following that,
Figure 2b exhibits the transformed data distribution after applying TLD for denoising. In
Figure 2c, the raw teapot point cloud before denoising is displayed.
Figure 2d–f show the denoised results using TLD, TED, and TID, respectively, via Algorithm 1.
Figure 3a–e present the denoising outcomes of the teapot point cloud employing the metrics from prior work [
17]: Euclidean, AIRM, LEM, KLD, and SKLD. In these figures, red dots signify noise data, whereas blue dots indicate valid data.
Figure 2 demonstrates that Algorithm 1 effectively partitions data points into two discrete clusters, achieving explicit separation between valid signals and noise components. The MATLAB implementation directly interfaces with industry-standard PLY/PCD formats via
pcread/pcwrite and exports denoising metrics (TPR/FPR/SNRG) for pipeline integration.
5.2. Comparative Analysis of Influence Function Bounds
To rigorously validate the bounded sensitivity of TBD-induced means against Riemannian alternatives, we conducted extensive simulations under pathological outlier conditions. The experimental setup consisted of SPD () as valid samples and intentionally malformed outliers. These outliers were generated through spectral decomposition with controlled eigenvalues: where , , and Q a random orthogonal matrix. This created severely ill-conditioned matrices with condition numbers exceeding , simulating challenging scenarios encountered in real-world point clouds. For robust statistical analysis, 100 independent trials were performed, with influence function norms computed according to Propositions 12–14 for TBD means and comparable derivations for Riemannian baselines.
As demonstrated in
Figure 4, TBD-based means consistently outperformed Riemannian alternatives in outlier resistance. The AIRM mean exhibited unbounded sensitivity with significantly higher influence norms, while LEM showed moderately reduced but still considerable sensitivity. In stark contrast, all TBD means maintained strict theoretical bounds as established in Propositions 12–14. Specifically, TLD achieved approximately 60% lower influence norms than AIRM, with TED showing intermediate robustness. TID demonstrated the strongest outlier resistance with the tightest bounds—nearly three times more constrained than AIRM—and the lowest variance across trials. This superior performance stems from TBD’s intrinsic weight attenuation mechanisms that dynamically suppress pathological outliers while preserving valid data geometry.
5.3. Analysis of TBD Mean Stability
In
Figure 5, we compare loss functions across eight metrics (Euclidean, AIRM, LEM, KLD, SKLD from prior work [
17] and proposed TLD/TED/TID) using 100 trials of 10 randomly generated SPD matrices (
) from (
84). Three critical observations emerge:
- (1)
TLD and TID achieve 60–80% lower loss than Euclidean means, confirming their enhanced stability against gradient pathologies;
- (2)
TED shows marginally higher loss than TLD/TID but still outperforms all geometric metrics from [
17] (KLD/SKLD);
- (3)
The proposed TBD variants exhibit minimal variance across trials, indicating superior robustness to initialization compared to Riemannian metrics like AIRM/LEM.
These results validate that TBD-induced means are geometrically better suited for than both Euclidean and prior manifold metrics, enabling faster convergence to optimal cluster centroids.
The superior convergence of loss functions for Total Bregman Divergence (TBD) versus traditional Bregman Divergence (BD), as demonstrated in
Figure 6, stems fundamentally from TBD’s orthogonal projection property defined in (
11). This formulation measures the orthogonal distance between function values and tangent approximations, granting TBD invariance to coordinate transformations (rotations/scalings). In contrast, traditional BD (
10) exhibits only translational invariance. This geometric distinction enables TBD to maintain stable loss minimization (
Figure 6: TBD losses
–
vs. BD’s
–
) by adaptively weighting divergences according to local manifold curvature (
term in denominator), thereby accelerating convergence while resisting noise-induced perturbations.
We evaluate denoising performance using three key metrics: True Positive Rate (TPR), False Positive Rate (FPR), and Signal-to-Noise Ratio Growth (SNRG), which are formally defined as:
with True Positives (TPs), False Positives (FPs), False Negatives (FNs), True Negatives (TNs), Original valid point count
, and Original noise point count
. SNRG quantifies the relative enhancement in signal purity, with positive values indicating improved separation.
Table 2 benchmarks the denoising efficacy of Algorithm 1 under varying noise levels. At SNR = 10 (
), KLD achieves optimal FPR (44.69%) and SNRG (123.78%), while the proposed TID ranks second in FPR (46.22%) and outperforms AIRM/LEM by 18.3% in SNRG. Notably, at SNR = 2 (
), TLD and TED achieve perfect signal preservation (100% TPR) unmatched by any baseline, though KLD maintains superior SNRG (243.43% vs. TID’s 232.53%).
Under extreme noise (SNR = 1, ), TID demonstrates dominant performance with the lowest FPR (23.02%) and highest SNRG (279.90%), exceeding AIRM/LEM’s SNRG by 146%. Concurrently, TED achieves the best TPR (96.87%), outperforming SKLD by 18.1%. Crucially, while KLD and SKLD collapse (SNRG < 37%), all proposed TBD variants maintain SNRG > 168%, highlighting their robustness in pathological conditions.
Figure 7 demonstrates the denoising performance of TBD variants across noise levels, with quantitative metrics aligning with
Table 2. At SNR = 10, all methods achieve near-complete signal preservation (100% TPR) while effectively reducing noise, though TID’s superior SNRG (114.51%) corresponds to marginally cleaner outputs. Under SNR = 2 conditions, TLD and TED maintain perfect signal recovery (100% TPR) despite residual noise, while TID’s balanced performance (232.53% SNRG) preserves structural integrity. In extreme SNR = 1 scenarios, TID’s noise rejection (23.02% FPR) maintains global coherence, whereas TED achieves optimal signal retention (96.87% TPR) despite increased false positives. Visual results consistently reflect each method’s quantitative trade-offs between signal preservation and noise suppression.
5.4. Analysis of Computational Complexity
We first analyze the computational complexity of mean operators induced by four metrics: Euclidean, TLD, TED, and TID, mathematically defined in Equations (
28), (
35), (
37), and (
39). Subsequently, we quantify the computational load for corresponding influence functions, with formulations specified in (
72), (
76), (
80), and Proposition 1 of [
17].
Under the framework of m SPD and l outliers, we systematically evaluate computational costs. Considering single-element operations as O(1) and matrix , fundamental operations reveal critical patterns: Matrix inversion and logarithmic operations both require computations. Operations involving matrix exponentials and matrix roots for SPD necessitate eigenvalue decomposition, consequently maintaining complexity.
Detailed analysis reveals that the arithmetic mean (
28) exhibits
complexity. For influence functions, Euclidean-based estimation requires
operations. This establishes a critical trade-off: While Algorithm 1’s Euclidean metric demonstrates inferior denoising efficacy compared to TLD/TED/TID variants, its computational economy in mean matrix calculation
and influence function estimation
surpasses geometric counterparts. The complexity disparity stems from TLD/TED/TID’s intrinsic requirements for iterative matrix decompositions and manifold optimizations absent in Euclidean frameworks.
6. Conclusions
This study introduces a novel K-means clustering algorithm based on Total Bregman Divergence for robust point cloud denoising. Traditional Euclidean-based K-means methods often fail to address non-uniform noise distributions due to their limited geometric sensitivity. To overcome this, TBDs—Total Logarithm Divergence, Total Exponential Divergence, and Total Inverse Divergence—are proposed on the manifold of symmetric positive-definite matrices. These divergences are designed to model distinct local geometric structures, enabling more effective separation of noise from valid data. Theoretical contributions include the derivation of anisotropy indices to quantify structural variations and the analysis of influence functions, which demonstrate the bounded sensitivity of TBD-induced means to outliers.
Numerical experiments on synthetic and real-world datasets (e.g., 3D teapot point clouds) validate the algorithm’s excellence over Euclidean-based approaches. Results highlight improved noise separation, enhanced stability, and adaptability to complex noise patterns. The proposed framework bridges geometric insights from information geometry with practical clustering techniques, offering a scalable and robust preprocessing solution for applications in computer vision, robotics, and autonomous systems. This work underscores the potential of manifold-aware metrics in advancing point cloud processing and opens avenues for further exploration of divergence-based methods in high-dimensional data analysis. While the current implementation handles mid-scale clouds (<10 k points), scaling to massive point clouds (>1 M points) requires parallel neighborhood computation and spatial partitioning techniques, which constitute important future work.