A K-Means Clustering Algorithm with Total Bregman Divergence for Point Cloud Denoising

Duan, Xiaomin; Mu, Anqi; Zhao, Xinyu; Wu, Yuqi

doi:10.3390/sym17081186

Open AccessArticle

A K-Means Clustering Algorithm with Total Bregman Divergence for Point Cloud Denoising

¹

School of Science, Dalian Jiaotong University, Dalian 116028, China

²

School of Materials Science and Engineering, Dalian Jiaotong University, Dalian 116028, China

^*

Author to whom correspondence should be addressed.

Symmetry 2025, 17(8), 1186; https://doi.org/10.3390/sym17081186

Submission received: 24 June 2025 / Revised: 17 July 2025 / Accepted: 22 July 2025 / Published: 24 July 2025

(This article belongs to the Section Mathematics)

Download

Browse Figures

Versions Notes

Abstract

Point cloud denoising is essential for improving 3D data quality, yet traditional K-means methods relying on Euclidean distance struggle with non-uniform noise. This paper proposes a K-means algorithm leveraging Total Bregman Divergence (TBD) to better model geometric structures on manifolds, enhancing robustness against noise. Specifically, TBDs—Total Logarithm, Exponential, and Inverse Divergences—are defined on symmetric positive-definite matrices, each tailored to capture distinct local geometries. Theoretical analysis demonstrates the bounded sensitivity of TBD-induced means to outliers via influence functions, while anisotropy indices quantify structural variations. Numerical experiments validate the method’s superiority over Euclidean-based approaches, showing effective noise separation and improved stability. This work bridges geometric insights with practical clustering, offering a robust framework for point cloud preprocessing in vision and robotics applications.

Keywords:

Total Bregman Divergence; point cloud denoising; K-means clustering; manifold learning; influence function; anisotropic index

1. Introduction

In recent years, the rapid advancement of 3D scanning technologies has led to an explosion of point cloud data in various fields, including computer vision, robotics, and autonomous systems [1,2,3,4]. As raw digitalizations of physical structures, point clouds often contain inherent noise that degrades segmentation and 3D reconstruction fidelity. Despite their utility, raw point cloud datasets frequently contain inherent imperfections like noise and outliers that compromise the accuracy of downstream applications including segmentation and 3D reconstruction. This inherent data quality challenge necessitates robust denoising algorithms as a fundamental preprocessing requirement for reliable 3D analysis [5,6,7,8,9,10].

Among the various denoising techniques, clustering-based methods have shown promising results due to their ability to group similar points together and distinguish between signal and noise [11,12]. While conventional K-means clustering demonstrates effectiveness in regular data spaces, its dependence on isotropic distance measures creates fundamental limitations when processing irregular 3D structures. Specifically, the Euclidean metric fails to adequately characterize manifold geometries inherent to point cloud data, particularly under spatially variant noise conditions.

Our methodological contribution resolves this through a density-aware clustering framework that adapts to local topological characteristics [13,14]. Given the differences in local geometric structures between noise and valid data on the manifold, points with similar local features should be grouped together into the same cluster. Previous studies [15,16] have employed the K-means clustering algorithm for point cloud denoising, but they relied solely on a limited number of metrics to measure distances on the manifold. Furthermore, these studies overlooked the influence function, which is crucial for evaluating the robustness of the mean derived from each metric in the presence of outliers.

Conversely, [17] proposed a point cloud denoising method leveraging the geometric structure of the Gaussian distribution manifold, employing five distinct measures: Euclidean distance, Affine-Invariant Riemannian Metric (AIRM), Log-Euclidean Metric (LEM), Kullback–Leibler Divergence (KLD), and its symmetric variant (SKLD). The research evaluated metric robustness by deriving influence functions for mean estimators under outlier conditions. Experimental results revealed that geometric metrics outperformed Euclidean measurements in denoising quality, with geometric means showing greater outlier resistance than arithmetic means. Notably, KLD and SKLD demonstrated computational advantages over AIRM/LEM through reduced complexity.

However, [17] had three key limitations: Lack of theoretical bounds for influence functions, high computational complexity (

O (n^{4})

) from iterative mean calculations, and absence of quantitative measures for local geometric variations.

More recently, Xu et al. [10] introduced TDNet, a deep learning approach achieving state-of-the-art results but requiring substantial GPU resources that limit real-time applications. Parallel developments in Bregman divergences show promising theoretical properties yet remain underexplored for 3D data: Liu et al. [18] established Total Bregman Divergence (TBD) for shape retrieval tasks, while Hua et al. [19] adapted TBD for radar detection without extending it to 3D point processing. To our knowledge, this work presents the first unified framework that integrates TBD with K-means clustering for manifold-aware point cloud denoising, bridging geometric robustness with computational efficiency. To address these gaps, we introduce Total Bregman Divergence (TBD), a novel divergence class that measures orthogonal deviation between a convex differentiable function’s value at one point and its tangent approximation at another. As shown in [18], TBD’s orthogonal projection property offers fundamental advantages over [17]’s framework. These include closed-form mean solutions that avoid iterative optimization (Section 2.2), anisotropy indices for quantifying local geometric distortions (Section 4.1), and bounded influence functions supported by strict theoretical guarantees (Section 4.2).

Building on [17], our TBD-K-means framework provides three fundamental breakthroughs:

The TBD framework introduces closed-form mean solutions derived through orthogonal projection. This fundamental innovation eliminates the need for iterative optimization in mean estimation procedures.
Our method establishes theoretically bounded influence functions with strict mathematical guarantees. These bounded functions provide inherent robustness against outliers and data perturbations.
TBD develops novel anisotropy indices for quantifying local geometric distortions. These indices enable precise characterization of manifold structures in complex data spaces.

This paper is structured into six main sections. Section 1 introduces the importance of point cloud denoising and highlights the limitations of traditional K-means clustering approaches based on Euclidean distance. Section 2 delves into the theoretical foundations, defining TBD and deriving the means of several positive-definite matrices. Section 3 proposes a novel K-means clustering algorithm that leverages TBD for point cloud denoising. Section 4 provides an in-depth analysis of the anisotropy indices of Bregman divergence and the influence functions of the TBD means. Section 5 presents simulation results and performance comparisons, demonstrating the effectiveness of the proposed algorithm. Section 6 concludes by summarizing the main contributions and discussing potential applications and future research directions.

2. Geometry on the Manifold of Symmetric Positive-Definite Matrices

The collection of

n \times n

real matrices is represented by

M_{n \times n}

, whereas the subset comprising all invertible

n \times n

real matrices forms the general linear group, denoted as

GL (n, R)

. Notably,

GL (n, R)

possesses a differentiable manifold structure, with

M_{n \times n}

functioning as its Lie algebra, denoted as

gl (n, R)

. The exchange of information between

GL (n, R)

and

M_{n \times n}

is facilitated by exponential and logarithmic mappings. In particular, the exponential map, given by

Exp (X) = \sum_{k = 0}^{\infty} \frac{X^{k}}{k!},

converts a matrix X in

M_{n \times n}

to an element in

GL (n, R)

. Conversely, for an invertible matrix A devoid of eigenvalues on the closed negative real axis, there exists a unique logarithm with eigenvalues within the strip

{z \in C | - π < Im (z) < π}

. This logarithm, serving as the inverse of the exponential map, is termed the principal logarithm map and symbolized by

Log (A) .

Let

S_{n}

denote the space of real symmetric

n \times n

matrices:

S_{n} = {X \in R^{n \times n} ∣ X = X^{T}} .

(1)

The subset of symmetric positive-definite matrices (SPD) forms the Riemannian manifold:

P_{n} = {Σ \in S_{n} ∣ v^{T} Σ v > 0, \forall v \in R^{n} ∖ {0}} .

(2)

Three fundamental metric structures on

P_{n}

are considered:

(i): Euclidean (Frobenius) Framework: The canonical inner product on $P_{n}$ is defined as

$〈 X, Y 〉 = tr (X^{T} Y),$

(3)

inducing the norm $∥ X ∥ = \sqrt{〈 X, X 〉}$ and metric distance

$d_{F} (Σ_{1}, Σ_{2}) = ∥ Σ_{1} - Σ_{2} ∥ .$

(4)

The tangent space $T_{Σ} P_{n}$ at any $Σ \in P_{n}$ coincides with $S_{n}$ due to $P_{n}$ ’s open submanifold structure in $S_{n}$ .
(ii): Affine-Invariant Riemannian Metric (AIRM): The geometry metric at $Σ \in P_{n}$ is given as follows:

$g_{Σ} (V, W) = tr (Σ^{- 1} V Σ^{- 1} W), V, W \in T_{Σ} P_{n} .$

(5)

This induces the geodesic distance [20,21]:

$d_{AIRM} (Σ_{1}, Σ_{2}) = ∥ (Σ_{1}^{- 1 / 2} Σ_{2} Σ_{1}^{- 1 / 2}) ∥ .$

(6)
(iii): Log-Euclidean Metric (LEM): Through the logarithmic group operation

$Σ_{1} ⊛ Σ_{2} : = Exp (Log (Σ_{1}) + Log (Σ_{2})),$

(7)

P_{n}

becomes a Lie group. The metric at

Σ

is defined via differential mappings:

g_{Σ} (V, W) = 〈 D_{Σ} Log (V), D_{Σ} Log (W) 〉,

(8)

where

D_{Σ} Log

denotes the differential of the matrix logarithm. The corresponding distance becomes

d_{LEM} (Σ_{1}, Σ_{2}) = ∥ Log (Σ_{1}) - Log (Σ_{2}) ∥,

(9)

effectively Euclideanizing the manifold geometry through logarithmic coordinates.

2.1. Bregman Divergence on Manifold $P_{n}$

For matrices

A, B \in GL (n, R)

, the Bregman matrix divergence associated with a strictly convex differentiable potential function

Φ

is given by:

D_{Φ} (A, B) = Φ (A) - Φ (B) - 〈 \nabla Φ (B), A - B 〉,

(10)

where

〈 \cdot, \cdot 〉

denotes the Frobenius inner product [22].

This divergence can be systematically extended to a Total Bregman Divergence (TBD) through the subsequent formulation. For invertible matrices

A, B \in GL (n, R)

, the TBD is defined as follows [23]:

δ_{Φ} (A, B) = \frac{Φ (A) - Φ (B) - 〈 \nabla Φ (B), A - B 〉}{\sqrt{1 + {∥ \nabla Φ (B) ∥}^{2}}} .

(11)

When calculating the divergence associated with a certain convex function

Φ (A)

, the Riemannian gradient of

Φ (A)

is often needed, which can be obtained based on the covariant/directional derivative related to the Riemannian metric as follows:

〈 \nabla Φ (A), X 〉 = \frac{d}{d ε} |_{ε = 0} Φ (γ (ε)), \forall X \in T_{A} P_{n}

(12)

with the curve

γ : [0, 1] \to P_{n}

and

γ (0) = A, \dot{γ} (0) = X

. After linearizing the curve

γ

, (12) can be rewritten as follows:

〈 \nabla Φ (A), X 〉 = \frac{d}{d ε} |_{ε = 0} Φ (A + ε X), \forall X \in T_{A} P_{n} .

(13)

Proposition 1.

For any nonsingular matrix

A \in GL (n, R)

with spectral exclusion condition

σ (A) \cap R_{\leq 0} = \emptyset

, there exists a well-defined matrix entropy functional:

Φ (A) = tr (A Log (A)),

(14)

which generalizes the scalar function

ϕ (x) = x ln x

to the matrix setting through trace-theoretic extension [23]. Then, the Riemannian gradient of

Φ (A)

is as follows:

\nabla Φ (A) = Log (A) + I .

(15)

Furthermore, the logarithm divergence (LD) is expressed as follows:

D_{Φ} (A, B) = tr (A (Log (A) - Log (B)) - (A - B)),

(16)

and the total logarithm divergence (TLD) is provided as follows:

δ_{Φ} (A, B) = \frac{tr (A (Log (A) - Log (B)) - (A - B))}{\sqrt{{1 + ∥ Log (B) + I ∥}^{2}}} .

(17)

Proof.

From (13),

\nabla Φ (B)

is calculated as follows:

\begin{matrix} 〈 \nabla Φ (B), V 〉 & = & \frac{d}{d ε} |_{ε = 0} Φ (B + ε V) \\ = & \frac{d}{d ε} |_{ε = 0} tr (γ (ε) Log γ (ε)) \\ = & tr (Log (B) V) - tr (γ (ε) \frac{d}{d ε} Log γ (ε)) |_{ε = 0}, \end{matrix}

where

γ (ε) : = B + ε V .

(18)

Next, we will calculate the second term of the above expression, namely the differential term. Using Lemmas 3 and 4 in [17], we obtain the following:

\begin{matrix} tr (γ (ε) \frac{d}{d ε} Log γ (ε)) \\ = & tr (\int_{0}^{1} γ (ε) {((γ (ε) - I) t + I)}^{- 1} \frac{d γ (ε)}{d ε} {((γ (ε) - I) t + I)}^{- 1} d t) \\ = & tr (\int_{0}^{1} {((γ (ε) - I) t + I)}^{- 1} γ (ε) \frac{d γ (ε)}{d ε} {((γ (ε) - I) t + I)}^{- 1} d t) \\ = & \int_{0}^{1} tr ({((γ (ε) - I) t + I)}^{- 2} γ (ε) \frac{d γ (ε)}{d ε}) d t \\ = & tr (γ {(ε)}^{- 1} γ (ε) V) \\ = & tr (V) . \end{matrix}

Then, we have

〈 \nabla Φ (B), V 〉 = tr ((Log (B) + I) V) = 〈 Log (B) + I, V 〉,

(19)

and thus (15) has been proven. Substituting the result into Equation (10), the proof of Equation (16) is complete. □

Similar to the proof methodology employed in Proposition 1, we can derive Proposition 2 and Proposition 3 through analogous applications.

Proposition 2.

Consider

Φ (A) = tr (Exp (A)),

which is induced from the function

e^{x}

[22]. Then, the Riemannian gradient of

Φ (A)

is as follows:

\nabla Φ (B) = Exp (A) .

(20)

Furthermore, the exponential divergence (ED) is expressed as

D_{Φ} (A, B) = tr (Exp (A) - Exp (B) - Exp (B) (A - B)),

(21)

and the total exponential divergence (TED) is provided by

δ_{Φ} (A, B) = \frac{tr (Exp (A) - Exp (B) - Exp (B) (A - B))}{\sqrt{1 + {∥ Exp (B) ∥}^{2}}} .

(22)

Proposition 3.

When A is invertible, let

Φ (A) = tr (A^{- 1})

, which is induced from the function

\frac{1}{x}

[22]. Then, the Riemannian gradient of

Φ (A)

is as follows:

\nabla Φ (B) = - B^{- 2} .

(23)

Furthermore, the inverse divergence (ID) is expressed as follows:

D_{Φ} (A, B) = tr (A^{- 1} + A B^{- 2} - 2 B^{- 1}),

(24)

and the total inverse divergence (TID) is provided as follows:

δ_{Φ} (A, B) = \frac{tr (A^{- 1} + A B^{- 2} - 2 B^{- 1})}{\sqrt{1 + ∥ B^{- 2} ∥^{2}}} .

(25)

To analyze the differences among the TBD divergences defined on the positive-definite matrix manifold, Figure 1 shows three-dimensional isosurfaces centered at the identity matrix, which are induced by TLD, TED, and TID, respectively. All of these isosurfaces are convex balls with non-spherical shapes, differing completely from the spherical isosurfaces in the context of the Euclidean metric.

2.2. TBD Means on $P_{n}$

In this subsection, we study the geometric mean of several symmetric positive-definite matrices induced by the TBD, by considering the minimization problem of an objective function.

For m positive real numbers

{a_{i}}

, the arithmetic mean is often denoted as

\hat{a}

and expressed as

\bar{a} : = \frac{1}{m} \sum_{i = 1}^{m} a_{i} = arg min_{a > 0} \sum_{i = 1}^{m} {| a - a_{i} |}^{2},

(26)

where

| a - a_{i} |

represents the absolute difference between a and

a_{i}

, which signifies the distance separating the two points on the real number line. The mean of m SPD

{A_{1}, A_{2}, \dots, A_{m}} \in P_{n}

is the solution to

\bar{A} : = arg min_{A \in P_{n}} \sum_{i = 1}^{m} d^{2} (A, A_{i}) .

(27)

Thus, the arithmetic mean of

{A_{1}, A_{2}, \dots, A_{m}}

endowed by the Euclidean metric can be represented as follows:

\bar{A} = \frac{1}{m} \sum_{i = 1}^{m} A_{i} .

(28)

And, the mean of

{A_{1}, A_{2}, \dots, A_{m}}

with LEM is as follows [16]:

\bar{A} = Exp (\frac{1}{m} \sum_{i = 1}^{m} Log (A_{i})) .

(29)

The result indicates that the Log-Euclidean mean has Euclidean characteristics when considered in the logarithmic domain, whereas AIRM’s mean lacks these properties [17].

The concept extends naturally to the TBD mean. For a strictly convex differentiable function

Φ

and matrices

{A_{1}, A_{2}, \dots, A_{m}}

, the TBD mean is defined as follows:

\bar{A} : = arg min_{A \in P_{n}} \sum_{i = 1}^{m} δ_{Φ} (A, A_{i}) .

(30)

The strict convexity of

Φ (A)

ensures the Bregman divergence

\sum_{i = 1}^{m} δ_{Φ} (A, A_{i})

retains strict convexity in A. Consequently, uniqueness of the TBD mean (30) follows directly from convex optimization principles, provided such a mean exists. To guarantee existence,

δ_{Φ} (A)

must operate within a compact subset of

P_{n}

. Compactness ensures adherence to the Weierstrass extreme value theorem, which mandates attainment of minima for continuous functions over closed and bounded domainsconditions inherently satisfied by the convex functional on this Riemannian manifold. Let us define

H (A)

as follows:

H (A) : = \frac{1}{m} \sum_{i = 1}^{m} δ_{Φ} (A, A_{i}) .

(31)

According to (13), the Riemannian gradient of

H (A)

can be obtained as follows:

\nabla H (A) = \frac{1}{m} \sum_{i = 1}^{m} \frac{\nabla Φ (A) - \nabla Φ (A_{i})}{\sqrt{1 + {∥ \nabla Φ (A_{i}) ∥}^{2}}} .

(32)

Then, by solving

\nabla H (A) = 0

, the TBD mean can be expressed as

\nabla Φ (\bar{A}) = \sum_{i = 1}^{m} Ω_{i} Φ (A_{i})

(33)

with the weight

Ω_{i} = \frac{1}{\sqrt{1 + ‖ \nabla Φ (A_{i}) ‖^{2}}} / \sum_{j = 1}^{m} \frac{1}{\sqrt{1 + ‖ \nabla Φ (A_{j}) ‖^{2}}} .

(34)

Next, by substituting (15), (20), and (23) into (33), respectively, it is straightforward to get explicit expressions for the means corresponding to TLD, TED, and TID.

Proposition 4.

The TLD mean of m SPD

{A_{1}, A_{2}, \dots, A_{m}}

is provided by

\bar{A} = Exp (\sum_{i = 1}^{m} Ω_{i} Log (A_{i})),

(35)

where

Ω_{i} = \frac{1}{\sqrt{1 + ‖ Log (A_{i}) {+ I ‖}^{2}}} / \sum_{j = 1}^{m} \frac{1}{\sqrt{1 + ‖ Log (A_{j}) {+ I ‖}^{2}}} .

(36)

By comparing (29) and (35), it can be seen that the TLD mean is essentially a weighted version of the LEM mean.

Proposition 5.

The TED mean of m SPD

{A_{1}, A_{2}, \dots, A_{m}}

is provided by

\bar{A} = Log (\sum_{i = 1}^{m} Ω_{i} Exp (A_{i})),

(37)

where

Ω_{i} = \frac{1}{\sqrt{1 + ‖ Exp (A_{i}) ‖^{2}}} / \sum_{j = 1}^{m} \frac{1}{\sqrt{1 + ‖ Exp (A_{j}) ‖^{2}}} .

(38)

Proposition 6.

The TID mean of m SPD

{A_{1}, A_{2}, \dots, A_{m}}

is provided by

\bar{A} = {(\sum_{i = 1}^{m} Ω_{i} A_{i}^{- 2})}^{- 1 / 2},

(39)

where

Ω_{i} = \frac{1}{\sqrt{1 + ‖ A_{i}^{- 2} ‖^{2}}} / \sum_{j = 1}^{m} \frac{1}{\sqrt{1 + ‖ A_{j}^{- 2} ‖^{2}}} .

(40)

3. K-Means Clustering Algorithm with TBDs

In the n-dimensional Euclidean space

R^{n}

, we denote a point cloud of size

ι

as follows:

B : = {b_{i} \in R^{n} ∣ i = 1, \dots, ι} .

(41)

For

\forall b \in B

, we first identify its local neighborhood

S \equiv Δ (b, q)

using the k-nearest neighbors algorithm. Subsequently, the intrinsic geometry of

S

is characterized by computing two statistical descriptors:

1. The centered mean vector:

μ (S) : = E [S] - b,

where

E [\cdot]

denotes empirical expectation;

2. The covariance operator:

Σ (S) : = Cov (S, S)

quantifying pairwise positional deviations.

These descriptors induce a parameterization of

S

as points on the statistical manifold of n-dimensional Gaussian distributions:

S_{n} = \{f ∣ f (x; μ, Σ) = \frac{1}{\sqrt{{(2 π)}^{n}} det (Σ)} Exp (- \frac{{(x - μ)}^{T} Σ^{- 1} (x - μ)}{2})\},

(42)

where

x \in R^{n}

and f denotes the probability density function. This geometric embedding facilitates subsequent analysis within the framework of information geometry. The point cloud

B

undergoes local statistical encoding through operator

Υ

, generating its parametric representation

\tilde{B} : = Υ (B) \subset S_{n}

in symmetric matrix space. This mapped ensemble, termed the statistical parameter point cloud, preserves neighborhood geometry through covariance descriptors while enabling manifold–coordinate analysis.

Due to the topological homeomorphism between

S_{n}

and

R^{n} \times P_{n}

[24], the geometric structure on

S_{n}

can be induced by assigning metrics on

R^{n} \times P_{n}

. Additionally, the distance on

S_{n}

is denoted as

d ((μ_{1}, Σ_{1}), (μ_{2}, Σ_{2})) = ∥ μ_{1} - μ_{2} ∥ + d i f f (Σ_{1}, Σ_{2}),

(43)

where

d i f f

stands for the difference between

Σ_{1}

and

Σ_{2}

. The mean of

{(μ_{i}, Σ_{i})}_{1}^{ι}

on the parameter point cloud

\tilde{B}

is denoted as

(c_{R}, c_{P})

, where

c_{R} = \frac{1}{ι} \sum_{i = 1}^{ι} μ_{i}, c_{P} = arg min_{Σ} \sum_{i = 1}^{ι} d^{2} (Σ_{i}, Σ) .

(44)

The definitions of both distance and mean operators on

\tilde{B}

are contingent upon the specific metric structure imposed on

\tilde{B}

. In our proposed algorithm, these fundamental statistical measures will be directly computed through TBDs, which play a crucial role in establishing the geometric framework for subsequent computations.

The intrinsic statistical properties of valid and stochastic noise components exhibit fundamental divergences in their local structural organizations. This statistical separation forms the basis for our implementation of a K-means clustering framework to partition the complete dataset

\tilde{B}

into distinct phenomenological categories: structured information carriers and unstructured random perturbations. The formal procedure for this discriminative clustering operation is systematically outlined in Algorithm 1.

Algorithm 1 Signal–Noise Discriminative Clustering Framework

1.

Point Cloud Parametrization:

Statistically encode $B$ into $\tilde{B} \subset S_{n}$ via covariance descriptors.

2.

Applying the K-means Algorithm:

Step a:

Barycenters Setup

Assign barycenters (cluster centers) for the two clusters as $({c_{R}^{1}}^{(i)}$ , ${c_{P}^{1}}^{(i)})$ and ${(c_{R}^{2}}^{(i)}$ , ${c_{P}^{2}}^{(i)})$ .

Step b:

Grouping with K-means

Use the K-means algorithm with the TBDs to group the points in the parameter point cloud $\tilde{B}$ into two clusters based on their proximity to the current barycenters.

Step c:

Updating Barycenters

Recalculate the barycenters for the two clusters based on the new clustering results, updating them to $({c_{R}^{1}}^{(i + 1)}$ $, {c_{P}^{1}}^{(i + 1)})$ and $({c_{R}^{2}}^{(i + 1)}$ $, {c_{P}^{2}}^{(i + 1)})$ .

Step d:

Convergence Check

Set a convergence threshold $ϵ_{0}$ and a convergence condition $d (({c_{R}^{1}}^{(i)}$ $, {c_{P}^{1}}^{(i)})$ $, ({c_{R}^{2}}^{(i + 1)}$ $, {c_{P}^{2}}^{(i + 1)})) < ϵ_{0}$ .
Upon satisfying convergence criteria, project the partitioned clusters in $\tilde{B}$ to $B$ , then finalize the computation with categorical assignments.
If the convergence condition is not met, repeat steps b, c, and d.

Assign initial barycenters for the two clusters as

(c_{R}^{1 (0)}, c_{P}^{1 (0)})

and

(c_{R}^{2 (0)}, c_{P}^{2 (0)})

using statistical descriptors from any two distinct points in the encoded point cloud

\tilde{B}

. Specifically:

(c_{R}^{1 (0)}, c_{P}^{1 (0)}) = (μ_{i}, Σ_{i}), (c_{R}^{2 (0)}, c_{P}^{2 (0)}) = (μ_{j}, Σ_{j})

where

i \neq j

are arbitrary indices.

4. Anisotropy Index and Influence Functions

This section establishes a unified analytical framework for evaluating geometric sensitivity and algorithmic robustness on

P_{n}

. Section 4.1 introduces the anisotropy index as a key geometric descriptor, formally defining its relationship with fundamental metrics including the Euclidean metric, AIRM, and the Bregman divergences. Through variational optimization, we derive closed-form expressions for these indices, revealing their intrinsic connections to matrix spectral properties. Section 4.2 advances robustness analysis through influence function theory, developing perturbation models for three central tensor means: TLD, TED, and TID. By quantifying sensitivity bounds under outlier contamination, we establish theoretical guarantees for each mean operator.

4.1. Anisotropy Index Related to Various Metrics

The discriminatory capacity of weighted positive definite matrices manifests through their associated anisotropy measures. Defined intrinsically on the matrix manifold

P_{n}

, the anisotropy index quantifies local geometric distortion relative to isotropic configurations. For

A \in P_{n}

, the anisotropy measure relative to the Riemannian metric is as follows:

α (A) : = inf_{ε > 0} d^{2} (A, ε I) .

(45)

This index corresponds to the squared minimal projection distance from A to the scalar curvature subspace

{ε I | ε \in R^{+}}

. Larger

α (A)

values indicate stronger anisotropic characteristics. For explicit computation, we minimize the metric-specific functional:

ϕ_{A} (ε) = d^{2} (A, ε I), ε > 0 .

(46)

Next, we systematically investigate anisotropy indices induced by three fundamental geometries: the Frobenius metric representing Euclidean structure, AIRM characterizing curved manifold topology, and Bregman divergences rooted in information geometry. This trichotomy reveals how metric selection governs directional sensitivity analysis on

P_{n}

.

Following Equations (3) and (45), the anisotropy index associated with the Euclidean metric can be derived through direct computation.

Proposition 7.

The anisotropy index according to Euclidean metric (3) at a point

A \in P_{n}

is given by

a_{F} (A) = tr {(A - ε^{*} I)}^{2},

(47)

where

ε^{*} = \frac{tr (A)}{n} .

(48)

Proposition 8.

The anisotropy index associated with AIRM at a point

A \in P_{n}

is formulated as

a_{A I R M} (A) = tr ({Log}^{2} (ε^{*} A^{- 1})),

(49)

where

ε^{*} = \sqrt[n]{det A} .

(50)

Proof.

Following (6) and (45), we have

\begin{matrix} ϕ_{A} (ε) & = d^{2} (A, ε I) = tr ({Log}^{2} (ε A^{- 1})) . \end{matrix}

(51)

Then, differentiating

ϕ_{A} (ε)

with respect to

ε

yields the expression that

\begin{matrix} ϕ_{A}^{'} (ε) & = tr (2 Log (ε A^{- 1}) \frac{d}{d ε} Log (ε A^{- 1})) . \end{matrix}

(52)

Let

\begin{matrix} γ (ε) = ε A^{- 1} . \end{matrix}

(53)

Applying Lemma 14 and 15 from Reference [19], we derive the following:

\begin{matrix} ϕ_{A}^{'} (ε) = & tr (2 Log (γ (ε)) \frac{d}{d ε} Log (γ (ε))) \\ = & 2 tr (Log (γ (ε)) \int_{0}^{1} {((γ (ε) - I) s + I)}^{- 1} \\ \times \frac{d γ (ε)}{d ε} {((γ (ε) - I) s + I)}^{- 1} d s) \\ = & 2 tr (γ {(ε)}^{- 1} Log (γ (ε) A^{- 1})) \\ = & 2 tr (ε^{- 1} Log (ε A^{- 1})) . \end{matrix}

(54)

With the eigenvalue of A denoted as

λ_{1}, λ_{2}, \dots, λ_{n}

, Equation (54) can be written as follows:

ϕ_{A}^{'} (ε) = \frac{2}{ε} (n ln ε - ln (λ_{1} λ_{2} \dots λ_{n})) .

(55)

By solving

ϕ_{A}^{'} (ε) = 0

, the proof of Equation (49) is complete. □

Proposition 9.

The anisotropy index according to the Bregman divergence (16) at a point

A \in P_{n}

is formulated as

a_{L} (A) = tr (A Log (A) - A Log (ε^{*} I) - (A - ε^{*} I)),

(56)

where

ε^{*} = \frac{tr (A)}{n} .

(57)

Proof.

Via (16) and (45), we have

ϕ_{A} (ε) = tr (A Log (A) - A Log (ε I) - A + ε I),

(58)

and

ϕ_{A}^{'} (ε) = tr (I - \frac{1}{ε} A) .

(59)

By solving

ϕ_{A}^{'} (ε) = 0

, the proof of Equation (56) is complete. □

Proposition 10.

The anisotropy index according to the Bregman divergence (21) at a point

A \in P_{n}

is formulated as

a_{E} (A) = tr (Exp (A) + Exp (ε^{*} I) ((ε^{*} - 1) I - A)),

(60)

where

ε^{*} = \frac{tr (A)}{n} .

(61)

Proof.

Via (21) and (45), we have

ϕ_{A} (ε) = tr (Exp (A) + Exp (ε I) ((ε - 1) I - A)),

(62)

and

ϕ_{A}^{'} (ε) = e^{ε} (n - tr (A)) .

(63)

By solving

ϕ_{A}^{'} (ε) = 0

, the proof of Equation (60) is complete. □

Proposition 11.

The anisotropy index associated with the Bregman divergence (24) at a point

A \in P_{n}

is given by

a_{I} (A) = tr (A^{- 1} + ε^{* - 2} A) - 2 n ε^{* - 1},

(64)

where

ε^{*} = \frac{tr (A)}{n} .

(65)

Proof.

Via (24) and (45), we have

ϕ_{A} (ε) = tr (A^{- 1} + ε^{- 2} A) - 2 n ε^{- 1},

(66)

and

ϕ_{A}^{'} (ε) = 2 n ε^{- 2} - 2 ε^{- 3} tr (A) .

(67)

By solving

ϕ_{A}^{'} (ε) = 0

, the proof of Equation (64) is complete. □

4.2. Influence Functions

This subsection develops a robustness analysis framework through influence functions for symmetric positive-definite matrix-valued data. We systematically quantify the susceptibility of the TBD mean estimators under outlier contamination by deriving closed-form expressions of influence functions. Furthermore, we establish operator norm bounds of influence functions, thereby characterizing their stability margins in perturbed manifold learning scenarios.

Let

\bar{A}

denote the TBDs mean of mSPD

{A_{1}, A_{2}, \dots, A_{m}} .

Let

\hat{A}

denote the mean after adding a set of l outliers

{R_{1}, R_{2}, \dots, R_{l}}

with a weight

ε (0 < ε ≪ 1)

to

{A_{1}, A_{2}, \dots, A_{m}}

[25]. Therefore,

\hat{A}

can be defined as follows:

\hat{A} = \bar{A} + ε F (\bar{A}, R_{1}, R_{2}, \dots, R_{l}) + O (ε^{2}),

(68)

which shows that

\hat{A}

is a perturbation of

\bar{A}

and

∥ F ∥

is defined as the influence function. Let

Ψ (A)

denote the objective function to be minimized over the

m + l

SPD, formulated as follows:

Ψ (A) : = (1 - ε) \frac{1}{m} \sum_{i = 1}^{m} δ_{Φ} (A, A_{i}) + ε \frac{1}{l} \sum_{j = 1}^{l} δ_{Φ} (A, R_{j}) .

(69)

Given that

\hat{A}

denotes the mean of

m + l

SPD, the optimality condition requires

\nabla Ψ (\hat{A}) = 0

, which gives:

(1 - ε) \frac{1}{m} \sum_{i = 1}^{m} \frac{\nabla Φ (\hat{A}) - \nabla Φ (A_{i})}{\sqrt{1 + {∥ \nabla Φ (A_{i}) ∥}^{2}}} + ε \frac{1}{l} \sum_{j = 1}^{l} \frac{\nabla Φ (\hat{A}) - \nabla Φ (R_{j})}{\sqrt{1 + {∥ \nabla Φ (R_{j}) ∥}^{2}}} = 0 .

(70)

By taking the derivative of the equation

\nabla Ψ (\hat{A}) = 0

with respect to

ϵ

and evaluating it at

ϵ = 0

, we obtain the following:

\frac{1}{m} \sum_{i = 1}^{m} \frac{1}{\sqrt{1 + {∥ \nabla Φ (A_{i}) ∥}^{2}}} \frac{d}{d ε} |_{ε = 0} \nabla Φ (\hat{A}) + \frac{1}{l} \sum_{j = 1}^{l} \frac{\nabla Φ (\bar{A}) - \nabla Φ (R_{j})}{\sqrt{1 + {∥ \nabla Φ (R_{j}) ∥}^{2}}} = 0 .

(71)

Next, the influence functions of TLD mean, TED mean, and TID mean are given by the following properties.

Proposition 12.

The TLD mean of m SPD

{A_{1}, A_{2}, \dots, A_{m}}

and l outliers

{R_{1}, R_{2}, \dots, R_{l}}

is

∥ F ∥

with

F = {(\frac{1}{m} \sum_{i = 1}^{m} \frac{{\bar{A}}^{- 1}}{\sqrt{1 + ∥ Log (A_{i}) {+ I ∥}^{2}}})}^{- 1} (\frac{1}{l} \sum_{j = 1}^{l} \frac{Log (\bar{A}) - Log (R_{j})}{\sqrt{1 + ∥ Log (R_{j}) {+ I ∥}^{2}}}) .

(72)

Furthermore,

∥ F ∥ \leq ∥ {\bar{A}}^{- 1} ∥^{- 1} (∥ Log (\bar{A}) + I ∥ + 1) .

(73)

Proof.

Following from (15), we derive the following:

tr ({\frac{d}{d ε}|}_{ε = 0} \nabla Φ (\hat{A})) = tr ({\frac{d}{d ε}|}_{ε = 0} Log \hat{A} + I)) = tr ({\bar{A}}^{- 1} F)

(74)

Then, substituting (74) into (71) and computing the trace on both sides, we obtain the following:

\begin{matrix} tr (\frac{1}{m} \sum_{i = 1}^{m} \frac{{\bar{A}}^{- 1} F}{\sqrt{1 + ∥ Log A_{i} {+ I ∥}^{2}}} + \frac{1}{l} \sum_{j = 1}^{l} \frac{Log \bar{A} - Log R_{j}}{\sqrt{1 + ∥ Log R_{j} {+ I ∥}^{2}}}) = 0 \end{matrix}

(75)

By considering the arbitrariness of

\bar{A}

, we derive (72) for the TLD mean. Furthermore, it can be deduced that the influence function

∥ F ∥

has an upper bound (73) and is independent of outliers

{R_{1}, R_{2}, \dots, R_{l}}

. □

Proposition 13.

The TED mean of m SPD

A_{1}, A_{2}, \dots, A_{m}

and l outliers

{R_{1}, R_{2}, \dots, R_{l}}

is

∥ F ∥

, where

F = {(\frac{1}{m} \sum_{i = 1}^{m} \frac{Exp (\bar{A})}{\sqrt{1 + ∥ Exp (A_{i}) ∥^{2}}})}^{- 1} (\frac{1}{l} \sum_{j = 1}^{l} \frac{Exp (R_{j}) - Exp (\bar{A})}{\sqrt{1 + ∥ Exp (R_{j}) ∥^{2}}}) .

(76)

Furthermore,

∥ F ∥ \leq ∥ Exp (\bar{A}) ∥^{- 1} + 1 .

(77)

Proof.

Following from (20), we derive the following:

tr ({\frac{d}{d ε}|}_{ε = 0} \nabla Φ (\hat{A})) = tr ({\frac{d}{d ε}|}_{ε = 0} Exp (\hat{A})) = tr (Exp (\bar{A})) .

(78)

Then, substituting (78) into (71) and computing the trace on both sides, we obtain the following:

\begin{matrix} tr (\frac{1}{m} \sum_{i = 1}^{m} \frac{Exp (\bar{A})}{\sqrt{1 + ∥ Exp (A_{i}) ∥^{2}}} F + \frac{1}{l} \sum_{j = 1}^{l} \frac{Exp (\bar{A}) - Exp (R_{j})}{\sqrt{1 + ∥ Exp (R_{j}) ∥^{2}}}) = 0 . \end{matrix}

(79)

By considering the arbitrariness of

\bar{A}

, we derive (76) for the TED mean. Furthermore, it can be deduced that the influence function

∥ F ∥

has an upper bound (77) and is independent of outliers

{R_{1}, R_{2}, \dots, R_{l}}

. □

Proposition 14.

The TID mean of m SPD

A_{1}, A_{2}, \dots, A_{m}

and n outliers

R_{1}, R_{2}, \dots, R_{l}

is

∥ F ∥

, where

F = {(\frac{1}{m} \sum_{i = 1}^{m} \frac{2 {\bar{A}}^{- 3}}{\sqrt{1 + ∥ A_{i}^{- 2} ∥^{2}}})}^{- 1} \cdot (\frac{1}{l} \sum_{j = 1}^{l} \frac{{\bar{A}}^{- 2} - {R_{j}}^{- 2}}{\sqrt{1 + ∥ R_{j}^{- 2} ∥^{2}}}) .

(80)

Furthermore,

∥ F ∥ \leq ∥ 2 {\bar{A}}^{- 3} ∥^{- 1} (∥ {\bar{A}}^{- 2} ∥ + 1) .

(81)

Proof.

Following from (23), we derive the following:

\frac{d}{d ε} |_{ε = 0} \nabla Φ (\hat{A}) = - \frac{d}{d ε} |_{ε = 0} ({\hat{A}}^{- 1}) ({\hat{A}}^{- 1}) = {\bar{A}}^{- 1} F {\bar{A}}^{- 2} + {\bar{A}}^{- 2} F {\bar{A}}^{- 1} .

(82)

Then, substituting (82) into (71) and computing the trace on both sides, we obtain the following:

tr (\frac{1}{m} \sum_{i = 1}^{m} \frac{2 \cdot {\bar{A}}^{- 3} F}{\sqrt{1 + ∥ A_{i}^{- 2} ∥^{2}}} + \frac{1}{l} \sum_{j = 1}^{l} \frac{- {\bar{A}}^{- 2} + R_{j}^{- 2}}{\sqrt{1 + ∥ R_{j}^{- 2} ∥^{2}}}) = 0 .

(83)

By considering the arbitrariness of

\bar{A}

, we get (80) for the TED mean. Furthermore, it can be deduced that the influence function

∥ F ∥

has an upper bound (81) and is independent of outliers

{R_{1}, R_{2}, \dots, R_{l}}

. □

While the influence function of the AIRM mean demonstrates unboundedness with respect to its input matrices [19], all of the TBD means exhibit bounded influence functions under equivalent conditions.

5. Simulations and Analysis

In the following simulations, the SPD matrices used in Algorithm 1 are generated as follows:

Exp (\frac{X + X^{T}}{2}),

(84)

where

X \in R^{n \times n}

is a square matrix randomly generated by MATLAB R2024a.

5.1. Simulations and Results

In this example, we apply Algorithm 1 to denoise the point cloud of a teapot, employing TBD and metrics in [17]. The signal-to-noise ratio (SNR) is specified as 4148:1000, with parameters

q = 6

and

ε_{0} = 0.001

. The experiments utilize MATLAB’s built-in Teapot.ply dataset, a standard PLY-format 3D point cloud that serves as a benchmark resource for validating graphics processing algorithms and visualization techniques. Synthetic noise was injected following a hybrid uniform distribution:

p_{x} (x) \sim U [- a_{x}, a_{x}]

,

p_{y} (y) \sim U [- a_{y}, a_{y}]

, and

p_{z} (z) \sim | U | [0, b_{z}]

with

a_{x}, a_{y}, b_{z}

corresponding to the coordinate limits. This explicitly violates Gaussian assumptions through bounded support, distributional asymmetry (non-negative in Z-dimension), and multimodal density, comprising 41.8% (

n_{noise} = 1000

) of the teapot data.

The experimental results in Table 1 confirm that

q = 6

is the optimal neighborhood size, achieving peak SNRG and minimum FPR. Performance degrades significantly when

q < 5

(FPR surges to 0.6430) or

q > 6

(SNRG drops by

12 %

with

20 %

longer runtime). This data-driven analysis confirms

q = 6

balances precision (max SNRG 1.439) and reliability (min FPR 0.410) for most scenarios.

To optimize indicator weights and enhance data visualization, we utilize Principal Component Analysis (PCA) to provide a holistic view of the covariance matrix encompassing all data. Figure 2a displays the initial distribution of valid data and noise prior to denoising by PCA. Following that, Figure 2b exhibits the transformed data distribution after applying TLD for denoising. In Figure 2c, the raw teapot point cloud before denoising is displayed. Figure 2d–f show the denoised results using TLD, TED, and TID, respectively, via Algorithm 1. Figure 3a–e present the denoising outcomes of the teapot point cloud employing the metrics from prior work [17]: Euclidean, AIRM, LEM, KLD, and SKLD. In these figures, red dots signify noise data, whereas blue dots indicate valid data. Figure 2 demonstrates that Algorithm 1 effectively partitions data points into two discrete clusters, achieving explicit separation between valid signals and noise components. The MATLAB implementation directly interfaces with industry-standard PLY/PCD formats via pcread/pcwrite and exports denoising metrics (TPR/FPR/SNRG) for pipeline integration.

5.2. Comparative Analysis of Influence Function Bounds

To rigorously validate the bounded sensitivity of TBD-induced means against Riemannian alternatives, we conducted extensive simulations under pathological outlier conditions. The experimental setup consisted of

m = 100

SPD (

Σ \in P_{4}

) as valid samples and

n = 50

intentionally malformed outliers. These outliers were generated through spectral decomposition with controlled eigenvalues:

Σ = Q \cdot diag (λ_{1}, λ_{2}, λ_{3}, β) \cdot Q^{⊤}

where

λ_{i} \in (1, 3)

,

β \in (0, 10^{- 9})

, and Q a random orthogonal matrix. This created severely ill-conditioned matrices with condition numbers exceeding

10^{9}

, simulating challenging scenarios encountered in real-world point clouds. For robust statistical analysis, 100 independent trials were performed, with influence function norms computed according to Propositions 12–14 for TBD means and comparable derivations for Riemannian baselines.

As demonstrated in Figure 4, TBD-based means consistently outperformed Riemannian alternatives in outlier resistance. The AIRM mean exhibited unbounded sensitivity with significantly higher influence norms, while LEM showed moderately reduced but still considerable sensitivity. In stark contrast, all TBD means maintained strict theoretical bounds as established in Propositions 12–14. Specifically, TLD achieved approximately 60% lower influence norms than AIRM, with TED showing intermediate robustness. TID demonstrated the strongest outlier resistance with the tightest bounds—nearly three times more constrained than AIRM—and the lowest variance across trials. This superior performance stems from TBD’s intrinsic weight attenuation mechanisms that dynamically suppress pathological outliers while preserving valid data geometry.

5.3. Analysis of TBD Mean Stability

In Figure 5, we compare loss functions across eight metrics (Euclidean, AIRM, LEM, KLD, SKLD from prior work [17] and proposed TLD/TED/TID) using 100 trials of 10 randomly generated SPD matrices (

n = 4

) from (84). Three critical observations emerge:

(1): TLD and TID achieve 60–80% lower loss than Euclidean means, confirming their enhanced stability against gradient pathologies;
(2): TED shows marginally higher loss than TLD/TID but still outperforms all geometric metrics from [17] (KLD/SKLD);
(3): The proposed TBD variants exhibit minimal variance across trials, indicating superior robustness to initialization compared to Riemannian metrics like AIRM/LEM.

These results validate that TBD-induced means are geometrically better suited for

P_{n}

than both Euclidean and prior manifold metrics, enabling faster convergence to optimal cluster centroids.

The superior convergence of loss functions for Total Bregman Divergence (TBD) versus traditional Bregman Divergence (BD), as demonstrated in Figure 6, stems fundamentally from TBD’s orthogonal projection property defined in (11). This formulation measures the orthogonal distance between function values and tangent approximations, granting TBD invariance to coordinate transformations (rotations/scalings). In contrast, traditional BD (10) exhibits only translational invariance. This geometric distinction enables TBD to maintain stable loss minimization (Figure 6: TBD losses

10^{1.5}

–

10^{2}

vs. BD’s

10^{2.5}

–

10^{3}

) by adaptively weighting divergences according to local manifold curvature (

∥ \nabla Φ (B) ∥

term in denominator), thereby accelerating convergence while resisting noise-induced perturbations.

We evaluate denoising performance using three key metrics: True Positive Rate (TPR), False Positive Rate (FPR), and Signal-to-Noise Ratio Growth (SNRG), which are formally defined as:

TPR = \frac{TP}{N_{data}}, FPR = \frac{FP}{N_{noise}}, SNRG = (\frac{TP}{FP} \cdot \frac{N_{noise}}{N_{data}}) - 1

(85)

with True Positives (TPs), False Positives (FPs), False Negatives (FNs), True Negatives (TNs), Original valid point count

N_{data}

, and Original noise point count

N_{noise}

. SNRG quantifies the relative enhancement in signal purity, with positive values indicating improved separation.

Table 2 benchmarks the denoising efficacy of Algorithm 1 under varying noise levels. At SNR = 10 (

N_{noise} = 415

), KLD achieves optimal FPR (44.69%) and SNRG (123.78%), while the proposed TID ranks second in FPR (46.22%) and outperforms AIRM/LEM by 18.3% in SNRG. Notably, at SNR = 2 (

N_{noise} = 2074

), TLD and TED achieve perfect signal preservation (100% TPR) unmatched by any baseline, though KLD maintains superior SNRG (243.43% vs. TID’s 232.53%).

Under extreme noise (SNR = 1,

N_{noise} = 4148

), TID demonstrates dominant performance with the lowest FPR (23.02%) and highest SNRG (279.90%), exceeding AIRM/LEM’s SNRG by 146%. Concurrently, TED achieves the best TPR (96.87%), outperforming SKLD by 18.1%. Crucially, while KLD and SKLD collapse (SNRG < 37%), all proposed TBD variants maintain SNRG > 168%, highlighting their robustness in pathological conditions.

Figure 7 demonstrates the denoising performance of TBD variants across noise levels, with quantitative metrics aligning with Table 2. At SNR = 10, all methods achieve near-complete signal preservation (100% TPR) while effectively reducing noise, though TID’s superior SNRG (114.51%) corresponds to marginally cleaner outputs. Under SNR = 2 conditions, TLD and TED maintain perfect signal recovery (100% TPR) despite residual noise, while TID’s balanced performance (232.53% SNRG) preserves structural integrity. In extreme SNR = 1 scenarios, TID’s noise rejection (23.02% FPR) maintains global coherence, whereas TED achieves optimal signal retention (96.87% TPR) despite increased false positives. Visual results consistently reflect each method’s quantitative trade-offs between signal preservation and noise suppression.

5.4. Analysis of Computational Complexity

We first analyze the computational complexity of mean operators induced by four metrics: Euclidean, TLD, TED, and TID, mathematically defined in Equations (28), (35), (37), and (39). Subsequently, we quantify the computational load for corresponding influence functions, with formulations specified in (72), (76), (80), and Proposition 1 of [17].

Under the framework of m SPD and l outliers, we systematically evaluate computational costs. Considering single-element operations as O(1) and matrix

A \in R^{n \times n}

, fundamental operations reveal critical patterns: Matrix inversion

A^{- 1}

and logarithmic operations

Log (A)

both require

O (n^{3})

computations. Operations involving matrix exponentials

Exp (A)

and matrix roots for SPD necessitate eigenvalue decomposition, consequently maintaining

O (n^{3})

complexity.

Detailed analysis reveals that the arithmetic mean (28) exhibits

O (n^{3})

complexity. For influence functions, Euclidean-based estimation requires

O ((m + l) n^{3})

operations. This establishes a critical trade-off: While Algorithm 1’s Euclidean metric demonstrates inferior denoising efficacy compared to TLD/TED/TID variants, its computational economy in mean matrix calculation

O (n^{3})

and influence function estimation

O ((m + l) n^{3})

surpasses geometric counterparts. The complexity disparity stems from TLD/TED/TID’s intrinsic requirements for iterative matrix decompositions and manifold optimizations absent in Euclidean frameworks.

6. Conclusions

This study introduces a novel K-means clustering algorithm based on Total Bregman Divergence for robust point cloud denoising. Traditional Euclidean-based K-means methods often fail to address non-uniform noise distributions due to their limited geometric sensitivity. To overcome this, TBDs—Total Logarithm Divergence, Total Exponential Divergence, and Total Inverse Divergence—are proposed on the manifold of symmetric positive-definite matrices. These divergences are designed to model distinct local geometric structures, enabling more effective separation of noise from valid data. Theoretical contributions include the derivation of anisotropy indices to quantify structural variations and the analysis of influence functions, which demonstrate the bounded sensitivity of TBD-induced means to outliers.

Numerical experiments on synthetic and real-world datasets (e.g., 3D teapot point clouds) validate the algorithm’s excellence over Euclidean-based approaches. Results highlight improved noise separation, enhanced stability, and adaptability to complex noise patterns. The proposed framework bridges geometric insights from information geometry with practical clustering techniques, offering a scalable and robust preprocessing solution for applications in computer vision, robotics, and autonomous systems. This work underscores the potential of manifold-aware metrics in advancing point cloud processing and opens avenues for further exploration of divergence-based methods in high-dimensional data analysis. While the current implementation handles mid-scale clouds (<10 k points), scaling to massive point clouds (>1 M points) requires parallel neighborhood computation and spatial partitioning techniques, which constitute important future work.

Author Contributions

Investigation, X.D.; methodology, X.D. and A.M.; software, X.Z.; writing—original draft, X.D.; writing—review and editing, X.D., A.M., X.Z., and Y.W. All authors have read and agreed to the published version of the manuscript.

Funding

This research was supported by the National Natural Science Foundation of China (No. 61401058), the General Program of Liaoning Natural Science Foundation (No. 2024-MS-166), and the Basic Scientific Research Project of the Liaoning Provincial Department of Education (No. JYTMS20230010).

Data Availability Statement

The teapot dataset (Teapot.phy) is publicly available in MATLAB R2023a’s 3D Scene Data Suite. Algorithm codes are available and are deposited at ScienceDB: [DOI: 10.57760/sciencedb.27179].

Acknowledgments

The authors would like to thank the anonymous reviewers for their detailed and careful comments, which helped to improve the quality of the manuscript.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Xiao, K.; Li, T.; Li, J.; Huang, D.; Peng, Y. Equal emphasis on data and network: A two-stage 3D point cloud object detection algorithm with feature alignment. Remote Sens. 2024, 16, 249. [Google Scholar] [CrossRef]
Zhang, S.; Zhu, Y.; Xiong, W.; Rong, X.; Zhang, J. Bridge substructure feature extraction based on the underwater sonar point cloud data. Ocean Eng. 2024, 294, 116700. [Google Scholar] [CrossRef]
Jiang, J.; Lu, X.; Ouyang, W.; Wang, M. Unsupervised contrastive learning with simple transformation for 3D point cloud data. Vis. Comput. 2024, 40, 5169–5186. [Google Scholar] [CrossRef]
Akhavan, J.; Lyu, J.; Manoochehri, S. A deep learning solution for real-time quality assessment and control in additive manufacturing using point cloud data. J. Intell. Manuf. 2024, 35, 1389–1406. [Google Scholar] [CrossRef]
Ran, C.; Zhang, X.; Han, S.; Yu, H.; Wang, S. TPDNet: A point cloud data denoising method for offshore drilling platforms and its application. Measurement 2025, 241, 115671. [Google Scholar] [CrossRef]
Bao, Y.; Wen, Y.; Tang, C.; Sun, Z.; Meng, X.; Zhang, D.; Wang, L. Three-dimensional point cloud denoising for tunnel data by combining intensity and geometry information. Sustainability 2024, 16, 2077. [Google Scholar] [CrossRef]
Fu, X.; Zhang, G.; Kong, T.; Zhang, Y.; Jing, L.; Li, Y. A segmentation and topology denoising method for three-dimensional (3-D) point cloud data obtained from Laser scanning. Lasers Eng. 2024, 57, 191–207. [Google Scholar]
Nurunnabi, A.; West, G.; Belton, D. Outlier detection and robust normal-curvature estimation in mobile laser scanning 3D point cloud data. Pattern Recog. 2015, 48, 1404–1419. [Google Scholar] [CrossRef]
Zeng, J.; Cheung, G.; Ng, M.; Pang, J.; Yang, C. 3D point cloud denoising using graph Laplacian regularization of a low dimensional manifold model. IEEE Trans. Image Process. 2019, 29, 3474–3489. [Google Scholar] [CrossRef]
Xu, X.; Geng, G.; Cao, X.; Li, K.; Zhou, M. TDNet: Transformer-based network for point cloud denoising. Appl. Opt. 2022, 61, 80–88. [Google Scholar] [CrossRef]
Jain, A.K. Data clustering: 50 years beyond K-means. Pattern Recognit. Lett. 2010, 31, 651–666. [Google Scholar] [CrossRef]
Maronna, R.; Aggarwal, C.C.; Reddy, C.K. Data clustering: Algorithms and applications. Stat. Pap. 2015, 57, 565–566. [Google Scholar] [CrossRef]
Zhu, Y.; Ting, K.; Carman, M.J. Density-ratio based clustering for discovering clusters with varying densities. Pattern Recognit. 2016, 60, 983–997. [Google Scholar] [CrossRef]
Rodriguez, A.; Laio, A. Clustering by fast search and find of density peaks. Science 2014, 334, 1494–1496. [Google Scholar] [CrossRef] [PubMed]
Sun, H.; Song, Y.; Luo, Y.; Sun, F. A Clustering Algorithm Based on Statistical Manifold. Trans. Beijing Inst. Technol. 2021, 41, 226–230. [Google Scholar] [CrossRef]
Duan, X.; Ji, X.; Sun, H.; Guo, H. A non-iterative method for the difference of means on the Lie Group of symmetric positive-definite matrices. Mathematics 2022, 10, 255. [Google Scholar] [CrossRef]
Duan, X.; Feng, L.; Zhao, X. Point cloud denoising algorithm via geometric metrics on the statistical manifold. Appl. Sci. 2023, 13, 8264. [Google Scholar] [CrossRef]
Liu, M.; Vemuri, B.C.; Amari, S.I.; Nielsen, F. Total Bregman divergence and its applications to shape retrieval. In Proceedings of the 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR), San Francisco, CA, USA, 13–18 June 2010; pp. 3463–3468. [Google Scholar] [CrossRef]
Hua, X.; Ono, Y.; Peng, L.; Cheng, Y.; Wang, H. Target detection within nonhomogeneous clutter via total Bregman divergence-based matrix information geometry detectors. IEEE Trans. Signal Process. 2021, 69, 4326–4340. [Google Scholar] [CrossRef]
Yair, O.; Ben-Chen, M.; Talmon, R. Parallel transport on the cone manifold of SPD matrices for domain adaptation. IEEE Trans. Signal Process. 2019, 67, 1797–1811. [Google Scholar] [CrossRef]
Luo, G.; Wei, J.; Hu, W.; Maybank, S.J. Tangent Fisher vector on matrix manifolds for action recognition. IEEE Trans. Image Process. 2020, 29, 3052–3064. [Google Scholar] [CrossRef]
Dhillon, I.S.; Tropp, J.A. Matrix nearness problems with Bregman divergences. SIAM J. Matrix Anal. Appl. 2007, 29, 1120–1146. [Google Scholar] [CrossRef]
Liu, M.; Vemuri, B.C.; Amari, S.L.; Nielsen, F. Shape retrieval using hierarchical total Bregman soft clustering. IEEE Trans. Pattern Anal. Mach. Intell. 2012, 34, 2407–2419. [Google Scholar] [CrossRef] [PubMed]
Amari, S.I. Information Geometry and Its Applications; Springer: Tokyo, Japan, 2016. [Google Scholar]
Hua, X.; Cheng, Y.; Wang, H.; Qin, Y. Information geometry for covariance estimation in heterogeneous clutter with total Bregman divergence. Entropy 2018, 20, 258. [Google Scholar] [CrossRef]

Figure 1. Isosurfaces induced by Euclidean metric and TBDs.

Figure 2. Denoised−raw point cloud contrast by TBD. (a) Initial distribution: valid data (green) and noise (blue) prior to denoising (PCA); (b) Transformed distribution after TLD denoising: valid data (green) and noise (blue) (PCA); (c) Teapot point cloud before denoising: valid data (blue) and noise (red); (d–f) Denoised results using TLD, TED, and TID, respectively.

Figure 3. Denoised−raw point cloud contrast based on metrics in [17]. (a–e): Denoising outcomes employing the metrics in [17]: Euclidean, AIRM, LEM, KLD, and SKLD.

Figure 4. Comparative norms of influence functions under extreme outlier conditions. TBD variants exhibit substantially tighter bounds and lower variance than Riemannian metrics.

Figure 5. Mean loss across metrics.

Figure 6. Comparison of loss functions (Logarithmic scale).

Figure 7. Teapot point cloud denoising results using TLD, TED, and TID under different SNR conditions.

Table 1. Performance metrics vs. neighborhood size q.

q	TPR	FPR	SNRG	Time (s)
2	0.9860	0.6430	0.5335	10.9
3	1.0000	0.5040	0.9841	11.7
4	1.0000	0.4520	1.2124	12.22
5	1.0000	0.4380	1.2831	12.42
6	1.0000	0.4100	1.4390	13.33
7	1.0000	0.4310	1.3202	13.77
8	1.0000	0.4470	1.2371	14.73
9	1.0000	0.4350	1.2989	15.11

Table 2. Comparison of denoising results.

Metric	SNR = 10			SNR = 2			SNR = 1
Metric	TPR	FPR	SNRG	TPR	FPR	SNRG	TPR	FPR	SNRG
TLD	100%	51.21%	95.28%	100%	35.20%	184.11%	93.39%	29.97%	211.67%
TED	100%	54.11%	84.82%	100%	39.59%	152.62%	96.87%	35.90%	168.70%
TID	100%	46.22%	114.51%	95.88%	28.83%	232.53%	87.46%	23.02%	279.90%
Euclid	100%	75.56%	32.21%	95.97%	66.78%	43.72%	79.89%	59.23%	34.88%
AIRM	100%	50.97%	96.21%	99.83%	33.90%	194.52%	96.72%	45.25%	113.75%
LEM	100%	50.97%	96.21%	99.83%	33.90%	194.52%	96.72%	45.25%	113.75%
KLD	100%	44.69%	123.78%	84.45%	24.59%	243.43%	76.66%	56.75%	35.09%
SKLD	100%	47.83%	109.09%	96.67%	29.89%	223.39%	78.78%	57.62%	36.74%

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Duan, X.; Mu, A.; Zhao, X.; Wu, Y. A K-Means Clustering Algorithm with Total Bregman Divergence for Point Cloud Denoising. Symmetry 2025, 17, 1186. https://doi.org/10.3390/sym17081186

AMA Style

Duan X, Mu A, Zhao X, Wu Y. A K-Means Clustering Algorithm with Total Bregman Divergence for Point Cloud Denoising. Symmetry. 2025; 17(8):1186. https://doi.org/10.3390/sym17081186

Chicago/Turabian Style

Duan, Xiaomin, Anqi Mu, Xinyu Zhao, and Yuqi Wu. 2025. "A K-Means Clustering Algorithm with Total Bregman Divergence for Point Cloud Denoising" Symmetry 17, no. 8: 1186. https://doi.org/10.3390/sym17081186

APA Style

Duan, X., Mu, A., Zhao, X., & Wu, Y. (2025). A K-Means Clustering Algorithm with Total Bregman Divergence for Point Cloud Denoising. Symmetry, 17(8), 1186. https://doi.org/10.3390/sym17081186

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A K-Means Clustering Algorithm with Total Bregman Divergence for Point Cloud Denoising

Abstract

1. Introduction

2. Geometry on the Manifold of Symmetric Positive-Definite Matrices

2.1. Bregman Divergence on Manifold $P_{n}$

2.2. TBD Means on $P_{n}$

3. K-Means Clustering Algorithm with TBDs

4. Anisotropy Index and Influence Functions

4.1. Anisotropy Index Related to Various Metrics

4.2. Influence Functions

5. Simulations and Analysis

5.1. Simulations and Results

5.2. Comparative Analysis of Influence Function Bounds

5.3. Analysis of TBD Mean Stability

5.4. Analysis of Computational Complexity

6. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Article Menu

A K-Means Clustering Algorithm with Total Bregman Divergence for Point Cloud Denoising

Abstract

1. Introduction

2. Geometry on the Manifold of Symmetric Positive-Definite Matrices

2.1. Bregman Divergence on Manifold P n

2.2. TBD Means on P n

3. K-Means Clustering Algorithm with TBDs

4. Anisotropy Index and Influence Functions

4.1. Anisotropy Index Related to Various Metrics

4.2. Influence Functions

5. Simulations and Analysis

5.1. Simulations and Results

5.2. Comparative Analysis of Influence Function Bounds

5.3. Analysis of TBD Mean Stability

5.4. Analysis of Computational Complexity

6. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

2.1. Bregman Divergence on Manifold $P_{n}$

2.2. TBD Means on $P_{n}$