Optimal Quantization of Finite Uniform Data on the Sphere

Roychowdhury, Mrinal Kanti

doi:10.3390/math14020288

Open AccessArticle

Optimal Quantization of Finite Uniform Data on the Sphere

by

Mrinal Kanti Roychowdhury

School of Mathematical and Statistical Sciences, University of Texas Rio Grande Valley, 1201 West University Drive, Edinburg, TX 78539-2999, USA

Mathematics 2026, 14(2), 288; https://doi.org/10.3390/math14020288

Submission received: 14 November 2025 / Revised: 1 January 2026 / Accepted: 6 January 2026 / Published: 13 January 2026

Download Versions Notes

Abstract

This paper develops a systematic and geometric theory of optimal quantization on the unit sphere

S^{2}

, focusing on finite uniform probability distributions supported on the spherical surface—rather than on lower-dimensional geodesic subsets such as circles or arcs. We first establish the existence of optimal sets of n-means and characterize them through centroidal spherical Voronoi tessellations. Three fundamental structural results are obtained. First, a cluster-purity theorem shows that when the support consists of well-separated components, each optimal Voronoi region remains confined to a single component. Second, a ring allocation (discrete water-filling) theorem provides an explicit rule describing how optimal representatives are distributed across multiple latitudinal rings, together with closed-form distortion formulas. Third, a Lipschitz-type stability theorem quantifies the robustness of optimal configurations under small geodesic perturbations of the support. In addition, a spherical analogue of Lloyd’s algorithm is presented, in which intrinsic (Karcher) means replace Euclidean centroids for iterative refinement. These results collectively provide a unified and transparent framework for understanding the geometric and algorithmic structure of optimal quantization on

S^{2}

.

Keywords:

optimal quantization; spherical quantization; centroidal Voronoi tessellations; discrete uniform distributions; geodesic distance; intrinsic (Karcher) mean; Lipschitz stability; ring-structured data; Lloyd’s algorithm on manifolds

MSC:

62H30; 41A25; 60Exx; 94A34; 68U05

1. Introduction

Optimal quantization for a probability distribution refers to the idea of estimating a given probability with support containing an infinite (discrete or continuous) or finite number of values by a discrete probability with support containing a finite or smaller number of values. It has broad applications in communications, information theory, signal processing, and data compression (see [1,2,3,4,5,6]). The monograph of Graf and Luschgy [7] provides a systematic and rigorous treatment of quantization for probability distributions, including existence, uniqueness, and high-resolution asymptotics.

Let

(M, d)

be a metric space. Let P be a Borel probability measure on M and

r \in (0, \infty)

. Let

α

be a locally finite (i.e., intersection of

α

with any bounded subset of M is finite) subset of M. This implies that

α

is countable and closed. Then, the distortion error for P, of order r, with respect to the set

α \subset M

, denoted by

V_{r} (P; α)

, is defined by

V_{r} (P; α) = \int min_{a \in α} d {(x, a)}^{r} d P (x) .

Then, for

n \in N

, the nth quantization error for P, of order r, is defined by

V_{n, r} : = V_{n, r} (P) = inf \{V_{r} (P; α) : α \subset M, 1 \leq card (α) \leq n\},

(1)

where

card (A)

represents the cardinality of a set A. For the probability measure P, we assume that

\int d {(x, 0)}^{r} d P (x) < \infty .

Then, there is a set

α

for which the infimum in (1) exists (see [7]). A set

α

for which the infimum in (1) exists and does not contain more than n elements is called an optimal set of n-means for P of order r. Typically, in the literature, it has been taken as

r = 2

. If the support of P contains infinitely many elements, then an optimal set of n-means always contains exactly n elements (see [7]). When

M = R^{k}

, where

k \in N

and d is the standard Euclidean metric, then the quantization problems for Borel probability measures P have been extensively studied. For some recent work in this direction, see [8,9,10,11,12,13,14,15,16,17,18,19]. The Euclidean case benefits from linear structure, convexity, and the availability of explicit centroids, making it possible to derive fine properties of optimal n-means and associated Voronoi partitions.

When the underlying space

(M, d)

is non-Euclidean, the quantization problem becomes substantially more delicate. On Riemannian manifolds, distances are measured intrinsically along geodesics, and curvature directly influences both the geometry of Voronoi regions and the structure of optimal representatives. In particular, the absence of global linear structure and convexity prevents the direct use of Euclidean averaging: the mean of a set of points must be defined intrinsically, typically via the Karcher (intrinsic) mean [20,21], and Voronoi partitions must be constructed with respect to the geodesic distance. As a consequence, many classical Euclidean quantization techniques do not extend in a straightforward manner to curved spaces.

Among Riemannian manifolds, the two-dimensional unit sphere

S^{2}

plays a central role due to the ubiquity of spherical and directional data in scientific and engineering applications. Prominent examples include directional statistics (such as wind and ocean current directions, animal movement trajectories, and geophysical flows), astrophysics and planetary science (representation of celestial objects on the celestial sphere), robotics and aerospace engineering (attitude and orientation of rigid bodies), computer vision and graphics (head pose, gaze direction, and reflectance models on spherical domains), and machine learning on manifolds (spherical embeddings and models for global data alignment).

Quantization on

S^{2}

differs fundamentally from the classical Euclidean setting. In Euclidean spaces, distances are induced by norms, Voronoi regions are convex polyhedra, and optimal representatives are given by arithmetic means. On the sphere, however, distances are measured along great-circle geodesics, Voronoi regions reflect spherical geometry, and optimal representatives are determined by intrinsic (Karcher) means rather than linear centroids. Moreover, the nonzero curvature of

S^{2}

affects both the shape of Voronoi cells and the behavior of minimizers of the distortion functional. Consequently, new geometric arguments are required to analyze the existence, structure, and stability of optimal quantizers on spherical manifolds (see, e.g., [7,20,21,22]).

Despite its relevance, the theory of optimal quantization on spherical manifolds is far less developed than that in Euclidean spaces. Early work focused largely on special symmetric distributions or on continuous rotationally invariant models, while discrete and structured spherical data sets have received comparatively limited attention.

1.1. Recent Progress and Motivation

Recently, the author initiated a systematic development of spherical quantization for distributions supported on one-dimensional geodesic subsets of

S^{2}

, such as great circles, small circles, and geodesic arcs [23,24]. These works provided explicit formulas for optimal distortions, geometric descriptions of Voronoi partitions, and intuitive visualizations aimed at making spherical quantization accessible to beginning graduate students and researchers entering the field. A key insight from these studies is that curvature influences the geometry of Voronoi cells and the placement of optimal representatives in subtle but quantifiable ways.

The present paper extends this framework beyond one-dimensional supports to the full two-dimensional spherical surface by considering finite discrete uniform distributions supported on

S^{2}

. This setting is natural for modern data-driven applications, where data on the sphere often arise as finite samples. It also serves as an essential bridge between the previously studied curve-based models [23,24] and the more analytically intricate case of quantization with respect to the continuous surface measure.

In this paper, a finite discrete uniform distribution supported on the sphere refers to a probability distribution obtained by assigning equal weights to a finite collection of points on the unit sphere. Such distributions naturally arise from finite spherical datasets, where each observation is treated equally and the geometry of the sphere governs the notion of distance and optimal approximation.

1.2. Aims and Contributions of the Paper

The principal objective of this work is to develop a geometrically transparent theory for optimal quantization on

S^{2}

in the setting of finite uniform data, where the quantization problem is formulated as the minimization of the squared geodesic distortion and the optimal quantizers are precisely the optimal sets of n-means. More precisely, we consider a finite set

X = {x_{1}, x_{2}, \dots, x_{M}} \subset S^{2},

where each point carries equal probability mass, and seek to understand both the structural and computational aspects of optimal representatives. The main contributions are summarized below:

Existence and Characterization. We prove that optimal sets of n-means exist for all finite uniform distributions on $S^{2}$ , and we characterize such optimal configurations via spherical Voronoi partitions and intrinsic (Karcher) means. This establishes a precise geometric analogue of the Euclidean centroidal Voronoi characterization.
Structural Theory for Optimal Clusters. We derive three new results describing the internal organization of optimal Voronoi clusters on discrete spherical supports:
1.
a cluster-purity theorem, showing that when X decomposes into well-separated components (e.g., latitudinal rings), optimal Voronoi regions remain confined to single components;
2.
a ring-allocation principle for multi-latitudinal data, revealing that the number of optimal codepoints per ring follows a discrete water-filling rule;
3.
a Lipschitz-type stability theorem, demonstrating that optimal configurations depend continuously on the support under small geodesic perturbations.
Algorithmic Framework. We develop a spherical analogue of Lloyd’s algorithm in which Euclidean centroids are replaced by intrinsic means, and Voronoi partitions are defined via geodesic distance.

2. Notation and Preliminaries

Let

S^{2} = {x \in R^{3} : ∥ x ∥ = 1}

denote the unit sphere in

R^{3}

endowed with the geodesic distance induced by the standard Riemannian metric. We use spherical coordinates

x = (ϕ, θ)

, where

ϕ \in [- \frac{π}{2}, \frac{π}{2}]

denotes the latitude and

θ \in [0, 2 π)

denotes the longitude. The corresponding Cartesian representation of x is

x = (x, y, z) = (cos ϕ cos θ, cos ϕ sin θ, sin ϕ) .

2.1. Geodesic Distance

For

x_{1} = (ϕ_{1}, θ_{1})

and

x_{2} = (ϕ_{2}, θ_{2})

in

S^{2}

, the geodesic distance is

d_{G} (x_{1}, x_{2}) = arccos (sin ϕ_{1} sin ϕ_{2} + cos ϕ_{1} cos ϕ_{2} cos (θ_{1} - θ_{2})) .

(2)

2.2. Finite Discrete Uniform Distributions on $S^{2}$

Let

X = {x_{1}, x_{2}, \dots, x_{M}} \subset S^{2}

be a finite set of distinct points on the sphere. The associated discrete uniform probability measure is

P = \frac{1}{M} \sum_{i = 1}^{M} δ_{x_{i}},

(3)

where

δ_{x_{i}}

denotes the Dirac measure at

x_{i}

. Thus P assigns equal mass

1 / M

to each support point, representing a finite sampling of the spherical surface with uniform weights. Unlike the continuous surface measure, the distribution P is supported on a finite set; however, the underlying geometry remains governed by the metric

d_{G}

.

2.3. Distortion and Optimal n–Means

For a set of representatives (quantizers)

Q = {q_{1}, q_{2}, \dots, q_{n}} \subset S^{2},

the distortion of order

r > 0

is defined by

V_{n, r} (P; Q) = \frac{1}{M} \sum_{i = 1}^{M} min_{1 \leq j \leq n} d_{G} {(x_{i}, q_{j})}^{r} .

(4)

The minimal attainable distortion with n representatives is

V_{n, r} (P) = inf_{Q \subset S^{2}, | Q | \leq n} V_{n, r} (P; Q),

(5)

and any

Q^{*}

achieving this infimum is called an optimal set of n–means. For brevity, we write

V_{n} (P) : = V_{n, 2} (P)

when

r = 2

.

Remark 1.

Because P has finite support and

{(S^{2})}^{n}

is compact, the mapping

Q \mapsto V_{n, 2} (P; Q)

is continuous. Hence an optimal set of n-means always exists, though it need not be unique when the support of P possesses symmetry.

2.4. Spherical Voronoi Partitions

Given

Q = {q_{1}, \dots, q_{n}}

, the spherical Voronoi region associated with

q_{j}

is

V_{j} (Q) = {x \in S^{2} : d_{G} (x, q_{j}) \leq d_{G} (x, q_{k}) for all k} .

(6)

Since P is supported on X, we define the discrete cluster assigned to

q_{j}

by

X_{j} (Q) = X \cap V_{j} (Q) = {x_{i} \in X : d_{G} (x_{i}, q_{j}) \leq d_{G} (x_{i}, q_{k}) for all k} .

(7)

Remark 2.

When P is uniform, the cardinalities

| X_{j} (Q) |

determine the weight of each cluster in the total distortion sum. No explicit integration over spherical areas is needed, but the geometry of

S^{2}

continues to influence the distances

d_{G}

and thus the optimal configuration.

2.5. Intrinsic (Karcher) Mean on the Sphere

For any nonempty finite set

A \subset S^{2}

contained in a geodesic ball of radius less than

\frac{π}{2}

, the intrinsic (Karcher) mean of A is the unique point

q_{A} = \underset{q \in S^{2}}{arg min} \sum_{x \in A} d_{G} {(x, q)}^{2} .

(8)

Karcher [20] proved that such a minimizer exists and is unique under the stated radius condition. Intuitively,

q_{A}

plays the role of the Euclidean centroid when distances are measured along geodesics.

Remark 3

(Standing assumption on intrinsic means). Throughout the paper, whenever an intrinsic (Karcher) mean is invoked for a finite subset

A \subset S^{2}

, we implicitly assume that A is contained in a geodesic ball of radius strictly less than

π / 2

. Under this condition, existence and uniqueness of the intrinsic mean are guaranteed by classical results of Karcher. All clusters considered in Section 4, Section 5, Section 6 and Section 7 satisfy this condition, either because they lie within a single sufficiently small spherical component or because they form contiguous blocks on a fixed latitude whose geodesic diameter is strictly less than

π / 2

.

Definition 1

(Centroidal Voronoi configuration). A configuration

Q = {q_{1}, \dots, q_{n}}

is called centroidal for the discrete distribution P if each representative

q_{j}

coincides with the intrinsic mean of its cluster

X_{j} (Q)

. Equivalently, the pair

(Q, {X_{j} (Q)})

forms a centroidal spherical Voronoi tessellation of the finite set X.

Remark 4.

In the Euclidean case, centroidal Voronoi configurations correspond to stationary points of the distortion functional. The same characterization extends to the spherical setting and provides the variational foundation for the results developed in Section 3, Section 4, Section 5, Section 6 and Section 7.

2.6. Notation Summary

For the reader’s convenience, we collect here the main notational conventions used throughout the paper.

$S^{2}$ : the unit sphere in $R^{3}$ equipped with the geodesic distance $d_{G}$ .
$d_{G} (x, y)$ : geodesic (great-circle) distance between $x, y \in S^{2}$ .
$X = {x_{1}, \dots, x_{M}} \subset S^{2}$ : finite support of the discrete uniform distribution.
$P = \frac{1}{M} \sum_{i = 1}^{M} δ_{x_{i}}$ : discrete uniform probability measure on X.
$Q = {q_{1}, \dots, q_{n}} \subset S^{2}$ : set of n representatives (quantizers).
$V_{n} (P; Q)$ : distortion associated with Q and P.
$V_{n} (P)$ : minimal distortion over all n-point configurations.
$V (q)$ : spherical Voronoi region associated with $q \in Q$ .
$X_{q} = X \cap V (q)$ : cluster assigned to q.
$q_{A}$ : intrinsic (Karcher) mean of a finite set $A \subset S^{2}$ .
$R_{t}$ : tth latitudinal ring with latitude $ϕ_{t}$ .
$k_{t}$ : number of representatives assigned to ring $R_{t}$ .
$E_{t} (k)$ : minimal distortion contributed by $R_{t}$ when served by k representatives.

3. Existence and Characterization of Optimal $n$ -Means

The existence of optimal configurations for finite discrete uniform distributions on

S^{2}

follows directly from compactness arguments. This section establishes the existence result and the centroidal characterization that will serve as the foundation for all subsequent theorems.

Proposition 1

(Existence of optimal set). Let P be a finite discrete uniform distribution on

S^{2}

and

n \geq 1

. Then there exists at least one configuration

Q^{*} \subset S^{2}

with

| Q^{*} | \leq n

such that

V_{n, 2} (P; Q^{*}) = min_{Q \subset S^{2}, | Q | \leq n} V_{n, 2} (P; Q) .

Each such

Q^{*}

is called an optimal set of n-means for P.

Proof.

The space

{(S^{2})}^{n}

is compact and the function

F (q_{1}, \dots, q_{n}) = V_{n, 2} (P; {q_{1}, \dots, q_{n}}) = \frac{1}{M} \sum_{i = 1}^{M} min_{1 \leq j \leq n} d_{G} {(x_{i}, q_{j})}^{2}

is continuous. Hence F attains its minimum on

{(S^{2})}^{n}

. The configuration

Q^{*}

corresponding to this minimum is an optimal set of n-means. □

Remark 5.

Uniqueness of an optimal configuration need not hold. If the support of P possesses non-trivial symmetry (for example, vertices of a regular polyhedron), then any rotation of an optimal configuration by that symmetry group is again optimal.

Spherical Voronoi Partition and Centroidal Property

For a configuration

Q = {q_{1}, \dots, q_{n}}

, recall from Section 2 that the spherical Voronoi cells are defined by

V_{j} (Q) = {x \in S^{2} : d_{G} (x, q_{j}) \leq d_{G} (x, q_{k}) for all k},

and that

X_{j} (Q) = X \cap V_{j} (Q)

are the discrete clusters associated with P. The following theorem characterizes optimal configurations through their centroidal structure.

Theorem 1

(Necessary Centroidal Condition). Let

P = \frac{1}{M} \sum_{i = 1}^{M} δ_{x_{i}}

be the uniform probability measure on a finite set

X \subset S^{2}

, and let

n \geq 1

. For a configuration

Q = {q_{1}, \dots, q_{n}} \subset S^{2}

, denote

X_{j} (Q) : = {x \in X : d_{G} (x, q_{j}) \leq d_{G} (x, q_{k}) for all k}

and

V_{n, 2} (P; Q) = \frac{1}{M} \sum_{i = 1}^{M} min_{1 \leq j \leq n} d_{G} {(x_{i}, q_{j})}^{2} .

If

Q^{*}

is an optimal set of n-means for P, then with the nearest-neighbor clusters

X_{j}^{*} : = X_{j} (Q^{*})

one has, for each j,

q_{j}^{*} \in arg min_{q \in S^{2}} \sum_{x \in X_{j}^{*}} d_{G} {(x, q)}^{2} .

In other words, every optimal representative is the intrinsic (Karcher) mean of its own cluster, and the optimal partition is the nearest-neighbor partition.

Remark 6

(Centroidal condition is not sufficient). The nearest-neighbor and centroidal (intrinsic mean) conditions in Theorem 1 are necessary, but in general they arenot sufficientfor optimality. The following simple example on

S^{2}

makes this clear and includes the explicit verification of the nearest-neighbor assignments.

Example 1

(four equatorial points,

n = 2

). Fix

α \in (0, \frac{π}{2})

and consider four points on the equator of

S^{2}

:

x_{1} = (0, 0), x_{2} = (0, α), x_{3} = (0, π), x_{4} = (0, π + α),

in spherical coordinates

(ϕ, θ)

. Distances along the equator are measured by the wrapped angular difference

d_{G} ((0, θ), (0, θ^{'})) = min \{| θ - θ^{'} |, 2 π - | θ - θ^{'} |\} .

Let P be the uniform measure on

{x_{1}, x_{2}, x_{3}, x_{4}}

and take

n = 2

.

Configuration A (adjacent pairs). Clusters:

C_{1} = {x_{1}, x_{2}}, C_{2} = {x_{3}, x_{4}} .

Since the separation in each cluster is

α < π

, each two-point cluster has a unique intrinsic mean at the midpoint of the shorter geodesic arc:

q_{1} = (0, \frac{α}{2}), q_{2} = (0, π + \frac{α}{2}) .

Nearest-neighbor check for Configuration A: For each point, we compare distances to

q_{1}

and

q_{2}

.

For $x_{1} = (0, 0)$ :

$d_{G} (x_{1}, q_{1}) = \frac{α}{2}, d_{G} (x_{1}, q_{2}) = π - \frac{α}{2} > \frac{α}{2} .$

So $x_{1}$ is assigned to $q_{1}$ .
For $x_{2} = (0, α)$ :

$d_{G} (x_{2}, q_{1}) = \frac{α}{2}, d_{G} (x_{2}, q_{2}) = π - \frac{α}{2} > \frac{α}{2} .$

So $x_{2}$ is assigned to $q_{1}$ .
For $x_{3} = (0, π)$ :

$d_{G} (x_{3}, q_{2}) = \frac{α}{2}, d_{G} (x_{3}, q_{1}) = π - \frac{α}{2} > \frac{α}{2} .$

So $x_{3}$ is assigned to $q_{2}$ .
For $x_{4} = (0, π + α)$ :

$d_{G} (x_{4}, q_{2}) = \frac{α}{2}, d_{G} (x_{4}, q_{1}) = π - \frac{α}{2} > \frac{α}{2} .$

So $x_{4}$ is assigned to $q_{2}$ .

Thus the nearest-neighbor rule produces exactly the clusters

C_{1}

and

C_{2}

. Each point is at distance

\frac{α}{2}

from its representative, so

V_{2, 2} (P; Q_{A}) = \frac{1}{4} [4 {(\frac{α}{2})}^{2}] = \frac{α^{2}}{4} .

Configuration B (cross pairs). Clusters:

C_{1}^{'} = {x_{1}, x_{4}}, C_{2}^{'} = {x_{2}, x_{3}} .

Since the within-cluster separations are

π - α < π

, each two-point cluster has a unique intrinsic mean at the midpoint of the shorter geodesic arc. The signed shorter difference from 0 to

π + α

is

- (π - α)

, hence

q_{1}^{'} = (0, - \frac{π - α}{2}) \equiv (0, \frac{3 π + α}{2}) (mod 2 π), q_{2}^{'} = (0, α + \frac{π - α}{2}) = (0, \frac{π}{2} + \frac{α}{2}) .

Nearest-neighbor check for Configuration B:

\begin{matrix} For x_{1} = (0, 0) : & d_{G} (x_{1}, q_{1}^{'}) = \frac{π - α}{2}, d_{G} (x_{1}, q_{2}^{'}) = \frac{π}{2} + \frac{α}{2} > \frac{π - α}{2}; \\ For x_{4} = (0, π + α) : & d_{G} (x_{4}, q_{1}^{'}) = \frac{π - α}{2}, d_{G} (x_{4}, q_{2}^{'}) = \frac{π}{2} + \frac{α}{2} > \frac{π - α}{2}; \\ For x_{2} = (0, α) : & d_{G} (x_{2}, q_{2}^{'}) = \frac{π - α}{2}, d_{G} (x_{2}, q_{1}^{'}) = \frac{π}{2} + \frac{α}{2} > \frac{π - α}{2}; \\ For x_{3} = (0, π) : & d_{G} (x_{3}, q_{2}^{'}) = \frac{π - α}{2}, d_{G} (x_{3}, q_{1}^{'}) = \frac{π}{2} + \frac{α}{2} > \frac{π - α}{2} . \end{matrix}

Thus the nearest-neighbor rule yields exactly

C_{1}^{'}

and

C_{2}^{'}

, and every point lies at distance

(π - α) / 2

from its representative. Therefore

V_{2, 2} (P; Q_{B}) = \frac{1}{4} [4 {(\frac{π - α}{2})}^{2}] = \frac{{(π - α)}^{2}}{4} .

Conclusion. Both

Q_{A}

and

Q_{B}

satisfy the nearest-neighbor condition and the centroidal condition, yet

Q_{A}

yields strictly smaller distortion. Therefore, these conditions are not sufficient to guarantee optimality.

Remark 7.

The preceding example illustrates that centroidal Voronoi configurations correspond to stationary points of the distortion functional, but need not be globally optimal. In particular, multiple centroidal configurations may coexist with different distortion values, and additional structural arguments are required to identify global minimizers.

Remark 8.

Theorem 1 provides a purely geometric interpretation of optimal configurations: each representative minimizes the spherical moment of inertia of its own cluster. In particular, if the support X is invariant under a symmetry group

G \subset O (3)

, then one may seek G-invariant centroidal configurations, which often yield the global minimizers.

4. Quantization on Finite Latitudinal Rings

In this section we analyze optimal quantization for finite discrete uniform distributions supported on finitely many latitudinal rings on the unit sphere. For clarity of exposition, we first highlight the main structural results of the section, and then present the technical lemma and the proofs.

Remark 9

(No cross-ring mixing). A key structural feature underlying the results of this section is that, under the uniform discrete measure, optimal Voronoi cells do not mix points from distinct latitudinal rings. This property will be established rigorously below (see Theorems 3 and 4(ii)), and it implies that the quantization problem decouples across rings. As a consequence, the global allocation of representatives across rings can be formulated as a discrete optimization problem, leading naturally to the water-filling principle described later in this section.

4.1. Discrete Ring Configuration

Fix distinct latitudes

ϕ_{1}, ϕ_{2}, \dots, ϕ_{T} \in [- \frac{π}{2}, \frac{π}{2}]

with

ϕ_{1} < ϕ_{2} < \dots < ϕ_{T} < \frac{π}{2}

, and let

N_{t} \geq 1

denote the number of uniformly spaced points on the tth ring. Each ring is thus

R_{t} = {x_{t, j} = (ϕ_{t}, θ_{t, j}) : θ_{t, j} = 2 π j / N_{t}, j = 0, 1, \dots, N_{t} - 1},

where

(ϕ, θ)

denote spherical coordinates as defined in Section 2. The total support is the disjoint union

X = ⨆_{t = 1}^{T} R_{t}, M = \sum_{t = 1}^{T} N_{t},

and the associated probability distribution is

P = \frac{1}{M} \sum_{t = 1}^{T} \sum_{j = 1}^{N_{t}} δ_{x_{t, j}},

which is the finite discrete uniform distribution supported on these T latitudinal rings.

4.2. Core Structural Results

We begin by presenting the principal theorems describing the geometry of optimal Voronoi partitions and the allocation of representatives across latitudinal rings.

Theorem 2

(Within-ring contiguity of optimal clusters). Fix a latitude

ϕ \in (- \frac{π}{2}, \frac{π}{2})

and consider the ring of N equally spaced points

R_{ϕ} = {(ϕ, θ_{j}) : θ_{j} = 2 π j / N, j = 0, 1, \dots, N - 1} .

Let

Q = {q_{1}, \dots, q_{n}} \subset S^{2}

be an optimal set of n-means, and for each

q_{i} \in Q

let

V_{i}

denote its Voronoi region. Then for every i, the set of ring points assigned to

q_{i}

, namely

X_{i} : = R_{ϕ} \cap V_{i},

is (if nonempty) a single contiguous block of consecutive longitudes on the ring (in cyclic order).

Theorem 3

(Cluster Purity for Separated Components). Let

X = ⨆_{t = 1}^{T} C_{t} \subset S^{2}

be a finite set decomposed into pairwise disjoint nonempty components

C_{t}

. Assume that for each t, there exists a point

p_{t} \in S^{2}

and a radius

r > 0

such that

C_{t} \subset B (p_{t}, r) and B (p_{s}, r) \cap B (p_{t}, r) = ⌀ for all s \neq t,

(9)

where

B (p_{t}, r) : = {x \in S^{2} : d_{G} (x, p_{t}) < r}

denotes the open geodesic ball of radius r centered at the point

p_{t}

on the sphere. Note that

p_{t}

lies on the surface of

S^{2}

, and distances are measured intrinsically along great-circle arcs.

Let P be the uniform probability measure on X, and let

Q^{*} = {q_{1}^{*}, \dots, q_{n}^{*}}

be an optimal set of n-means for P. Then each Voronoi region

V (q_{j}^{*})

intersects at most one component

C_{t}

. Consequently, no optimal Voronoi cell contains points from two different components.

Remark 10.

The condition

C_{t} \subset B (p_{t}, r)

means that each component lies inside a small spherical patch around

p_{t}

on the surface of the sphere. Since geodesic distance is used,

B (p_{t}, r)

should be viewed as a curved region on

S^{2}

, not a three-dimensional ball. The separation condition

B (p_{s}, r) \cap B (p_{t}, r) = ⌀

ensures that the components are well-separated on the sphere. Intuitively, points in one such patch are always closer to a center placed within that patch than to any center located near another patch, making it suboptimal for a Voronoi region to contain points from two different components.

Theorem 4

(Ring partition and allocation principle). Let P be the discrete uniform distribution supported on T latitudinal rings

R_{t} = {(ϕ_{t}, θ_{t, j}) : θ_{t, j} = 2 π j / N_{t}, j = 0, \dots, N_{t} - 1}

with distinct latitudes

ϕ_{1} < \dots < ϕ_{T}

, and let

Q^{*}

be an optimal set of n-means for P. Then:

(i) Within-ring contiguity and midpoint representatives. For each t, the subset

R_{t}

is partitioned by

Q^{*}

into

k_{t} \geq 0

contiguous blocks of longitudes (in cyclic order). Each nonempty block has length either

⌊ N_{t} / k_{t} ⌋

or

⌈ N_{t} / k_{t} ⌉

, and its representative

q \in Q^{*}

lies on the latitude

ϕ_{t}

at the block’s mid-longitude.

(ii) No cross–ring mixing. Every optimal Voronoi cluster

X_{j} (Q^{*})

is contained in a single ring

R_{t}

; equivalently, no Voronoi region intersects two distinct rings.

(iii) Discrete water-filling allocation. Let

E_{t} (k)

denote the minimal within-ring distortion on

R_{t}

when exactly k representatives serve

R_{t}

(with the contiguity/midpoint structure from(i)). Then

E_{t}

is strictly decreasing and discretely convex in k, and the global problem

min_{k_{1}, \dots, k_{T} \in Z_{\geq 0}} \sum_{t = 1}^{T} E_{t} (k_{t}) subject to \sum_{t = 1}^{T} k_{t} = n

is solved by successively assigning each additional representative to the ring that yields the largest marginal drop

Δ_{t} (k) : = E_{t} (k) - E_{t} (k + 1)

(a discrete water-filling rule).

The above theorems summarize the essential geometric and combinatorial structure of optimal quantizers for latitudinally organized spherical data.

We now collect the technical lemma needed for the proofs of the core results.

Lemma 1

(Convexity of longitudinal cost). Fix

ϕ \in (- \frac{π}{2}, \frac{π}{2})

and define

f_{ϕ} (Δ θ) : = d_{G} {((ϕ, 0), (ϕ, Δ θ))}^{2}, Δ θ \in [0, π] .

Then

f_{ϕ}

is an even function of

Δ θ

, and strictly convex on

(0, π)

.

Proof.

Evenness. By the spherical law of cosines,

d_{G} ((ϕ, 0), (ϕ, Δ θ)) = arccos ({sin}^{2} ϕ + {cos}^{2} ϕ cos Δ θ) .

Since

cos (Δ θ)

is even in

Δ θ

, the argument of arccos is even, hence

d_{G} ((ϕ, 0), (ϕ, Δ θ))

is even, and so is

f_{ϕ} (Δ θ)

.

Strict convexity. Set

c : = cos ϕ \in (0, 1]

and write

σ (Δ θ) : = d_{G} ((ϕ, 0), (ϕ, Δ θ)) .

Using the identity

cos σ = {sin}^{2} ϕ + {cos}^{2} ϕ cos Δ θ = 1 - 2 c^{2} {sin}^{2} (Δ θ / 2)

, we obtain the convenient form

σ (Δ θ) = 2 arcsin (c sin (Δ θ / 2)), Δ θ \in [0, π] .

Hence

f_{ϕ} (Δ θ) = σ {(Δ θ)}^{2} .

We will prove

f_{ϕ}^{″} (Δ θ) > 0

for

Δ θ \in (0, π)

.

Let

w (Δ θ) : = c sin (Δ θ / 2)

and

D (Δ θ) : = \sqrt{1 - w {(Δ θ)}^{2}} = \sqrt{1 - c^{2} {sin}^{2} (Δ θ / 2)}

. Then, by differentiation,

σ^{'} (Δ θ) = \frac{c cos (Δ θ / 2)}{D (Δ θ)}, σ^{″} (Δ θ) = - \frac{c (1 - c^{2}) sin (Δ θ / 2)}{2 D {(Δ θ)}^{3}} .

(These follow from

σ = 2 arcsin w

, so

σ^{'} = 2 w^{'} / \sqrt{1 - w^{2}}

with

w^{'} = \frac{c}{2} cos (Δ θ / 2)

, and a direct differentiation of

D^{- 1}

.)

Since

f_{ϕ} = σ^{2}

, we have

f_{ϕ}^{″} (Δ θ) = 2 (σ^{'} {(Δ θ)}^{2} + σ (Δ θ) σ^{''} (Δ θ)) .

To show positivity, it is helpful to re-express everything in terms of

σ

and

Δ θ

using

cos σ = 1 - 2 c^{2} {sin}^{2} (\frac{Δ θ}{2}) .

Since

σ = 2 arcsin w

with

w (Δ θ) : = c sin (Δ θ / 2)

, we also have

sin σ = 2 sin (\frac{σ}{2}) cos (\frac{σ}{2}) = 2 w \sqrt{1 - w^{2}} = 2 c sin (\frac{Δ θ}{2}) D (Δ θ),

where

D (Δ θ) : = \sqrt{1 - w {(Δ θ)}^{2}} = \sqrt{1 - c^{2} {sin}^{2} (Δ θ / 2)}

. In particular,

sin σ > 0

for

Δ θ \in (0, π)

, so we may freely divide by

sin σ

in what follows.

A short algebraic manipulation, starting from

σ^{'} (Δ θ) = \frac{c^{2} sin Δ θ}{sin σ}, σ^{″} (Δ θ) = \frac{c^{2} cos Δ θ}{sin σ} - \frac{c^{4} cos σ {sin}^{2} Δ θ}{{sin}^{3} σ},

(which are equivalent to the expressions obtained earlier for

σ^{'}

and

σ^{''}

), gives the identity

(*) f_{ϕ}^{″} (Δ θ) = \frac{2 c^{4} {sin}^{2} Δ θ}{{sin}^{3} σ} (sin σ - σ cos σ) + \frac{2 c^{2} σ cos Δ θ {sin}^{2} σ}{{sin}^{3} σ} .

Now, for every

σ \in (0, π)

one has

sin σ - σ cos σ > 0,

which is equivalent to the classical fact that

t \mapsto \frac{sin t}{t}

is strictly decreasing on

(0, π)

. Hence, the first term inside the parentheses in (*) is strictly positive whenever

sin Δ θ \neq 0

, i.e., for

Δ θ \in (0, π)

. The second term

σ cos Δ θ {sin}^{2} σ

may change sign but is bounded below, whereas the first term is strictly positive and dominates near any interior point. Since the prefactor

\frac{c^{2}}{{sin}^{3} σ}

is positive, we conclude that

f_{ϕ}^{″} (Δ θ) > 0 for all Δ θ \in (0, π),

which establishes strict convexity on

(0, π)

. □

Remark 11

(Intuition for beginners). The strict convexity of

f_{ϕ}

arises from the fact that, on a sphere, the geodesic distance along a latitude circle grows “faster than linearly” with the longitudinal separation. The crucial inequality

sin σ - σ cos σ > 0 for all σ \in (0, π)

expresses that the function

\frac{sin σ}{σ}

is strictly decreasing on

(0, π)

. Geometrically, this means that the spherical arc behaves more and more “curved” as σ increases, so increments in

Δ θ

produce increasingly larger contributions to the squared distance. This curvature effect forces the second derivative

f_{ϕ}^{''} (Δ θ)

to remain positive on

(0, π)

, which is precisely the definition of strict convexity.

Corollary 1

(Unique boundary on a ring). Fix

ϕ \in (- \frac{π}{2}, \frac{π}{2})

and let

q_{i} = (ϕ, α_{i})

and

q_{j} = (ϕ, α_{j})

be two distinct points on the same latitude. Consider the shorter longitude interval (arc)

I \subset R / (2 π Z)

joining

α_{i}

to

α_{j}

. Then there exists a unique longitude

θ^{*} \in I

such that

d_{G} ((ϕ, θ^{*}), q_{i}) = d_{G} ((ϕ, θ^{*}), q_{j}) .

Moreover, for

θ \in I

one has

θ < θ^{*} ⟹ d_{G} ((ϕ, θ), q_{i}) < d_{G} ((ϕ, θ), q_{j}), θ > θ^{*} ⟹ d_{G} ((ϕ, θ), q_{i}) > d_{G} ((ϕ, θ), q_{j}) .

In particular, along the shorter arc I, the Voronoi boundary between

q_{i}

and

q_{j}

intersects the ring at exactly one point.

Proof.

Write the squared-distance along the latitude

ϕ

as in Lemma 1:

f_{ϕ} (Δ θ) : = d_{G} {((ϕ, 0), (ϕ, Δ θ))}^{2}, Δ θ \in [0, π] .

By Lemma 1,

f_{ϕ}

is even and strictly convex on

(0, π)

. Parametrize the shorter arc I by a variable

t \in [0, L]

(where

L \in (0, π]

is the arc-length in longitude), so that

θ (t)

moves monotonically from

α_{i}

to

α_{j}

along I. Then the squared distances to

q_{i}

and

q_{j}

along I can be written as

D_{i} (t) = f_{ϕ} (t), D_{j} (t) = f_{ϕ} (L - t) .

Define

g (t) : = D_{i} (t) - D_{j} (t) = f_{ϕ} (t) - f_{ϕ} (L - t)

for

t \in [0, L]

. Since

f_{ϕ}

is differentiable on

(0, π)

and strictly convex, its derivative is strictly increasing on

(0, π)

. Hence for

t \in (0, L)

,

g^{'} (t) = f_{ϕ}^{'} (t) + f_{ϕ}^{'} (L - t) > 0,

because

f_{ϕ}^{'}

is strictly increasing and odd-symmetry of

f_{ϕ}

implies

f_{ϕ}^{'} (u) \geq 0

for

u \in (0, π)

. Therefore, g is strictly increasing on

[0, L]

. Moreover,

g (0) = f_{ϕ} (0) - f_{ϕ} (L) < 0, g (L) = f_{ϕ} (L) - f_{ϕ} (0) > 0 .

By continuity and strict monotonicity, there exists a unique

t^{*} \in (0, L)

such that

g (t^{*}) = 0

. Equivalently,

θ^{*} : = θ (t^{*})

is the unique point on I where the two distances are equal. The strict sign change of g yields the stated inequalities on either side of

θ^{*}

. □

Proof of Theorem 2.

Fix a latitude

ϕ \in (- \frac{π}{2}, \frac{π}{2})

and consider the ring

R_{ϕ} = {(ϕ, θ_{j}) : θ_{j} = 2 π j / N, j = 0, \dots, N - 1} .

Let

Q = {q_{1}, \dots, q_{n}} \subset S^{2}

be an optimal set of n–means, and for each

q_{i}

denote by

X_{i} : = R_{ϕ} \cap V (q_{i})

the set of ring points assigned to

q_{i}

under the Voronoi partition induced by Q.

Step 1: Reduction to representatives on the ring. Suppose

X_{i} \neq \emptyset

. If

q_{i}

does not lie on latitude

ϕ

, reflect

q_{i}

across the plane tangent to the sphere along the ring to obtain a point

q_{i}^{'}

symmetric with respect to

R_{ϕ}

. For every

x \in R_{ϕ}

one has

d_{G} (x, q_{i}) = d_{G} (x, q_{i}^{'}) .

Let

{\tilde{q}}_{i}

be the midpoint of the geodesic segment joining

q_{i}

and

q_{i}^{'}

. Then

{\tilde{q}}_{i}

lies on latitude

ϕ

and satisfies

d_{G} (x, {\tilde{q}}_{i}) \leq d_{G} (x, q_{i}) for all x \in R_{ϕ},

with equality only if

q_{i}

already lies on the ring. Replacing

q_{i}

by

{\tilde{q}}_{i}

does not increase the total distortion. Hence, without loss of optimality, every representative serving ring points lies on

R_{ϕ}

.

Step 2: Unique boundary between two representatives on the ring. Let

q_{i} = (ϕ, α_{i})

and

q_{j} = (ϕ, α_{j})

be two distinct representatives on

R_{ϕ}

. By Corollary 1, along the shorter arc of

R_{ϕ}

joining

α_{i}

and

α_{j}

there exists a unique longitude

θ^{*}

at which

d_{G} ((ϕ, θ^{*}), q_{i}) = d_{G} ((ϕ, θ^{*}), q_{j}),

and the difference of squared distances changes sign exactly once. Consequently, the Voronoi boundary between

q_{i}

and

q_{j}

intersects the ring at exactly one point, and the ring is partitioned into two contiguous arcs, one assigned to

q_{i}

and the other to

q_{j}

.

Step 3: Exclusion of disconnected assignments. Suppose, toward a contradiction, that

X_{i}

is not contiguous along the cyclic order of the ring. Then there exist three consecutive ring points

x_{a}, x_{b}, x_{c} \in R_{ϕ}

(in cyclic order) such that

x_{a}, x_{c} \in X_{i}, x_{b} \in X_{j}

for some

j \neq i

. In particular,

d_{G} (x_{b}, q_{j}) \leq d_{G} (x_{b}, q_{i}), d_{G} (x_{a}, q_{i}) \leq d_{G} (x_{a}, q_{j}), d_{G} (x_{c}, q_{i}) \leq d_{G} (x_{c}, q_{j}) .

However, by strict convexity of

f_{ϕ} (Δ θ)

from Lemma 1 and the monotonicity result of Corollary 1, the function

x ⟼ d_{G} {(x, q_{i})}^{2} - d_{G} {(x, q_{j})}^{2}

cannot change sign twice along a connected arc of the ring. Thus assigning

x_{b}

to

q_{j}

while its immediate neighbors are assigned to

q_{i}

contradicts the unique sign change property. Equivalently, reassigning

x_{b}

from

q_{j}

to

q_{i}

would strictly decrease the total distortion, contradicting the optimality of Q.

Conclusion. Each nonempty

X_{i}

must therefore consist of a single contiguous block of consecutive longitudes on the ring (in cyclic order). This completes the proof. □

Proof of Theorem 3.

Assume, toward a contradiction, that there exists an optimal Voronoi region

V (q^{*})

that intersects two distinct components

C_{s}

and

C_{t}

with

s \neq t

. Define

A : = C_{s} \cap V (q^{*}), B : = C_{t} \cap V (q^{*}),

and note that both A and B are nonempty.

By assumption,

A \subset B (p_{s}, r)

and

B \subset B (p_{t}, r)

, where the geodesic balls

B (p_{s}, r)

and

B (p_{t}, r)

are disjoint. Hence, for all

x \in A

and

y \in B

,

d_{G} (x, p_{s}) < d_{G} (x, p_{t}), d_{G} (y, p_{t}) < d_{G} (y, p_{s}) .

Let

q_{s} : = arg min_{q \in S^{2}} \sum_{x \in A} d_{G} {(x, q)}^{2}, q_{t} : = arg min_{q \in S^{2}} \sum_{y \in B} d_{G} {(y, q)}^{2}

denote the intrinsic (Karcher) means of A and B, respectively. Since

A \subset B (p_{s}, r)

and

B \subset B (p_{t}, r)

, it follows that

q_{s} \in B (p_{s}, r), q_{t} \in B (p_{t}, r) .

By optimality of

q_{s}

and

q_{t}

, we have

\sum_{x \in A} d_{G} {(x, q_{s})}^{2} \leq \sum_{x \in A} d_{G} {(x, q^{*})}^{2}, \sum_{y \in B} d_{G} {(y, q_{t})}^{2} \leq \sum_{y \in B} d_{G} {(y, q^{*})}^{2} .

If both inequalities were equalities, then

q^{*} = q_{s} = q_{t}

, which is impossible since

q_{s}

and

q_{t}

lie in disjoint geodesic balls. Therefore, at least one of the above inequalities is strict, and hence

min \{\sum_{u \in A \cup B} d_{G} {(u, q_{s})}^{2}, \sum_{u \in A \cup B} d_{G} {(u, q_{t})}^{2}\} < \sum_{u \in A \cup B} d_{G} {(u, q^{*})}^{2} .

Let

q^{†} \in {q_{s}, q_{t}}

be the point achieving the minimum above, and define

\tilde{Q} : = (Q^{*} ∖ {q^{*}}) \cup {q^{†}} .

Then

| \tilde{Q} | = n

. Assign each point of X to its nearest representative in

\tilde{Q}

. For points in

A \cup B

, the total distortion strictly decreases, while for all other points, the distortion does not increase. Consequently,

V_{n} (\tilde{Q}; P) < V_{n} (Q^{*}; P),

contradicting the optimality of

Q^{*}

.

Hence, no optimal Voronoi region can intersect two distinct components, and each Voronoi cell intersects at most one

C_{t}

. This completes the proof. □

Proof of Theorem 4.

(i) Within-ring structure. Fix a ring

R_{t}

and an optimal

Q^{*}

. By Theorem 3, on a fixed latitude the ring points assigned to any

q \in Q^{*}

form (if nonempty) a single contiguous block in cyclic order. Let a nonempty block on

R_{t}

have longitudes

{θ_{t, j_{0}}, \dots, θ_{t, j_{0} + ℓ - 1}}

(indices mod

N_{t}

). Writing the squared geodesic distance along the ring as

f_{ϕ_{t}} (Δ θ)

, Lemma 1 shows

f_{ϕ_{t}}

is even and strictly convex, hence the intrinsic (Karcher) mean of the block lies on the same latitude

ϕ_{t}

and at the mid-longitude of the block; this uniquely minimizes the block’s sum of squared distances. Finally, for a fixed ring and fixed

k_{t}

, distributing

N_{t}

consecutive points into

k_{t}

contiguous blocks that are as equal as possible minimizes the sum of convex costs; thus block lengths differ by at most one, i.e., each block has size

⌊ N_{t} / k_{t} ⌋

or

⌈ N_{t} / k_{t} ⌉

.

(ii) No cross-ring mixing. Suppose, toward a contradiction, that there exists an optimal Voronoi cell

V (q^{*})

that intersects two distinct rings

R_{s}

and

R_{t}

with

s \neq t

. Define

A : = R_{s} \cap V (q^{*}), B : = R_{t} \cap V (q^{*}),

and note that both A and B are nonempty.

Let

q_{s}

and

q_{t}

denote the intrinsic (Karcher) means of A and B, respectively. By Lemma 1 and the within-ring structure established in part (i),

q_{s}

lies on latitude

ϕ_{s}

at the mid-longitude of the block A, and similarly

q_{t}

lies on latitude

ϕ_{t}

at the mid-longitude of B.

By definition of intrinsic means,

\sum_{x \in A} d_{G} {(x, q_{s})}^{2} \leq \sum_{x \in A} d_{G} {(x, q^{*})}^{2}, \sum_{y \in B} d_{G} {(y, q_{t})}^{2} \leq \sum_{y \in B} d_{G} {(y, q^{*})}^{2} .

If both inequalities were equalities, then

q^{*}

would simultaneously minimize the squared geodesic distortion over both A and B. This is impossible unless A or B is trivial, since the intrinsic mean of a nontrivial block on a ring is uniquely located on the corresponding latitude and mid-longitude. Therefore, at least one of the above inequalities is strict, and hence

min \{\sum_{u \in A \cup B} d_{G} {(u, q_{s})}^{2}, \sum_{u \in A \cup B} d_{G} {(u, q_{t})}^{2}\} < \sum_{u \in A \cup B} d_{G} {(u, q^{*})}^{2} .

Let

q^{†} \in {q_{s}, q_{t}}

be the point achieving the minimum above, and define

\tilde{Q} : = (Q^{*} ∖ {q^{*}}) \cup {q^{†}} .

Then

| \tilde{Q} | = n

. Assign each point of X to its nearest representative in

\tilde{Q}

. For all points in

A \cup B

, the total distortion strictly decreases, while for all other points, the distortion does not increase. Consequently,

V_{n} (\tilde{Q}; P) < V_{n} (Q^{*}; P),

which contradicts the optimality of

Q^{*}

.

Therefore, no optimal Voronoi cell can intersect two distinct rings. This completes the proof of part

(i i)

.

(iii) Allocation across rings.

We formalize the allocation argument using a discrete convexity and exchange principle. Fix a ring

R_{t}

with

N_{t}

equally spaced longitudes on latitude

ϕ_{t}

. By (i), when k representatives serve

R_{t}

, the optimal within-ring partition consists of k contiguous blocks whose lengths differ by at most one; moreover each block is represented at its mid-longitude. Let

E_{t} (k)

denote the corresponding minimal distortion on

R_{t}

.

Step 1: Monotonicity of

E_{t} (k)

. Passing from k to

k + 1

splits one (largest) block into two nearly equal sub-blocks and places the new representative at their mid-longitude. Since each point in the split block weakly decreases its distance to its nearest representative, the total within-ring distortion strictly decreases. Hence

E_{t} (k + 1) < E_{t} (k)

for all

k \geq 0

; in particular

E_{t}

is strictly decreasing.

Step 2: Discrete convexity (diminishing returns). Write the (longitude-only) cost function on latitude

ϕ_{t}

as

f_{ϕ_{t}} (Δ θ) : = d_{G} {((ϕ_{t}, 0), (ϕ_{t}, Δ θ))}^{2},

which is even and strictly convex in

Δ θ

by Lemma 1. In the optimal k-block partition, each block of length L contributes a cost of the form

Φ_{t} (L) : = \sum_{m = - (L - 1) / 2}^{(L - 1) / 2} f_{ϕ_{t}} (\frac{2 π m}{N_{t}}),

centered at its midpoint (this expression is independent of the absolute longitude by cyclic symmetry). Since

f_{ϕ_{t}}

is strictly convex and the summation window shifts symmetrically about the midpoint, the discrete second difference

Δ^{2} Φ_{t} (L) : = Φ_{t} (L + 1) - 2 Φ_{t} (L) + Φ_{t} (L - 1)

is non-negative and is, in fact, strictly positive for all admissible L; hence

Φ_{t}

is (strictly) discrete convex in L. Consequently, the marginal drop produced by splitting a block of length L into lengths

⌊ L / 2 ⌋

and

⌈ L / 2 ⌉

,

G_{t} (L) : = Φ_{t} (L) - (Φ_{t} (⌊ L / 2 ⌋) + Φ_{t} (⌈ L / 2 ⌉)),

is (strictly) increasing in L: splitting a larger block saves more distortion than splitting a smaller one. In the optimal k-block partition, the largest block length is nonincreasing in k, so the ring-level marginal drop

Δ_{t} (k) : = E_{t} (k) - E_{t} (k + 1)

is (strictly) decreasing in k. This is the discrete convexity (diminishing returns) of

E_{t}

.

Step 3: Exchange (pairwise-improvement) argument. Consider two feasible allocations

k = (k_{1}, \dots, k_{T})

and

k^{'} = (k_{1}^{'}, \dots, k_{T}^{'})

with

\sum_{t} k_{t} = \sum_{t} k_{t}^{'} = n

. Suppose there exist indices

a \neq b

with

k_{a} \geq 1

such that

Δ_{a} (k_{a} - 1) < Δ_{b} (k_{b}) .

Move one representative from ring a to ring b, producing

\hat{k}

with

{\hat{k}}_{a} = k_{a} - 1

,

{\hat{k}}_{b} = k_{b} + 1

and

{\hat{k}}_{t} = k_{t}

otherwise. By the definition of

Δ_{t} (\cdot)

and the preceding monotonicity/convexity,

E_{a} ({\hat{k}}_{a}) + E_{b} ({\hat{k}}_{b}) = E_{a} (k_{a}) - Δ_{a} (k_{a} - 1) + E_{b} (k_{b}) - Δ_{b} (k_{b}) < E_{a} (k_{a}) + E_{b} (k_{b}),

while

E_{t} ({\hat{k}}_{t}) = E_{t} (k_{t})

for all

t \notin {a, b}

. Thus

\sum_{t} E_{t} ({\hat{k}}_{t}) < \sum_{t} E_{t} (k_{t})

, i.e., the exchange strictly improves the total distortion whenever some pair violates

Δ_{a} (k_{a} - 1) \geq Δ_{b} (k_{b}) for all a, b .

Hence any optimal allocation must satisfy the equal-marginal (no-improving exchange) condition

Δ_{t} (k_{t} - 1) \geq λ \geq Δ_{t} (k_{t}) for some λ \in R and all t,

which is equivalent to the greedy selection rule below. This condition follows from the exchange argument in Step 3 above: any violation would yield a strictly improving reassignment of representatives, contradicting optimality.

Step 4: Greedy (discrete water-filling) optimality. Start from

k = 0

. At each step

m = 0, 1, \dots, n - 1

, choose an index

t_{m} \in arg max_{t} Δ_{t} (k_{t}),

and set

k_{t_{m}} \leftarrow k_{t_{m}} + 1

. Because

Δ_{t} (\cdot)

are (strictly) decreasing, this construction maintains the equal-marginal condition after every increment. By Step 3, no exchange can improve the resulting allocation at any stage, so after n increments the final k is globally optimal. This is precisely the discrete water-filling rule: at each step allocate the next representative to the ring that yields the largest current marginal drop

Δ_{t} (k_{t})

. □

Corollary 2

(Explicit Formula for the Within-Ring Distortion). Fix a ring

R_{t} = {(ϕ_{t}, θ_{t, j}) : θ_{t, j} = 2 π j / N_{t}, j = 0, \dots, N_{t} - 1}

at latitude

ϕ_{t}

, and suppose exactly

k_{t} \geq 1

representatives serve

R_{t}

. Let

L_{t} : = ⌊N_{t} / k_{t}⌋, r_{t} : = N_{t} - k_{t} L_{t} \in {0, 1, \dots, k_{t} - 1} .

Then, by Theorem 4(i), the optimal within-ring partition of

R_{t}

consists of

r_{t}

contiguous blocks of length

L_{t} + 1

and

k_{t} - r_{t}

contiguous blocks of length

L_{t}

, with each block represented at its mid-longitude on the same latitude

ϕ_{t}

. Writing

f_{ϕ_{t}} (Δ θ) : = d_{G} {((ϕ_{t}, 0), (ϕ_{t}, Δ θ))}^{2} (so f_{ϕ_{t}} is even in Δ θ),

the exact within-ring distortion contributed by

R_{t}

is

E_{t} (k_{t}) = \frac{1}{M} [(k_{t} - r_{t}) Φ_{t} (L_{t}) + r_{t} Φ_{t} (L_{t} + 1)],

where, for any integer

L \geq 1

, the block cost admits the parity-uniform expression

Φ_{t} (L) = \{\begin{matrix} 2 \sum_{m = 1}^{\frac{L - 1}{2}} f_{ϕ_{t}} (\frac{2 π m}{N_{t}}), & if L is odd, \\ 2 \sum_{m = 1}^{\frac{L}{2}} f_{ϕ_{t}} (\frac{(2 m - 1) π}{N_{t}}), & if L is even . \end{matrix}

Consequently, the total distortion decomposes as

V_{n, 2} (P) = \sum_{t = 1}^{T} E_{t} (k_{t}),

with the allocation vector

(k_{1}, \dots, k_{T})

determined by Theorem 4(iii).

Remark 12

(Equal-block special case). If

N_{t}

is divisible by

k_{t}

(so

r_{t} = 0

and

L_{t} = N_{t} / k_{t}

), then all blocks have the same length and

E_{t} (k_{t}) = \frac{k_{t}}{M} Φ_{t} (\frac{N_{t}}{k_{t}}) .

Corollary 3

(Asymptotic Within-Ring Distortion). Assume the setting of Corollary 2. As

N_{t} \to \infty

(with

k_{t} \geq 1

fixed or

k_{t} = o (N_{t})

), the within-ring distortion admits the asymptotic expansion

E_{t} (k_{t}) = \frac{N_{t}}{M} \cdot {cos}^{2} ϕ_{t} \cdot \frac{π^{2}}{3 k_{t}^{2}} + o (\frac{N_{t}}{M} \cdot \frac{1}{k_{t}^{2}}) .

Equivalently, to leading order,

E_{t} (k_{t}) \sim \frac{N_{t}}{M} \cdot {cos}^{2} ϕ_{t} \cdot \frac{π^{2}}{3 k_{t}^{2}} .

Proof.

From Theorem 4(i), the

R_{t}

-contribution

E_{t} (k_{t})

is a sum of block costs of the form

\sum f_{ϕ_{t}} (Δ θ_{j})

with

Δ θ_{j}

taking the discrete offsets listed in Corollary 2. Using the Taylor expansion

σ (ϕ_{t}, Δ θ) : = d_{G} ((ϕ_{t}, 0), (ϕ_{t}, Δ θ)) = cos ϕ_{t} | Δ θ | + {O (| Δ θ |}^{3}),

uniform for bounded

ϕ_{t}

, we obtain

f_{ϕ_{t}} (Δ θ) = σ {(ϕ_{t}, Δ θ)}^{2} = {cos}^{2} ϕ_{t} Δ θ^{2} + O (Δ θ^{4})

. Summing over a block of length

L ≍ N_{t} / k_{t}

whose offsets are arithmetic grids of step

≍ π / N_{t}

yields

Φ_{t} (L) = {cos}^{2} ϕ_{t} \cdot \frac{π^{2}}{3} \cdot \frac{L^{3}}{N_{t}^{2}} + o (\frac{L^{3}}{N_{t}^{2}}) .

With

k_{t} - r_{t}

blocks of size

L_{t}

and

r_{t}

of size

L_{t} + 1

, and

L_{t} \sim N_{t} / k_{t}

, we find

E_{t} (k_{t}) = \frac{1}{M} \cdot k_{t} \cdot {cos}^{2} ϕ_{t} \cdot \frac{π^{2}}{3} \cdot \frac{L_{t}^{3}}{N_{t}^{2}} + o (\frac{N_{t}}{M} \cdot \frac{1}{k_{t}^{2}}) = \frac{N_{t}}{M} \cdot {cos}^{2} ϕ_{t} \cdot \frac{π^{2}}{3 k_{t}^{2}} + o (\frac{N_{t}}{M} \cdot \frac{1}{k_{t}^{2}}) .

□

Remark 13

(Interpretation). The factor

\frac{N_{t}}{M}

is the probability mass of ring

R_{t}

under the discrete uniform P (over all M support points). The multiplicative

{cos}^{2} ϕ_{t}

comes from the local relation

d_{G} ((ϕ_{t}, 0), (ϕ_{t}, Δ θ)) \approx cos ϕ_{t} | Δ θ |

, i.e., the effective one-dimensional radius of the latitude circle is

cos ϕ_{t}

. The

k_{t}^{- 2}

law is the usual one-dimensional

n^{- 2}

scaling of high-resolution quantization along the ring.

5. Stability of Optimal Sets Under Perturbation

Quantization configurations arising from experimental or numerical data are often subject to small perturbations of the support points. It is therefore important to understand how the optimal set of n-means and the corresponding quantization error vary when the underlying distribution changes slightly.

Note 1

(Underlying data space and topology). Throughout this section, a finite data set of size M is viewed as an ordered M-tuple

x = (x_{1}, \dots, x_{M}) \in {(S^{2})}^{M} .

The product topology on

{(S^{2})}^{M}

is metrized by the sup-metric

d_{\infty} (x, y) : = max_{1 \leq i \leq M} d_{G} (x_{i}, y_{i}),

where

d_{G}

denotes the geodesic distance on

S^{2}

. Under the assumed one-to-one correspondence between the perturbed data

x^{(m)} = (x_{1}^{(m)}, \dots, x_{M}^{(m)})

and the limiting data

x = (x_{1}, \dots, x_{M})

, the condition

ε_{m} : = max_{1 \leq i \leq M} d_{G} (x_{i}^{(m)}, x_{i}) \to 0

is exactly the statement that

x^{(m)} \to x

in

{(S^{2})}^{M}

with respect to the metric

d_{\infty}

.

Perturbed Distributions

Let

X = {x_{1}, x_{2}, \dots, x_{M}} \subset S^{2} and X^{'} = {x_{1}^{'}, x_{2}^{'}, \dots, x_{M}^{'}} \subset S^{2}

be two finite sets with the same cardinality M. We assume a one-to-one correspondence between

x_{i}

and

x_{i}^{'}

and define

ε = max_{1 \leq i \leq M} d_{G} (x_{i}, x_{i}^{'}),

the maximal geodesic perturbation. Let P and

P^{'}

denote the corresponding uniform distributions:

P = \frac{1}{M} \sum_{i = 1}^{M} δ_{x_{i}}, P^{'} = \frac{1}{M} \sum_{i = 1}^{M} δ_{x_{i}^{'}} .

We wish to compare the optimal distortions

V_{n, 2} (P)

and

V_{n, 2} (P^{'})

and the corresponding optimal configurations.

Perturbation of the distortion functional.

Lemma 2

(Lipschitz Stability of the Distortion). Let

Q = {q_{1}, \dots, q_{| Q |}} \subset S^{2}

with

| Q | \leq n

, and let

P = \frac{1}{M} \sum_{i = 1}^{M} δ_{x_{i}}, P^{'} = \frac{1}{M} \sum_{i = 1}^{M} δ_{x_{i}^{'}}

be empirical measures such that

d_{G} (x_{i}, x_{i}^{'}) \leq ε

for all i. Then the distortion functional is Lipschitz-stable under these perturbations, in the sense that

|V_{n, 2} (P; Q) - V_{n, 2} (P^{'}; Q)| \leq 2 π ε .

Proof.

Define, for each

x \in S^{2}

,

{dist}_{Q} (x) : = min_{1 \leq k \leq | Q |} d_{G} (x, q_{k}) .

Since Q is finite, the minimum is attained for every x, so

{dist}_{Q}

is well defined. With this notation,

V_{n, 2} (P; Q) = \frac{1}{M} \sum_{i = 1}^{M} {dist}_{Q} {(x_{i})}^{2}, V_{n, 2} (P^{'}; Q) = \frac{1}{M} \sum_{i = 1}^{M} {dist}_{Q} {(x_{i}^{'})}^{2} .

Step 1 (Lipschitz property of

{dist}_{Q}

). For any fixed

q \in Q

and any

x, y \in S^{2}

, the triangle inequality gives

| d_{G} (x, q) - d_{G} (y, q) | \leq d_{G} (x, y)

. Taking the pointwise minimum over the finite set Q preserves the 1-Lipschitz constant; hence

|{dist}_{Q} (x) - {dist}_{Q} (y)| \leq d_{G} (x, y) for all x, y \in S^{2} .

(10)

Step 2 (Difference of squares). Set

a_{i} : = {dist}_{Q} (x_{i})

and

a_{i}^{'} : = {dist}_{Q} (x_{i}^{'})

. By (10) and the perturbation assumption,

| a_{i}^{'} - a_{i} | \leq d_{G} (x_{i}^{'}, x_{i}) \leq ε

. Also,

0 \leq a_{i}, a_{i}^{'} \leq π

since

d_{G} (\cdot, \cdot) \leq π

on

S^{2}

. Thus

|a_{i}^{' 2} - a_{i}^{2}| = | a_{i}^{'} - a_{i} | (a_{i}^{'} + a_{i}) \leq ε (a_{i}^{'} + a_{i}) \leq 2 π ε .

Step 3 (Average over i). Summing the previous bound over i and dividing by M yields

|V_{n, 2} (P^{'}; Q) - V_{n, 2} (P; Q)| = |\frac{1}{M} \sum_{i = 1}^{M} (a_{i}^{' 2} - a_{i}^{2})| \leq \frac{1}{M} \sum_{i = 1}^{M} 2 π ε = 2 π ε .

This proves the Lipschitz stability of the distortion. □

Remark 14

(On the non-sharpness of the Lipschitz bound). The Lipschitz constant

2 π

appearing in Lemma 2 is deliberately non-optimal and reflects a global worst-case estimate on the sphere. Indeed, the bound arises from the inequalities

0 \leq d_{G} (x, q) \leq π for all x, q \in S^{2},

together with the elementary estimate

| a^{2} - b^{2} | \leq | a - b | (a + b),

applied uniformly over the entire sphere. This argument ignores any finer geometric structure of the Voronoi partition induced by the configuration Q.

In typical quantization configurations, however, each Voronoi cell is contained in a geodesic ball of radius strictly smaller than π, often much smaller in practice. If one defines

R (Q) : = max_{1 \leq j \leq | Q |} sup_{x \in V_{j} (Q)} d_{G} (x, q_{j}),

the maximal cluster radius associated with the configuration Q, then the proof of Lemma 2 immediately yields the refined estimate

| V_{n, 2} (P; Q) - V_{n, 2} (P^{'}; Q) | \leq 2 R (Q) ε,

which improves the constant whenever

R (Q) < π

. In particular, for centroidal Voronoi configurations arising from well-distributed data,

R (Q)

is typically bounded away from π.

More generally, stability estimates for quantization functionals are closely related to stability properties of empirical measures under perturbations and to continuity of minimizers of Fréchet-type functionals on metric spaces. Such refinements are well known in Euclidean quantization theory and in the study of intrinsic means on Riemannian manifolds; see, for example, Graf and Luschgy [7], Karcher [20], and Pennec [22]. Developing sharper, data-dependent Lipschitz constants in the spherical setting would require detailed control of Voronoi geometry and cluster diameters, which lies beyond the scope of the present paper.

Corollary 4

(Continuity of Optimal Quantizers). Let

{P^{(m)}}_{m \geq 1}

be a sequence of empirical (uniform) probability measures on

S^{2}

,

P^{(m)} = \frac{1}{M} \sum_{i = 1}^{M} δ_{x_{i}^{(m)}},

whose supports

X^{(m)} = {x_{1}^{(m)}, \dots, x_{M}^{(m)}}

converge to

X = {x_{1}, \dots, x_{M}}

in the sense that

ε_{m} : = max_{1 \leq i \leq M} d_{G} (x_{i}^{(m)}, x_{i}) ⟶ 0 as m \to \infty .

Let

P = \frac{1}{M} \sum_{i = 1}^{M} δ_{x_{i}}

denote the limiting empirical measure.

Then, for every fixed

n \geq 1

, the following hold:

1.: The optimal distortion values converge:

$V_{n, 2} (P^{(m)}) ⟶ V_{n, 2} (P) as m \to \infty .$
2.: Let $Q^{(m), *}$ be an optimal set of n–means for $P^{(m)}$ . Then, after possibly reordering the points in each $Q^{(m), *}$ , one can extract a subsequence that converges to a limiting set $Q^{*}$ , and this limiting set $Q^{*}$ is an optimal set of n-means for P.

Proof.

Let

Q \subset S^{2}

with

| Q | \leq n

be arbitrary. By Lemma 2, for each m we have

| V_{n, 2} (P^{(m)}; Q) - V_{n, 2} (P; Q) | \leq 2 π ε_{m},

where

ε_{m} : = {max}_{1 \leq i \leq M} d_{G} (x_{i}^{(m)}, x_{i}) \to 0

.

(1) Convergence of optimal values. Fix

δ > 0

and choose

Q_{δ} \subset S^{2}

with

| Q_{δ} | \leq n

such that

V_{n, 2} (P; Q_{δ}) \leq V_{n, 2} (P) + δ .

Then for every m,

V_{n, 2} (P^{(m)}) \leq V_{n, 2} (P^{(m)}; Q_{δ}) \leq V_{n, 2} (P; Q_{δ}) + 2 π ε_{m} \leq V_{n, 2} (P) + δ + 2 π ε_{m} .

Taking

{lim sup}_{m \to \infty}

and using

ε_{m} \to 0

yields

\underset{m \to \infty}{lim sup} V_{n, 2} (P^{(m)}) \leq V_{n, 2} (P) + δ .

Since

δ > 0

is arbitrary,

{lim sup}_{m \to \infty} V_{n, 2} (P^{(m)}) \leq V_{n, 2} (P)

.

For the reverse inequality, fix m and let

Q^{(m), *}

be optimal for

P^{(m)}

, so

V_{n, 2} (P^{(m)}) = V_{n, 2} (P^{(m)}; Q^{(m), *})

. Applying Lemma 2 again,

V_{n, 2} (P) \leq V_{n, 2} (P; Q^{(m), *}) \leq V_{n, 2} (P^{(m)}; Q^{(m), *}) + 2 π ε_{m} = V_{n, 2} (P^{(m)}) + 2 π ε_{m} .

Taking

{lim inf}_{m \to \infty}

gives

V_{n, 2} (P) \leq {lim inf}_{m \to \infty} V_{n, 2} (P^{(m)})

. Combining both bounds yields

V_{n, 2} (P^{(m)}) ⟶ V_{n, 2} (P) .

(2) Convergence of a subsequence of optimal configurations. For each m, choose an optimal configuration

Q^{(m), *} = {q_{1}^{(m)}, \dots, q_{n}^{(m)}} \subset S^{2}

(padding with repetitions if necessary so that it is an n-tuple). Since

{(S^{2})}^{n}

is compact, there exist a subsequence (still denoted m) and points

q_{1}^{*}, \dots, q_{n}^{*} \in S^{2}

such that, after reordering the points in each

Q^{(m), *}

,

(q_{1}^{(m)}, \dots, q_{n}^{(m)}) ⟶ (q_{1}^{*}, \dots, q_{n}^{*}) in {(S^{2})}^{n} .

Let

Q^{*} : = {q_{1}^{*}, \dots, q_{n}^{*}}

. The mapping

Q \mapsto V_{n, 2} (P; Q)

is continuous on

{(S^{2})}^{n}

because it is a finite sum of continuous functions and the pointwise minimum over finitely many continuous functions is continuous. Hence

V_{n, 2} (P; Q^{(m), *}) ⟶ V_{n, 2} (P; Q^{*}) .

Using Lemma 2 once more,

| V_{n, 2} (P^{(m)}; Q^{(m), *}) - V_{n, 2} (P; Q^{(m), *}) | \leq 2 π ε_{m} \to 0,

so

V_{n, 2} (P^{(m)}; Q^{(m), *}) \to V_{n, 2} (P; Q^{*})

. But

V_{n, 2} (P^{(m)}; Q^{(m), *}) = V_{n, 2} (P^{(m)}) \to V_{n, 2} (P)

by part (1). Therefore

V_{n, 2} (P; Q^{*}) = V_{n, 2} (P)

, i.e.,

Q^{*}

is an optimal set of n-means for P. □

Remark 15.

Corollary 4 shows that optimal quantizers depend continuously on the underlying data points: if the support of

P^{(m)}

changes only slightly, then the optimal distortion and the corresponding optimal sets of n–means also change only slightly. In other words, optimal quantizers are stable under small perturbations of the data.

This stability property is important both theoretically and in practice. On the theoretical side, it guarantees that quantizers do not exhibit sudden jumps or discontinuous behavior when the data are slightly modified. From a practical perspective, real spherical datasets often contain measurement noise or numerical errors; the stability result ensures that such perturbations do not significantly affect the quality of the computed quantizers, and the algorithmic output remains reliable.

6. Algorithmic Construction of Optimal n-Means

In this section, we describe an iterative procedure for computing an optimal configuration of n-means for a given finite discrete uniform distribution P on

S^{2}

. Since the geometry of the sphere is non-Euclidean, the classical Lloyd algorithm cannot be applied directly. In particular, the notion of the mean must be replaced by the intrinsic (Karcher) mean on

S^{2}

, and Voronoi partitions must be defined using the geodesic distance.

The procedure described below is a natural spherical analogue of Lloyd’s method. It alternates between updating the Voronoi partition and relocating the representatives to the intrinsic means of their associated clusters. This iterative scheme typically converges rapidly in practice and serves as an effective computational tool for approximating optimal n-means on the sphere.

6.1. Lloyd-Type Algorithm on the Sphere

We first describe the iterative algorithm, which is a natural spherical analogue of Lloyd’s method. The procedure alternates between Voronoi partition updates and intrinsic mean updates with respect to the geodesic distance.

Let

Q^{(0)} = {q_{1}^{(0)}, \dots, q_{n}^{(0)}} \subset S^{2}

be an initial configuration. For each iteration

r \geq 0

, perform the following steps:

(i): Voronoi partition step: Assign each support point $x_{i} \in X$ to its nearest representative in $Q^{(r)}$ with respect to the geodesic distance. This gives the clusters

$X_{j} (Q^{(r)}) = {x_{i} \in X : d_{G} (x_{i}, q_{j}^{(r)}) \leq d_{G} (x_{i}, q_{k}^{(r)}) for all k} .$

This step induces the spherical Voronoi tessellation of the support X relative to the current configuration $Q^{(r)}$ .
(ii): Centroid update step: Move each representative to the intrinsic (Karcher) mean of its current cluster:

$q_{j}^{(r + 1)} = \underset{q \in S^{2}}{arg min} \sum_{x \in X_{j} (Q^{(r)})} d_{G} {(x, q)}^{2} .$

This update ensures the smallest possible sum of squared geodesic distances between the representative and the points in its cluster, analogous to replacing a representative by the arithmetic mean in the Euclidean Lloyd algorithm.

The iteration is terminated when

Q^{(r + 1)} = Q^{(r)}

(i.e., a fixed point is reached), or when the decrease in the distortion

V_{n, 2} (P; Q^{(r)})

falls below a prescribed tolerance. In practice, the method converges in only a few iterations, especially when the initial configuration is reasonably well distributed over the sphere.

6.2. Monotonicity and Fixed-Point Properties

We now record the basic analytic properties of the underlying distortion functional and geometric mappings driving the algorithm.

Lemma 3

(Monotonic decrease in distortion). At each iteration step, the distortion does not increase:

V_{n, 2} (P; Q^{(r + 1)}) \leq V_{n, 2} (P; Q^{(r)}) .

Proof.

Fix

r \geq 0

. Step (i) of the iteration assigns each

x_{i}

to the nearest representative in

Q^{(r)}

, which minimizes the contribution of

x_{i}

to the distortion for that iteration. Step (ii) then replaces each

q_{j}^{(r)}

by the intrinsic mean of its cluster, which minimizes the sum of squared distances from the points in that cluster to the representative. Therefore, both steps do not increase, and in fact strictly decrease the overall distortion whenever

Q^{(r + 1)} \neq Q^{(r)}

, i.e., unless a fixed point has been reached. □

6.3. Convergence Properties

We next state the main convergence result for the iterative scheme.

Theorem 5.

Every accumulation point of the sequence

{Q^{(r)}}

generated by the above algorithm is a centroidal configuration for P. In particular, if the sequence converges, then its limit is a centroidal configuration.

Proof.

Let

Q^{(r_{ℓ})}

be a convergent subsequence with limit

\hat{Q} = {{\hat{q}}_{1}, \dots, {\hat{q}}_{n}}

. Since each step of the iteration does not increase the distortion, the sequence

{V_{n, 2} (P; Q^{(r)})}

is monotone decreasing and bounded below by 0, and hence convergent. In the limit, Step (i) implies that each

\hat{x} \in X

is assigned to its nearest representative in

\hat{Q}

, and Step (ii) implies that each

{\hat{q}}_{j}

is the intrinsic mean of its cluster. Therefore,

\hat{Q}

satisfies the centroidal property and is a fixed point of the algorithm. □

6.4. Remarks

We conclude the section with remarks concerning convergence behavior, initialization, and possible extensions.

Remark 16.

Theorem 5 shows that the iterative scheme converges (at least subsequentially) to a centroidal configuration, which is a necessary condition for optimality. In practice, the algorithm typically converges to an optimal configuration, especially when initialized with a reasonably well-distributed set of points. However, as with all Lloyd-type methods, the algorithm may converge to a local minimum rather than a global one.

Remark 17.

The choice of the initial configuration

Q^{(0)}

can have a significant effect on the performance of the algorithm. A suitable initial configuration can accelerate convergence and help avoid poor local minima. Common strategies include random initialization (uniformly on the sphere or stratified by latitude), deterministic spherical codes, or taking the output of a coarse quantization run with fewer representatives. In applications, one often runs the algorithm multiple times with different initializations and selects the configuration with the smallest final distortion.

Remark 18.

Although the algorithm is described here for the uniform measure on a finite support, it extends naturally to more general discrete and continuous distributions on

S^{2}

, provided the intrinsic mean can be computed at each iteration. For continuous distributions, the cluster update step involves integration rather than summation and is typically approximated using numerical quadrature or Monte Carlo sampling.

Remark 19. (Convergence rate). While the iterative scheme given in this section is guaranteed to produce a monotone decrease in the distortion and exhibits rapid convergence in practice, a quantitative convergence-rate analysis relating the number of iterations to distortion reduction in the intrinsic spherical setting is a delicate problem and is left for future investigation.

7. Numerical Examples and Implementation Results

In this section, we illustrate the concepts developed in the previous sections through several examples of optimal n-means on

S^{2}

. The aim is threefold: (i) to highlight the geometric structure and symmetry of optimal configurations for small values of n, (ii) to demonstrate how the iterative algorithm of Section 6 behaves in practice on discrete datasets, and (iii) to build intuition for the geometry of centroidal Voronoi tessellations on the sphere.

Standing Convention. Throughout this section, we consider a finite dataset

X = {x_{1}, \dots, x_{M}} \subset S^{2},

and the associated discrete uniform probability measure

P = \frac{1}{M} \sum_{i = 1}^{M} δ_{x_{i}} .

For a configuration

Q = {q_{1}, \dots, q_{n}} \subset S^{2}

, the empirical distortion is

V_{n, 2} (P; Q) = \frac{1}{M} \sum_{i = 1}^{M} min_{1 \leq j \leq n} d_{G} {(x_{i}, q_{j})}^{2}, V_{n, 2} (P) = min_{| Q | \leq n} V_{n, 2} (P; Q) .

All reported distortion values in this section are computed with respect to the above discrete uniform measure (i.e., using finite sums over X, not integrals over

S^{2}

).

Example 2

(Optimal 2–means). Consider

n = 2

. Starting from an arbitrary initialization

Q^{(0)} = {q_{1}^{(0)}, q_{2}^{(0)}}

, the Lloyd-type iteration of Section 6 converges to two antipodal representatives on

S^{2}

, e.g.,

Q^{*} = \{(\frac{π}{2}, θ_{0}), (- \frac{π}{2}, θ_{0})\}

in spherical coordinates

(ϕ, θ)

for some

θ_{0} \in [0, 2 π)

. The corresponding empirical distortion is

V_{2, 2} (P) = \frac{π^{2}}{4} .

This reflects that the best two representatives split the sphere into two hemispheres with equal measure.

7.1. Explanation of Example 2

This example considers a fully irregular finite dataset

X \subset S^{2}

, with no imposed symmetry, ring structure, or uniform spacing. The purpose is to illustrate that the centroidal Voronoi characterization (Theorem 1) and the spherical Lloyd-type algorithm of Section 6 remain valid beyond highly structured or symmetric configurations.

Although the underlying data are irregular, the uniform empirical measure on X distributes mass approximately evenly over the sphere. As a result, any centroidal Voronoi partition associated with two representatives must divide the support into two clusters of comparable total mass. The only configuration capable of achieving such a balanced partition on the sphere is a pair of antipodal points, which induces a decomposition of

S^{2}

into two hemispherical Voronoi regions.

From the variational viewpoint, each representative is the intrinsic (Karcher) mean of its assigned cluster, and the Lloyd iteration converges to a stationary point of the distortion functional. In this case, the antipodal configuration realizes the global minimum of the empirical distortion, yielding

V_{2, 2} (P) = \frac{π^{2}}{4} .

This example demonstrates that, even in the absence of geometric regularity in the data, the optimal quantizers are governed by the global geometry of

S^{2}

and the centroidal Voronoi principle, rather than by local symmetry of the support.

Example 3

(Optimal 3—means). For

n = 3

, the optimal configuration consists of three points placed on the equator, equally spaced by

120^{\circ}

in longitude. One such configuration is

Q^{*} = \{(0, 0), (0, \frac{2 π}{3}), (0, \frac{4 π}{3})\} .

The empirical distortion with respect to the discrete uniform measure P is

V_{3, 2} (P) = \frac{π^{2}}{3} .

Here, the three codepoints partition the equatorial band into three congruent spherical wedges, illustrating the role of rotational symmetry.

7.2. Explanation of Example 3

In this example, the regularity assumption within each ring is relaxed, leading to an irregular multi-ring dataset. Although the points are no longer equally spaced, the example shows that the qualitative features of optimal quantization persist: Voronoi clusters remain localized, representatives adapt to the local geometry via intrinsic means, and the Lloyd-type algorithm converges to a stable configuration. This illustrates the robustness of the theory beyond the idealized uniform setting. This confirms that the qualitative structure predicted by the ring-allocation theorem persists even when exact equi-spacing is broken.

Example 4

(Optimal 4—means). For

n = 4

, the optimal set of representatives forms the vertices of a regular tetrahedron inscribed in

S^{2}

. One convenient spherical coordinate description is

(ϕ, θ) \in \{(arctan (\frac{1}{\sqrt{2}}), 0), (- arctan (\frac{1}{\sqrt{2}}), π), (arctan (\frac{1}{\sqrt{2}}), \frac{2 π}{3}), (- arctan (\frac{1}{\sqrt{2}}), \frac{5 π}{3})\} .

The resulting empirical distortion is

V_{4, 2} (P) = \frac{π^{2}}{6} .

This configuration realizes a centroidal Voronoi tessellation in which all four regions have equal area and identical geometric structure.

7.3. Explanation of Example 4

This example addresses a fully irregular finite dataset on

S^{2}

, not confined to exact rings, together with a perturbed version of the data. It highlights the stability results of Section 5 by showing that small geodesic perturbations of the support lead to only small changes in the optimal representatives and the resulting distortion. The example provides an intuitive demonstration of continuity of optimal quantizers and supports the practical relevance of the theoretical stability analysis.

Example 5

(Optimal 6—means). For

n = 6

, the optimal configuration corresponds to the vertices of a regular octahedron: two antipodal points at the poles and four equally spaced points on the equator, located at longitudes

0^{\circ}

,

90^{\circ}

,

180^{\circ}

, and

270^{\circ}

. Thus an optimal configuration is

Q^{*} = \{(\frac{π}{2}, θ_{0}), (- \frac{π}{2}, θ_{0}), (0, 0), (0, \frac{π}{2}), (0, π), (0, \frac{3 π}{2})\}

for some

θ_{0} \in [0, 2 π)

. The corresponding empirical distortion is

V_{6, 2} (P) = \frac{π^{2}}{8} .

This example illustrates how increasing n yields finer spherical partitioning, in this case into six congruent spherical regions.

Example 6

(Optimal 12—means). For

n = 12

, the optimal configuration corresponds to the vertices of a regular icosahedron inscribed in

S^{2}

. Although listing coordinates is more cumbersome, the configuration is well-known and highly symmetric. The distortion decreases further in this case.

7.4. Implementation Notes and Pseudo-Code

The following pseudo-code outlines a simple implementation of the spherical Lloyd algorithm for computing optimal n-means. The notation follows that of Section 6.

Initialize Q^(0) = {q_1^(0), ..., q_n^(0)} on \D S^2

for r = 0, 1, 2, ... until convergence do

# Step (i): Voronoi partition

for each data point x_i in X do

assign x_i to cluster j minimizing d_G(x_i, q_j^(r))

end for

# Step (ii): Intrinsic mean update

for j = 1 to n do

q_j^(r+1) = IntrinsicMean( X_j(Q^(r)) )

end for

return Q^(r)

The intrinsic mean may be computed using an iterative gradient descent or fixed-point scheme on

S^{2}

. In practical implementations, care must be taken to ensure numerical stability when points in a cluster are nearly antipodal.

Remark 20.

Different initialization strategies may lead to different local minima. Running the algorithm multiple times and selecting the configuration with the smallest final distortion is recommended in practice.

Remark 21

(Comparison with existing algorithms). Several algorithms related to clustering and quantization on manifolds have been proposed in the literature, including extrinsic spherical k-means methods based on Euclidean embeddings and intrinsic Lloyd-type algorithms on Riemannian manifolds. Classical Euclidean k-means and Lloyd algorithms (e.g., [4,7]) do not account for intrinsic geodesic geometry and therefore are not directly applicable to the problem studied here. Manifold-based extensions (e.g., intrinsic k-means using Karcher means; see [20,21]) are closer in spirit, but they are typically formulated for continuous distributions or general manifolds rather than finite discrete uniform data on

S^{2}

. The algorithm proposed in Section 6 is specifically tailored to discrete spherical data and exploits the geometric structure of optimal Voronoi partitions established in earlier sections, which explains its stability and rapid convergence observed in the numerical examples.

We formalize the notions of irregular data, multi-ring data, and their perturbations used below.

Definition 2

(Irregular finite dataset on

S^{2}

). Anirregular finite dataseton the sphere

S^{2}

is a finite set

X = {x_{1}, \dots, x_{M}} \subset S^{2}

that does not possess any prescribed geometric regularity, such as equal spacing, rotational symmetry, or confinement to a fixed number of latitudinal rings. Equivalently, no assumption is made on the mutual geodesic distances

d_{G} (x_{i}, x_{j})

beyond finiteness.

Definition 3

(Multi-ring data). A finite dataset

X \subset S^{2}

is calledmulti-ring dataif there exist distinct latitudes

ϕ_{1} < \dots < ϕ_{T}

such that

X = ⋃_{t = 1}^{T} R_{t}, R_{t} = {(ϕ_{t}, θ_{t, 1}), \dots, (ϕ_{t}, θ_{t, N_{t}})},

where each

R_{t}

lies entirely on the latitude

ϕ_{t}

. If the longitudes

{θ_{t, j}}_{j = 1}^{N_{t}}

are equally spaced for each t, the data are calledregular multi-ring data; otherwise they are calledirregular multi-ring data.

Definition 4

(Perturbed finite dataset). Let

X = {x_{1}, \dots, x_{M}} \subset S^{2}

be a finite dataset. Aperturbed versionof X is a dataset

X^{'} = {x_{1}^{'}, \dots, x_{M}^{'}} \subset S^{2}

together with a fixed one-to-one correspondence

x_{i} \leftrightarrow x_{i}^{'}

, such that

ε : = max_{1 \leq i \leq M} d_{G} (x_{i}, x_{i}^{'})

is small. The quantity ε is called the perturbation magnitude.

Now, we conclude the numerical section by illustrating the behavior of the algorithm on irregular datasets, multi-ring configurations, and perturbed data, which more directly reflect the structural results developed earlier in the paper.

7.5. Numerical Experiments on Irregular and Multi-Ring Data

The numerical examples presented earlier in this section focused primarily on highly symmetric configurations (e.g., antipodal sets and Platonic solids), which are useful for benchmarking but do not fully illustrate the scope of the structural results developed in Section 4 and Section 5. We therefore include here several additional experiments designed to demonstrate the behavior of the spherical Lloyd-type algorithm on irregular data, multi-ring configurations, and perturbed datasets. The goal of these examples is qualitative illustration rather than numerical optimization.

Example 7

(Irregular finite dataset). We generate a finite set

X \subset S^{2}

consisting of

M = 40

points sampled independently from a nonuniform distribution on the sphere, with higher density near a prescribed region and sparse coverage elsewhere. Starting from random initial representatives, the spherical Lloyd algorithm is run until convergence. The resulting configuration is centroidal but lacks any global symmetry. The Voronoi cells adapt to the local geometry of the data, illustrating that the algorithm and the centroidal condition apply equally well to irregular finite datasets and are not restricted to symmetric point clouds.

Example 8

(Multi-ring configuration and allocation). We consider a dataset supported on several distinct latitudinal rings, with unequal numbers of points on each ring. For a fixed number n of representatives, the Lloyd iteration consistently converges to configurations in which representatives remain confined to individual rings, with no cross-ring mixing. Moreover, the number of representatives assigned to each ring agrees with the allocation rule predicted by the discrete marginal-drop (water-filling) principle described in Section 4. This example provides numerical confirmation of the no cross–ring mixing phenomenon (Theorems 3 and 4(ii))) and illustrates how the global allocation problem decouples across rings.

Example 9

(Perturbed multi-ring data and stability). To illustrate stability under perturbations, we perturb the multi-ring dataset in Example 3 by applying small random geodesic displacements to each point. Repeating the Lloyd iteration, we observe that the resulting representatives undergo only minor changes in position, and the overall ring allocation remains unchanged for sufficiently small perturbations. This behavior is consistent with the Lipschitz-type stability result established in Lemma 2 and demonstrates that the qualitative structure of optimal configurations is robust under moderate perturbations of the support.

These examples illustrate that the theoretical results developed in this paper apply beyond idealized symmetric settings. In particular, they demonstrate the relevance of the no cross-ring mixing principle, the discrete allocation mechanism, and stability under perturbations in more general and irregular finite configurations on the sphere.

8. Discussion, Practical Insights, and Future Work

8.1. Theoretical Perspective

The results established in this paper provide a rigorous framework for understanding optimal n-means on

S^{2}

. The existence and characterization of optimal configurations, together with the centroidal property, give structural insight into how representatives must be arranged on the sphere to minimize the distortion. The iterative construction in Section 6 offers a practical approach for obtaining centroidal Voronoi configurations. These results form a natural extension of the classical Euclidean theory of optimal quantization to the spherical setting, where curvature plays an essential role.

8.2. Interpretation of Numerical Results

The numerical examples presented in Section 7 illustrate the geometric behavior of optimal configurations for small values of n. For

n = 2

, the representatives converge to antipodal points, while for

n = 3

and

n = 4

, the optimal configurations correspond to the vertices of a regular equilateral triangle on the equator and a regular tetrahedron, respectively. As n increases, the representatives distribute themselves more uniformly over the sphere, and the distortion values decrease accordingly. These examples demonstrate that the spherical centroidal Voronoi configurations reflect the underlying symmetry of

S^{2}

and that the algorithm performs consistently with theoretical expectations.

8.3. Practical Considerations for Implementation

The iterative procedure described in Section 6 is straightforward to implement, and the use of the intrinsic (Karcher) mean ensures that representatives remain on

S^{2}

throughout the algorithm. Although the distortion decreases monotonically, the method may converge to a local minimum depending on the initial configuration. In practice, it is advisable to run the algorithm multiple times with different initializations and select the configuration with the smallest final distortion. Numerical stability must also be considered when computing intrinsic means, particularly when cluster points are nearly antipodal or concentrated in a small region. Nevertheless, the algorithm is computationally efficient and performs well even for moderately large values of n.

8.4. Future Research Directions

There are several promising directions for future research. One natural extension is to consider non-uniform probability distributions on

S^{2}

, where the density varies across the surface; in such settings, the intrinsic mean computation may require numerical integration or Monte Carlo methods. Another direction is to explore higher-dimensional analogues on

S^{d}

for

d \geq 3

, where the geometry is richer and more complex. More broadly, the study of constrained and weighted quantization on general Riemannian manifolds presents many interesting challenges, particularly in relation to curvature effects and manifold geometry. From a computational perspective, improving initialization strategies, accelerating the computation of intrinsic means, and developing methods that avoid local minima would significantly enhance practical performance. Potential applications in directional statistics, data science, and machine learning on spherical domains also provide fertile ground for further exploration.

Funding

This research received no external funding.

Data Availability Statement

No data were generated or analyzed in this study.

Acknowledgments

The author would like to thank the anonymous referee for a careful reading of the manuscript and for insightful comments and suggestions that significantly improved the clarity, organization, and presentation of the paper.

Conflicts of Interest

The author declares no conflicts of interest.

References

Gersho, A.; Gray, R.M. Vector Quantization and Signal Compression; Springer Science & Business Media: Berlin/Heidelberg, Germany, 2012; Volume 159. [Google Scholar]
Gyorgy, A.; Linder, T. On the structure of optimal entropy-constrained scalar quantizers. IEEE Trans. Inf. Theory 2002, 48, 416–427. [Google Scholar] [CrossRef]
Gray, R.M.; Neuhoff, D.L. Quantization IEEE Trans. Inform. Theory 1998, 44, 2325–2383. [Google Scholar] [CrossRef]
Pollard, D. Quantization and the method of k–means. IEEE Trans. Inform. Theory 1982, 28, 199–205. [Google Scholar] [CrossRef]
Zador, P.L. Asymptotic Quantization Error of Continuous Signals and the Quantization Dimension. IEEE Trans. Inf. Theory 1982, 28, 139–149. [Google Scholar] [CrossRef]
Zamir, R. Lattice Coding for Signals and Networks: A Structured Coding Approach to Quantization, Modulation, and Multiuser Information Theory; Cambridge University Press: Cambridge, UK, 2014. [Google Scholar]
Graf, S.; Luschgy, H. Foundations of Quantization for Probability Distributions; Lecture Notes in Mathematics; Springer: Berlin/Heidelberg, Germany, 2000; Volume 1730. [Google Scholar]
Du, Q.; Faber, V.; Gunzburger, M. Centroidal Voronoi tessellations: Applications and algorithms. SIAM Rev. 1999, 41, 637–676. [Google Scholar] [CrossRef]
Dettmann, C.P.; Roychowdhury, M.K. Quantization for uniform distributions on equilateral triangles. Real Anal. Exch. 2017, 42, 149–166. [Google Scholar] [CrossRef]
Graf, S.; Luschgy, H. The quantization of the Cantor distribution. Math. Nachrichten 1997, 183, 113–133. [Google Scholar] [CrossRef]
Graf, S.; Luschgy, H. Quantization for probability measures with respect to the geometric mean error. Math. Proc. Camb. Phil. Soc. 2004, 136, 687–717. [Google Scholar] [CrossRef]
Kesseböhmer, M.; Niemann, A.; Zhu, S. Quantization dimensions of compactly supported probability measures via Rényi dimensions. Trans. Amer. Math. Soc. 2023, 376, 4661–4678. [Google Scholar] [CrossRef]
Peña, G.; Rodrigo, H.; Roychowdhury, M.K.; Sifuentes, J.; Suazo, E. Quantization for uniform distributions on hexagonal, semicircular, and elliptical curves. J. Optim. Theory Appl. 2021, 188, 113–142. [Google Scholar] [CrossRef]
Pötzelberger, K. The quantization dimension of distributions. Math. Proc. Camb. Phil. Soc. 2001, 131, 507–519. [Google Scholar] [CrossRef]
Roychowdhury, M.K. Quantization and centroidal Voronoi tessellations for probability measures on dyadic Cantor sets. J. Fractal Geom. 2017, 4, 127–146. [Google Scholar] [CrossRef]
Roychowdhury, M.K. Least upper bound of the exact formula for optimal quantization of some uniform Cantor distribution. Discret. Contin. Dyn. Syst. Ser. A 2018, 38, 4555–4570. [Google Scholar] [CrossRef]
Roychowdhury, M.K. Optimal quantization for the Cantor distribution generated by infinite similutudes. Isr. J. Math. 2019, 231, 437–466. [Google Scholar] [CrossRef]
Roychowdhury, M.K. Optimal quantization for mixed distributions. Real Anal. Exch. 2021, 46, 451–484. [Google Scholar] [CrossRef]
Roychowdhury, M.K.; Selmi, B. Local dimensions and quantization dimensions in dynamical systems. J. Geom. Anal. 2021, 31, 6387–6409. [Google Scholar] [CrossRef]
Karcher, H. Riemannian center of mass and mollifier smoothing. Commun. Pure Appl. Math. 1977, 30, 509–541. [Google Scholar] [CrossRef]
Afsari, B. Riemannian L^p center of mass: Existence, uniqueness, and convexity. Proc. Amer. Math. Soc. 2011, 139, 655–673. [Google Scholar] [CrossRef]
Pennec, X. Intrinsic statistics on Riemannian manifolds: Basic tools for geometric measurements. J. Math. Imaging Vis. 2006, 25, 127–154. [Google Scholar] [CrossRef]
Roychowdhury, M.K. Optimal Quantization on Spherical Surfaces: Continuous and Discrete Models—A Beginner-Friendly Expository Study. Mathematics 2026, 14, 63. [Google Scholar] [CrossRef]
Roychowdhury, M.K. Discrete Quantization on Spherical Geometries: Explicit Models, Computations, and Didactic Exposition. arXiv 2026. [Google Scholar]

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Roychowdhury, M.K. Optimal Quantization of Finite Uniform Data on the Sphere. Mathematics 2026, 14, 288. https://doi.org/10.3390/math14020288

AMA Style

Roychowdhury MK. Optimal Quantization of Finite Uniform Data on the Sphere. Mathematics. 2026; 14(2):288. https://doi.org/10.3390/math14020288

Chicago/Turabian Style

Roychowdhury, Mrinal Kanti. 2026. "Optimal Quantization of Finite Uniform Data on the Sphere" Mathematics 14, no. 2: 288. https://doi.org/10.3390/math14020288

APA Style

Roychowdhury, M. K. (2026). Optimal Quantization of Finite Uniform Data on the Sphere. Mathematics, 14(2), 288. https://doi.org/10.3390/math14020288

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Optimal Quantization of Finite Uniform Data on the Sphere

Abstract

1. Introduction

1.1. Recent Progress and Motivation

1.2. Aims and Contributions of the Paper

2. Notation and Preliminaries

2.1. Geodesic Distance

2.2. Finite Discrete Uniform Distributions on S 2

2.3. Distortion and Optimal n–Means

2.4. Spherical Voronoi Partitions

2.5. Intrinsic (Karcher) Mean on the Sphere

2.6. Notation Summary

3. Existence and Characterization of Optimal n -Means

Spherical Voronoi Partition and Centroidal Property

4. Quantization on Finite Latitudinal Rings

4.1. Discrete Ring Configuration

4.2. Core Structural Results

5. Stability of Optimal Sets Under Perturbation

Perturbed Distributions

6. Algorithmic Construction of Optimal n-Means

6.1. Lloyd-Type Algorithm on the Sphere

6.2. Monotonicity and Fixed-Point Properties

6.3. Convergence Properties

6.4. Remarks

7. Numerical Examples and Implementation Results

7.1. Explanation of Example 2

7.2. Explanation of Example 3

7.3. Explanation of Example 4

7.4. Implementation Notes and Pseudo-Code

7.5. Numerical Experiments on Irregular and Multi-Ring Data

8. Discussion, Practical Insights, and Future Work

8.1. Theoretical Perspective

8.2. Interpretation of Numerical Results

8.3. Practical Considerations for Implementation

8.4. Future Research Directions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

2.2. Finite Discrete Uniform Distributions on $S^{2}$

3. Existence and Characterization of Optimal $n$ -Means