The Wasserstein Metric between a Discrete Probability Measure and a Continuous One

Weihua Yang; Xu Zhang; Xia Wang

doi:10.3390/math12152320

,

and

School of Mathematics, Statistics and Mechanics, Beijing University of Technology, Beijing 100124, China

^*

Author to whom correspondence should be addressed.

Mathematics2024, 12(15), 2320;https://doi.org/10.3390/math12152320

This article belongs to the Special Issue Probability, Statistics and Random Processes

Version Notes

Order Reprints

Abstract

This paper examines the Wasserstein metric between the empirical probability measure of n discrete random variables and a continuous uniform measure in the d-dimensional ball, providing an asymptotic estimation of their expectations as n approaches infinity. Furthermore, we investigate this problem within a mixed process framework, where n discrete random variables are generated by the Poisson process.

Keywords:

Wasserstein metric; optimal matching; random variable; Poisson process

MSC:

60B10; 60G57

1. Introduction

Article [1] investigates the Ollivier curvature of random geometric graphs, with a key step being the estimation of Wasserstein metrics between the empirical probability measure of n discrete random variables and a continuous uniform one in a d-dimensional ball. The authors applied results from [2], which are based on the interval

{[0, 1]}^{d}

, whereas Ollivier curvature is built in balls. To address this discrepancy, we aim to refine the proof based on balls in order to enhance the robustness and accuracy of the process described in [1] and to make it suitable for our purposes. Furthermore, since [2] requires

d > 2

, we extended our upper and lower bounds estimation to include the case

d = 2

, aligning our research objectives with the broader scope of the study.

Additionally, lattice methods used in statistical mechanical approaches [3] often involve similar notations and convergence from discrete physical quantities to continuous ones, suggesting potential connections with convergences from discrete probabilities to continuous ones. For instance, consider a collection of point charges denoted as

Q_{i}, i = 1, \dots, n

and their corresponding locations represented by the independent and uniformly distributed random variables

X_{i} \in Ω, i = 1, \dots, n

, where

Ω

represents a bounded region within three-dimensional space

R^{3}

with its volume defined as

| Ω | = 1

. Assuming an ideal scenario in implicit solvation models for biological molecules, it can be postulated that each charge possesses the value of

\frac{1}{n}

, thereby establishing a discrete charge density expressed as

μ_{n} = \frac{1}{n} \sum_{i = 1}^{n} δ_{X_{i}}

. On the other hand, we consider a continuum charge density represented by a uniform measure

μ

in an ideal scenario. Thus, transitioning from discrete (i.e., point) charges to a continuum charge density can be a pathway from a discrete probability measure to a continuous one in Wasserstein metrics. Consequently, we may contemplate convergences of the corresponding discrete electrostatic energies and others in terms of Wasserstein metrics or within Wasserstein spaces.

The authors in [4] have provided estimations for the convergence rate in Wasserstein metrics of empirical measures on complex computational cases, involving numerous asymptotic calculations. Our findings are consistent with their results in corresponding scenarios. In comparison, our proof primarily relies on estimating the expectations of optimal matching problems to obtain upper bounds on the expectations of Wasserstein metrics. We chose this technique because, as mentioned in [5], (i) the definition of Wasserstein metrics makes them convenient for problems involving optimal transports, such as those arising from partial differential equations; (ii) Wasserstein metrics possess a rich duality property which is particularly useful when considering (2) (in contrast to bounded Lipschitz distances), and passing back and forth from the original to the dual definition is often technically convenient; (iii) being defined by an infimum, it is often relatively straightforward to bound Wasserstein metrics from above by constructing couplings between

μ_{1}

and

μ_{2}

; and (iv) Wasserstein metrics incorporate a lot of the geometry of the space. For instance, the mapping

x \to δ_{x}

is an isometric embedding of

X

into

P_{p} (X)

(Wasserstein space of order p), but there are much deeper links. This partly explains why a Wasserstein space of

P_{p} (X)

is often very well adapted to statements that combine weak convergence and geometry.

Motivated by the virtues of Wasserstein spaces and these inspirations, we aim to bridge the gap between discrete probabilities and their continuous counterparts using Wasserstein metrics as measures in our study. Recently, significant advancements have been made in the research progress concerning the rate of convergence of Wasserstein metrics. For instance, in [6], the authors investigated the precise rate of convergence of the quadratic Wasserstein metric between empirical measures and uniform distributions on

{[0, 1]}^{2}

by employing well-known techniques from partial differential equations. Additionally, in [7], researchers explored upper bounds for the mean Wasserstein metric between two probabilities on

{(- π, π]}^{d},

where

d \geq 1

, using Fourier transformation, and subsequently applied these findings to estimate the mean Wasserstein metric between two empirical measures under certain assumptions. Furthermore, in [8], an author examined upper bounds for the mean rate within the quadratic Wasserstein metric

W_{2}

on a d-dimensional compact Riemannian manifold where

d \geq 2

. Notably, there are also ongoing studies focusing on higher-order (p-th order) Wasserstein metrics; however, we refrain from listing them here.

2. Preliminary Estimation

Definition 1.

Let

X_{1}, X_{2}, \dots, X_{n}, Y_{1}, X_{2}, \dots, Y_{n}

be independent and uniformly distributed random variables in a d-dimensional ball

B (0; 1) = {x \in R^{d}, ∥ x ∥ \leq 1}, d \geq 2,

where

∥ \cdot ∥

represents the Euclidean metric in

R^{d} .

The random variable

M_{n}^{d} : = inf_{σ} \sum_{i = 1}^{n} ∥ X_{i} - Y_{σ (i)} ∥

represents the optimal matching between

X_{1}, X_{2}, \dots, X_{n}

and

Y_{1}, Y_{2}, \dots, Y_{n},

with σ iterating over all permutations of

{1, 2, \dots, n} .

By applying the dual principle [9,10], or referring to the proof process of Lemma 1 in [2], we have

M_{n}^{d} = sup_{f \in L_{1}} \sum_{i = 1}^{n} (f (X_{i}) - f (Y_{i})) = sup_{f \in L_{1}} | \sum_{i = 1}^{n} (f (X_{i}) - f (Y_{i})) |,

where the set of Lipschitz functions

L_{1} = {f : B (0; 1) \to R; | f (x) - f (y) | \leq ∥ x - y ∥, \forall x, y \in B (0; 1), f (0) = 0} .

It is worth noting that every Lipschitz function in

L_{1}

can be extended to one in

L = {f : R^{d} \to R; | f (x) - f (y) | \leq ∥ x - y ∥, \forall x, y \in R^{d}, f (0) = 0, ∥ f ∥_{L^{\infty}} \leq 1} .

(1)

Therefore, we have

L_{1} {= L |}_{B (0; 1)} .

The following Lemma 1 gives an upper and lower bound estimation for the expectation

E (M_{n}^{d}) .

Lemma 1

(Optimal matching). For the above optimal matching problem, we have

1 - \frac{1}{d + 1} \leq \underset{n \to \infty}{lim inf} \frac{E (M_{n}^{d})}{n^{1 - \frac{1}{d}}} \leq \underset{n \to \infty}{lim sup} \frac{E (M_{n}^{d})}{n^{1 - \frac{1}{d}}} \leq 5 + 2 d + 32 \sqrt{2},

in dimension

d \geq 3,

and

\underset{n \to \infty}{lim inf} \frac{E (M_{n}^{d})}{n^{\frac{1}{2}}} \geq \frac{2}{3}, \underset{n \to \infty}{lim sup} \frac{E (M_{n}^{d})}{n^{\frac{1}{2}} {log}_{2} n} \leq 2^{4}

in dimension

d = 2 .

Proof.

We provide a detailed proof, referring to Equation (A1) in Appendix A and Equation (A3) in Appendix B. The method employed here is essentially based on the work of [2], with several improvements and modifications made to extend its applicability to random variables within balls.

3. Main Results and Proofs

The following results present Wasserstein metrics between empirical and uniform measures in d-dimensional balls. Generally, a Wasserstein metric between two probability measures

μ_{1}, μ_{2}

is defined as follows:

Definition 2.

Let

μ_{1}

and

μ_{2}

be Borel probability measures in a compact metric space

(X, ρ)

and let

Γ (μ_{1}, μ_{2})

denote the set of all couplings of

μ_{1}

and

μ_{2}

, i.e.,

Γ (μ_{1}, μ_{2}) = {μ : μ i s a p r a b a b i l i t y m e a s u r e i n X \times X, μ (A, X) = μ_{1} (A), μ (X, A) = μ_{2} (A), A i s a m e a s u r a b l e s u b s e t o f X .}

A Wasserstein metric is defined as

W (μ_{1}, μ_{2}) = inf_{μ \in Γ (μ_{1}, μ_{2})} \int_{X^{2}} ρ (x, y) d μ (x, y) .

By applying the duality principle (Kantorovich Dual Theorem) in Chapter 6, Remark 6.5 of [5], we can express the Wasserstein metric as

W (μ_{1}, μ_{2}) = sup_{f \in L_{1} (X)} (\int_{X} f (x) d μ_{1} (x) - \int_{X} f (y) d μ_{2} (y)),

(2)

where

L_{1} (X)

denotes the set of Lipschitz functions based on the metric of

X

with a coefficient of 1. From the duality formula, we can further assume that any function

f \in L_{1} (X)

satisfying

f (0) = 0 .

Notice: In all subsequent discussions, we will explicitly specify that the metric being considered is a Euclidean metric. Additionally, we will adopt the notation

a_{n} = O (n^{α})

for a sequence

{a_{n}}

, where

α

is a constant. This notation implies the existence of positive constants C such that

a_{n} \leq C n^{α}

as n is large enough.

Theorem 1.

Let

X_{1}, X_{2}, \dots, X_{n}

be independent and uniformly distributed random variables in a d-dimensional ball

B (0; 1) .

The empirical measure

m_{n}^{d}

is given by

m_{n}^{d} (A) = \frac{1}{n} \sum_{i = 1}^{n} 1_{A} (X_{i}),

which represents the proportion of points in the sample that lie in a measurable subset A of

B (0; 1)

.

μ^{d}

demotes the uniform measure in

B (0; 1) .

As the sample size n tends to infinity, it can be shown that an expected Wasserstein metric between

m_{n}^{d}

and

μ^{d}

, denoted as

E [W (m_{n}^{d}, μ^{d})]

, decays at a rate

E [W (m_{n}^{d}, μ^{d})] = \{\begin{matrix} O (n^{- \frac{1}{2}} {log}_{2} n), & d = 2, \\ O (n^{- \frac{1}{d}}), & d \geq 3 . \end{matrix}

Proof.

Now, we consider a Wasserstein metric in

B (0; 1)

with

ρ (x, y) = ∥ x - y ∥,

and then

\begin{matrix} W (m_{n}^{d}, μ^{d}) & = inf_{μ \in Γ (m_{n}^{d}, μ^{d})} \int_{B {(0; 1)}^{2}} ρ (x, y) d μ (x, y) \\ = sup_{f \in L_{1} (B (0; 1))} (\int_{B (0; 1)} f (x) d m_{n}^{d} (x) - \int_{B (0; 1)} f (y) d μ^{d} (y)) . \end{matrix}

Let

Y_{1}, Y_{2}, \dots, Y_{n}

be independent uniformly distributed random variables in

B (0; 1),

and then

\int_{B (0; 1)} f (y) d μ^{d} (y) = E [f (Y_{i})], i = 1, \dots, n .

So

\begin{matrix} W (m_{n}^{d}, μ^{d}) & = sup_{f \in L_{1} (B (0; 1))} (\int_{B (0; 1)} f (x) d m_{n}^{d} (x) - \int_{B (0; 1)} f (y) d μ^{d} (y)) \\ = \frac{1}{n} sup_{f \in L_{1} (B (0; 1))} (\sum_{i = 1} (f (X_{i}) - E [f (Y_{i})])) \\ = \frac{1}{n} sup_{f \in L_{1} (B (0; 1))} (\sum_{i = 1} E [f (X_{i}) - f (Y_{i}) | X_{i}]) \\ = \frac{1}{n} sup_{f \in L_{1} (B (0; 1))} E [\sum_{i = 1} (f (X_{i}) - f (Y_{i})) | X] \\ \leq \frac{1}{n} E [(sup_{f \in L_{1} (B (0; 1))} \sum_{i = 1} (f (X_{i}) - f (Y_{i}))) | X] \\ = \frac{1}{n} E [M_{n}^{d} | X], \end{matrix}

where

X = (X_{1}, \dots, X_{n}),

and hence from Lemma 1, it has

E [W (m_{n}^{d}, μ^{d})] \leq \frac{1}{n} E [M_{n}^{d}] = \{\begin{matrix} O (n^{- \frac{1}{2}} {log}_{2} n), & d = 2, \\ O (n^{- \frac{1}{d}}), & d \geq 3 . \end{matrix}

□

Next, we consider an empirical measure and a uniform measure in a ball

B (0; δ) .

Corollary 1.

In general, let

X_{1}, X_{2}, \dots, X_{n}

be independent and random variables uniformly distributed in the d-dimensional ball

B (0; δ)

with radius

δ > 0 .

Consider an empirical measure and a uniform measure in

B (0, δ)

, where

m_{n, δ}^{d}

represents the empirical measure defined as

m_{n, δ}^{d} (A) = \frac{1}{n} \sum_{i = 1}^{n} 1_{A} (X_{i})

for a measurable subset A of

B (0, δ)

, and

μ_{δ}^{d}

denotes the uniform measure in

B (0; δ) .

Then, it follows that

E [W (m_{n, δ}^{d}, μ_{δ}^{d})] = δ \{\begin{matrix} O (n^{- \frac{1}{2}} {log}_{2} n), & d = 2, \\ O (n^{- \frac{1}{d}}), & d \geq 3 . \end{matrix}

Proof.

Consider the map

φ : B (0; δ) \to B (0; 1)

, defined by

φ (x) = \frac{1}{δ} x,

where

φ^{- 1} (t) = δ t .

Thus,

m_{n, δ}^{d} \circ φ^{- 1}

and

μ_{δ}^{d} \circ φ^{- 1}

correspond to the empirical measure and the uniform measure in

B (0; 1)

, respectively, which establishes a one-to-one relationship between empirical measures in

B (0; δ)

and those in

B (0; 1)

, as well as between uniform measures in

B (0; δ)

and those in

B (0; 1)

. In particular, we can write

\begin{matrix} W (m_{n, δ}^{d}, μ_{δ}^{d}) & = inf_{μ \in Γ (m_{n, δ}^{d}, μ_{δ}^{d})} \int_{B {(0; δ)}^{2}} ρ (x, y) d μ (x, y) \\ = inf_{μ \in Γ (m_{n, δ}^{d}, μ_{δ}^{d})} \int_{B {(0; 1)}^{2}} ρ (φ^{- 1} (t), φ^{- 1} (τ)) d μ (φ^{- 1} (t), φ^{- 1} (τ)) \\ = inf_{ν \in Γ (m_{n, δ}^{d} \circ φ^{- 1}, μ_{δ}^{d} \circ φ^{- 1})} \int_{B {(0; 1)}^{2}} δ ρ (t, τ) d ν (t, τ) \\ = δ W^{d} (m_{n}^{d}, μ^{d}) . \end{matrix}

Therefore, from Theorem 1 we obtain

E [W (m_{n, δ}^{d}, μ_{δ}^{d})] = \{\begin{matrix} δ O (n^{- \frac{1}{2}} {log}_{2} n), & d = 2, \\ δ O (n^{- \frac{1}{d}}), & d \geq 3 . \end{matrix}

□

We next generalize Theorem 1 to the case where the number of random variables, denoted by N, follows a Poisson distribution with a parameter

(1 + α_{n}) n

and is independent of these random variables. This case actually corresponds to a specific spacial Poisson process

P

in [1] with intensity measure

(1 + α_{n}) n \frac{V_{d}}{| B (0; 1) |},

which describes a spatial configuration of points in the ball

B (0; 1) .

Here,

V_{d}

denotes the volume measure in d-dimensional Euclidean space. Moreover, in [1], it is also stated that

N = | P |,

representing the number of random points in

B (0; 1)

, called size, and the parameter

(1 + α_{n}) n

of the Poisson distribution is equivalent to

(1 + α_{n}) n \frac{V_{d} (B (0; 1))}{| B (0; 1) |}

, derived from the corresponding Poisson point process. The notation

0 \leq α_{n} \to 0

represents slight perturbations of n in order to observe how a expected Wasserstein metric gradually changes as n approaches infinity.

Theorem 2.

Let

m_{N}^{d}

denote the empirical random measure with respect to independent and uniformly distributed random variables

X_{1}, X_{2}, \dots, X_{N}

in

B (0; 1)

, defined as

m_{N}^{d} (A) = \frac{1}{N} \sum_{i = 1}^{N} 1_{A} (X_{i})

for a measurable subset A of

B (0; 1)

. N follows a Poisson distribution with a parameter

(1 + α_{n}) n

, which is independent of random variables

X_{1}, X_{2}, \dots, X_{N}

. Let

μ^{d}

represent the uniform measure in the aforementioned ball. Then, it follows that

E [W (m_{N}^{d}, μ^{d})] = \{\begin{matrix} O (n^{- \frac{1}{2}} {log}_{2} n), & d = 2, \\ O (n^{- \frac{1}{d}}), & d \geq 3 . \end{matrix}

Proof.

Since the number N follows a Poisson distribution with the mean

(1 + α_{n}) n

and

X_{1}, X_{2}, \dots, X_{N}

are uniformly distributed random variables in

B (0; 1)

, which are independent of N, we have

E [W (m_{N}^{d}, μ^{d})] = \sum_{k = 1}^{\infty} E [W (m_{N}^{d}, μ^{d}) | N = k] P (N = k) .

According to Theorem 1, it follows that

E [W (m_{N}^{d}, μ^{d}) | N = k] = \{\begin{matrix} O (k^{- \frac{1}{2}} {log}_{2} k), & d = 2, \\ O (k^{- \frac{1}{d}}), & d \geq 3 . \end{matrix}

On the other hand, from Lemma 1.2 in [11], one can obtain

\begin{matrix} P (N > (1 + α_{n}) n + c \sqrt{(1 + α_{n}) n log n}) \\ \leq e^{c \sqrt{(1 + α_{n}) n log n} + ((1 + α_{n}) n + c \sqrt{(1 + α_{n}) n log n}) log \frac{(1 + α_{n}) n}{(1 + α_{n}) n + c \sqrt{(1 + α_{n}) n log n}}} \\ \leq e^{- \frac{c^{2} (1 + α_{n}) n log n}{2 ((1 + α_{n}) n + c \sqrt{(1 + α_{n}) n log n})}} = O (n^{- \frac{c^{2}}{3}}) \end{matrix}

where

c > 0

is a constant, and

\begin{matrix} P (N < (1 + α_{n}) n - c \sqrt{(1 + α_{n}) n log n}) \\ \leq e^{- c \sqrt{(1 + α_{n}) n log n} + ((1 + α_{n}) n - c \sqrt{(1 + α_{n}) n log n}) log \frac{(1 + α_{n}) n}{(1 + α_{n}) n - c \sqrt{(1 + α_{n}) n log n}}} \\ = O (n^{- \frac{c^{2}}{3}}) . \end{matrix}

Let us denote

a_{n}^{\pm} = [(1 + α_{n}) n \pm c \sqrt{(1 + α_{n}) n log n}] .

Then, we obtain an expression for the expected value as follows:

\begin{array}{l} E [W (m_{N}^{d}, μ^{d})] \\ = \sum_{k = 1}^{a_{n}^{-} - 1} E [W (m_{N}^{d}, μ^{d}) | N = k] P (N = k) \\ + \sum_{k = a_{n}^{-}}^{a_{n}^{+}} E [W (m_{N}^{d}, μ^{d}) | N = k] P (N = k) \\ + \sum_{k = a_{n}^{+} + 1}^{\infty} E [W (m_{N}^{d}, μ^{d}) | N = k] P (N = k) \\ : = I_{1} + I_{2} + I_{3} . \end{array}

We further estimate these three terms and find that

I_{1} = \sum_{k = 1}^{a_{n}^{-} - 1} E [W (m_{N}^{d}, μ^{d}) | N = k] P (N = k) \leq 2 P (N < a_{n}^{-}) = O (n^{- \frac{c^{2}}{3}})

and

I_{3} = \sum_{k = a_{n}^{+} + 1}^{\infty} E [W (m_{N}^{d}, μ^{d}) | N = k] P (N = k) \leq 2 P (N > a_{n}^{+}) = O (n^{- \frac{c^{2}}{3}}) .

For term

I_{2},

it is bounded as follows:

\begin{matrix} I_{2} & = \sum_{k = a_{n}^{-}}^{a_{n}^{+}} E [W (m_{N}^{d}, μ^{d}) | N = k] P (N = k) \\ = \{\begin{matrix} \sum_{k = a_{n}^{-}}^{a_{n}^{+}} O (k^{- \frac{1}{2}} {log}_{2} k) \frac{e^{- (1 + α_{n}) n} {((1 + α_{n}) n)}^{k}}{k!}, & d = 2, \\ \sum_{k = a_{n}^{-}}^{a_{n}^{+}} O (k^{- \frac{1}{d}}) \frac{e^{- (1 + α_{n}) n} {((1 + α_{n}) n)}^{k}}{k!}, & d \geq 3, \end{matrix} \\ \leq \{\begin{matrix} O ({(a_{n}^{-})}^{- \frac{1}{2}} {log}_{2} (a_{n}^{+})), & d = 2, \\ O ({(a_{n}^{-})}^{- \frac{1}{d}}), & d \geq 3, \end{matrix} \\ = \{\begin{matrix} O (n^{- \frac{1}{2}} {log}_{2} n), & d = 2, \\ O (n^{- \frac{1}{d}}), & d \geq 3 . \end{matrix} \end{matrix}

Since c was arbitrary, with a suitable adjustment in constant c, we conclude that

E [W (m_{N}^{d}, μ^{d})] = \{\begin{matrix} O (n^{- \frac{1}{2}} {log}_{2} n), & d = 2, \\ O (n^{- \frac{1}{d}}), & d \geq 3 . \end{matrix}

□

Now, we consider a d-dimensional ball

B (x; δ)

. The number of random variables in

B (x; δ)

, still denoted as N, follows a Poisson distribution with a parameter

(1 + α_{n}) n δ^{d}

, and N is independent of these random variables. This actually corresponds to a spatial Poisson process

P

with intensity measure

(1 + α_{n}) n \frac{V_{d}}{| B (0; 1) |}

in

B (x; δ)

, as discussed in [1], and the parameter of Poisson distribution is equivalent to

(1 + α_{n}) n \frac{V_{d} (B (x; δ))}{| B (0; 1) |}

, derived from the corresponding Poisson point process.

Corollary 2.

Let

0 \leq α_{n} \to 0

and

x \in R^{d}

. We denote by

m_{x, δ; N}^{d}

the empirical measure with respect to independent and uniformly distributed random variables

X_{1}, X_{2}, \dots, X_{N}

in

B (x; δ)

, i.e.,

m_{x, δ; N}^{d} (A) = \frac{1}{N} \sum_{i = 1}^{N} 1_{A} (X_{i}),

for a measurable subset A of

B (x; δ)

. N follows a Poisson distribution with a parameter

(1 + α_{n}) n δ^{d}

and is independent of random variables

X_{1}, X_{2}, \dots, X_{N}

. Let

μ_{x, δ}^{d}

be the uniform measure in

B (x; δ) .

Then, we have

E [W (m_{x, δ; N}^{d}, μ_{x, δ}^{d})] = \{\begin{matrix} O (n^{- \frac{1}{2}} {log}_{2} n), & d = 2, \\ O (n^{- \frac{1}{d}}), & d \geq 3 . \end{matrix}

Proof.

Combining the proof in Theorem 2 and Corollary 1, we first note that N follows a Poisson distribution in

B (x; δ)

with mean value

(1 + α_{n}) n δ^{d}

. Therefore, we can obtain

\begin{matrix} E [W (m_{x, δ; N}^{d}, μ_{x, δ}^{d})] & = E [δ W^{d} (m_{N}^{d}, μ^{d})] \\ = δ E [W^{d} (m_{N}^{d}, μ^{d})] \\ = δ \{\begin{matrix} O ({(n δ^{2})}^{- \frac{1}{2}} {log}_{2} (n δ^{d})), & d = 2, \\ O ({(n δ^{d})}^{- \frac{1}{d}}), & d \geq 3 . \end{matrix} \\ = \{\begin{matrix} O (n^{- \frac{1}{2}} {log}_{2} n), & d = 2, \\ O (n^{- \frac{1}{d}}), & d \geq 3 . \end{matrix} \end{matrix}

□

4. Conclusions

The result in Corollary 2 can be directly applied to produce Appendix A.3 in [1]. We have successfully refined the proof based on balls, thereby enhancing the robustness and accuracy of the process described in [1]. Furthermore, our study has effectively bridged the gap between discrete probabilities and their continuous counterparts by utilizing Wasserstein metrics as approach measures. Moving forward, we aim to apply our methodology to analyze lattice problems in statistical mechanical approaches that involve similar notation and convergence from discrete physical quantities to continuous ones, such as electrostatic approach problems.

We derived the upper bound for the convergence rate of the Wasserstein distance between a uniform distribution and its empirical distribution when

d \geq 2

using the dual method. Our result is consistent with the order of convergence rate in [4], but we provide a specific constant term. Furthermore, we extended this analysis to estimate the convergence rate of random multinomial empirical distributions towards uniform distributions, yielding similar results. However, our approach does not apply to the case when

d = 1

, and we did not obtain a lower bound estimation for the convergence rate. In real-world scenarios, connections between discrete and continuous worlds can be established through random graphs by extending mathematical concepts from manifolds to graphs. For instance, in [1], authors generalize the Ollivier graph curvature definition to enhance its versatility and prove that the Ollivier curvature of random geometric graphs in Riemannian manifolds converges to the Ricci curvature of the manifold. Additionally, Appendix C3 in [1] also provides methods for computing Wasserstein metrics through simulations.

Author Contributions

Conceptualization, W.Y., X.Z. and X.W.; methodology, W.Y. and X.Z.; validation, W.Y., X.W. and X.Z.; formal analysis, W.Y., X.W. and X.Z.; writing—original draft preparation, W.Y.; writing—review and editing, W.Y. and X.Z.; project administration, W.Y. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by a school–enterprise cooperation project: Application of hyperbolic network model in data analysis.

Data Availability Statement

No new data were created or analyzed in this study. Data sharing is not applicable to this article.

Conflicts of Interest

The authors declare no conflict of interest. The funder had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results.

Appendix A. The Lower Bound of $E (M_{n}^{d})$

Since

M_{n}^{d} = inf_{σ} \sum_{i = 1}^{n} ∥ X_{i} - Y_{σ (i)} ∥ \geq inf_{σ} \sum_{i = 1}^{n} min_{j \leq n} ∥ X_{i} - Y_{j} ∥ = \sum_{i = 1}^{n} min_{j \leq n} ∥ X_{i} - Y_{j} ∥,

it follows that

E (M_{n}^{d} | X) \geq \sum_{i = 1}^{n} E (min_{j \leq n} ∥ X_{i} - Y_{j} ∥ | X) \geq n min_{x \in B (0; 1)} E (min_{j \leq n} ∥ x - Y_{j} ∥) .

Let the set of points

B (x, t) = {y \in R^{d} : ∥ x - y ∥ \leq t},

and then

\frac{| B (x, t) \cap B (0; 1) |}{| B (0; 1) |} \leq min {t^{d}, 1}

and

P (min_{j \leq n} ∥ x - Y_{j} ∥ \geq t) \geq {(1 - t^{d})}^{n}, t < 1 .

Thus, we have

E (min_{j \leq n} ∥ x - Y_{j} ∥) = \int_{0}^{\infty} P (min_{j \leq n} ∥ x - Y_{j} ∥ \geq t) d t \geq \int_{0}^{1} {(1 - t^{d})}^{n} d t \overset{t = n^{- 1 / d} u}{=} n^{- 1 / d} \int_{0}^{n^{1 / d}} {(1 - u^{d} / n)}^{n} d u .

Finally, by Fatou’s lemma, one has

\underset{n \to \infty}{lim inf} \frac{E (M_{n}^{d})}{n^{1 - \frac{1}{d}}} \geq \int_{0}^{\infty} e^{- u^{d}} \geq 1 - \frac{1}{d + 1} .

(A1)

Appendix B. The Upper Bound of $E (M_{n}^{d})$

Let

r = n^{- \frac{1}{d}},

so that

r \to 0

as

n \to \infty

, and

| B (x, r) | = \frac{1}{n} ω_{d},

where

ω_{d} = | B (0; 1) | .

Define

u (i, j) = \{\begin{matrix} 1, & i f ∥ X_{i} - Y_{j} ∥ \leq r, \\ 0, & o t h e r w i s e . \end{matrix}

Define

b (x) = \frac{| B (x, r) \cap B (0; 1) |}{| B (0; 1) |},

(A2)

, and then we have

b (x) \leq \frac{1}{n} < 1

if

n > 1 .

First decompose

\sum_{i \leq n} (f (X_{i}) - f (Y_{i}))

, as follows:

\begin{matrix} \sum_{i \leq n} (f (X_{i}) - f (Y_{i})) \\ = \sum_{i \leq n} f (X_{i}) \sum_{j \leq n} u (i, j) - \sum_{j \leq n} f (Y_{j}) \sum_{i \leq n} u (i, j) + \sum_{i \leq n} f (X_{i}) (1 - \sum_{j \leq n} u (i, j)) - \sum_{j \leq n} f (Y_{j}) (1 - \sum_{i \leq n} u (i, j)), \end{matrix}

so one has the inequality

\begin{matrix} | \sum_{i \leq n} (f (X_{i}) - f (Y_{i})) | \\ \leq | \sum_{i \leq n} f (X_{i}) \sum_{j \leq n} u (i, j) - \sum_{j \leq n} f (Y_{j}) \sum_{i \leq n} u (i, j) | + | \sum_{i \leq n} f (X_{i}) (1 - \sum_{j \leq n} u (i, j)) - \sum_{j \leq n} f (Y_{j}) (1 - \sum_{i \leq n} u (i, j)) | \\ = : I_{1} + I_{2} . \end{matrix}

Further, we will estimate two parts of

E {sup}_{f \in L_{1}} I_{1}

in Appendix B.1 and

E {sup}_{f \in L_{1}} I_{2}

in Appendix B.2, respectively. Combining them yields the following bound for the expectation of

M_{n}^{d}

:

\begin{matrix} E (M_{n}^{d}) & = E sup_{f \in L_{1}} | \sum_{i \leq n} (f (X_{i}) - f (Y_{i})) | \\ \leq E sup_{f \in L_{1}} I_{1} + E sup_{f \in L_{1}} I_{2} \\ \leq \{\begin{matrix} n^{\frac{1}{2}} + 4 n^{\frac{1}{2}} + 4 n^{\frac{1}{2}} + 2^{4} n^{\frac{1}{2}} {log}_{2} n, & d = 2, \\ n^{1 - \frac{1}{d}} + 2 d n^{1 - \frac{1}{d}} + 2 (2 + 16 \sqrt{2}) n^{1 - \frac{1}{d}}, & d \geq 3, \end{matrix} \\ = \{\begin{matrix} 9 n^{\frac{1}{2}} + 2^{4} n^{\frac{1}{2}} {log}_{2} n, & d = 2, \\ (5 + 2 d + 32 \sqrt{2}) n^{1 - \frac{1}{d}}, & d \geq 3 . \end{matrix} \end{matrix}

(A3)

Appendix B.1. The Estimation of $\begin{matrix} E sup_{f \in L_{1}} I_{1} \end{matrix}$

\begin{matrix} I_{1} & = | \sum_{i \leq n} f (X_{i}) \sum_{j \leq n} u (i, j) - \sum_{j \leq n} f (Y_{j}) \sum_{i \leq n} u (i, j) | \\ = | \sum_{i \leq n} \sum_{j \leq n} u (i, j) (f (X_{i}) - f (Y_{j})) | \\ \leq r \sum_{i \leq n} \sum_{j \leq n} u (i, j) . \end{matrix}

Since f is Lipschitz, we have

\begin{matrix} E sup_{f \in L_{1}} I_{1} & \leq r \sum_{i \leq n} \sum_{j \leq n} E u (i, j) \\ = r \sum_{i \leq n} \sum_{j \leq n} E (E (u (i, j) | X_{i})) \\ = r \sum_{i \leq n} \sum_{j \leq n} E (b (X_{i})) \\ \leq r \sum_{i \leq n} \sum_{j \leq n} \frac{1}{n} \\ = n^{1 - \frac{1}{d}} . \end{matrix}

It should be noted that a more optimized estimation for

I_{1}

can be found in [2].

Appendix B.2. The Estimation of $\begin{matrix} E sup_{f \in L_{1}} I_{2} \end{matrix}$

Decompose

I_{2}

as follows:

\begin{matrix} I_{2} & = | \sum_{i \leq n} f (X_{i}) (1 - \sum_{j \leq n} u (i, j)) - \sum_{j \leq n} f (Y_{j}) (1 - \sum_{i \leq n} u (i, j)) | \\ \leq |\sum_{i \leq n} f (X_{i}) (1 - n b (X_{i})) - \sum_{j \leq n} f (Y_{j}) (1 - n b (Y_{j}))| + |\sum_{i \leq n} f (X_{i}) (n b (X_{i}) - \sum_{j \leq n} u (i, j))| \\ + |\sum_{j \leq n} f (Y_{j}) (n b (Y_{j}) - \sum_{i \leq n} u (i, j))| \\ = : I_{21} + I_{22} + I_{23} . \end{matrix}

We will estimate these three parts separately in the following smaller subsections to obtain

E sup_{f \in L_{1}} I_{2} \leq E sup_{f \in L_{1}} I_{21} + E sup_{f \in L_{1}} I_{22} + E sup_{f \in L_{1}} I_{23} \leq \{\begin{matrix} 4 n^{\frac{1}{2}} + 4 n^{\frac{1}{2}} + 2^{4} n^{\frac{1}{2}} {log}_{2} n, & d = 2, \\ 2 d n^{1 - \frac{1}{d}} + 2 (2 + 16 \sqrt{2}) n^{1 - \frac{1}{d}}, & d \geq 3 . \end{matrix}

Appendix B.2.1. The Estimation of $\begin{matrix} E sup_{f \in L_{1}} I_{21} \end{matrix}$

According to

{∥ f ∥}_{L^{\infty} [B (0; 1)]} \leq 1

and the value (A2) of

b (X_{i})

in

B (0; 1)

, we have

E sup_{f \in L_{1}} (| \sum_{i \leq n} f (X_{i}) (1 - n b (X_{i})) |) \leq E (\sum_{i \leq n} | 1 - n b (X_{i}) |) = \sum_{i \leq n} E (| 1 - n b (X_{i}) |) \leq n d r = d n^{1 - \frac{1}{d}} .

Consequently, we can obtain

\begin{matrix} E sup_{f \in L_{1}} I_{21} & = & E sup_{f \in L_{1}} (| \sum_{i \leq n} f (X_{i}) (1 - n b (X_{i})) - \sum_{j \leq n} f (Y_{j}) (1 - n b (Y_{j})) |) \\ \leq & E sup_{f \in L_{1}} | \sum_{i \leq n} f (X_{i}) (1 - n b (X_{i})) | + E sup_{f \in L_{1}} | \sum_{j \leq n} f (Y_{j}) (1 - n b (Y_{j})) | \\ \leq & 2 d n^{1 - \frac{1}{d}} . \end{matrix}

Appendix B.2.2. The Estimations of $\begin{matrix} E sup_{f \in L_{1}} I_{22} \end{matrix}$ and $\begin{matrix} E sup_{f \in L_{1}} I_{23} \end{matrix}$

Estimating this part is challenging, and one may employ convolution decomposition to impose f in small areas. Consequently, the following estimation holds:

\begin{matrix} E sup_{f \in L_{1}} I_{22} = E sup_{f \in L_{1}} I_{23} \\ = E sup_{f \in L_{1}} (| \sum_{i \leq n} f (X_{i}) (n b (X_{i}) - \sum_{j \leq n} u (i, j)) |) \leq \{\begin{matrix} 2 n^{\frac{1}{2}} + 2^{3} n^{\frac{1}{2}} {log}_{2} n, & d = 2, \\ (2 + 16 \sqrt{2}) n^{1 - \frac{1}{d}}, & d \geq 3 . \end{matrix} \end{matrix}

Initially, we assume that f represents an indicator function for a set

A,

where A is a measurable subset of

R^{d},

and estimate

E (| \sum_{i \leq n} 1_{A} (X_{i}) (n b (X_{i}) - \sum_{j \leq n} u (i, j)) |^{2}) .

Thus, we have

\begin{matrix} E (| \sum_{i \leq n} 1_{A} (X_{i}) (n b (X_{i}) - \sum_{j \leq n} u (i, j)) |^{2}) \\ = E (\sum_{i, i^{'} \leq n} \sum_{j, j^{'} \leq n} 1_{A} (X_{i}) (b (X_{i}) - u (i, j)) 1_{A} (X_{i^{'}}) (b (X_{i^{'}}) - u (i^{'}, j^{'}))) . \end{matrix}

By considering different cases of

i, i^{'}

and

j, j^{'},

we obtain the inequality

E (| \sum_{i \leq n} \sum_{j \leq n} 1_{A} (X_{i}) (b (X_{i}) - u (i, j)) |^{2}) \leq n^{2} (n - 1) \frac{| A |}{| B (0; 1) |} \frac{1}{n^{2}} + n^{2} \frac{| A |}{| B (0; 1) |} \frac{1}{n} \leq 2 n \frac{| A |}{| B (0; 1) |} .

(A4)

Furthermore, let us set

h (x) = c_{0} 1_{A} (x) .

By using Formula (A4), we obtain

\begin{matrix} \int_{R^{d}} E (| \sum_{i \leq n} \sum_{j \leq n} h (X_{i} - t) (b (X_{i}) - u (i, j)) |^{2}) d t \\ = c_{0}^{2} \int_{R^{d}} E (| \sum_{i \leq n} \sum_{j \leq n} 1_{A} (X_{i} - t) (b (X_{i}) - u (i, j)) |^{2}) d t \\ \leq 2 n c_{0}^{2} | A | . \end{matrix}

(A5)

Finally, we decompose f into the sum of some well-defined convolutions to estimate these components. Since a Lipschitz function f in

B (0; 1) \subset R^{d}

with

f (0) = 0

can be extended to a Lipschitz function in the entire space

R^{d}

with

{∥ f ∥}_{L^{\infty}} \leq 1

, we consider the function

f \in L

defined as (1). We then decompose it as follows,

f = \sum_{l = 1}^{q + 1} f_{l},

where

f_{1} = f - f * h_{1},

f_{2} = f * h_{1} - f * h_{2} * h_{1},

⋯,

f_{q} = f * h_{q - 1} * \dots * h_{1} - f * h_{q} * h_{q - 1} * \dots * h_{1}, f_{q + 1} = f * h_{q} \dots * h_{1},

h_{l} = \{\begin{matrix} {| B (0; 1) |}^{- 1} {(2^{l} r)}^{- d}, & x \in B (0, 2^{l} r) \\ 0, & o t h e r w i s e \end{matrix},

and q denoted by

2^{q} r < 1 \leq 2^{q + 1} r .

Therefore, we have

\begin{matrix} E sup_{f \in L_{1}} I_{22} & = E sup_{f \in L} I_{22} \\ = E (sup_{f \in L} | \sum_{i \leq n} \sum_{l = 1}^{q + 1} f_{l} (X_{i}) \sum_{j \leq n} (b (X_{i}) - u (i, j)) |) \\ \leq \sum_{l = 1}^{q + 1} E (sup_{f \in L} | \sum_{i \leq n} f_{l} (X_{i}) \sum_{j \leq n} (b (X_{i}) - u (i, j)) |) . \end{matrix}

For the first expectation mentioned above, we have

\begin{matrix} E (sup_{f \in L} | \sum_{i \leq n} f_{1} (X_{i}) \sum_{j \leq n} (b (X_{i}) - u (i, j)) |) \\ = E (sup_{f \in L} | \sum_{i \leq n} (f - f * h_{1}) (X_{i}) \sum_{j \leq n} (b (X_{i}) - u (i, j)) |) \\ \leq 2 r \sum_{i \leq n} E (| \sum_{j \leq n} (b (X_{i}) - u (i, j)) |) \\ \leq 2 n r = 2 n^{1 - \frac{1}{d}}, \end{matrix}

(A6)

since

∥ f - f * h_{1} ∥_{L^{\infty}} \leq 2 r

and

E (| \sum_{j \leq n} (b (X_{i}) - u (i, j)) |) \leq 1 .

For the expectation about

f_{l} = (f - f * h_{l}) * h_{1} * \dots * h_{l - 1}

with

2 \leq l \leq q,

since

∥ f - f * h_{l} ∥_{L^{\infty}} \leq 2^{l} r

, we have

\begin{matrix} E (sup_{f \in L} | \sum_{i \leq n} f_{l} (X_{i}) \sum_{j \leq n} (b (X_{i}) - u (i, j)) |) \\ = E (sup_{f \in L} | \sum_{i \leq n} (f - f * h_{l}) * h_{1} * \dots * h_{l - 1} (X_{i}) \sum_{j \leq n} (b (X_{i}) - u (i, j)) |) \\ \leq 2^{l} r E (\int_{R^{d}} | \sum_{i \leq n} \sum_{j \leq n} h_{l - 1} (X_{i} - t) (b (X_{i}) - u (i, j)) | d t), \end{matrix}

(A7)

and on the other hand, we have

\begin{matrix} E (\int_{R^{d}} | \sum_{i \leq n} \sum_{j \leq n} h_{l - 1} (X_{i} - t) (b (X_{i}) - u (i, j)) | d t) \\ = \int_{R^{d}} E (| \sum_{i \leq n} \sum_{j \leq n} h_{l - 1} (X_{i} - t) (b (X_{i}) - u (i, j)) |) d t \\ \leq {(\int_{R^{d}} E (| \sum_{i \leq n} \sum_{j \leq n} h_{l - 1} (X_{i} - t) (b (X_{i}) - u (i, j)) |^{2}) d t)}^{\frac{1}{2}} \\ \times {(\int_{R^{d}} \int_{B {(0; 1)}^{2 n}} 1_{\cup_{i \leq n} (x_{i} - s u p p (h_{l - 1}))} (t) \frac{1}{{| B (0, 1) |}^{2 n}} d x d y d t)}^{\frac{1}{2}} \\ \leq {(| B (0; 1 + 2^{l - 1} r) |)}^{\frac{1}{2}} {(\int_{R^{n}} E (| \sum_{i \leq n} \sum_{j \leq n} h_{l - 1} (X_{i} - t) (b (X_{i}) - u (i, j)) |^{2}) d t)}^{\frac{1}{2}} . \end{matrix}

(A8)

Hence, using (A5) and combining (A7) and (A8), we have

\begin{matrix} E (sup_{f \in L} | \sum_{i \leq n} f_{l} (X_{i}) \sum_{j \leq n} (b (X_{i}) - u (i, j)) |) \\ \leq 2^{l} r E (\int_{R^{d}} | \sum_{i \leq n} \sum_{j \leq n} h_{l - 1} (X_{i} - t) (b (X_{i}) - u (i, j)) | d t) \\ \leq 2^{l} r {(| B (0; 1 + 2^{l - 1} r) |)}^{\frac{1}{2}} {(\int_{R^{n}} E (| \sum_{i \leq n} \sum_{j \leq n} h_{l - 1} (X_{i} - t) (b (X_{i}) - u (i, j)) |^{2}) d t)}^{\frac{1}{2}} \\ \leq 2^{l} r {(| B (0; 1 + 2^{l - 1} r) |)}^{\frac{1}{2}} {({2 n (| B (0; 1) |}^{- 1} {(2^{l - 1} r)}^{- d})^{2} | s u p p (h_{l - 1}) |)}^{\frac{1}{2}} \\ = 2^{l} r \sqrt{2 n} {(1 + 2^{l - 1} r)}^{\frac{d}{2}} {(2^{l - 1} r)}^{- \frac{d}{2}} \\ \leq 2^{l (1 - \frac{d}{2})} 2^{d + \frac{1}{2}} n^{1 - \frac{1}{d}} . \end{matrix}

(A9)

For the last expectation about

f_{q + 1} = f * h_{1} * \dots h_{q - 1} * h_{q}

, the above argument still works. Since

{∥ f ∥}_{L^{\infty}} \leq 1 \leq 2^{q + 1} r

, we have

E (sup_{f \in L} | \sum_{i \leq n} f_{q + 1} (X_{i}) \sum_{j \leq n} (b (X_{i}) - u (i, j)) |) \leq 2^{(q + 1) (1 - \frac{d}{2})} 2^{d + \frac{1}{2}} n^{1 - \frac{1}{d}} .

(A10)

Summing up these estimations, (A6), (A9) and (A10), yields

E sup_{f \in L_{1}} I_{22} \leq 2 n^{1 - \frac{1}{d}} + \sum_{l = 2}^{q + 1} 2^{l (1 - \frac{d}{2})} 2^{d + \frac{1}{2}} n^{1 - \frac{1}{d}} .

If

d \geq 3,

one has

E {sup}_{f \in L_{1}} I_{22} \leq (2 + 16 \sqrt{2}) n^{1 - \frac{1}{d}} .

If

d = 2,

we may obtain

\frac{1}{2} {log}_{2} n - 1 \leq q < \frac{1}{2} {log}_{2} n

, and hence

E {sup}_{f \in L_{1}} I_{22} \leq 2 n^{\frac{1}{2}} + 2^{3} n^{\frac{1}{2}} {log}_{2} n .

References

van der Hoorn, P.; Lippner, G.; Trugenberger, C.; Krioukov, D. Ollivier-Ricci curvature convergence in random geometric graphs. Phys. Rev. Res. 2021, 3, 013211. [Google Scholar] [CrossRef]
Talagrand, M. Matching random samples in many dimensions. Ann. Appl. Probab. 1992, 2, 846–856. [Google Scholar] [CrossRef]
Kralj-Iglič, V.; Iglič, A. A Simple Statistical Mechanical Approach to the free Energy of the Electric Double Layer Including the Excluded Volume Effect. J. Phys. II 1996, 6, 477–491. [Google Scholar] [CrossRef]
Fournier, N.; Guillin, A. On the rate of convergence in Wasserstein distance of the empirical measure. Probab. Theory Relat. Fields 2015, 162, 707–738. [Google Scholar] [CrossRef]
Villani, C. Optimal Transport: Old and New Part 1; Grundlehren der Mathematischen Wissenschaften, 338. A Series of Comprehensive Studies in Mathematics; Springer: Berlin/Heidelberg, Germany, 2009. [Google Scholar]
Ambrosio, L.; Stra, F.; Trevisan, D. A PDE approach to a 2-dimensional matching problem. Probab. Theory Relat. Fields 2019, 173, 433–477. [Google Scholar] [CrossRef]
Bobkov, S.G.; Ledoux, M. A simple Fourier analytic proof of the AKT optimal matching theorem. Ann. Appl. Probab. 2021, 31, 2567–2584. [Google Scholar] [CrossRef]
Borda, B. Empirical measures and random walks on compact spaces in the quadratic Wasserstein metric. Ann. Inst. Henri Poincaré Probab. Stat. 2023, 59, 2017–2035. [Google Scholar] [CrossRef]
Kuhn, H.W. The Hungarian method for the assignment problem. Nav. Res. Logist. Q. 1955, 2, 83–97. [Google Scholar] [CrossRef]
Papadimitriou, C.H.; Steiglitz, K. Combinatorial Optimization, Algorithms and Complexity; Prentice-Hall: Englewood Cliffs, NJ, USA, 1982. [Google Scholar]
Penrose, M. Random Geometric Graphs; Oxford University Press: Oxford, UK, 2003. [Google Scholar]

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

The Wasserstein Metric between a Discrete Probability Measure and a Continuous One

Abstract

1. Introduction

2. Preliminary Estimation

3. Main Results and Proofs

4. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

Appendix A. The Lower Bound of $E (M_{n}^{d})$

Appendix B. The Upper Bound of $E (M_{n}^{d})$

Appendix B.1. The Estimation of $\begin{matrix} E sup_{f \in L_{1}} I_{1} \end{matrix}$

Appendix B.2. The Estimation of $\begin{matrix} E sup_{f \in L_{1}} I_{2} \end{matrix}$

Appendix B.2.1. The Estimation of $\begin{matrix} E sup_{f \in L_{1}} I_{21} \end{matrix}$

Appendix B.2.2. The Estimations of $\begin{matrix} E sup_{f \in L_{1}} I_{22} \end{matrix}$ and $\begin{matrix} E sup_{f \in L_{1}} I_{23} \end{matrix}$

References

Article Metrics

Citations

Article Access Statistics

The Wasserstein Metric between a Discrete Probability Measure and a Continuous One

Abstract

1. Introduction

2. Preliminary Estimation

3. Main Results and Proofs

4. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

Appendix A. The Lower Bound of E ( M n d )

Appendix B. The Upper Bound of E ( M n d )

Appendix B.1. The Estimation of E sup f ∈ L 1 I 1

Appendix B.2. The Estimation of E sup f ∈ L 1 I 2

Appendix B.2.1. The Estimation of E sup f ∈ L 1 I 21

Appendix B.2.2. The Estimations of E sup f ∈ L 1 I 22 and E sup f ∈ L 1 I 23

References

Article Metrics

Citations

Article Access Statistics

Appendix A. The Lower Bound of $E (M_{n}^{d})$

Appendix B. The Upper Bound of $E (M_{n}^{d})$

Appendix B.1. The Estimation of $\begin{matrix} E sup_{f \in L_{1}} I_{1} \end{matrix}$

Appendix B.2. The Estimation of $\begin{matrix} E sup_{f \in L_{1}} I_{2} \end{matrix}$

Appendix B.2.1. The Estimation of $\begin{matrix} E sup_{f \in L_{1}} I_{21} \end{matrix}$

Appendix B.2.2. The Estimations of $\begin{matrix} E sup_{f \in L_{1}} I_{22} \end{matrix}$ and $\begin{matrix} E sup_{f \in L_{1}} I_{23} \end{matrix}$