# On Wasserstein Two-Sample Testing and Related Families of Nonparametric Tests

^{2}

^{3}

^{1}

^{*}

## Abstract

**:**

## 1. Introduction

#### 1.1. Contributions

#### 1.2. Paper Outline

## 2. Nonparametric Two-Sample Testing

#### 2.1. Three Ways to Compare Distributions

## 3. Entropy Smoothed Wasserstein Distances

#### 3.1. Wasserstein Distance

**Definition 1**(Wasserstein Distances)

**.**

#### 3.2. Entropic Smoothing

#### 3.3. Two Extremes of Smoothing: Wasserstein and Energy Distance

#### 3.4. From Energy Distance to Kernel Maximum Mean Discrepancy

## 4. Univariate Wasserstein Distance and PP/QQ Tests

#### 4.1. Comparing CDFs (PP)

#### 4.2. Comparing QFs (QQ)

#### 4.3. Wasserstein Is a QQ Test

**Proposition**

**1.**

## 5. Distribution-Free Wasserstein Tests and ROC/ODC Curves

#### 5.1. Relating Wasserstein Distance to ROC and ODC Curves

- The $ROC$ curve is increasing and $ROC\left(0\right)=0$, $ROC\left(1\right)=1$.
- If $G\left(t\right)\ge F\left(t\right)$ for all t, then $ROC\left(t\right)\ge t$ for all t.
- If $F,G$ have densities with monotone likelihood ratio, then the ROC curve is concave.
- The area under the ROC curve is equal to $\mathbb{P}(Y<X)$, where Y∼Q and X∼P.

**Lemma 1**(Reduction to uniform distribution)

**.**

**Proof.**

**Theorem**

**1.**

## 6. Experiments

- Beta(2,2) versus Beta(1.8,2.16);
- Exponential(1), equivalently Gamma(1,1), versus Gamma(2,0.5);
- Standard Normal versus Student’s t;
- Generalized extreme value versus Generalized Pareto.

## 7. Conclusions

## Acknowledgments

## Author Contributions

## Conflicts of Interest

## Appendix A. Proof of Proposition 1

**Proof.**

## References

- Freitag, G.; Czado, C.; Munk, A. A nonparametric test for similarity of marginals—with applications to the assessment of population bioequivalence. J. Stat. Plan. Inference
**2007**, 137, 697–711. [Google Scholar] [CrossRef] - Munk, A.; Czado, C. Nonparametric validation of similar distributions and assessment of goodness of fit. J. R. Stat. Soc. Ser. B Stat. Methodol.
**1998**, 60, 223–241. [Google Scholar] [CrossRef] - Álvarez-Esteban, P.C.; Del Barrio, E.; Cuesta-Albertos, J.A.; Matrán, C. Similarity of samples and trimming. Bernoulli
**2012**, 18, 606–634. [Google Scholar] [CrossRef] - Alvarez-Esteban, P.C.; Del Barrio, E.; Cuesta-Albertos, J.A.; Matrán, C. Trimmed comparison of distributions. J. Am. Stat. Assoc.
**2008**, 103, 697–704. [Google Scholar] [CrossRef] - Cuesta-Albertos, J.A.; Matrán, C.; Rodriguez-Rodriguez, J.M.; del Barrio, E. Tests of goodness of fit based on the L_2-wasserstein distance. Ann. Stat.
**1999**, 27, 1230–1239. [Google Scholar] - Del Barrio, E.; Cuesta-Albertos, J.A.; Matrán, C.; Csörgö, S.; Cuadras, C.M.; de Wet, T.; Giné, E.; Lockhart, R.; Munk, A.; Stute, W. Contributions of empirical and quantile processes to the asymptotic theory of goodness-of-fit tests. Test
**2000**, 9, 1–96. [Google Scholar] [CrossRef] - Del Barrio, E.; Giné, E.; Utzet, F.; et al. Asymptotics for l2 functionals of the empirical quantile process, with applications to tests of fit based on weighted wasserstein distances. Bernoulli
**2005**, 11, 131–189. [Google Scholar] [CrossRef] - Del Barrio, E. Empirical and quantile processes in the asymptotic theory of goodness-of-fit tests. In Proceedings of European Mathematical Society Summer School on Theory and Statistical Applications of Empirical Processes, Laredo, Spain, 29 August–3 September 2004.
- Székely, G.J.; Rizzo, M.L. Testing for equal distributions in high dimension. InterStat
**2004**, 5, 1–6. [Google Scholar] - Baringhaus, L.; Franz, C. On a new multivariate two-sample test. J. Multivar. Anal.
**2004**, 88, 190–206. [Google Scholar] [CrossRef] - Gretton, A.; Borgwardt, K.; Rasch, M.; Schoelkopf, B.; Smola, A. A kernel two-sample test. J. Mach. Learn. Res.
**2012**, 13, 723–773. [Google Scholar] - Hsieh, F.; Turnbull, B.W. Nonparametric and semiparametric estimation of the receiver operating characteristic curve. Ann. Stat.
**1996**, 24, 25–40. [Google Scholar] [CrossRef] - Lehmann, E.L.; D’Abrera, H.J.M. Nonparametrics: Statistical Methods Based on Ranks; Springer: New York, NY, USA, 2006. [Google Scholar]
- Friedman, J.H.; Rafsky, L.C. Multivariate generalizations of the wald-wolfowitz and smirnov two-sample tests. Ann. Stat.
**1979**, 7, 697–717. [Google Scholar] [CrossRef] - Wald, A.; Wolfowitz, J. On a test whether two-samples are from the same population. Ann. Math. Stat.
**1940**, 11, 147–162. [Google Scholar] [CrossRef] - Schilling, M.F. Multivariate two-sample tests based on nearest neighbors. J. Am. Stat. Assoc.
**1986**, 81, 799–806. [Google Scholar] [CrossRef] - Henze, N. A multivariate two-sample test based on the number of nearest neighbor type coincidences. Ann. Stat.
**1988**, 16, 772–783. [Google Scholar] [CrossRef] - Rosenbaum, P.R. An exact distribution-free test comparing two multivariate distributions based on adjacency. J. R. Stat. Soc. Ser. B Stat. Methodol.
**2005**, 67, 515–530. [Google Scholar] [CrossRef] - Thas, O. Comparing Distributions; Springer: New York, NY, USA, 2010. [Google Scholar]
- Kolmogorov, A.N. Sulla determinazione empirica di una legge di distribuzione. Giornale dell’Istituto Italiano degli Attuari
**1933**, 4, 83–91. (In Italian) [Google Scholar] - Smirnov, N. Table for estimating the goodness of fit of empirical distributions. Ann. Math. Stat.
**1948**, 19, 279–281. [Google Scholar] [CrossRef] - Cramér, H. On the composition of elementary errors: First paper: Mathematical deductions. Scand. Actuar. J.
**1928**, 1928, 13–74. [Google Scholar] [CrossRef] - Von Mises, R. Wahrscheinlichkeit statistik und wahrheit. In Schriften zur Wissenschaftlichen Weltauffassung; Springer: Vienna, Austria, 1928. (In German) [Google Scholar]
- Anderson, T.W.; Darling, D.A. Asymptotic theory of certain “goodness of fit” criteria based on stochastic processes. Ann. Math. Stat.
**1952**, 23, 193–212. [Google Scholar] [CrossRef] - Bickel, P.J. A distribution free version of the smirnov two-sample test in the p-variate case. Ann. Math. Stat.
**1969**, 40, 1–23. [Google Scholar] [CrossRef] - Villani, C. Optimal Transport: Old and New; Springer: New York, NY, USA, 2009; Volume 338. [Google Scholar]
- Bertsimas, D.; Tsitsiklis, J.N. Introduction to Linear Optimization; Athena Scientific: Belmont, MA, USA, 1997. [Google Scholar]
- Dudley, R.M. The speed of mean Glivenko-Cantelli convergence. Ann. Math. Stat.
**1968**, 40, 40–50. [Google Scholar] [CrossRef] - Ajtai, M.; Komlós, J.; Tusnády, G. On optimal matchings. Combinatorica
**1994**, 4, 259–264. [Google Scholar] [CrossRef] - García, N.; Slepčev, D. On the rate of convergence of empirical measures in ∞-transportation distance. arXiv
**2014**. [Google Scholar] - Leighton, T.; Shor, P. Tight bounds for minimax grid matching with applications to the average case analysis of algorithms. Combinatorica
**1989**, 9, 161–187. [Google Scholar] [CrossRef] - Shor, P.W.; Yukich, J.E. Minimax grid matching and empirical measures. Ann. Probab.
**1991**, 19, 1338–1348. [Google Scholar] [CrossRef] - Schrodinger, E. Uber die Umkehrung der Naturgesetze; Akad. d. Wissenschaften: Berlin, Germany, 1931; pp. 144–153. (In German) [Google Scholar]
- Cuturi, M. Sinkhorn distances: Lightspeed computation of optimal transport. In Proceedings of the Advances in Neural Information Processing Systems 26 (NIPS 2013), Lake Tahoe, NV, USA, 5–10 December 2013; pp. 2292–2300.
- Sinkhorn, R. Diagonal equivalence to matrices with prescribed row and column sums. Am. Math. Mon.
**1967**, 74, 402–405. [Google Scholar] [CrossRef] - Franklin, J.; Lorenz, J. On the scaling of multidimensional matrices. Linear Algebra Appl.
**1989**, 114, 717–735. [Google Scholar] [CrossRef] - Gretton, A.; Borgwardt, K.M.; Rasch, M.; Schölkopf, B.; Smola, A.J. A kernel method for the two-sample-problem. arXiv
**2008**. [Google Scholar] - Fernández, V.A.; Jiménez-Gamero, M.D.; Muñoz-García, J. A test for the two-sample problem based on empirical characteristic functions. Comput. Stat. Data Anal.
**2008**, 52, 3730–3748. [Google Scholar] [CrossRef] - Rudin, W. Fourier Analysis on Groups; Interscience Publishers: New York, NY, USA, 1962. [Google Scholar]
- Lyons, R. Distance covariance in metric spaces. Ann. Probab.
**2013**, 41, 3284–3305. [Google Scholar] [CrossRef] - Sejdinovic, D.; Sriperumbudur, B.; Gretton, A.; Fukumizu, K. Equivalence of distance-based and RKHS-based statistics in hypothesis testing. Ann. Stat.
**2013**, 41, 2263–2291. [Google Scholar] [CrossRef] - Billingsley, P. Convergence of Probability Measures; John Wiley & Sons, Inc.: New York, NY, USA, 1968. [Google Scholar]
- Shorack, G.R.; Wellner, J.A. Empirical Processes with Applications to Statistics; SIAM: Philadelphia, PA, USA, 1986. [Google Scholar]
- Bickel, P.J.; Freedman, D.A. Some asymptotic theory for the bootstrap. Ann. Stat.
**1981**, 9, 1196–1217. [Google Scholar] [CrossRef] - Dobrushin, R.L. Prescribing a system of random variables by conditional distributions. Theory Probab. Appl.
**1970**, 15, 458–486. [Google Scholar] [CrossRef] - Mallows, C.L. A note on asymptotic joint normality. Ann. Math. Stat.
**1972**, 43, 508–515. [Google Scholar] [CrossRef] - Komlós, J.; Major, P.; Tusnády, G. An approximation of partial sums of independent RV’s, and the sample DF. II. Zeitschrift für Wahrscheinlichkeitstheorie und Verwandte Gebiete
**1976**, 34, 33–58. [Google Scholar] [CrossRef]

**Figure 1.**The

**left**panel contains the two PDFs used for the simulation, and the

**right**panel contains the resulting precision–recall curve for several tests. From

**top**to

**bottom**: distributions differing in their first, second, third and fourth moments.

© 2017 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license ( http://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Ramdas, A.; Trillos, N.G.; Cuturi, M.
On Wasserstein Two-Sample Testing and Related Families of Nonparametric Tests. *Entropy* **2017**, *19*, 47.
https://doi.org/10.3390/e19020047

**AMA Style**

Ramdas A, Trillos NG, Cuturi M.
On Wasserstein Two-Sample Testing and Related Families of Nonparametric Tests. *Entropy*. 2017; 19(2):47.
https://doi.org/10.3390/e19020047

**Chicago/Turabian Style**

Ramdas, Aaditya, Nicolás García Trillos, and Marco Cuturi.
2017. "On Wasserstein Two-Sample Testing and Related Families of Nonparametric Tests" *Entropy* 19, no. 2: 47.
https://doi.org/10.3390/e19020047