Next Article in Journal
On the Linear Quadratic Optimal Control for Systems Described by Singularly Perturbed Itô Differential Equations with Two Fast Time Scales
Previous Article in Journal
Doily as Subgeometry of a Set of Nonunimodular Free Cyclic Submodules
Article Menu

Export Article

Open AccessArticle

Using Ramsey Theory to Measure Unavoidable Spurious Correlations in Big Data

Department of Mathematics and Statistics, University of Calgary, Calgary, AB T2N 1N4, Canada
Department of Applied Physics and Applied Mathematics, Columbia University, New York, NY 10027, USA
Author to whom correspondence should be addressed.
These authors contributed equally to this work.
Axioms 2019, 8(1), 29;
Received: 18 January 2019 / Revised: 8 February 2019 / Accepted: 11 February 2019 / Published: 5 March 2019
(This article belongs to the Special Issue Perspectives on Big Data and Data Sciences)
PDF [1121 KB, uploaded 6 March 2019]


Given a dataset, we quantify the size of patterns that must always exist in the dataset. This is done formally through the lens of Ramsey theory of graphs, and a quantitative bound known as Goodman’s theorem. By combining statistical tools with Ramsey theory of graphs, we give a nuanced understanding of how far away a dataset is from correlated, and what qualifies as a meaningful pattern. This method is applicable to a wide range of datasets. As examples, we analyze two very different datasets. The first is a dataset of repeated voters ( n = 435 ) in the 1984 US congress, and we quantify how homogeneous a subset of congressional voters is. We also measure how transitive a subset of voters is. Statistical Ramsey theory is also used with global economic trading data ( n = 214 ) to provide evidence that global markets are quite transitive. While these datasets are small relative to Big Data, they illustrate the new applications we are proposing. We end with specific calls to strengthen the connections between Ramsey theory and statistical methods. View Full-Text
Keywords: statistics; data analysis; Ramsey theory; graph theory; transitivity statistics; data analysis; Ramsey theory; graph theory; transitivity

Figure 1

This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited (CC BY 4.0).

Share & Cite This Article

MDPI and ACS Style

Pawliuk, M.; Waddell, M.A. Using Ramsey Theory to Measure Unavoidable Spurious Correlations in Big Data. Axioms 2019, 8, 29.

Show more citation formats Show less citations formats

Note that from the first issue of 2016, MDPI journals use article numbers instead of page numbers. See further details here.

Related Articles

Article Metrics

Article Access Statistics



[Return to top]
Axioms EISSN 2075-1680 Published by MDPI AG, Basel, Switzerland RSS E-Mail Table of Contents Alert
Back to Top