Special Issue Editors

Prof. Dr. Cristian S. Calude

E-Mail Website
Guest Editor

Department of Computer Science, University of Auckland, Auckland, New Zealand
Interests: algorithmic information theory; quantum computing
Special Issues, Collections and Topics in MDPI journals

Prof. Dr. Giuseppe Longo

E-Mail Website
Guest Editor

1. Centre Cavaillès, République des Savoirs, CNRS USR3608, Collège de France et Ecole Normale Supérieure, Paris, France
2. School of Medicine, Tufts University, Boston, MA 02111, USA
Interests: logic and theory of computation; denotational semantics and lambda-calculus; type theory, category theory and their applications to computer Science; cognitive foundations of mathematics; interfaces mathematics; physics; biology

Prof. Dr. Maël Montévil

E-Mail Website
Guest Editor

Institut de recherche et d'Innovation (IRI), Centre Pompidou, 75004 Paris, France
Interests: theoretical biology; morphogenesis; philosophy of biology; developmental biology

Special Issue Information

Dear Colleagues,

It all started with the explosion of data generated since the dawn of the digital age in the 1970s. Currently, every two days we create as much data as we did from the beginning of time until 2000. Big data and data science have appeared because of the rise of computers, the Internet and the technology capable of generating, storing, organising and processing data from the world we live in.

Data science is based on the principle that the more data one knows about anything, the more reliably one can gain new insights and make predictions about what will happen in the future. Data science projects use artificial intelligence and machine learning to teach computers to identify what various data represent and learn to spot patterns much more quickly and reliably than humans. Data science has been considered in health care (data-driven medicine), in predicting and responding to natural and man-made disasters, in preventing crime, in developing autonomous devices (cars), and in many other social, business, learning projects. It will be difficult to name an area where data science cannot be, in principle, used.

Data science may give unprecedented insights and opportunities, but it also raises concerns and questions related to ethics, privacy, security and knowledge. These concerns pertain to working exclusively with correlations instead of understanding, the source (and/or origin) of data, the bias in the choice of observables and metrics, the difficulty or even impossibility to understand and verify the correctness of solutions and decisions. As some regularities are just due to the large data size, are there critical sizes beyond which the validity of prediction is questionable, or, conversely, ascertained?

Data science does not build on invariance properties in the way physics makes inert phenomena intelligible mathematically. The collection of data, in biology, is a particularly delicate issue: the phylo- and onto-genetic specification of organisms makes biological in vitro/in vivo measurement sensitive to synchronic as well as to diachronic events. How can data science help to enrich these practices, instead of blurring the complexity of life sciences under the size of databases?

This Special Issue aims to present technical results in mathematics, physics, biology and other areas which can help the understanding of data science but also its limits and potential dangers.

We look forward to your contributions to this Special Issue,

Prof. Dr. Cristian S. Calude
Prof. Dr. Giuseppe Longo
Prof. Dr. Maël Montévil
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Axioms is an international peer-reviewed open access monthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2400 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

Big data
Data science
Spurious correlation
Limits
Data science in physics, biology and beyond

Published Papers (1 paper)

Research

17 pages, 1121 KiB

Open AccessArticle

Using Ramsey Theory to Measure Unavoidable Spurious Correlations in Big Data

by Micheal Pawliuk and Michael Alexander Waddell

Axioms 2019, 8(1), 29; https://doi.org/10.3390/axioms8010029 - 05 Mar 2019

Cited by 1 | Viewed by 4038

Abstract

Given a dataset, we quantify the size of patterns that must always exist in the dataset. This is done formally through the lens of Ramsey theory of graphs, and a quantitative bound known as Goodman’s theorem. By combining statistical tools with Ramsey theory of graphs, we give a nuanced understanding of how far away a dataset is from correlated, and what qualifies as a meaningful pattern. This method is applicable to a wide range of datasets. As examples, we analyze two very different datasets. The first is a dataset of repeated voters (

n = 435

) in the 1984 US congress, and we quantify how homogeneous a subset of congressional voters is. We also measure how transitive a subset of voters is. Statistical Ramsey theory is also used with global economic trading data (

n = 214

) to provide evidence that global markets are quite transitive. While these datasets are small relative to Big Data, they illustrate the new applications we are proposing. We end with specific calls to strengthen the connections between Ramsey theory and statistical methods. Full article

(This article belongs to the Special Issue Perspectives on Big Data and Data Sciences)

► Show Figures

Figure 1

Journal Menu

Journal Browser

Perspectives on Big Data and Data Sciences

Share This Special Issue

Special Issue Editors

Special Issue Information

Keywords

Published Papers (1 paper)

Research

Further Information

Guidelines

MDPI Initiatives

Follow MDPI