Next Article in Journal
DOPSIE: Deep-Order Proximity and Structural Information Embedding
Previous Article in Journal
Prediction by Empirical Similarity via Categorical Regressors
Open AccessReview

Large-Scale Simultaneous Inference with Hypothesis Testing: Multiple Testing Procedures in Practice

1
Predictive Society and Data Analytics Lab, Faculty of Information Technology and Communication Sciences, Tampere University, 33100 Tampere, Finland
2
Institute of Biosciences and Medical Technology, 33520 Tampere, Finland
3
Institute for Intelligent Production, Faculty for Management, University of Applied Sciences Upper Austria, Steyr Campus, Steyr 4400, Austria
4
Department of Mechatronics and Biomedical Computer Science, University for Health Sciences, Medical Informatics and Technology, Tirol 6060, Austria
5
College of Computer and Control Engineering, Nankai University, Tianjin 300071, China
*
Author to whom correspondence should be addressed.
Mach. Learn. Knowl. Extr. 2019, 1(2), 653-683; https://doi.org/10.3390/make1020039
Received: 29 March 2019 / Revised: 4 May 2019 / Accepted: 6 May 2019 / Published: 15 May 2019
(This article belongs to the Section Learning)
A statistical hypothesis test is one of the most eminent methods in statistics. Its pivotal role comes from the wide range of practical problems it can be applied to and the sparsity of data requirements. Being an unsupervised method makes it very flexible in adapting to real-world situations. The availability of high-dimensional data makes it necessary to apply such statistical hypothesis tests simultaneously to the test statistics of the underlying covariates. However, if applied without correction this leads to an inevitable increase in Type 1 errors. To counteract this effect, multiple testing procedures have been introduced to control various types of errors, most notably the Type 1 error. In this paper, we review modern multiple testing procedures for controlling either the family-wise error (FWER) or the false-discovery rate (FDR). We emphasize their principal approach allowing categorization of them as (1) single-step vs. stepwise approaches, (2) adaptive vs. non-adaptive approaches, and (3) marginal vs. joint multiple testing procedures. We place a particular focus on procedures that can deal with data with a (strong) correlation structure because real-world data are rarely uncorrelated. Furthermore, we also provide background information making the often technically intricate methods accessible for interdisciplinary data scientists. View Full-Text
Keywords: hypothesis testing; machine learning; statistics; multiple testing correction; multiple comparisons; high-dimensional data; data science hypothesis testing; machine learning; statistics; multiple testing correction; multiple comparisons; high-dimensional data; data science
Show Figures

Figure 1

MDPI and ACS Style

Emmert-Streib, F.; Dehmer, M. Large-Scale Simultaneous Inference with Hypothesis Testing: Multiple Testing Procedures in Practice. Mach. Learn. Knowl. Extr. 2019, 1, 653-683.

Show more citation formats Show less citations formats

Article Access Map by Country/Region

1
Back to TopTop