In traditional research, repeated measurements lead to a sample of results, and inferential statistics can be used to not only estimate parameters, but also to test statistical hypotheses concerning these parameters. In many cases, the standard error of the estimates decreases (asymptotically) with the square root of the sample size, which provides a stimulus to probe large samples. In simulation models, the situation is entirely different. When probability distribution functions for model features are specified, the probability distribution function of the model output can be approached using numerical techniques, such as bootstrapping or Monte Carlo sampling. Given the computational power of most PCs today, the sample size can be increased almost without bounds. The result is that standard errors of parameters are vanishingly small, and that almost all significance tests will lead to a rejected null hypothesis. Clearly, another approach to statistical significance is needed. This paper analyzes the situation and connects the discussion to other domains in which the null hypothesis significance test (NHST) paradigm is challenged. In particular, the notions of effect size and Cohen’s d
provide promising alternatives for the establishment of a new indicator of statistical significance. This indicator attempts to cover significance (precision) and effect size (relevance) in one measure. Although in the end more fundamental changes are called for, our approach has the attractiveness of requiring only a minimal change to the practice of statistics. The analysis is not only relevant for artificial samples, but also for present-day huge samples, associated with the availability of big data.
This is an open access article distributed under the Creative Commons Attribution License
which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited