Next Article in Journal
Jump-Robust Realized-GARCH-MIDAS-X Estimators for Bitcoin and Ethereum Volatility Indices
Previous Article in Journal
Process Monitoring Using Truncated Gamma Distribution
 
 
Article
Peer-Review Record

Revisiting the Large n (Sample Size) Problem: How to Avert Spurious Significance Results

Stats 2023, 6(4), 1323-1338; https://doi.org/10.3390/stats6040081
by Aris Spanos
Reviewer 1: Anonymous
Reviewer 2: Anonymous
Stats 2023, 6(4), 1323-1338; https://doi.org/10.3390/stats6040081
Submission received: 9 November 2023 / Revised: 26 November 2023 / Accepted: 30 November 2023 / Published: 5 December 2023
(This article belongs to the Section Statistical Methods)

Round 1

Reviewer 1 Report

Comments and Suggestions for Authors

Report on the manuscript: stats-2729962
"Revisiting the Large n (sample size) Problem: How to Avert Spurious Significance Results"


    The manuscript is well written, in clear, simple, and scientifically reliable language. It is a manuscript that is pleasant to read and even interesting but there are some comments that need to be taken into consideration:

 1. The abstract must remain in the introduction section; the author's task is to craft compelling opening sentences that effectively convey the main idea of the work.
2. Keywords should not exceed five parts and are useful for quickly expressing what is inside the manuscript.
3. The author must add a paragraph explaining the main aim of the manuscript. I think it could be added in the introduction section after Line 100.
4. In Equation (2), define ℕ.
5. It is much more effective, in my opinion, for the authors to make a few key, well-stated points when commenting on Figures 4 and 5 (Comment on the figures in the form of bright spots).
6. Line 309. "48 of which have tiny p-values, <.0000". It needs detailing and correction.
7. In addition to the conclusions section, I think it would be better to include the most notable results as bright spots at the end of the section.

Comments for author File: Comments.pdf

Author Response

Author's response to reviewer 1

The manuscript is well written, in clear, simple, and scientifically reliable language. It is a manuscript that is pleasant to read and even interesting but there are some comments that need to be taken into consideration:

 1. The abstract must remain in the introduction section; the author's task is to craft compelling opening sentences that effectively convey the main idea of the work.

Author's response: I greatly appreciate the comments and suggestions by this reviewer! With the help of reviewer 2, the abstract has been thoroughly revised to address the issues raised by this reviewer.   


2. Keywords should not exceed five parts and are useful for quickly expressing what is inside the manuscript.

Author's response: done!

3. The author must add a paragraph explaining the main aim of the manuscript. I think it could be added in the introduction section after Line 100.

Author's response: A paragraph has been added to the revised version. It reads:

"The main objective of the paper is to make a case that the large n problem can be addressed using a principled argument based on the post-data severity (SEV) evaluation of the unduly data-specific accept/reject results. This is achieved by accounting for the inference-related uncertainty to provide an evidential interpretation of such results that revolves around the discrepancy γ̸ =0 from the null value warranted by the particular test and data x0 with high enough post-data error probability. This provides an inductive generalization of the accept/reject results that enhances learning from data."

4. In Equation (2), define ℕ.

Author's response: ℕ is defined, line 131

5. It is much more effective, in my opinion, for the authors to make a few key, well-stated points when commenting on Figures 4 and 5 (Comment on the figures in the form of bright spots).

Author's response: More delineating comments have been added to explain figures 4 and 5 in more detail.

6. Line 309. "48 of which have tiny p-values, <.0000". It needs detailing and correction.

Author's response: More explanatory details have been added.


7. In addition to the conclusions section, I think it would be better to include the most notable results as bright spots at the end of the section.

Author's response: A short paragraph has been added to the end. It reads:

"The SEV evaluation was illustrated above using two empirical results from Abouk et al. (2022). Example 2A concerns an estimated coefficient, βk=.004, SE(bβk)=.002, in a LR model, which is declared statistically significant at α=.05 with n=24732966. The SEV evaluation yields a discrepancy γ†≤.0000001 from βk=0 warranted by data z0 and the t-test with probability .977. Example 2B concerns the difference between two means whose estimates are xn1 =2.51, yn2 =2.51, but the t-test outputted τ(z0)=2.77 with N=6108194. The SEV evaluation yields a discrepancy γ†≤.00000068 from (μ1−μ2)=0 warranted by data z0 and the t-test with probability .97. Both empirical examples represent cases of spurious statistical significance results stemming from exceptionally large sample sizes, n = 24732966 and N = 6108194, respectively. "

Author's thanks: I'm most grateful to this reviewer who took the time to read the paper carefully and make the most helpful comments I could hope for! 

Reviewer 2 Report

Comments and Suggestions for Authors

While large data sets are generally viewed as advantageous to provide more precise and reliable evidence, it is often overlooked that these benefits are contingent upon certain conditions being met. The primary condition is the approximate validity (statistical adequacy) of the probabilistic assumptions inherent in the statistical model Mθ(x) applied to the data. In the case of a statistically adequate Mθ(x) and a given α, as n increases, the power of a test rises, and the p-value diminishes due to the inherent trade-off between type I and type II error probabilities in frequentist testing.

This phenomenon raises concerns about the reliability of declaring 'statistical significance' based on conventional significance levels when n is exceptionally large. To address this issue, the author proposed that a principled approach, in the form of post-data severity (SEV) evaluation, be employed. The SEV evaluation represents a post-data error probability that transforms specific data-driven results into evidence either supporting or contradicting inferential claims regarding the parameters of interest. This approach offers a more nuanced and robust perspective in navigating the challenges posed by the large n problem.

Author Response

Author's response to reviewer 2

I cannot express my heartfelt gratitude to reviewer 2 who took the time to suggest a possible abstract that agrees with reviewer's 1 suggestion. It never happened to me in 45 years of publishing academic papers! Thank you!!

 

Back to TopTop