Diffusion-Based Parameters for Stock Clustering: Sector Separation and Out-of-Sample Evidence
Round 1
Reviewer 1 Report
Comments and Suggestions for AuthorsThis study estimates key distribution parameters of stock price using two approaches based on Black-Scholes model and log returns, respectively. Then the study conducts a battery of statistical tests to compare the performances of k-means clustering utilizing the two distinct sets of parameters. It is found that the model-based approach leads to better k-means clustering during turbulent months. The paper is very well written and structured. I provide the following comments and suggestions for the authors to consider.
Firstly, if the goal is to explore model-based approaches to estimate distribution parameters of stock prices which can complement the primitive approach of directly using log returns, I’m wondering why the authors only look for results of Black-Scholes model. As suggested by the findings, model-based approaches are superior during months of more price fluctuations. My take is that the drift-diffusion process of Black-Scholes model better capture unexpected price movements. If this is the case, the authors may try other stochastic models such as stochastic volatility, jump-diffusion, stochastic volatility with jumps, etc. In fact, the marginal benefit of the model approach in the paper suggests the weakness of Black-Scholes in capturing features of stock price such as fat tails, volatility clustering, jumps, and mean reversion.
Secondly, the authors conduct a wide array of statistical tests to examine the performances between the model approach and the return approach. However, as a research paper, it would be better to examine the causes and attributions of the empirical findings.
Thirdly, the conclusion of the article suggests that model-based parameters can complement traditional return-based approaches for market forecasting. However, as the superiority of the model-based approach is not persistent, the authors need to contemplate an executable framework or develop a synthetic approach for readers to incorporate the model-based outputs.
Overall, more efforts are needed to enhance the scientific quality of the article and signify the importance of the study. So far, the authors present models and evidence to answer the “what” question, but answers to “why” and “so what” remain to be developed.
Author Response
Please find the revised manuscript attached.
Author Response File:
Author Response.pdf
Reviewer 2 Report
Comments and Suggestions for AuthorsPlease refer to the review report
Comments for author File:
Comments.pdf
Author Response
Please find the revised manuscript attached.
Author Response File:
Author Response.pdf
Reviewer 3 Report
Comments and Suggestions for AuthorsThe study analyzes stock clustering based on diffusion parameters, more specifically, drift and volatility, using a dataset of Thailand`s stock exchange.
- The language must be revised to make the paper more readable. For instance, the passage "Applying clustering applies clustering techniques to stock groups or stock market indices that exhibit" is not clear.
- Some passages should be accompanied by more influential references. For instance, in the passage "Numerous financial studies have developed analytical or closed-form expressions applied within stochastic differential equations (SDEs) to assess possible outcomes outside the classical use of SDEs in derivatives pricing", references seem to be hand-picked from works of the third author. Instead, more influential papers should be cited.
- I am not sure whether the description of some algorithms are relevant. For instance, Algorithm 2 seems to be just the estimation of mu and sigma based on arithmetic mean and sample standard deviation of historical log return data, which is a common procedure.
- In this context, formulas for estimates of mu and sigma in subsection 2.3.2 would also be less relevant.
- It is important to explore better the rationale of the estimation of parameters using the Black&Scholes formula.
- Isn`t it necessary to have call or put prices? Or just the price of stocks is needed? In this case, wouldn't it be suffice the assumption of Brownian diffusion process instead of all the other Black-Scholes assumptions?
- There are unnecessary tables, such as those in Appendix A.
- It is not clear the theoretical and practical implications of the study. This issue should be stressed.
- For instance, how does the paper contribute to the theoretical discussion of clustering of parameters?
- In addition, how would results impact practitioners? Since depending on the time horizon studied, stocks seem to navigate between different clusters, how would the results help identify "early warning signals of market stress, inform sector rotation strategies, and strengthen risk management framework"?
- It is paramount to discuss how the specific results of the clustering study would help traders, decision makers, regulators, etc.
- is there any trading strategy from the results of the study that could lead to abnormal results? What is this strategy and what is the risk-adjusted return?
Author Response
Please find the revised manuscript attached.
Author Response File:
Author Response.pdf
Round 2
Reviewer 1 Report
Comments and Suggestions for AuthorsThe quality of the article has improved significantly, and it should be ready to publish.
Author Response
We sincerely thank the reviewer for the positive evaluation and encouraging remark. We greatly appreciate the reviewer’s time and constructive feedback throughout the review process.
Reviewer 2 Report
Comments and Suggestions for AuthorsI have reviewed the revised manuscript "Diffusion-Based Parameters for Stock Clustering: Sector Separation and Out-of-Sample Evidence" and the authors' response document. All concerns from my previous review have been satisfactorily addressed.
Critical Issues Resolution (4/4 Resolved)
1. Logical Inconsistency - The authors clarified the distinction between individual-stock lognormality (required for valid MLE) and aggregate market turbulence (heightened volatility across sectors). The revision in Section 2.2 (lines 121-127) frames BS parameters as structural filters that isolate persistent volatility dynamics from transient noise. This resolves the circular reasoning concern.
2. Out-of-Sample Validation - The addition of Sharpe ratios (Table 3), clarified methodology (lines 361-381), and transparent acknowledgment of single-month validation limitations address my concerns. The authors reasonably justify why transaction costs are negligible for liquid SET100 constituents.
3. Sample Size Reduction - The authors justify the Anderson-Darling test as necessary for theoretically sound parameter estimation (lines 122-128) while acknowledging in Section 4.8 (lines 553-556) that months with reduced samples require cautious interpretation. They recommend future research with relaxed screening thresholds.
4. Multiple Testing - The Bonferroni correction (αadj ≈ 0.0042) was correctly applied to 12 one-sided tests. Both August and November remained significant, actually strengthening the original conclusions. Section 4.2 (lines 427-443) provides appropriate discussion of the statistical framework.
Important Improvements (4/4 Addressed)
The single-year limitation is acknowledged in Section 4.8 (lines 539-543). The K=8 selection is now transparently framed as a case study balancing statistical cohesion and sectoral interpretability (lines 342-346). The ARI interpretation confusion is resolved through clear explanation that high ARI indicates boundary agreement while BS features provide tighter within-cluster cohesion (lines 312-317). The impact of multiple testing correction receives appropriate discussion in Section 4.2.
Marginal Issues (7/7 Corrected)
All minor issues have been corrected: awkward phrasing revised, Figure 3 heatmap added for better visualization, Tables A1/A2 replaced with informative summary statistics (Table 1), software specified (scipy.stats.anderson), computational costs discussed (Subsection 4.5), economic intuition explained (Subsection 4.2), and derivatives pricing connection expanded (Subsection 4.7).
Notable Revision Strengths
The Market-Regime Decision Framework in Section 4.5 adds practical value by specifying when BS versus LR clustering should be prioritized. The restructured discussion with new subsections strengthens the paper's contribution. The authors demonstrate transparency through detailed methodology, software specification, and balanced acknowledgment of limitations in Section 4.8.
Recommendation
I recommend acceptance for publication.
The manuscript makes a solid contribution by introducing diffusion-based features for stock clustering, validating their performance during market stress through multiple metrics (silhouette scores, ARI, Wilcoxon tests), and providing out-of-sample portfolio evidence. The work connects equity clustering to derivatives pricing in a novel way while maintaining methodological soundness and transparent reporting.
The authors have addressed all methodological concerns without compromising the paper's core contribution. No additional revisions are required.
Confirmation:
- All 4 critical issues resolved
- All 4 important improvements implemented
- All 7 marginal issues corrected
- Manuscript meets publication standards
Thank you for the opportunity to review this revision. The authors' responsiveness has resulted in a much stronger manuscript.
Author Response
We sincerely thank the reviewer for the thorough re-evaluation and highly encouraging feedback. We greatly appreciate the reviewer’s detailed and constructive guidance throughout the review process, which has significantly strengthened the quality and clarity of the manuscript.
Reviewer 3 Report
Comments and Suggestions for AuthorsThe new version of the manuscript addresses the relevant issues of the review.
In suggest including in the abstract the specific (not generic) implications of the study, as indicated in the author's reply.
Author Response
Please find the revised manuscript attached.
Author Response File:
Author Response.pdf
