Review Reports - Stock Market Bubble Warning: A Restricted Boltzmann Machine Approach Using Volatility–Return Sequences

Round 1

Reviewer 1 Report

Comments and Suggestions for Authors

The paper presents an approach to predict early stock bubbles and states that
follow a stock markets bubble-bursting point. The approach is based on Restricted Boltzmann Machinesm Random Forest and Feedforward Neural Networks which are used in two different phases. The paper uses a training set based on S&P 500 and reports an AUC of 70% and a false positive rate of 20% .

The problem the paper attempts to address is interesting and certainly challenging. It is also very interesting that the proposed solution attempts to use
a variety of machine learning algorithms to compensate for the complexity of the problem and the different phases it seems to have. In terms of the used algorithms the paper definitely is on the right track.

However there are a number of serious issues that do not help the user to fully comprehend or relate the importance of the approach or the results produced. This makes it difficult to see where exactly the contribution of the paper is. Here are some more details on this:

1) The research question is not precisely stated. While the paper states vaguely some concerns, it does not clearly state what it attempts to do: to increase accuracy of such predictions? to identify unambiguisly periods of instability? to assess abrupt drops and if these are warnings for bubbles? are the algorithms the achilles heel of such kind of warning systems since the proper algorithms have yet to be found? Without such clarly stated question 1) the results reported in the paper cannot be assessed and 2) the paper's contribution is not clear.

2) The background work needs to be substantially improved. Some of the background work is in the introduction (e.g. from "Machine learning approaches emerge and practical models..." and onwards) and this should be placed in the background work. The introduction needs to be rewritten to clearly explain why today an accurate prediction is still an open problem.

3) The paper does not compare the adopted approach with other existing approaches in predicting bubbles or other bubble early warning systems. For example, what are the metrics (AUC, false prositive rates) of other approaches? In contemporary research there are similar approaches that are based 1) on simple machine learning models (e.g. logistic regression, decision trees) 2) more complicated models like deep learning and ensamble methods and 3) Macroeconomics warning systems. How does the paper's approach compare to these systems? What gaps of such systems attempts the proposed approach to cover?

4) The presented metrics (AUC of 70% and a false positive rate of 20%) is not bad, but not good either. It's average in the best case. There are approaches that achieve higher performance. A false positive rate of 20% is also rather high. Again, the paper must clearly show that these numbers are within today's state of the art.

5) The contribution is not exactly clear. The paper reports an AUC of 70% and false positive rate of 20%. This is not bad, as i said. Yet other than that, i could not see any other novel remark. This is not to say that there is none; but rather that the way this research is presented does not highligt some interesting aspects and findings that i'm sure a lurking behind these numbers.

Author Response

Please, see the pdf file. Changes in the manuscript are in red color.

Author Response File: Author Response.pdf

Reviewer 2 Report

Comments and Suggestions for Authors

Please check my detailed comments in the attachment.

Comments for author File: Comments.pdf

Comments on the Quality of English Language

Comment 1 – Table 1: The columns “Median of Volatility Percentil” and “Median of Return Percentil” contain typos; they should be “Percentile”.
Comment 2 – Line 561: “An impeding tipping point” should be “impending”.

Author Response

Please, see the pdf file. Changes in the manuscript are in red color.

Author Response File: Author Response.pdf

Reviewer 3 Report

Comments and Suggestions for Authors

The authors present a combined algorithm and methodology by integrating un-supervised and supervised learning approaches to propose novel quantitative results by analyzing volatility of a market index and daily returns. The methodology and approach presented is interesting and provides an interesting insight for a general readership.

The manuscript could be strengthened by addressing a few additional points:

The introduction section is full of details but does not appear to orient a general readership to understand the comparative and quantitative approach for stock index and market volatility. This could be more streamlined.
I don't understand why the information shown in Table 1 merits its own table.
Figure 1 panels (C) and (D) is not explained clearly. What is the significance of the variations shown in each individual panels?
Equation (1) is explained naively. Please provide more details and the different limits of the variables shown in this equation.
It would be useful to quantitatively describe the statistical significance of pairwise consecutive years shown in Figure 4.
How are the different panels shown in Figure 8 different? This is presented in a convoluted manner and needs to be quantitated.
It is confusing to read Table 2. Again, I don't see the necessity to describe the information shown here in the form of a Table.
Majority of the figures are very low quality and difficult to read the figure captions. This needs to be overhauled.

### Additional comment ###

What is the main question addressed by the research?
The authors present a combined algorithm and methodology by integrating un-supervised and supervised learning approaches to propose novel quantitative results by analyzing volatility of a market index and daily returns.

• Do you consider the topic original or relevant to the field? Does it
address a specific gap in the field? Please also explain why this is/ is not
the case.
The methodology and approach presented is interesting and provides an interesting insight for a general readership.

• What does it add to the subject area compared with other published
material?
The integrated methodology presented to predict daily returns and market volatility predictor is unique and provided interesting insights.
• What specific improvements should the authors consider regarding the
methodology?
Figure 1 panels (C) and (D) are not explained clearly. What is the significance of the variations shown in each individual panel?
Equation (1) is explained naively. Please provide more details and the different limits of the variables shown in this equation.
It would be useful to quantitatively describe the statistical significance of pairwise consecutive years shown in Figure 4. The authors could incorporate T-test (p-value), as an example.

• Are the conclusions consistent with the evidence and arguments presented
and do they address the main question posed? Please also explain why this
is/is not the case.

The methodology presented in the manuscript addresses the main question proposed in the research by combining supervised and unsupervised learning approaches which is an interesting approach taken to decipher the research question proposed. As such, the model hyperparameters obtained have been quantitated in the appropriate context.
• Are the references appropriate?
While sufficient references have been provided, the manuscript would benefit from more references focused on benchmarking on individual supervised and unsupervised learning methodologies.
• Any additional comments on the tables and figures.
How are the different panels shown in Figure 8 different? This is presented in a convoluted manner and needs to be quantitated.
It is confusing to read Table 2. Again, I don't see the necessity to describe the information shown here in the form of a Table.
Majority of the figures are very low quality and difficult to read the figure captions. This needs to be overhauled.

Author Response

Please, see the pdf file. Changes in the manuscript are in red color.

Author Response File: Author Response.pdf

Round 2

Reviewer 1 Report

Comments and Suggestions for Authors

Author Response

The authors have addressed all important points in the first review and have substantially improved the aspects that had been pointed out. The research objective is clear and the contribution has been now well carved out. There are some minor grammar errors but a careful reading will spot them. Response: We are grateful for the reviewer's comments. Thank you for the opportunity to improve the manuscript. We have carefully reviewed the manuscript for grammatical of the manuscript for grammar errors, and we have corrected them. Attached to this message we are sending the latest version of the manuscript in which we have marked in blue color editorial and grammatical improvements.

Reviewer 2 Report

Comments and Suggestions for Authors

I think the revised paper is an astonishing improvement over the original submission. The authors did a remarkable job and have offered a thorough discussion of their approach. Specifically, I am truly impressed by the great effort made to solve the black-box character of the RBM method using generative sensitivity analysis, which provides a creative alternative to direct feature interpretation.

I would like to emphasize that the extra discussions on drift and recalibration needs are particularly insightful. In these conversations, the authors recognize the practical difficulties of implementing such models in real-world financial environments. I also appreciate the authors’ openness regarding the limitations of their approach while effectively highlighting its advantages.

I also observed a number of notable enhancements in comparison to the original manuscript.

In the revised manuscript, the authors present a remarkable solution through generative sensitivity analysis. This approach provides significant insights without necessitating direct interpretation of hidden units, aligning seamlessly with the latest trends in explainable AI.

Once again, I appreciate the authors’ open discussion about the challenges in benchmarking bubble detection models due to the lack of standardized frameworks.

A special thanks to the authors for sharing insights on RBM training methodologies, especially their reasoning for using Persistent Contrastive Divergence and their approach to managing variance parameters.

The analysis of forward-testing and the discussion on model drift are valuable enhancements that improve the practical applicability of the approach.

In my view, the authors have effectively addressed all of my inquiries. The revised paper contributes to financial bubble detection using machine learning methods. I recommend it for publication.

Author Response

Firstly, I sincerely appreciate the authors’ thorough and careful replies. They put a great deal of effort into revising the manuscript and addressed all of my questions in the first round of review. I think the revised paper is an astonishing improvement over the original submission. The authors did a remarkable job and have offered a thorough discussion of their approach. Specifically, I am truly impressed by the great effort made to solve the black-box character of the RBM method using generative sensitivity analysis, which provides a creative alternative to direct feature interpretation. I would like to emphasize that the extra discussions on drift and recalibration needs are particularly insightful. In these conversations, the authors recognize the practical difficulties of implementing such models in real-world financial environments. I also appreciate the authors’ openness regarding the limitations of their approach while effectively highlighting its advantages. I also observed a number of notable enhancements in comparison to the original manuscript. In the revised manuscript, the authors present a remarkable solution through generative sensitivity analysis. This approach provides significant insights without necessitating direct interpretation of hidden units, aligning seamlessly with the latest trends in explainable AI. Once again, I appreciate the authors’ open discussion about the challenges in benchmarking bubble detection models due to the lack of standardized frameworks. A special thanks to the authors for sharing insights on RBM training methodologies, especially their reasoning for using Persistent Contrastive Divergence and their approach to managing variance parameters. The analysis of forward-testing and the discussion on model drift are valuable enhancements that improve the practical applicability of the approach. In my view, the authors have effectively addressed all of my inquiries. The revised paper contributes to financial bubble detection using machine learning methods. I recommend it for publication. Response: We are grateful for the reviewer's comments. Thank you for the opportunity to improve the manuscript. The observations raised by the reviewer have allowed us to identify several opportunities for future research and generated a very interesting discussion regarding the problem of bubble detection in complex systems.

Reviewer 3 Report

Comments and Suggestions for Authors

The revised manuscript addresses the questions raised by this reviewer.

Author Response

The revised manuscript addresses the questions raised by this reviewer.

Response:
We are grateful for the reviewer's comments. Thank you for the the opportunity to help us improve the manuscript.