Next Article in Journal
The Role of Sensory Cues in Collective Dynamics: A Study of Three-Dimensional Vicsek Models
Previous Article in Journal
A Novel Ensemble Belief Rule-Based Model for Online Payment Fraud Detection
 
 
Article
Peer-Review Record

Enhancing Bitcoin Price Prediction with Deep Learning: Integrating Social Media Sentiment and Historical Data

Appl. Sci. 2025, 15(3), 1554; https://doi.org/10.3390/app15031554
by Hla Soe Htay, Mani Ghahremani * and Stavros Shiaeles
Reviewer 1:
Reviewer 2:
Reviewer 3: Anonymous
Reviewer 4:
Appl. Sci. 2025, 15(3), 1554; https://doi.org/10.3390/app15031554
Submission received: 20 December 2024 / Revised: 28 January 2025 / Accepted: 31 January 2025 / Published: 4 February 2025
(This article belongs to the Section Computing and Artificial Intelligence)

Round 1

Reviewer 1 Report

Comments and Suggestions for Authors

In this paper, the authors consider a time-series model to predict the bitcoin value by using multivariate regression models along with long short-term memory (LSTM).

 

The study is interesting considering the bitcoin getting a lot of attention these days. However, the authors need to address the following questions/concerns.

 

1. Define LSTM including an example of its usage

 

2. Fig 8 contains data up to 2019 and as the (authors correctly indicated ) bitcoin volatality is so much that the last 3-5 years its value has gone through tremendous fluctuations. The authors should include recent data and if not explain the reasons for not including them. Otherwise, the study maybe meaningless.

Author Response

Comment 1: Define LSTM including an example of its usage

Response 1: Thanks. This has been added (second paragraph of the Introduction chapter). I have also updated the first five paragraphs of the Related Work to go into more detail and examples.

Comment 2: Fig 8 contains data up to 2019 and as the (authors correctly indicated ) bitcoin volatality is so much that the last 3-5 years its value has gone through tremendous fluctuations. The authors should include recent data and if not explain the reasons for not including them. Otherwise, the study maybe meaningless.

Response 2: 

We base our work on \cite{Zuvela:2022} and aim to extend their results. Hence, we have used the same dataset as in that paper. This means that the models were trained using historical data covering the period from August 1, 2017, to January 21, 2019, and no attempt was made to predict the future price due to the dataset limitation.   

We will, however, attempt to address this in our future works. As we would like to use X API (e.g., Twitter API), so we can test our model’s performance with real-time data, which has not been evaluated. Additionally, processing tweet data requires considerable computational power and time, which was not available to the researchers at the time of this paper.

Incidentally, this has now been added to the Conclusions and Future Work chapter of the new draft (see second to last paragraph).

Reviewer 2 Report

Comments and Suggestions for Authors

This manuscript develops a multivariate Long Short-Term Memory model incorporating an emotional index to predict Bitcoin prices. However, there is a plethora of existing literature that considers emotional indicators in Bitcoin price forecasting.

The article in question does not sufficiently demonstrate its unique strengths and innovative aspects. It is recommended that the manuscript includes a comparative analysis of predictive accuracy with previously established models. Furthermore, in the third chapter, when evaluating the model's predictive performance, the manuscript solely employs the MSE, which is insufficient to ascertain the robustness of the model.

The predictive accuracy of LSTM models is susceptible to the manner in which data is partitioned. The current partitioning strategy in this study is divided into training (60%), validation (20%), and test (20%) subsets. It is suggested that altering the data partition ratios could enhance the robustness of the study.

Finally, the references lack the most recent literature.

Comments for author File: Comments.pdf

Author Response

Comment 1: This manuscript develops a multivariate Long Short-Term Memory model incorporating an emotional index to predict Bitcoin prices. However, there is a plethora of existing literature that considers emotional indicators in Bitcoin price forecasting.

Response 1: Thank you for all your comments. We have now introduced two subsections within the Literature Review section. One of them focuses on similar works that use sentiment analysis for their forecasting.

Comment 2: The article in question does not sufficiently demonstrate its unique strengths and innovative aspects. It is recommended that the manuscript includes a comparative analysis of predictive accuracy with previously established models.

Response 2: This has now been addressed in the final subsection within the Related Work section.

Comment 3: Furthermore, in the third chapter, when evaluating the model's predictive performance, the manuscript solely employs the MSE, which is insufficient to ascertain the robustness of the model.

Response 3: We are now showing MAE and MAPE for all of our models during their training (in the Model Training section) alongside figures for each of these metrics.

Comment 4: The predictive accuracy of LSTM models is susceptible to the manner in which data is partitioned. The current partitioning strategy in this study is divided into training (60\%), validation (20\%), and test (20\%) subsets. It is suggested that altering the data partition ratios could enhance the robustness of the study.

Response 4: We have addressed this concern by testing our models with a new data partitioning strategy (70%-15%-15%) and included an in-depth analysis in the Discussion section (and it's limitation subsection). This includes figures and observations highlighting the performance under different splits, enhancing the robustness of the study.

Comment 5: Finally, the references lack the most recent literature.

Response 5: We have reviewed and added more recent similar sources now.

Reviewer 3 Report

Comments and Suggestions for Authors

I recommend the following aspects to strengthen the quality of the article. In the methodology section, while you discuss model architectures extensively, there is a lack of focus on why each hyperparameter or structure was chosen. Add rationales for layer counts, neuron numbers, and dropout rates to strengthen the methodology. Regarding the proposed model, the Hybrid LSTM-Volume models show lower accuracy compared to sentiment-based models. I believe it is necessary to discuss whether additional features, such as news sentiment or economic indices, could be integrated to enhance the hybrid models. In the results section, the graphs (Figures 7–10) effectively show trends but do not include confidence intervals. Adding confidence intervals or error bounds would help illustrate the model's reliability during volatile periods.

Author Response

Comment 1: I recommend the following aspects to strengthen the quality of the article. In the methodology section, while you discuss model architectures extensively, there is a lack of focus on why each hyperparameter or structure was chosen. Add rationales for layer counts, neuron numbers, and dropout rates to strengthen the methodology.

Response 1: Thanks for your recommendations. This has now been added to the Methodology chapter.

Comment 2: Regarding the proposed model, the Hybrid LSTM-Volume models show lower accuracy compared to sentiment-based models. I believe it is necessary to discuss whether additional features, such as news sentiment or economic indices, could be integrated to enhance the hybrid models. 

Response 2: This has now been added to our Future Works. We also want to purchase access to X API to experiment with live data in a future paper.

Comment 3: In the results section, the graphs (Figures 7–10) effectively show trends but do not include confidence intervals. Adding confidence intervals or error bounds would help illustrate the model's reliability during volatile periods.

Response 3: Confidence intervals have now been added to all relevant graphs in the Results section to visually illustrate the model's reliability, particularly during volatile periods.

Reviewer 4 Report

Comments and Suggestions for Authors

The paper's work is very interesting. But the paper was not well organized.

1. Section 3.3 should describe the methodology in more detail instead of reporting the results of the training.

2. There are a particularly large number of models for sequence prediction, which are neglected in the related work section of the paper.

3. The experimental part lacks a convincing comparison with the SOTA sequence models.

4. The experiments are not sufficient. It is recommended to add more discussion experiments, including ablation experiments, to further model robustness.

5. The methodology has more steps and involves a variety of data. It is recommended to share the data to improve the reproduction of the method.

 

Author Response

Comment 1: Section 3.3 should describe the methodology in more detail instead of reporting the results of the training.

Response 1: Thanks for your recommendations. This was suggested by another reviewer also and has now been added. You can find justification and rationale behind the value of our hyperparameters too.

Comment 2: There are a particularly large number of models for sequence prediction, which are neglected in the related work section of the paper.

Response 2: Again this was pointed out by another reviewer. We have added a section in the Related Work chapter that talks more about similar models in the literature.

Comment 3: The experimental part lacks a convincing comparison with the SOTA sequence models.

Response 3: We do agree that incorporating a comparison with state-of-the-art models such as Transformers and GRUs would strengthen the paper. However, due to time constraints, we were unable to implement these comparisons for the current submission. We have updated the Conclusions and Future Work section to explicitly include plans for benchmarking our models against SOTA sequence models in our next submission.

Comment 4: The experiments are not sufficient. It is recommended to add more discussion experiments, including ablation experiments, to further model robustness.

Response 4: We have received various comments similar to this from other reviewers. We chose to expand on the experiments by testing other partition sizes. We would also like to incorporate other features/data in the hybrid model in addition to conducting an ablation experiment in our future paper. The Conclusions and Future Work chapter has been updated to reflect this.

Comment 5: The methodology has more steps and involves a variety of data. It is recommended to share the data to improve the reproduction of the method.

Response 5: A link has been provided at the end of the paper (next to the text "Data Availability Statement").

Round 2

Reviewer 1 Report

Comments and Suggestions for Authors

The revision is satisfactory. 

Author Response

Comment 1: The revision is satisfactory. 

Response 1: Thank you.

Reviewer 2 Report

Comments and Suggestions for Authors The authors have made satisfactory revisions in response to the previously raised concerns. The revised manuscript can be considered for acceptance.

Author Response

Comment 1: The authors have made satisfactory revisions in response to the previously raised concerns. The revised manuscript can be considered for acceptance.

Response 1: Thank you for your help.

Reviewer 4 Report

Comments and Suggestions for Authors

Although the authors have made many changes, the two core issues have not been resolved:

1. The experimental part lacks a convincing comparison with the SOTA sequence models.

2. The experiments are not sufficient. It is recommended to add more discussion experiments, including ablation experiments, to further model robustness.

Author Response

Comment 2: The experiments are not sufficient. It is recommended to add more discussion experiments, including ablation experiments, to further model robustness.

Response 2: Thank you for your valuable feedback. As requested, an ablation study subsection has been added to the Discussion section. This study evaluates the robustness of our best model under architectural modifications and reduced features, providing insights into its performance and limitations. Please also note that we have another subsection in the same section (added at the request of another reviewer) that further evaluates the robustness of our models.

Comment 1: The experimental part lacks a convincing comparison with the SOTA sequence models.

Response 1: Thank you again for this comment. We acknowledge the importance of comparing with state-of-the-art models; however, we believe that adding such comparisons would shift the focus away from the primary aim of this paper, which is to systematically build upon the work of Zuvela et al. (2022). This aim has been clearly stated in the revised abstract and introduction. We have, however, included a discussion of SOTA models in the future work section and plan to address this aspect in a subsequent publication.

Round 3

Reviewer 4 Report

Comments and Suggestions for Authors

The revised manuscript addressed the most of my concerns. 

Back to TopTop