A Prediction Method with Data Leakage Suppression for Time Series
Round 1
Reviewer 1 Report
electronics-1995630
A prediction method with data leakage suppression for time series
…………………………
1). The paper is interesting but poorly written and little evidence. The application section is limited to 3 stock price daily prices in China stock exchanges.
2). For instance, the Introduction section does not convince the reader and I would suggest the authors to re-write it by focusing on what they are proposing in their paper.
3). References and previous literature should be updated. For instance, what is the relevance of papers [1], and [2] in the paper? There are thousand of related papers in time series forecasting in finance, macroeconomics and all disciplines. Why those two in particular are deserving to be mentioned? Same as for reference [3].
4). Introduction, lines 31-34. “Most of the early time series …”. I wouldn’t agree with this statement. It needs to be supported by more references and discussion. Only [5] is not enough and it is misleading. What about artificial neural network? Why [8, 9]? ANN is a very old story in almost all fields.
5). Why these particular three stock price time series? Why only from China? How can the authors do inference? Please state the limitations of your dataset and experiments to the validity of your proposed technique. It would be more interesting if the authors would use high frequency data.
6). Section 3, line 339, “At the same time, the … “. This is a strong argument given my comment 4). And then, line 342 “Thus, the proposed prediction method has the best performance on three dataset”.
Author Response
Please see the attachment.
Author Response File: Author Response.pdf
Reviewer 2 Report
The authors propose a new method, data leakage suppression (DLS) for noise reduction and data prediction in time series. The method is based on variational mode decomposition and noise reduction on high frequency part of overlapping slices of data. The smeared data are then used to train an LSTM neural network, which is used to predict future data. The method is tested on synthetic data as well as stock market data from three different indexes.
The original motivation of data leakage is not too convincing, one can make denoising separately on the training and test data sets, thus avoiding cross influence of the two data series. Nevertheless, the DLS method can be a good denoising and data prediction method.
Unfortunately the presentation is very poor, and it makes the method description practically intractable. Eq. (2)-(3) is not explained appropriately (what is the parameter w? what is the VMD function? what is the meaning os a matrix in the upper index?). In eq. (6) a kanji character remained. Eq. (9) seems to be very ad hoc, sign is not the symbol function, and no median function appears in the formula. In line 177 one finds “step 4” referring to some unknown list.
The validation of the method is also not satisfying. The denoising of the mock data coming from the sum of two sin functions is fine as a proof of concept. But the analysis of the stock market predictions is not systematic, it demonstrates only three examples on five arbitrary days. The statement on the quality of DLS method must be based on statistical analysis, and not on singled out data.
Summarizing: I may accept that the method has some novelty as compared to other methods. But the motivation is not adequate, the presentation is not appropriate and the validation is not thorough enough. I do not suggest to accept the paper in the present form.
Author Response
Please see the attachment.
Author Response File: Author Response.pdf
Reviewer 3 Report
Referee Report on
A prediction method with data leakage suppression for time series by
Fang Liu et al.
The paper is written on the subject of data leakage in time series training/validation/testing partitions, when the same data are used in all processes. In my opinion, data leakage should certainly be avoided, but such avoidance may, or may not lead to better quality of actual predictions.
The Authors should concentrate on proving that the suppressing leakage could lead to systematic improvement of actual predictions. To this end is highly desirable to apply their Data Leakage Suppression (DLS) to a much broader groups of time series, including hundreds (or more) of financial time series.
Such series can be found (and borrowed) in recent competition organized by Makridakis et al.
See,
1. International Journal of Forecasting
Volume 38, Issue 4, October–December 2022, Pages 1346-1362.
2. International Journal of Forecasting
Volume 36, Issue 1, January–March 2020, Pages 54-74.
The data could be adopted for testing. Say, the last points are considered as unknown, and `` predicted’’.
The Authors can compare their DLS with other techniques from their paper using the same measures as in the Makridakis competitions.
Such analysis could be beneficial for the general public interested and involved with time series predictions. With such analysis the paper can be published.
Comments for author File: Comments.pdf
Author Response
Please see the attachment.
Author Response File: Author Response.pdf
Round 2
Reviewer 1 Report
Thank you for revising.
Author Response
Please see the attachment.
Author Response File: Author Response.pdf
Reviewer 2 Report
I think the authors have improved the manuscript considerably, and in accordance to my requests. I think the current version is appropriate to publish in Electronics.
Author Response
Please see the attachment.
Author Response File: Author Response.pdf
Reviewer 3 Report
The paper was improved. It can be published after minor corrections.
1. In the testing part, please, add some percentage error MAPE, SMAPE,
or MASE
2. Please, check if all datasets are properly referenced. Readers would appreciate the info.
3. In the Conclusion, please, speculate why data leakage can possibly lead to worsened accuracy
Comments for author File: Comments.pdf
Author Response
Please see the attachment.
Author Response File: Author Response.pdf