Next Article in Journal
A Trajectory-Based Immigration Strategy Genetic Algorithm to Solve a Single-Machine Scheduling Problem with Job Release Times and Flexible Preventive Maintenance
Previous Article in Journal
Unsupervised Cyclic Siamese Networks Automating Cell Imagery Analysis
Previous Article in Special Issue
A Scenario-Based Model Comparison for Short-Term Day-Ahead Electricity Prices in Times of Economic and Political Tension
 
 
Article
Peer-Review Record

Conditional Temporal Aggregation for Time Series Forecasting Using Feature-Based Meta-Learning

Algorithms 2023, 16(4), 206; https://doi.org/10.3390/a16040206
by Anastasios Kaltsounis, Evangelos Spiliotis * and Vassilios Assimakopoulos
Reviewer 1:
Reviewer 2:
Algorithms 2023, 16(4), 206; https://doi.org/10.3390/a16040206
Submission received: 23 January 2023 / Revised: 4 April 2023 / Accepted: 11 April 2023 / Published: 12 April 2023
(This article belongs to the Special Issue Algorithms and Optimization Models for Forecasting and Prediction)

Round 1

Reviewer 1 Report

This paper is proposes a machine learning method to aid with multiple aggregation of time series forecasting. The authors proposed a method to derive relevant features of a time series to be able to train a classifier that picks (or weighs) the best "base prediction" for maximum accuracy.

Overall, the method seems sound and the results support that. However, there are some issues with the paper and the study itself that need to be addressed.

I understand if the concept of multiple levels of aggregation is an established concept in the forecasting community, but the readers of Algorithms will need a fundamental introduction. Specifically, the authors need a new section to cover all the concepts necessary to make the paper self-contained for this audience.

Moreover, the task of using multiple predictions to obtain a final prediction is called "ensemble," and is a well-known concept in machine learning. The authors must directly address this similarity between ensemble and aggregation and clarify the differences.

The paper does not properly introduce the work. The intro section is dominated by the review of relevant literature. While a summary of the lit review in intro is always needed, but the intro section is more than that. The authors need to rewrite the intro to cover the following missing content:

- Motivation: Why do we need time series forecasting? What are some applications of time series forecasting that make it important? Why should we, or when should we, pursue aggregation in time series forecasting? What makes aggregation better than other approaches?

- Challenges: What makes aggregation so difficult that we need all these different methods? And what makes the proposed method difficult that we need the proposed algorithm?

- Solution: What is the solution to address these challenges? This should be specified in addition to the contribution list, to summarize how the method (or each part of it) address a specific challenge.

- Results: A summary of results should be mentioned in the text of the introduction as evidence of the solution's validity.

It is good that the authors included an itemized list of contributions in the intro section. However, only the first item seems to be an actual contribution and the other two bullet points are just descriptions of it. The last two points should be moved to the text and the authors should break down the first point to correctly match their contributions. For instance, the contribution list could include items such as a learner algorithm and a predictor algorithm (corresponding to the offline and online phases).

In addition to contributions, the authors need to be explicit about their novelty. It appears that there are a lot of methods that are used from the literature. In the intro section, the authors need to have a summary of the literature review only to the extent that allows them to specify the gap they are filling. Currently, the majority of the intro section is a review of the literature and the only novelty explicitly mentioned is that no one has used classification in temporal aggregation. But that is not enough. Just because no one has done it, does not mean that it should be done. The authors need to clarify the novelty and offer a reason/benefit for it.

Th author chose to use only one classification method. The reasons offered for this choice in 94 are not convincing. If there are not more convincing reasons, the authors should try other methods as well and report the results and compare them to LightGBM.

Author Response

Reviewer #1

This paper proposes a machine learning method to aid with multiple aggregation of time series forecasting. The authors proposed a method to derive relevant features of a time series to be able to train a classifier that picks (or weighs) the best "base prediction" for maximum accuracy.

Overall, the method seems sound and the results support that. However, there are some issues with the paper and the study itself that need to be addressed.

Thank you for your overall positive feedback as well as for your suggestions, which we have addressed as shown below.

I understand if the concept of multiple levels of aggregation is an established concept in the forecasting community, but the readers of Algorithms will need a fundamental introduction. Specifically, the authors need a new section to cover all the concepts necessary to make the paper self-contained for this audience.

Multiple temporal aggregation is indeed a well-established concept in the field of time series forecasting. Nevertheless, we appreciate that the topic is relatively niche and that some of the readers may not be familiar with it. Therefore, at the beginning of our paper we first establish the concept of temporal aggregation (explaining its logic and potential benefits) and then proceed by introducing the concept of multiple temporal aggregation (inspired by the field of forecast combination). This is effectively done in paragraphs 2, 3 and 4 of the introduction section, where we have also added a new figure to visualize the effect of temporal aggregation using a fast-moving and a slow-moving quarterly series as examples. Note also that section 2 offers a rich overview of the work done in the field, including introductory concepts and definitions, as well as the mathematical framework currently used by the forecasting community for applying (multiple) temporal aggregation in a structured way. We believe that this information, combined with the references listed, should be sufficient for the readers to follow the rest of the paper.

Moreover, the task of using multiple predictions to obtain a final prediction is called "ensemble," and is a well-known concept in machine learning. The authors must directly address this similarity between ensemble and aggregation and clarify the differences.

Thank you for your comment. Indeed, there is a close connection between standard ensemble strategies and temporal aggregation. Effectively, the main difference between the two approaches is that in the former case, different models are trained on the same series (sampled on the original frequency) in order for the produced forecasts to be combined, while in the latter case a single model is used to produce forecasts at different data frequencies, which are then being combined using proper weights. That said, we now start our paper by discussing the value of combining in general (1st paragraph) and then introduce multiple temporal aggregation as a special case of ensembling. We also state in paragraph 5 that “Multiple temporal aggregation, which is effectively a forecast combination subject to linear constraints [17], has several advantages over focusing on a single aggregation level [18]. Similar to standard ensembling strategies, widely used in the literature to blend forecasts produced by different models or versions of them, it avoids selecting a single “best” aggregation level, which is challenging to do in practice [19], and mitigates model and parameter uncertainty by exploiting the merits of combining [for a non-systematic review on temporal aggregation, please refer to 6]. Effectively, the main difference between standard forecast combination and multiple temporal aggregation is that in the former case the forecasts to be ensembled are reported at the original frequency of the series, while in the latter they are reported at various data frequencies, thus exploiting the potential benefits of temporal aggregation”. We believe that these statements establish the connection discussed and clarify things.

The paper does not properly introduce the work. The intro section is dominated by the review of relevant literature. While a summary of the lit review in intro is always needed, but the intro section is more than that. The authors need to rewrite the intro to cover the following missing content:

We have restructured the introduction to address your comments. Below we list the changes made in more detail. 

- Motivation: Why do we need time series forecasting? What are some applications of time series forecasting that make it important? Why should we, or when should we, pursue aggregation in time series forecasting? What makes aggregation better than other approaches?

We have rewritten the first paragraph of our paper to motivate the use of time series forecasting and the utilization of forecast combinations (a form of which is multiple temporal aggregation). As explained, temporal aggregation can help us identify and analyze patterns that are difficult to observe at the original frequency of the data, while at the same time mitigating model and parameter uncertainty.

- Challenges: What makes aggregation so difficult that we need all these different methods? And what makes the proposed method difficult that we need the proposed algorithm?

Following your suggestions, paragraph 6 is now completely dedicated to the challenges present when it comes to implementing multiple temporal aggregation. These can be summarized as follows: (a) multiple temporal aggregation can lead to unreasonable shrinkage of seasonality, (b) forecasts produced at different levels may need to be combined using unequal weights, (c) combination weights should be based on out-of-sample errors to better simulate forecast uncertainty, and (d) nonlinear combinations of forecasts should be considered to allow for more flexibility.

- Solution: What is the solution to address these challenges? This should be specified in addition to the contribution list, to summarize how the method (or each part of it) address a specific challenge.

The proposed solution to the challenges listed is the creation of a meta-learner that can properly determine the temporal level(s) for which the forecasts should be generated. Effectively, this is also the major contribution of our study. Following your suggestion, we now clarify the value added by the proposed solution in more detail.

- Results: A summary of results should be mentioned in the text of the introduction as evidence of the solution's validity.

We have extended the paragraph of the introduction that listed some of our results to provide more information on our findings, as suggested.

It is good that the authors included an itemized list of contributions in the intro section. However, only the first item seems to be an actual contribution and the other two bullet points are just descriptions of it. The last two points should be moved to the text and the authors should break down the first point to correctly match their contributions. For instance, the contribution list could include items such as a learner algorithm and a predictor algorithm (corresponding to the offline and online phases).

We have adjusted the contributions provided as bullet points per your suggestions.

In addition to contributions, the authors need to be explicit about their novelty. It appears that there are a lot of methods that are used from the literature. In the intro section, the authors need to have a summary of the literature review only to the extent that allows them to specify the gap they are filling. Currently, the majority of the intro section is a review of the literature and the only novelty explicitly mentioned is that no one has used classification in temporal aggregation. But that is not enough. Just because no one has done it, does not mean that it should be done. The authors need to clarify the novelty and offer a reason/benefit for it.

Thank you for your comment. The literature review included in the introduction section is currently limited to a single paragraph and mostly used to discuss the challenges present in the field. Based on the challenges mentioned and the contributions of the present paper (currently highlighted following your recommendations), the novelty of our work can be sufficiently justified. The discussions made establish the value of temporal aggregation and justify the efforts done towards the introduction of more effective multiple temporal aggregation forecasting methods.

The author chose to use only one classification method. The reasons offered for this choice in 94 are not convincing. If there are not more convincing reasons, the authors should try other methods as well and report the results and compare them to LightGBM.

We appreciate your comment but we have found that gradient boosting is indeed the best algorithm to use for this task. NNs may sound promising, but they require the conduction of (several) thousands of forecast simulations (used to measure accuracy per case and create the train set of the meta-learner) to be effectively trained. This is impossible to do in practice, especially when the series available are relatively short, as it is the case with the M4 series (and many of the M5 series). We have tested logistic regression with no success and also experimented with GBM, which provided similar results to LightGBM, although slightly worse. Nevertheless, given that the lack of sufficient data is what constrains the performance of NNs in practice, we currently discuss this option in the conclusions section to motivate future work. Moreover, in the introduction we now clarify that LightGBM has been successfully used for designing meta-learners in the field of forecast selection and combination, thus motivating its use in the present study as well.

Reviewer 2 Report

The paper proposes a new approach to time series forecasting using multiple temporal resolutions. The paper is well structured, and the description of the proposed approach is detailed and clear. The empirical results are also provided in a proper way.

My major concern is related to the comparison of the proposed approach with the existing ones. The Authors compared the obtained forecasting accuracy of the proposed method with three benchmarks (CON, MTA-EW, and MTA-STR), and found a very small difference in MASE and RMSSE. Probably, it would be better to compare the obtained accuracy with the benchmark methods, provided by M4/M5 competition, and report the improvement (similar to https://www.sciencedirect.com/science/article/pii/S0169207021001874). It will allow comparing the results not only with the selected alternatives but with other forecasting methods.

Also, I suggest discussing the application of the proposed methodology for other temporal aggregations: e.g., high-frequency time series with temporal resolutions of several seconds or minutes.

 

Author Response

The paper proposes a new approach to time series forecasting using multiple temporal resolutions. The paper is well structured, and the description of the proposed approach is detailed and clear. The empirical results are also provided in a proper way.

Thank you for your overall positive feedback.

My major concern is related to the comparison of the proposed approach with the existing ones. The Authors compared the obtained forecasting accuracy of the proposed method with three benchmarks (CON, MTA-EW, and MTA-STR), and found a very small difference in MASE and RMSSE. Probably, it would be better to compare the obtained accuracy with the benchmark methods, provided by M4/M5 competition, and report the improvement (similar to https://www.sciencedirect.com/science/article/pii/S0169207021001874). It will allow comparing the results not only with the selected alternatives but with other forecasting methods.

Thank you for your suggestion. Since the proposed method examines the effect of selecting or combining base forecasts produced at different aggregation levels, the benchmarks initially used facilitated relevant comparisons. For instance, since ETS is used for producing the base forecasts in all setups, CON (conventional forecasting or, simply put, running ETS at the original frequency of the data) served as a representative benchmark in the sense that if the proposed method cannot outperform this approach, then there is no point of applying (multiple) temporal aggregation in the first place. Similarly, MTA-EW and MTA-STR are essential benchmarks since they report the performance of an equal weighted combination of forecasts, produced at different aggregation levels, and a state-of-the-art approach for combining such forecasts, respectively. Nevertheless, since the proposed approach should also be able to outperform standard time series forecasting methods, in the revised version of the paper we consider two additional benchmarks per data set. In the M4, we used Theta and Comb (equal weighted combination of simple, holt, and damped exponential smoothing) since these methods were among the top performing benchmarks considered by the organizers of the competition (ETS was another top performing benchmark but this is already implemented and reported as CON). In the M5, we used Com_b and SBA for similar reasons. Our results indicate that our approach works better than the additional benchmarks set, demonstrating the potential of the proposed method.

Also, I suggest discussing the application of the proposed methodology for other temporal aggregations: e.g., high-frequency time series with temporal resolutions of several seconds or minutes.

Thank you for your comment. Indeed, temporal aggregation hasn’t been widely applied to high-frequency data, although it could help filter out the noise present at hourly/minute intervals, while at the same time preserving the multiple seasonal patterns typically found in said applications. Following your suggestion, we now close the conclusion section as follows:

“Moreover, since the potential benefits of (multiple) temporal aggregation seem to magnify for time series of relatively higher frequencies, the proposed meta-learner could be tested for data that are sampled on an hourly or even minute basis. To the best of our knowledge, the work done in the field of temporal aggregation has insufficiently covered said applications, despite that methods like the one proposed in the present paper could improve forecasting accuracy through a conditional aggregation of time series patterns observed at different temporal aggregation levels.”

Round 2

Reviewer 1 Report

The authors have addresses my comments thoroughly. The quality of the paper has improved substantially and I am OK with publication.

Back to TopTop