Generalisation Bounds of Zero-Shot Economic Forecasting Using Time Series Foundation Models
Round 1
Reviewer 1 Report
Comments and Suggestions for AuthorsThe paper is generally well-written, yet the novelty is limited due to the focus on testing existing models rather than proposing new models to solve the problem. More importantly, the work does not address a significant problem/challenge. Questions like "how does the issue pose a challenge to researchers if the paper's work does not exist?" should be answered, and the paper's contributions would diminish if other researchers can also address the research gap without much effort.
I am not quite convinced why testing "out-of-the-box, pre-trained TSFMs" is so important. If we know we could most likely get a better solution based on the out-of-the-box version, why stick to the out-of-the-box? Why would this work be that important in this case?
A similar question is about the "zero-shot setting" of transfer learning. If few-shot or other settings are more practical, then why is the zero-shot setting specifically of interest?
The work has used "LSBoost" and "Factor Model" as state-of-the-art baselines. However, they were proposed around 4 years ago. I wonder if more recent baseline methods exist.
Author Response
Dear Reviewer,
Thank you for the thoughtful and constructive reviews of our manuscript.
Please find a point-by-point response in the attached PDF file.
Best regards
Author Response File:
Author Response.pdf
Reviewer 2 Report
Comments and Suggestions for AuthorsThis paper investigates the zero-shot generalization capabilities of Time Series Foundation Models (TSFMs), specifically TimeGPT, Chronos, and Moirai, in forecasting macroeconomic indicators like GDP without any fine-tuning. Using New Zealand's economic data, the study benchmarks these models against traditional baselines (Persistence, ARIMA, LSBoost, and Factor models) and finds that TSFMs—particularly Moirai—often outperform classical models even during volatile periods like the COVID-19 shock. The paper presents promising work, but several issues remain to be addressed:
- The rationale behind selecting specific versions of Chronos and Moirai (e.g., small/base/large) is unclear. Additionally, it is not stated whether hyperparameter settings or input formatting choices were fixed or optimized. This obscures the reproducibility and fairness of the comparisons.
- The baseline models (ARIMA, LSBoost, Factor model) are reasonable but somewhat dated. Not including more modern benchmarks such as LSTM, Prophet, or hybrid models weakens the empirical rigor of the comparison.
- Although the study claims to offer actionable insights for policy analysts, there is no detailed discussion on how forecast outputs (e.g., uncertainty bounds, tail risks) would be interpreted or trusted by policymakers.
- The models are said to offer well-behaved uncertainty estimates, but no metrics (e.g., coverage probability, sharpness, CRPS) are provided to evaluate these probabilistic forecasts. This is a critical omission given the study's emphasis on forecasting under uncertainty.
- The manuscript, especially the methodology section, is overly long and at times repetitive (e.g., descriptions of Moirai architecture). The structure would benefit from condensing and clearly separating conceptual descriptions from experimental setup.
Author Response
Dear Reviewer,
Thank you for the thoughtful and constructive reviews of our manuscript.
Please find a point-by-point response in the attached PDF file.
Best regards
Author Response File:
Author Response.pdf
Reviewer 3 Report
Comments and Suggestions for AuthorsThis is a solid research demonstration. I recommend acceptance as is.
Author Response
Thank you for your acceptance
Round 2
Reviewer 1 Report
Comments and Suggestions for AuthorsThe submission is acceptable given that concerns over novelty and contributions have been appropriately addressed. The paper also has a clearer motivation now for the problem setting.
Reviewer 2 Report
Comments and Suggestions for AuthorsThe authors have addressed the previous concerns effectively, and the overall quality of the manuscript has been greatly enhanced. I recommend the paper for acceptance.
