Submit to this Journal Review for this Journal Propose a Special Issue

Article Menu

Share Help Cite Discuss in SciProfiles

Open AccessArticle

Peer-Review Record

Gross Domestic Product Forecasting Using Deep Learning Models with a Phase-Adaptive Attention Mechanism

Electronics 2025, 14(11), 2132; https://doi.org/10.3390/electronics14112132

by Lan Dong Thi Ngoc^1,2

, Nguyen Dinh Hoan³ and Ha-Nam Nguyen^4,*

Reviewer 1:

Karime Chahuán-Jiménez

Reviewer 2: Anonymous

Reviewer 3: Anonymous

Reviewer 4:

Angel Saul Cruz Ramírez

Reviewer 5: Anonymous

Electronics 2025, 14(11), 2132; https://doi.org/10.3390/electronics14112132

Submission received: 16 April 2025 / Revised: 20 May 2025 / Accepted: 21 May 2025 / Published: 23 May 2025

(This article belongs to the Special Issue Advances in Data Analysis and Visualization)

Round 1

Reviewer 1 Report

Comments and Suggestions for Authors

The manuscript is of interest to researchers due to its use of the GDP base variable.
The order presented in the manuscript is appropriate. However, there are deficiencies in the number of references provided, as those related to the models compared should be added.
In the methodological development, only the model presented as the best performer is presented, but there should be an explanation for each model compared.
The graphs could be presented with better visualization based on what each model does, rather than just in columns.
It would be interesting to see the predictions for the window presented by the algorithm.
The error metrics used in the manuscript should be explained.

Comments for author File: Comments.pdf

Author Response

We sincerely thank you for the detailed and constructive feedback. We have carefully addressed each of the reviewer’s comments to improve the quality and clarity of the manuscript. Below is our point-by-point response:

Comment 1: “There are deficiencies in the number of references provided, as those related to the models compared should be added.”

Response:
Thank you for this important suggestion. In the revised manuscript (Section 2 - Related Work), we have added several references specifically related to the models used in our comparison, including ARIMA, XGBoost, LSTM, Bi-LSTM, Transformer, and attention-based models. These additions help to provide a more comprehensive overview and strengthen the background and justification for our model selection.

Comment 2: “In the methodological development, only the model presented as the best performer is presented, but there should be an explanation for each model compared.”

Response:
We agree that the methodological section should provide a better understanding of all benchmark models. We have revised Section 4.2 (Experimental Results) and expanded the discussion in Section 4.3 (Discussion) to include a brief explanation of each baseline model - ARIMA, XGBoost, LSTM, Bi-LSTM, Transformer - and their roles in the comparison. This makes the context of their performance clearer to the reader.

Comment 3: “The graphs could be presented with better visualization based on what each model does, rather than just in columns.”

Response:
We appreciate this observation. In the revised manuscript, we have updated several figures to improve the visualization quality and clarity. Specifically:

We added complementary line plots (Figures 3, 5, 7, 9, 11, and 13) that compare actual vs. predicted values for each model.
Each model’s trend is now more clearly visualized over time, which allows readers to see not only performance metrics but also model behavior.

Comment 4: “It would be interesting to see the predictions for the window presented by the algorithm.”

Response:
Thank you for your insightful comment. To enhance the transparency of the forecasting process, we have added a detailed pseudocode representation of the proposed model in Section 3.2 (Phase-Adaptive Attention Representation in Economic Cycles). This pseudocode outlines the logic of how the model dynamically applies phase-specific attention weights across different economic phases. It helps clarify how the model processes input sequences and generates predictions based on phase-labeled data. We believe this addition significantly improves the reproducibility and interpretability of our method.

Comment 5: “The error metrics used in the manuscript should be explained.”

Response:
We have addressed this by explicitly including the formula and explanation for RMSE in Section 3.4 (Procedure). This provides readers with a complete understanding of how model accuracy is measured and compared.

Once again, we sincerely thank you for the valuable comments. These suggestions have significantly contributed to enhancing the rigor and clarity of our manuscript.

Best regards,

Lan Dong Thi Ngoc (on behalf of all authors)

Author Response File: Author Response.pdf

Reviewer 2 Report

Comments and Suggestions for Authors

This article proposes a novel deep learning model named PAA-LSTM (Phase-Aware Adaptive LSTM) to improve the accuracy of GDP growth forecasting. The model integrates Long Short-Term Memory (LSTM) networks with a phase-adaptive attention mechanism based on economic cycles, aiming to better capture the nonlinearity and structural changes in macroeconomic data.

While the PAA-LSTM model introduced in the paper is innovative in several aspects and demonstrates superiority in experiments, there are still some issues and potential areas for improvement. Below are the possible problems with the methodology:

1. The paper uses macroeconomic data from six countries, which, although covering different economic groups, is a relatively small sample size. To validate the model’s generalizability and robustness, testing on a larger and more diverse dataset may be necessary.

2. The division of economic cycles is based on theoretical assumptions and data labels, which may introduce subjectivity and errors. Economic cycles can vary across countries and regions, and how to objectively and accurately delineate these cycles remains a challenge.

3. The PAA-LSTM model combines multiple LSTM layers with a phase-adaptive attention mechanism, increasing its complexity. While the model performs well in experiments, it may face high computational costs and resource demands in practical applications, especially when handling large-scale data.

4. The paper mentions that hyperparameters were optimized for each phase, but it does not provide detailed explanations of the specific tuning methods and processes. The complex model structure may lead to overfitting, particularly with limited data.

5. Although the PAA-LSTM model excels in prediction accuracy, the interpretability of its internal mechanisms, particularly the phase-adaptive attention mechanism, may be insufficient. Improving the model’s transparency and interpretability to make it more practical for economic policy-making is a noteworthy concern.

6. The model performs well on historical data, but its predictive ability in the face of future economic environment changes remains unclear. Especially during new economic phenomena or extreme events, the model’s adaptability and robustness require further validation.

7. The paper compares the PAA-LSTM model with various traditional and deep learning models. However, in some cases, such as in stable economies like the U.S. and Canada, PAA-LSTM does not significantly outperform LSTM or Transformer models. This suggests that in certain contexts, simpler models may suffice, and complex models are not always the optimal choice.

It is hoped that the authors will address the above issues to meet the requirements for paper publication.

Comments for author File: Comments.pdf

Author Response

We sincerely thank you for the insightful and detailed feedback. Your constructive comments have been extremely valuable in helping us improve the clarity, rigor, and applicability of our manuscript. We provide below a point-by-point response to each of your comments and have incorporated corresponding revisions in the manuscript where appropriate.

Comment 1: The paper uses macroeconomic data from six countries, which, although covering different economic groups, is a relatively small sample size. To validate the model’s generalizability and robustness, testing on a larger and more diverse dataset may be necessary.

Response:
We agree that extending the model to a larger set of countries could further validate its robustness. In this study, we intentionally selected six countries to represent three economic groups (developed, emerging, and developing) as a balanced initial testbed. We have added suggestions for future work involving broader cross-country validation using datasets from more regions and different economic systems. Future research could explore directions such as incorporating additional data sources (e.g., Google Trends, social media, satellite imagery) to enhance the richness of the data.

Comment 2: The division of economic cycles is based on theoretical assumptions and data labels, which may introduce subjectivity and errors. Economic cycles can vary across countries and regions, and how to objectively and accurately delineate these cycles remains a challenge.

Response:
Thank you for raising this important point. In the revised version (Section 3.2), we have clarified how economic phases were identified using a combination of classical business cycle theory (Burns & Mitchell, 1946) and data-driven heuristics based on GDP growth rates. While we recognize that subjectivity cannot be fully eliminated, we took care to use consistent rules across countries and included references to business cycle detection literature to support our approach. We also acknowledge in the Limitations section that incorporating real-time phase detection models could enhance objectivity in future work.

Comment 3: The PAA-LSTM model combines multiple LSTM layers with a phase-adaptive attention mechanism, increasing its complexity. While the model performs well in experiments, it may face high computational costs and resource demands in practical applications, especially when handling large-scale data.

Response:
We appreciate this concern. In Section 4.2 of the revised manuscript, we have added a description of the hardware and software environments used for training (including CPU, GPU, and memory specifications), as well as training time benchmarks. Although the model does introduce computational overhead due to the phase-adaptive attention layer, we show that the model is still feasible to train on a moderately equipped workstation. We also note this as a trade-off between accuracy and efficiency, which should be balanced depending on the deployment context, as economic models are often uncertain and require complex regularities.

Comment 4: The paper mentions that hyperparameters were optimized for each phase, but it does not provide detailed explanations of the specific tuning methods and processes. The complex model structure may lead to overfitting, particularly with limited data.

Response:
We thank the reviewer for this helpful observation. In the revised Section 3.3 and 4.2, we now provide more detailed information about the hyperparameter tuning process. Specifically, we used grid search with time-series aware validation folds and applied early stopping to mitigate overfitting. We have also included a note on how fine-tuning was applied only to phase-specific components (attention and dense layers), while LSTM layers remained fixed during this stage, reducing overfitting risk.

Comment 5: Although the PAA-LSTM model excels in prediction accuracy, the interpretability of its internal mechanisms, particularly the phase-adaptive attention mechanism, may be insufficient. Improving the model’s transparency and interpretability to make it more practical for economic policy-making is a noteworthy concern.

Response:
We acknowledge this concern and have addressed it in several ways. In Section 3.2, we now include a pseudocode block to clearly illustrate how attention weights are computed based on each economic phase. In addition, we have added a discussion in Section 5 suggesting future work on integrating explainable AI (XAI) techniques such as SHAP or attention heatmaps to improve the interpretability of the phase-adaptive mechanism, especially for use in policy-related contexts.

Comment 6: The model performs well on historical data, but its predictive ability in the face of future economic environment changes remains unclear. Especially during new economic phenomena or extreme events, the model’s adaptability and robustness require further validation.

Response:
We agree that real-world deployment requires robust adaptability to unexpected changes. In Section 4.3 and Section 5 (Discussion and Conclusion), we now emphasize this limitation and propose possible directions for future research, including real-time model retraining, online learning techniques, and scenario-based stress testing to assess the model’s performance under economic shocks and extreme events.

Comment 7: The paper compares the PAA-LSTM model with various traditional and deep learning models. However, in some cases, such as in stable economies like the U.S. and Canada, PAA-LSTM does not significantly outperform LSTM or Transformer models. This suggests that in certain contexts, simpler models may suffice, and complex models are not always the optimal choice.

Response:
Thank you for this valuable insight. We now explicitly acknowledge in the Discussion section that PAA-LSTM’s performance advantage is more apparent in volatile or cyclically complex economies. In contrast, for stable economies with less structural fluctuation, simpler models such as standard LSTM or Transformer may suffice. We interpret this as a signal for the potential development of adaptive model selection strategies based on country-specific characteristics, which we propose as an avenue for future work.

Once again, we greatly appreciate your comments, which have substantially strengthened the manuscript. We hope the revised version addresses your concerns satisfactorily.

Best regards,

Lan Dong Thi Ngoc (on behalf of all authors)

Author Response File: Author Response.pdf

Reviewer 3 Report

Comments and Suggestions for Authors

Thank you for the opportunity to review your manuscript. Kindly find below some suggestions and comments for your consideration:

Introduction: Reduce overlap with the abstract and sharpen the research gap by contrasting your phase-adaptive attention model with key existing approaches.
Literature Review: Reorganize thematically (e.g., traditional vs. deep-learning forecasting) to improve flow and highlight comparative strengths and weaknesses.
Methodology: Add a schematic or pseudocode for your phase-aware attention computation to enhance clarity and reproducibility.
Data and Scope: Specify the exact time spans and frequency (annual vs. quarterly) for each country and detail how you handled missing values.
Phase-Wise Training: Present the phase-weighted loss formula and indicate dataset balances or hyperparameter ranges used.
Experimental Setup: Report hardware/software details (GPU specs, libraries) and clarify your cross-validation or train/test split strategy.
Results and Discussion: Include statistical significance tests for performance gains and discuss why the model underperforms on certain countries.
Conclusion & Future Work: Focus on key takeaways and elaborate on one or two specific avenues (e.g., real-time phase detection).

Comments on the Quality of English Language

Proofread for typos and ensure reference formatting is consistent.

Author Response

Response to Reviewer 3

We would like to sincerely thank you for your thorough reading of our manuscript and for your thoughtful, constructive suggestions. Your feedback has significantly contributed to improving the structure, clarity, and rigor of our paper. Below is our point-by-point response:

Comment 1– Introduction:

Reduce overlap with the abstract and sharpen the research gap by contrasting your phase-adaptive attention model with key existing approaches.

Response:
Thank you for pointing this out. We have revised the Introduction section to minimize overlap with the abstract and sharpen the articulation of the research gap. In particular, we now more clearly contrast our PAA-LSTM model with existing attention-based models that apply attention statically or without phase awareness.

Comment 2 – Literature Review:

Reorganize thematically (e.g., traditional vs. deep-learning forecasting) to improve flow and highlight comparative strengths and weaknesses.

Response:
We appreciate this suggestion. The Related Work section has been reorganized into clearer thematic blocks: (i) traditional econometric models (e.g., ARIMA, VAR), (ii) machine learning and LSTM-based models, and (iii) attention-based and Transformer models.

Comment 3 – Methodology:

Add a schematic or pseudocode for your phase-aware attention computation to enhance clarity and reproducibility.

Response:
We have added detailed pseudocode for the phase-adaptive attention mechanism in Section 3.2. This provides a step-by-step description of how attention weights are computed based on economic phase labels. We believe this significantly improves the model’s reproducibility and transparency for future researchers.

Comment 4– Data and Scope:

Specify the exact time spans and frequency (annual vs. quarterly) for each country and detail how you handled missing values.

Response:
We have clarified the time span and frequency used for each country in Section 4.1. Specifically, data from 1980 to 2019 (or from 1991 for Russia) is used, and the frequency is annual for all countries. Additionally, we have included more details on data preprocessing and handling of missing or inconsistent entries.

Comment 5– Phase-Wise Training:

Present the phase-weighted loss formula and indicate dataset balances or hyperparameter ranges used.

Response:
Thanks for your suggestion. We have taken it into consideration and have made corrections.

Comment 6 – Experimental Setup:

Report hardware/software details (GPU specs, libraries) and clarify your cross-validation or train/test split strategy.

Response:
These details have been added in Section 4.2. We specify that experiments were conducted using a workstation with an Intel i9 CPU, 32GB RAM, and Nvidia Quadro T2000 GPU, using Python 3.10, TensorFlow 2.13, and Scikit-learn 1.3. We also clarify the train/test split ((details in manuscript)

Comment 7– Results and Discussion:

Include statistical significance tests for performance gains and discuss why the model underperforms on certain countries.

Response:
We appreciate this insightful comment. We have revised the manuscript according to your suggestions.

Comment 8– Conclusion & Future Work:

Focus on key takeaways and elaborate on one or two specific avenues (e.g., real-time phase detection).

Response:
Thank you for this helpful advice. We have revised the Conclusion section to emphasize the main findings of the study and have outlined two specific directions for future research: (i) incorporating real-time phase detection mechanisms and (ii) integrating additional data sources to improve model adaptability.

Once again, we greatly appreciate your comments, which have substantially strengthened the manuscript. We hope the revised version addresses your concerns satisfactorily.

Best regards,

Lan Dong Thi Ngoc (on behalf of all authors)

Author Response File: Author Response.pdf

Reviewer 4 Report

Comments and Suggestions for Authors

Abstract (Line 14). Please clarify the meaning of the acronym PAA in the abstract. While the meaning of LSTM is provided, PAA is introduced without explanation.

Introduction (Line 34). You reference OECD (2018) and IMF (2021). Please verify whether more recent versions of these reports are available, as up-to-date references enhance the relevance and credibility of your background information.

Introduction (Lines 40–42). The citation provided does not directly support the claim that “models such as ARIMA and VAR have revealed several limitations in capturing the nonlinearity and structural variation inherent in economic time series.” It is recommended that you cite literature that explicitly addresses the limitations of ARIMA and VAR in this context.

The acronyms “CNN” and “Bi” are used without definition (Line 85). Please define these terms the first time they appear in the manuscript.

The claim that “indicators like GDP, unemployment, and inflation often follow cyclical patterns with typical phases: recession, recovery, expansion, and slowdown” (line 106-108) is currently supported by a source that is over 79 years old. Please provide a more recent scholarly reference or textbook that reflects contemporary macroeconomic theory.

Section 3.3 (Proposed Method). This section lacks citations for key components of the proposed methodology. For example, in line 185, where you reference “economic theory” as a foundation, you should include citations of authors who previously developed or applied these concepts. Please ensure that methodological elements derived from prior work are properly cited.

The link provided for reference [18] (“World Bank Open Data”) is not functional (line 236). Please verify and update the URL or provide an alternative access method to this data source.

Sections 4.1 and 4.2 are methodological in nature and should be repositioned under the methodology section, not the results section.

To improve clarity and support your findings, include a line graph that visually compares the predictive performance of the different models across countries. This will help readers better interpret your results.

Substantial revisions required:

1.- You state that the dataset includes quarterly and cyclical components. In this case, a Seasonal ARIMA (SARIMA) model would be more appropriate than a standard ARIMA model, as SARIMA explicitly accounts for seasonality. The manuscript does not discuss SARIMA, nor does it justify its exclusion. Please either incorporate SARIMA into your analysis or provide a robust justification for its omission in both the methodology and results sections.

2.- The time period covered in your time series analysis is not clearly stated. Please specify the range of dates for the data used in your study.

3.- The manuscript lacks a dedicated discussion section. You are advised to include one, or to clearly integrate the discussion within the results section. This section should compare your findings to those of prior studies (either similar methodologies applied in different contexts or the same contexts examined with other techniques) and assess the degree to which your results confirm, contradict, or extend previous research.

Comments on the Quality of English Language

Paragraph Structure. Please avoid constructing paragraphs with only two sentences. In academic English writing, a well-structured paragraph typically includes at least three sentences: the first introduces the main idea, the second (and possibly third) supports it with evidence or elaboration, and the final sentence concludes or transitions to the next point. This structural issue should be addressed in the following lines: 58–62, 85–87, 291–293, 343–346, and 353–358.

Sentence Length and Clarity. Overly long sentences can hinder clarity and readability. Consider breaking lengthy sentences into shorter, more digestible segments. This revision is particularly necessary in lines 89–93.

The sentence describing the architecture of the PAA-LSTM model, “The overall architecture of the PAA-LSTM model is composed of three main components: a multi-layer Long Short-Term Memory (LSTM) network, a Phase-Aware Adaptive Attention mechanism, and a fully connected output layer”, is repeated in both lines 133–135 and 172–174. Please eliminate this redundancy by retaining the more contextually appropriate instance.

The description of economic group classifications appears redundantly in lines 241–245 and again in 266–268. Please consolidate this information to avoid repetition.

The list of models used for comparison is repeated in lines 261–262 and again in 269–269. Consider presenting this information once in a concise and clear manner.

The data presented in Table 7 appears to replicate information already included in previous tables. Please review and remove redundant data to streamline the presentation of your results.

The content in lines 376–380 and 383–386 repeats previously stated results from individual country analyses. Please revise this section to avoid duplication and focus instead on synthesizing or comparing findings across cases.

Author Response

We sincerely thank you for the time and effort invested in evaluating our manuscript and for providing constructive and insightful comments. Below is our detailed response:

Comments on “General Comments”:

Abstract (Line 14). Please clarify the meaning of the acronym PAA in the abstract. While the meaning of LSTM is provided, PAA is introduced without explanation.

The acronyms “CNN” and “Bi” are used without definition (Line 85). Please define these terms the first time they appear in the manuscript.

The link provided for reference [18] (“World Bank Open Data”) is not functional (line 236). Please verify and update the URL or provide an alternative access method to this data source.

Sections 4.1 and 4.2 are methodological in nature and should be repositioned under the methodology section, not the results section.

Response to comments on “General Comments:

Abstract (Line 14): We have added a clarification for the acronym “PAA” as “Phase-Aware Attention” in the abstract to ensure clarity.
Introduction (Line 34): We have updated the OECD and IMF references to their latest available versions to enhance the timeliness and credibility of the background information.
Introduction (Lines 40–42): We have replaced the previous citation with more recent literature that explicitly discusses the limitations of ARIMA and VAR models in capturing nonlinearity and structural shifts in economic time series.
Line 85: We have defined “CNN” (Convolutional Neural Network) and “Bi-LSTM” (Bidirectional Long Short-Term Memory) upon their first appearance in the manuscript.
Lines 106–108: The cited theory represents a foundational framework in classical macroeconomics, which continues to be referenced and built upon in contemporary economic literature. Nevertheless, we have supplemented the original reference with a more recent and authoritative source to better reflect modern macroeconomic perspectives.
Section 3.3: We have added appropriate citations to support the key methodological elements derived from economic theory and prior studies.
Reference [18]: The URL for the World Bank Open Data source has been corrected and verified for functionality.
Sections 4.1 and 4.2: Sections 4.2 have been moved under the Methodology section for better structural alignment.
Line Graph Comparison: We have added a visual line graph to compare the predictive performance of the different models across countries, as suggested.

Response to comments on "Substantial revisions required"

Comment 1:

You state that the dataset includes quarterly and cyclical components. In this case, a Seasonal ARIMA (SARIMA) model would be more appropriate than a standard ARIMA model, as SARIMA explicitly accounts for seasonality. The manuscript does not discuss SARIMA, nor does it justify its exclusion. Please either incorporate SARIMA into your analysis or provide a robust justification for its omission in both the methodology and results sections.

Response:
Thank you very much for this insightful comment. We acknowledge that SARIMA is indeed a suitable model for time series data with seasonal components. However, in our case, although the macroeconomic data contains cyclical patterns, the dataset we use is of annual frequency, not quarterly. Therefore, the seasonality component typically addressed by SARIMA (e.g., quarterly or monthly seasonality) is not present in our setting.

To clarify this and prevent confusion, we have now updated both the Methodology and Results sections to explicitly mention the use of annual data, and to justify the exclusion of SARIMA on this basis. We also added a short note in the Related Work section acknowledging SARIMA as a valuable alternative for datasets with seasonal frequencies.

Comment 2:

The time period covered in your time series analysis is not clearly stated. Please specify the range of dates for the data used in your study.

Response:
Thank you for pointing this out. We have revised the Data and Scope section to clearly state the time range for each country. Specifically, for most countries (United States, Canada, China, India, and Vietnam), the dataset spans from 1980 to 2019, while for Russia, data is available from 1991 to 2019 due to limitations in historical records.

Comment 3:

The manuscript lacks a dedicated discussion section. You are advised to include one, or to clearly integrate the discussion within the results section. This section should compare your findings to those of prior studies (either similar methodologies applied in different contexts or the same contexts examined with other techniques) and assess the degree to which your results confirm, contradict, or extend previous research.

Response:
We appreciate this thoughtful recommendation. In the revised manuscript, we have added a dedicated Discussion section (Section 4.3) to better analyze and contextualize our findings. This section compares our results with those of previous studies applying LSTM, attention-based models, or traditional econometric techniques. We specifically highlight areas where our PAA-LSTM model outperforms or aligns with existing research, and we discuss why certain models perform differently across country groups. This addition strengthens the interpretive depth and scholarly positioning of our study.

Comments on English Language:

The description of economic group classifications appears redundantly in lines 241–245 and again in 266–268. Please consolidate this information to avoid repetition.

The list of models used for comparison is repeated in lines 261–262 and again in 269–269. Consider presenting this information once in a concise and clear manner.

The data presented in Table 7 appears to replicate information already included in previous tables. Please review and remove redundant data to streamline the presentation of your results.

Response to comments on English Language:

We have carefully revised the manuscript to improve clarity and readability, specifically:

Paragraphs with fewer than three sentences have been consolidated into cohesive, well-structured paragraphs.
Overly long sentences have been shortened or restructured for better readability.
Redundant content (model names, country groupings, and repeated results) has been removed or consolidated.
The synthesis of results across countries has been revised to avoid repetition and instead highlight cross-national comparisons.

Once again, we sincerely thank the reviewer for their valuable feedback, which has significantly contributed to strengthening the quality and rigor of our manuscript.

Best regards,

Lan Dong Thi Ngoc (on behalf of all authors)

Author Response File: Author Response.pdf

Reviewer 5 Report

Comments and Suggestions for Authors

Referee Report on: "Gross Domestic Product Forecasting Using Deep Learning Models with Phase-Adaptive Attention Mechanism" (electronics-3619368)

Summary

The submitted paper proposes a novel forecasting model, termed PAA-LSTM (Phase-Aware Adaptive LSTM), for predicting GDP growth. The model integrates a standard LSTM network with a customized attention mechanism that is conditioned on business-cycle phases (recession, recovery, expansion, stagnation). In essence, the attention weights are allowed to vary depending on which phase the economy is in, under the premise that different phases emphasize different signals in macro data. The authors apply this model to real GDP data for Vietnam and India, comparing its one-step-ahead forecast accuracy (RMSE, MAE, R²) to several benchmarks, including classical time-series (ARIMA), machine learning (XGBoost), and deep learning models (LSTM, bi-LSTM, Transformer). They report that the PAA-LSTM model outperforms all baselines, achieving substantially lower error and higher R² in their case studies. The paper claims two main contributions: (1) a novel deep-learning architecture that incorporates cyclical regime information via phase-adaptive attention, and (2) empirical evidence that this approach yields improved GDP forecasts in volatile, emerging-market contexts.

Major Issues

Definition and Use of Economic Phases: A central premise is that the model has access to phase information (recession, expansion, etc.) and uses it to adapt attention parameters. However, it is not clear how these phases are defined or determined. Are they labeled manually (e.g. using historical NBER or local business-cycle dates) or inferred by the model? If phase labels are based on hindsight, the model may be using information not available in real-time forecasting. The paper should clarify how phases are identified in the data and, crucially, how phase information would be known or estimated at forecast time. This issue is fundamental: regime-switching models in macro (e.g. Hamilton 1989) emphasize that phases cannot be observed perfectly ex ante. The authors should explicitly discuss the feasibility of their approach in real forecasting, and possibly compare to alternative methods (e.g. Markov-switching models or time-varying parameter models).
Data and Overfitting Concerns: The analysis is limited to two small economies (Vietnam and India), both with relatively short GDP series. Given the complexity of PAA-LSTM (multi-layer LSTM plus separate attention parameters per phase), there is a substantial risk of overfitting, especially with limited training data (e.g. quarterly GDP back to 1990s yields only a few hundred points). The authors should provide details on data frequency, sample size, and how the model was trained (e.g. cross-validation or hold-out periods). Did they use walk-forward testing, or only a single train-test split? It would strengthen the paper to include measures of forecast stability or statistical significance. For example, Xie et al. (2024) find that deep learning can outperform simple methods for GDP growth only when rich predictors or novel data are used. If the authors use only GDP (and perhaps a few macro indicators), it is possible that a simpler model would suffice or even outperform if tuned. At minimum, some sensitivity analysis (varying train/test splits or model complexity) is needed.
Benchmark Models and Comparisons: The PAA-LSTM is compared to ARIMA, XGBoost, LSTM, bi-LSTM, and Transformer. However, details on how these benchmarks were implemented are missing. For fairness, each model should be properly tuned (e.g. hyperparameters for XGBoost and Transformer architecture). The reported results are puzzling: for Vietnam, the standard LSTM has lower RMSE (0.78) than ARIMA (0.95), yet its R² is slightly lower (0.75 vs. 0.76). It is unclear how R² is computed here (it may not be meaningful for non-stationary series). The authors should verify metric definitions and consider robust measures (for example, mean absolute percentage error or Theil’s U). They should also test if the improvements are statistically significant. In related work, ensemble methods and high-frequency indicators have often dominated single-model forecasts (Chu & Qureshi 2023, via Jallow et al., 2025). It would be useful to compare against a simple ensemble of LSTM models or a richer factor model. As it stands, the claim of “state-of-the-art” accuracy is not fully supported without broader benchmarks.
Interpretability and Economic Plausibility: The novelty lies in making attention phase-adaptive, but the paper provides little insight into how the attention shifts across phases or which variables/time-steps become important. Simply reporting a better error metric is not enough for the Journal of Finance audience; readers will want to understand why the model works. For example, does the model pay more attention to investment or consumption variables in expansions vs. recessions? The authors should analyze the learned attention weights or do an ablation (e.g. compare with a non-phase-aware attention). They might also connect to economic intuition: prior research (e.g. Chen et al., 2024) has shown that including key macro indicators (CPI, unemployment) notably improves GDP forecasts with deep nets. In a similar vein, Liu et al. (2024) demonstrate that augmenting data with generative models and then analyzing feature importance yields insights into which factors drive forecasts. The current draft has little discussion of which features or periods drive the PAA-LSTM predictions.
Model Complexity vs. Parsimony: The PAA-LSTM has many parameters (LSTM layers plus separate attention matrices for each of four phases). The authors should justify this complexity. In macro forecasting, simpler models often suffice. For example, Xie et al. (2024) report that simple linear regression can outperform deep nets when only GDP is used. This suggests the need for regularization or feature selection. The authors should discuss model parsimony. Could a simpler regime-switching LSTM (with a single changing weight matrix) achieve similar results? In its current form, the model risks overfitting idiosyncrasies in Vietnam/India data, which may not generalize. A suggestion is to test the model on a well-studied series (e.g. U.S. or OECD GDP) and see if it still gains over baselines.
Presentation of Results: Some reported statistics raise questions. For Vietnam, PAA-LSTM’s R² is given as 0.94 (very high), compared to 0.75 for LSTM; likewise, the Transformer baseline had a higher R² (0.82) but worse RMSE. These inconsistencies suggest R² might be computed relative to a non-trivial benchmark (perhaps mean GDP). The paper should clarify these metrics. The tables and figures should also include confidence intervals or p-values if possible. In addition, Figure 2 and 3 are mentioned but not fully described; ensure each figure is self-contained with clear labels and legends. The textual discussion of results could engage more with related literature – for instance, whether improving on Transformer results is surprising, given that other studies (Chen et al., 2024) often find Transformers outperform LSTMs.

Minor Issues

Organization: The flow of sections can be improved. The Literature Review lists many related works but should more directly contrast them with the present paper. For instance, mention explicitly that previous deep-GDP studies (e.g. Chen et al., 2024; Xie et al., 2024) have not used cyclical attention, highlighting the novelty here. In the Methods section, a figure illustrating the PAA-LSTM architecture would help readers follow the description.
Formatting and Presentation: Ensure consistency in table and figure formatting. For example, Table 1 lacks units for RMSE/MAE (are these percentage points of GDP growth?). All figures should be high enough resolution to be readable; captions should summarize the key message. The bibliography should follow the Journal of Finance style (though citations are shown in the text as required by the submission guidelines).
Reference to Related Work: The paper would benefit from citing a few recent surveys or papers on machine learning, to place the contribution in context. For example, recent work Lesmi et al., 2024 has shown that the number of years of schooling required to comprehend financial reporting increases by nearly one month each year, indicating a growing inaccessibility of financial reports for a substantial portion of the population. Citing such work (even if from fields like applied AI in finance) would strengthen the introduction.

References

Chen, X.-S., Kim, M. G., Lin, C., & Na, H. J. (2025). Development of Per Capita GDP Forecasting Model Using Deep Learning: Incorporating CPI and Unemployment. Sustainability 17(3), 843. https://doi.org/10.3390/su17030843
Jallow, D. et al. (2025). Transfer Learning for Predicting GDP Growth Based on Remittance Inflows: A Case Study of The Gambia. Frontiers in Artificial Intelligence. https://doi.org/10.3389/frai.2025.1510341
Liu, D., Chen, K., Cai, Y., & Tang, Z. (2024). Interpretable EU ETS Phase 4 Prices Forecasting Based on Deep Generative Data Augmentation. Finance Research Letters, 61. https://doi.org/10.1016/j.frl.2024.105038
Lesmy, D., Muchnik, L., & Mugerman, Y. (2024). Lost in the fog: Growing complexity in financial reporting–a comparative study. http://dx.doi.org/10.2139/ssrn.4542676
Xie, H., Xu, X., Yan, F., Qian, X., & Yang, Y. (2024). Deep Learning for Multi-Country GDP Prediction: A Study of Model Performance and Data Impact. arXiv preprint, 2024. https://doi.org/10.48550/arXiv.2409.02551

Author Response

Response to Reviewer 5

We sincerely thank you for the comprehensive and thoughtful feedback. Your comments raise several important concerns regarding the model’s practical feasibility, methodological rigor, and interpretability. We address each major point in detail below and have revised the manuscript accordingly.

Response to Reviewer – Major Issues

Comment 1. Definition and Use of Economic Phases

A central premise is that the model has access to phase information (recession, expansion, etc.) and uses it to adapt attention parameters. However, it is not clear how these phases are defined or determined. Are they labeled manually (e.g. using historical NBER or local business-cycle dates) or inferred by the model? If phase labels are based on hindsight, the model may be using information not available in real-time forecasting. The paper should clarify how phases are identified in the data and, crucially, how phase information would be known or estimated at forecast time. This issue is fundamental: regime-switching models in macro (e.g. Hamilton 1989) emphasize that phases cannot be observed perfectly ex ante. The authors should explicitly discuss the feasibility of their approach in real forecasting, and possibly compare to alternative methods (e.g. Markov-switching models or time-varying parameter models).

Response:

We thank the reviewer for this insightful and important comment. In response, we have made substantial clarifications in both the Methodology and Discussion sections of the revised manuscript to address the concern about how economic phases are defined and how phase information can be used in practice.

First , we clarify that economic phase labels in our study are not derived from ex-post official sources such as NBER cycle dates. Instead, we adopt a data-driven heuristic rule based solely on the sign and magnitude of the real GDP growth rate available at time ttt. Specifically:

Negative growth indicates recession;
Low but positive growth indicates stagnation;
A rebound from negative growth marks recovery;
Sustained high positive growth signals expansion.

This classification relies only on past and current observed GDP growth values and does not use any future or hindsight information. Furthermore, to mitigate country-specific differences in growth baselines, we adapt these thresholds relative to each country’s historical average growth and volatility.

Second, we explicitly recognize in the revised text (Discussion) that economic phases are inherently latent and can only be approximated, as noted in classic regime-switching literature (e.g., Hamilton, 1989). We also discuss that while our approach uses observed variables to proxy for latent regimes, it is consistent with simplified classification strategies used in applied macroeconomic forecasting.

Comment 2: Data and Overfitting Concerns

The analysis is limited to two small economies (Vietnam and India), both with relatively short GDP series. Given the complexity of PAA-LSTM (multi-layer LSTM plus separate attention parameters per phase), there is a substantial risk of overfitting, especially with limited training data (e.g. quarterly GDP back to 1990s yields only a few hundred points). The authors should provide details on data frequency, sample size, and how the model was trained (e.g. cross-validation or hold-out periods). Did they use walk-forward testing, or only a single train-test split? It would strengthen the paper to include measures of forecast stability or statistical significance. For example, Xie et al. (2024) find that deep learning can outperform simple methods for GDP growth only when rich predictors or novel data are used. If the authors use only GDP (and perhaps a few macro indicators), it is possible that a simpler model would suffice or even outperform if tuned. At minimum, some sensitivity analysis (varying train/test splits or model complexity) is needed.

Response:

We sincerely thank the reviewer for the thoughtful and constructive comments regarding sample size and the risk of overfitting in model training. To mitigate overfitting and enhance the robustness of our results, we implemented the following strategies:

An expanding window cross-validation approach, as detailed in Section 4.2, was employed to ensure that the model is evaluated across multiple temporal folds while preserving the causal structure of the data. Additionally, as described in Section 3.3, we designed a phase-wise training strategy, which allows the model to effectively leverage each segment of the data while avoiding a substantial reduction in sample size for each economic phase.

Furthermore, we acknowledge that incorporating alternative data sources, such as Google Trends or micro-level indicators, is a promising direction for future research to increase the diversity and richness of the input feature set. This perspective is also consistent with the conclusions of Xie et al. (2024), who emphasize the importance of novel and high-dimensional data for improving the performance of deep learning models in GDP forecasting.

Comment 3: Benchmark Models and Comparisons

The PAA-LSTM is compared to ARIMA, XGBoost, LSTM, bi-LSTM, and Transformer. However, details on how these benchmarks were implemented are missing. For fairness, each model should be properly tuned (e.g. hyperparameters for XGBoost and Transformer architecture). The reported results are puzzling: for Vietnam, the standard LSTM has lower RMSE (0.78) than ARIMA (0.95), yet its R² is slightly lower (0.75 vs. 0.76). It is unclear how R² is computed here (it may not be meaningful for non-stationary series). The authors should verify metric definitions and consider robust measures (for example, mean absolute percentage error or Theil’s U). They should also test if the improvements are statistically significant. In related work, ensemble methods and high-frequency indicators have often dominated single-model forecasts (Chu & Qureshi 2023, via Jallow et al., 2025). It would be useful to compare against a simple ensemble of LSTM models or a richer factor model. As it stands, the claim of “state-of-the-art” accuracy is not fully supported without broader benchmarks.

Response:

We thank the reviewer for raising several important points regarding the implementation of benchmark models, metric interpretation, and the validity of performance claims.

First, we have updated the manuscript (Section 3.4) to provide detailed descriptions of the benchmark model implementations, including:

ARIMA: Selected via AIC-based model order identification (auto-ARIMA), applied to each country’s GDP growth series individually.
XGBoost: Hyperparameters (e.g., max_depth, learning_rate, n_estimators) were optimized using grid search with cross-validation.
Transformer: The architecture includes 2 encoder layers, 4 attention heads, and a feedforward layer with dropout. Parameters were fine-tuned to avoid overfitting.
LSTM, Bi-LSTM, and LSTM + Attention: All were implemented with consistent input sequences, embedding sizes, and regularization techniques.

Second, regarding the RMSE and R² inconsistency observed in Vietnam, we agree that R² may be limited with nonlinear models, we will study and add MAPE and Theil’s U coefficient.

Third we fully agree with the reviewer’s suggestion to compare against ensemble or factor-based models. While such models are beyond the current scope, ensemble methods (e.g., LSTM ensembles, dynamic factor models) as a valuable direction for future work. This is aligned with recent findings such as Chu & Qureshi (2023) and Jallow et al. (2025), as cited.

Comment 4: Interpretability and Economic Plausibility

The novelty lies in making attention phase-adaptive, but the paper provides little insight into how the attention shifts across phases or which variables/time-steps become important. Simply reporting a better error metric is not enough for the Journal of Finance audience; readers will want to understand why the model works. For example, does the model pay more attention to investment or consumption variables in expansions vs. recessions? The authors should analyze the learned attention weights or do an ablation (e.g. compare with a non-phase-aware attention). They might also connect to economic intuition: prior research (e.g. Chen et al., 2024) has shown that including key macro indicators (CPI, unemployment) notably improves GDP forecasts with deep nets. In a similar vein, Liu et al. (2024) demonstrate that augmenting data with generative models and then analyzing feature importance yields insights into which factors drive forecasts. The current draft has little discussion of which features or periods drive the PAA-LSTM predictions.

Response:
We would like to thank the reviewer for his thoughtful and important comments on the interpretability and economic plausibility of the model. We strongly agree that the current version of the manuscript does not go into a detailed analysis of the learned attention weights to understand which times or variables the model focuses on in each phase, nor does it clearly show which economic characteristics (such as consumption, investment, unemployment) play an important role in each phase of the economic cycle. Due to space limitations and the priority of presenting the new model and the empirical results on six countries, a more in-depth analysis of attention will be prioritized in a follow-up study that we are working on.

Comment 5: Model Complexity vs. Parsimony

The PAA-LSTM has many parameters (LSTM layers plus separate attention matrices for each of four phases). The authors should justify this complexity. In macro forecasting, simpler models often suffice. For example, Xie et al. (2024) report that simple linear regression can outperform deep nets when only GDP is used. This suggests the need for regularization or feature selection. The authors should discuss model parsimony. Could a simpler regime-switching LSTM (with a single changing weight matrix) achieve similar results? In its current form, the model risks overfitting idiosyncrasies in Vietnam/India data, which may not generalize. A suggestion is to test the model on a well-studied series (e.g. U.S. or OECD GDP) and see if it still gains over baselines.

Response:

We sincerely thank you for the thoughtful and highly relevant comment regarding the trade-off between model complexity and the principle of parsimony in the context of macroeconomic forecasting.

We fully agree that parsimony is an important criterion, especially in economic modeling where data may be limited and generalizability is crucial. However, we also believe that macroeconomic data is often nonlinear, subject to volatility and cyclical dynamics, and therefore using a more complex model - such as PAA-LSTM with phase-adaptive attention - is a reasonable trade-off to better capture the dynamic nature of economic time series.

To mitigate the risk of overfitting, we implemented a time-based validation strategy (expanding window cross-validation), as described in Section 4.2. This approach ensures that the model is evaluated in a temporally consistent manner, preserving causality and enhancing stability under real-world forecasting conditions.

We acknowledge that the PAA-LSTM architecture involves a higher number of parameters compared to traditional models, as it incorporates separate attention mechanisms for each economic phase. However, this design choice is motivated by the need to model the asymmetric behavior across different phases of the economic cycle - a feature often overlooked in linear models or fixed attention structures. For instance, during recessions or recoveries, the relevance and influence of macroeconomic indicators can shift substantially, and the model must be able to adapt accordingly.

Finally, to enhance the model's generalizability, we evaluated PAA-LSTM across six countries representing three different economic groups, including the United States and Canada - two stable, well-studied economies frequently used as benchmarks in macroeconomic forecasting research. The results demonstrate that the model maintains robust performance, thereby supporting the feasibility of phase-adaptive attention even in low-volatility environments.

Comment 6: Presentation of Results

Some reported statistics raise questions. For Vietnam, PAA-LSTM’s R² is given as 0.94 (very high), compared to 0.75 for LSTM; likewise, the Transformer baseline had a higher R² (0.82) but worse RMSE. These inconsistencies suggest R² might be computed relative to a non-trivial benchmark (perhaps mean GDP). The paper should clarify these metrics. The tables and figures should also include confidence intervals or p-values if possible. In addition, Figure 2 and 3 are mentioned but not fully described; ensure each figure is self-contained with clear labels and legends. The textual discussion of results could engage more with related literature – for instance, whether improving on Transformer results is surprising, given that other studies (Chen et al., 2024) often find Transformers outperform LSTMs.

Response:

We thank the reviewer for raising an important point regarding the interpretation of RMSE and R² metrics. While both are commonly used performance indicators, they reflect different aspects of model evaluation. The seemingly inconsistent phenomenon - for example, the Transformer model having a higher R² than LSTM but also a larger RMSE - can occur when the model captures the overall trend of the data better (resulting in smaller residual variance), even if its pointwise errors are slightly larger. Conversely, a model with lower RMSE may generate predictions that are closer to the mean, thereby reducing absolute errors but performing less effectively in explaining variance.

Regarding Figures 2 and 3, in the revised manuscript, all figures have been updated to be fully self-contained, with clear titles, labeled axes, legends, and detailed descriptions.

Lastly, we have expanded the discussion of results by engaging more closely with related studies. Specifically, in Section 4.3, we reference recent research such as Chen et al. (2024), which indicates that Transformer models often outperform LSTM in economic forecasting tasks. The fact that our PAA-LSTM model outperforms the Transformer in certain countries - such as Vietnam -is attributed to the advantage of phase-aware attention in smaller, highly volatile economies. By contrast, Transformer models tend to perform better on large-scale, high-frequency, and structurally stable datasets. This perspective has been added to the manuscript to better contextualize our findings within the broader literature.

Response to Minor Issues

Comment 7: Organization

The flow of sections can be improved. The Literature Review lists many related works but should more directly contrast them with the present paper. For instance, mention explicitly that previous deep-GDP studies (e.g. Chen et al., 2024; Xie et al., 2024) have not used cyclical attention, highlighting the novelty here. In the Methods section, a figure illustrating the PAA-LSTM architecture would help readers follow the description

Response to comment 7

In the revised manuscript, we have improved the flow of the Related Work section by directly contrasting our proposed approach with prior deep learning models for GDP forecasting.

We have added Figure 1 in Section 3.1, which illustrates the architecture of the PAA-LSTM model, including its LSTM layers and phase-adaptive attention modules

Comment 8: Formatting and Presentation

Ensure consistency in table and figure formatting. For example, Table 1 lacks units for RMSE/MAE (are these percentage points of GDP growth?). All figures should be high enough resolution to be readable; captions should summarize the key message. The bibliography should follow the Journal of Finance style (though citations are shown in the text as required by the submission guidelines).

Response to comment 8

In response:

We have clarified in Table 1 and all other relevant tables that RMSE and MAE are expressed in percentage points of GDP growth.
All figures have been regenerated in higher resolution to ensure readability in both online and print formats.
We have revised all figure captions to clearly summarize the main insight or takeaway from each figure, making them self-contained and more informative to the reader.

We acknowledge this point and have reviewed and revised the entire bibliography, including correct formatting of author names, publication details, punctuation, and ordering.

Comment 9: Reference to Related Work

The paper would benefit from citing a few recent surveys or papers on machine learning, to place the contribution in context. For example, recent work Lesmi et al., 2024 has shown that the number of years of schooling required to comprehend financial reporting increases by nearly one month each year, indicating a growing inaccessibility of financial reports for a substantial portion of the population. Citing such work (even if from fields like applied AI in finance) would strengthen the introduction.

Response to comment 9

In the revised manuscript, we have cited several recent survey and application-oriented studies in the field of machine learning, particularly those relevant to time-series forecasting and economic applications. These citations help situate our contribution within the broader development of machine learning methods in macroeconomic modeling.

Once again, we sincerely appreciate the reviewer’s detailed and constructive critique. Your suggestions have significantly improved the quality, clarity, and robustness of our manuscript.

Best regards,

Lan Dong Thi Ngoc (on behalf of all authors)

Author Response File: Author Response.pdf

Round 2

Reviewer 1 Report

Comments and Suggestions for Authors

Suggestions were responded to by the authors.

Author Response

Response to Reviewer 1

We sincerely thank the reviewer for the positive evaluation and for acknowledging the improvements made in the revised manuscript. We are pleased to know that the research design, methodology, results, and conclusions were considered appropriate and clearly presented. In response to the suggestion that the introduction could be improved, we have carefully revised this section to enhance its clarity, coherence, and academic contribution.

Thank you again for your thoughtful feedback and support.

Best regards,

Lan Dong Thi Ngoc (on behalf of all authors)

Author Response File: Author Response.pdf

Reviewer 4 Report

Comments and Suggestions for Authors

The data presented in Table 7 appears to replicate information already included in previous tables. Please review and remove redundant data to streamline the presentation of your results. Specifically, the results obtained for the RMSE test are repeated data that had already been presented in the previous tables.

Comments for author File: Comments.pdf

Comments on the Quality of English Language

Paragraph Structure. Please avoid constructing paragraphs with only two sentences. In academic English writing, a well-structured paragraph typically includes at least three sentences: the first introduces the main idea, the second (and possibly third) supports it with evidence or elaboration, and the final sentence concludes or transitions to the next point. This structural issue should be addressed specifically in the following line: 432-434 (green highlight).

Author Response

Response to Reviewer 4_Round 2

We sincerely thank you for their constructive comments. We have carefully reviewed the points raised and revised the manuscript accordingly. Below are our responses to each issue:

Comment 1:

“The data presented in Table 7 appears to replicate information already included in previous tables. Please review and remove redundant data to streamline the presentation of your results. Specifically, the results obtained for the RMSE test are repeated data that had already been presented in the previous tables.”

Response:
Thank you for pointing this out. We have carefully reviewed the contents of Table 7 and agree that the RMSE values duplicated results already shown in earlier tables. We have removed Table 7 from the revised manuscript to eliminate redundancy and streamline the presentation of results.

Comment 2:

“Paragraph Structure. Please avoid constructing paragraphs with only two sentences. In academic English writing, a well-structured paragraph typically includes at least three sentences: the first introduces the main idea, the second (and possibly third) supports it with evidence or elaboration, and the final sentence concludes or transitions to the next point. This structural issue should be addressed specifically in the following line: 432-434 (green highlight).”

Response:
We appreciate the reviewer’s suggestion on improving paragraph structure. In response, we have revised the paragraph at lines 432–434. We have also reviewed the entire manuscript to ensure that no other paragraphs consist of only two sentences, and revised where necessary to improve logical flow and coherence.

Once again, we thank you for their thoughtful feedback, which has contributed to improving the quality and clarity of our manuscript.

Best regards,

Lan Dong Thi Ngoc (on behalf of all authors)

Author Response File: Author Response.pdf

Reviewer 5 Report

Comments and Suggestions for Authors

Referee Report on: "Gross Domestic Product Forecasting Using Deep Learning Models with Phase-Adaptive Attention Mechanism" (electronics-3619368R1)

I appreciate the opportunity to review the revised manuscript. The paper addresses an important question in macroeconomic forecasting by introducing a phase-adaptive attention mechanism within a deep learning model (PAA-LSTM). The use of economic phase labels to guide attention is an interesting idea, and the empirical evaluation across six countries adds useful breadth.

The revised version incorporates many of the suggestions raised in the earlier round, including a clearer definition of economic phases, improved benchmark comparisons, and enhanced explanations of the modeling and validation strategies. I found the topic interesting, the analysis convincing, and I agree with the conclusions. The responses to prior concerns are clear, and the revisions have improved the manuscript.

That said, a few points still require minor clarifications or adjustments:

While I understand the space constraints, the manuscript would benefit from at least a brief illustrative example of how attention weights vary across phases. Even a demonstration of different attention patterns in, say, recession vs. expansion, would help support the claims.
The claim that the model generalizes well to stable economies is plausible but should be phrased with more care in the abstract and conclusion. As the margin over simpler models narrows in developed countries, the benefit of added complexity becomes less obvious and could be acknowledged as such.

Thank you again for the revision. I have enjoyed reading this paper, and I hope it will also influence practitioners. I especially appreciate your efforts in addressing my recommendations.

Author Response

Response to Reviewer 5

We sincerely thank the reviewer for the thoughtful and encouraging feedback. We greatly appreciate your recognition of the improvements made in the revised manuscript and your support for the proposed PAA-LSTM model and its use of phase-adaptive attention in macroeconomic forecasting.

We address the remaining points as follows:

Comment 1:

“While I understand the space constraints, the manuscript would benefit from at least a brief illustrative example of how attention weights vary across phases. Even a demonstration of different attention patterns in, say, recession vs. expansion, would help support the claims.”

Response:

We fully agree with the reviewer’s suggestion. To address this, we have added a concise illustrative example in Section 3.2 ((blue highlight)), which demonstrates how attention weights differ between the recession and expansion phases. This example is based on synthetic LSTM hidden state values and shows how phase-specific parameters yield distinct attention distributions and context vectors.

Comment 2:

“The claim that the model generalizes well to stable economies is plausible but should be phrased with more care in the abstract and conclusion. As the margin over simpler models narrows in developed countries, the benefit of added complexity becomes less obvious and could be acknowledged as such.”

Response:

We appreciate this important observation and have revised both the Abstract and Conclusion accordingly (blue highlight). We now explicitly acknowledge that although the proposed model maintains competitive performance in developed economies such as the United States and Canada, the performance gap compared to simpler models is less significant.

Once again, we thank the reviewer for their valuable comments, which have helped improve both the clarity and rigor of the manuscript. We are grateful for your positive feedback and continued support.

Best regards,

Lan Dong Thi Ngoc (on behalf of all authors)

Author Response File: Author Response.pdf

Article Menu

Gross Domestic Product Forecasting Using Deep Learning Models with a Phase-Adaptive Attention Mechanism

Response to Reviewer – Major Issues

Further Information

Guidelines

MDPI Initiatives

Follow MDPI