Submit to this Journal Review for this Journal Propose a Special Issue

Article Menu

Share Help Cite Discuss in SciProfiles

Open AccessArticle

Peer-Review Record

Long-Term Traffic Flow Prediction for Highways Based on STLLformer Model

Sustainability 2025, 17(22), 10078; https://doi.org/10.3390/su172210078

by Yonggang Shen^1,2,3

, Lu Wang^4,5, Yuting Zeng^4,5, Zhumei Gou^4,5

, Chengquan Wang⁶

and Zhenwei Yu^3,*

Reviewer 1:

Ali Taheri

Reviewer 2:

Mohammad Ali Sahraei

Reviewer 3:

Oleh Basystiuk

Sustainability 2025, 17(22), 10078; https://doi.org/10.3390/su172210078

Submission received: 17 September 2025 / Revised: 10 November 2025 / Accepted: 10 November 2025 / Published: 11 November 2025

Round 1

Reviewer 1 Report

Comments and Suggestions for Authors

The authors introduce STLLformer, a Transformer-based spatiotemporal model that combines STL decomposition, autocorrelation with delayed aggregation, and a low-rank graph convolutional module for long-term traffic flow forecasting. The work is timely and relevant, as long-horizon traffic prediction remains a challenging problem in intelligent transportation systems. The novelty lies in explicitly decomposing traffic into trend/seasonal/residual components and fusing them with advanced spatial modeling. Results on both a public (PEMSD8) and a proprietary (HHY) dataset show meaningful improvements over baselines.

While the approach is interesting and technically sound, the paper would benefit from clearer differentiation from related spatiotemporal decomposition models, more robust validation, and expanded discussion of practical implications.

The authors should more explicitly compare STLLformer with recent decomposition-based spatiotemporal models (e.g., Autoformer, ACNet, decomposition-enhanced GCNs). How does STLLformer differ beyond combining STL with low-rank GCN?
The contribution relative to Informer/Autoformer (which also use autocorrelation mechanisms and decomposition) is not fully clarified. What technical gaps does STLLformer specifically fill?

Add standard deviations/confidence intervals across multiple runs (currently only averages are shown). This would strengthen claims of consistent superiority.
Discuss whether the ~5–6 unit MAE reduction (e.g., 36.8 → 30.3) has practical significance—does it correspond to meaningful improvements in traffic management decisions?
Clarify whether error reduction is uniform across all nodes/time periods, or concentrated in peaks. This ties to equity in performance across conditions.

Discuss why residuals appear less influential compared to seasonal components, and whether they still matter in short-term horizons.
Consider a sensitivity test of STL parameters (window length, Loess smoothing) since decomposition quality can drive outcomes.
Report statistical significance of performance gains (e.g., t-tests, confidence intervals) to support the claim of 10–20% error reduction.
In Figures 3–5, add confidence bands or multiple-day averages instead of single-node snapshots, which may exaggerate performance differences.

Author Response

We sincerely thank the reviewer for the positive assessment of our work's timeliness and novelty, and for the exceptionally insightful and constructive suggestions that have significantly helped us enhance the impact and clarity of our manuscript. We are truly grateful for the thorough and thoughtful review.

Comment 1.1: The authors should more explicitly compare STLLformer with recent decomposition-based spatiotemporal models (e.g., Autoformer, ACNet, decomposition-enhanced GCNs). How does STLLformer differ beyond combining STL with low-rank GCN? The contribution relative to Informer/Autoformer (which also use autocorrelation mechanisms and decomposition) is not fully clarified. What technical gaps does STLLformer specifically fill?

Response to Comment 1.1: We sincerely thank the reviewer for this critical and insightful suggestion. We agree that a clearer differentiation from existing decomposition-based models is essential to highlight our contributions. To address this point thoroughly, we have made the following major revisions to the manuscript:

Added a Methodological Comparison Table: We have introduced a new Table 1 in the introduction (Section 1.1, please see Pages 2-3, Lines 94-99 of the revised manuscript). This table provides a systematic comparison of key spatiotemporal forecasting models, including STGCN, LSTGCN, Informer, Autoformer, and our proposed STLLformer. It explicitly contrasts their approaches to Temporal Modeling, Spatial Modeling, and Decomposition Strategy, and summarizes their Key Limitations.
Explicitly Defined Technical Gaps (C1-C3): Based on the analysis in Table 1, we now explicitly frame our work around addressing three core challenges (C1, C2, C3) faced by existing models when applied to traffic forecasting. This structured framing directly pinpoints the technical gaps STLLformer is designed to fill. This discussion can be found in Section 1.1, Page 2, Lines 80-93 and is visually summarized in Table 1.

(C1) Monolithic Sequence Modeling: Most models (e.g., STGCN, LSTGCN) treat traffic flow holistically, failing to leverage the distinct characteristics of its internal components (trend, seasonality, residual).
(C2) Inadequate Seasonal Modeling: While Autoformer introduces decomposition, it processes the components through a unified series decomposition block. This "mixed" processing lacks a dedicated, dynamic mechanism to purely capture and model the seasonal (cyclical) patterns, which are paramount in traffic data.
(C3) Limited Long-range Spatial Modeling: Models like Informer and Autoformer primarily focus on temporal modeling and have underdeveloped spatial modeling capabilities (often basic or none). They struggle to capture the non-uniform, long-range spatial dependencies prevalent in road networks.

Clarified Specific Differentiators: Our responses to these gaps are the core differentiators of STLLformer, which are now clearly stated in the new Section 1.2 "Our Contributions" on Page 3, Lines 100-116:

To address C1 and C2: Unlike Autoformer's unified block, STLLformer features a "seasonal-dominated, multi-component collaborative forecasting paradigm". This is implemented via a component-specific encoder-decoder architecture where the seasonal component is processed separately by the encoder, and all three components are modeled in parallel within the decoder. This enables targeted feature extraction. Furthermore, our seasonal-driven autocorrelation mechanism operates exclusively on the purified seasonal subsequence, effectively filtering out interference from trend and noise to provide a more dynamic and precise capture of cyclical traffic patterns.
To address C3: STLLformer integrates a dedicated Low-rank Graph Convolutional Network (LGCN) module, which is specifically designed to capture dynamic, long-range spatial dependencies through low-dimensional reconstruction and dynamic filtering of the graph structure. This fills a key technical gap left by models like Autoformer and Informer in the spatial dimension.

In summary, STLLformer is not merely a combination of STL and GCN. It is a holistic architecture designed from the ground up to address the specific multi-scale temporal and complex spatial characteristics of traffic flow, moving beyond the more generic sequence modeling approach of Autoformer/Informer. We believe these clarifications, now integrated into the manuscript, much more sharply define our novel contributions.

Comment 1.2: Add standard deviations/confidence intervals across multiple runs (currently only averages are shown). This would strengthen claims of consistent superiority. Report statistical significance of performance gains (e.g., t-tests, confidence intervals) to support the claim of 10–20% error reduction.

Response to Comment 1.2: We thank the reviewer for this crucial suggestion regarding the robustness of our experimental findings. We fully agree that demonstrating statistical significance is paramount. In response, we have significantly strengthened our experimental validation as follows:

Added Results from Multiple Runs with Standard Deviations: We have retrained and evaluated all models for the 6-hour prediction task on the HHY dataset over 10 independent runs with different random seeds. The results are now presented in the updated Table 3 (Page 11). The performance metrics for the 6-hour prediction on HHY are reported as mean ± standard deviation. For instance, the MAE for LSTGCN is now 38.818 ±98, compared to STLLformer's 30.665 ± 1.11. This addition is explicitly mentioned in the table note and in the main text in Section 3.1, Page 9, Lines 349-354: "To ensure robust and statistically significant comparisons, all models for the 6-hour prediction task on the HHY dataset were trained and evaluated over 10 independent runs with different random seeds. We report the mean ± standard deviation for these configurations in Table 3."
Performed Statistical Significance Tests against the Strongest Baseline: To quantitatively validate that the improvements over the most competitive baseline are not due to random variation, we conducted paired t-tests between STLLformer and the strongest baseline, LSTGCN, based on the results from the 10 independent runs. We have now reported the outcome of these tests in Section 3.3, Page 11, Lines 406-407: "The improvements are statistically significant (p < 0.01 in paired t-tests against the strongest baseline, LSTGCN)." Focusing the statistical comparison on the strongest competitor provides the most rigorous test of our model's advantage. Furthermore, we elaborate on the practical meaning of this improvement on Page 12, Lines 407-411: "Furthermore, the MAE reduction from 38.82 (±98) of LSTGCN to 30.67 (±1.11) of STLLformer on HHY for 6-hour prediction translates to over 20% error reduction. In practical terms, this level of improvement in long-term forecasting accuracy can significantly enhance the reliability of traffic management systems for proactive congestion control and dynamic route guidance."

These revisions provide strong statistical evidence for the consistent superiority of STLLformer, moving beyond average values to demonstrate both the stability (low standard deviation) and the statistical significance of its performance gains over the most relevant benchmark.

Comment 1.3: Discuss whether the ~5–6 unit MAE reduction (e.g., 36.8 → 30.3) has practical significance—does it correspond to meaningful improvements in traffic management decisions? Clarify whether error reduction is uniform across all nodes/time periods, or concentrated in peaks. This ties to equity in performance across conditions. Discuss why residuals appear less influential compared to seasonal components, and whether they still matter in short-term horizons.

Response to Comment 1.3: We thank the reviewer for these profound questions that touch upon the practical utility and intrinsic mechanics of our model. We have addressed each of these points in depth within the revised manuscript.

Practical Significance of Error Reduction: We have expanded our discussion on the practical implications of the performance improvement. In Section 3.3, Page 12, Lines 409-411, we state: "In practical terms, this level of improvement in long-term forecasting accuracy can significantly enhance the reliability of traffic management systems for proactive congestion control and dynamic route guidance." Furthermore, in the Discussion section (Section 3.5, Page 15, Lines 518-524), we elaborate: "From a practical standpoint, the observed performance profile—characterized by robust overall accuracy and reliable peak-period forecasting—holds significant value for traffic management systems. The model’s ability to provide stable, hours-ahead predictions enables a shift from reactive to proactive traffic control. For instance, accurate forecasts of impending peak-hour volumes can inform dynamic speed limit adjustments and pre-emptive congestion mitigation strategies, while reliable long-term trend predictions can optimize infrastructure maintenance scheduling." This clarifies that the absolute error reduction translates into tangible decision-support capabilities.
Non-Uniform Error Reduction and Peak Performance (Equity): This is a critical insight. We have added a dedicated new analysis in Section 3.3.1, titled "Temporal Error Distribution and Model Stability" (Page 13, Lines 445-471), complete with a new Figure 6.

We explicitly confirm that the error reduction is not uniform. We analyze a high-traffic node and report: "A critical observation is the non-uniform distribution of error: the MAE during peak hours (102.52 vehicles) is 77.8% higher than during off-peak periods (57.66 vehicles)."
However, we provide crucial context: "This disparity, however, is contextualized by the fact that peak traffic volume at this node is approximately 2.5 times greater than off-peak volume. Consequently, the model's relative accuracy (MAPE) remains robust across both conditions."
We conclude that the elevated absolute error during peaks is a reflection of the inherent forecasting difficulty in high-density traffic, not a model failure, and underscore the model's utility precisely when it is needed most: "This performance profile underscores the model's practical utility, as it delivers reliable and interpretable forecasts under the most demanding and consequential traffic states, where accurate forecasting is paramount for effective management."

Role of the Residual Component: We have clarified the distinct roles of all three components. In Section 2.1 (Page 4, Lines 161-164), we now state: "Notably, while the residual component may be less influential in long-term forecasting where periodic patterns dominate, it plays a crucial role in capturing short-term, anomalous fluctuations, thereby enhancing the model's robustness." This point is reinforced in the Discussion section (Section 3.5, Page 15, Lines 512-517), where we explain: "Conversely, while the residual component’s influence is less pronounced in long-horizon predictions, its role is crucial for capturing transient, non-periodic fluctuations. This delineation suggests that the residual component would become increasingly critical in short-term forecasting scenarios or when modeling the impact of unexpected events, offering a clear pathway for model adaptation."

In summary, these additions demonstrate that STLLformer's performance gains are not only statistically significant but also practically meaningful, providing reliable forecasts during critical peak periods and a clear architectural rationale for the role of each decomposed component.

Comment 1.4: Consider a sensitivity test of STL parameters (window length, Loess smoothing) since decomposition quality can drive outcomes. In Figures 3-5, add confidence bands or multiple-day averages instead of single-node snapshots, which may exaggerate performance differences.

Response to Comment 1.4: We appreciate the reviewer's suggestions regarding the robustness of our methodology and visualizations. We have addressed these points as follows:

Sensitivity of STL Parameters: We thank the reviewer for raising this important methodological point. The STL decomposition is a core component of our framework, and its parameters can indeed influence the results. In our implementation, we leverage domain knowledge of traffic flow to set the key parameter—the period length—in a principled way, thereby avoiding the need for extensive and potentially overfitting hyperparameter tuning. We have added the following clarification in Section 2.1, Page 5, Lines 178-182: "The seasonality window size in STL is set based on the fundamental periodicity of traffic flow (e.g., 288 time steps for a daily cycle with 5-minute intervals), leveraging domain knowledge to ensure meaningful decomposition without introducing additional hyperparameters that require extensive tuning." This approach ensures that the decomposition is both physically meaningful and computationally efficient for our task.
Enhancement of Figures with Confidence Intervals: We agree with the reviewer that demonstrating model stability in the figures is crucial. In direct response to this comment, we have enhanced our visual analysis by creating a new Figure 6 (Page 14). This figure is specifically dedicated to "Traffic prediction with error magnitude visualization" for a representative high-flow node over a 48-hour period. As described in Section 3.3.1, Page 13, Lines 449-453, this new figure "integrates the ground truth, STLLformer predictions with ±1σ confidence intervals, and the instantaneous prediction errors." The inclusion of these confidence intervals visually demonstrates the stability of our predictions across different training runs, addressing the reviewer's concern directly. Regarding Figures 3-5, we have retained them as illustrative single-node snapshots to clearly showcase the model's ability to capture specific traffic patterns (like bimodal peaks in Fig. 3) and comparative performance at specific locations (Figs. 4 & 5). We believe that the combination of the new, statistically robust Figure 6 and the clear illustrative examples in Figures 3-5 provides a comprehensive visual validation of our model's performance.

In summary, we are deeply grateful for the reviewer's thorough and constructive feedback. The comments have been invaluable in guiding us to substantially improve the manuscript, particularly in sharpening the definition of our contributions, strengthening the statistical evidence for our claims, and enriching the discussion on the practical implications and intrinsic behavior of our model. We believe the revised manuscript now more effectively communicates the novelty, robustness, and utility of the STLLformer framework.

Reviewer 2 Report

Comments and Suggestions for Authors

This paper is entitled “Long-term traffic flow prediction for highways based on STLLformer model” In this case, the idea and results of the paper are interesting but the following comments can be utilized to improve this paper in future.

Abstract

The datasets used for validation are not named, and the 10% improvement is mentioned only once without confidence context. Indicate the type or source of datasets (e.g., “two large-scale highway datasets”).
If space permits, clarify whether the 10% reduction is an average or peak improvement.
The final sentence on “other spatiotemporal prediction domains” is useful but a bit generic. End with a stronger, application-oriented statement, such as implications for traffic management, congestion mitigation, or intelligent transport planning.

General Comment:

Since the literature review and introduction are combined, more related research must be provide and describe about them.
Provide a Table at the end of section 1 and make a comparison for their results, findings and methodology.

Results:

Use clear subheadings and short summary paragraphs at the end of each subsection (e.g., key findings of sensitivity analysis, ablation study). Remove redundant methodological details and keep the focus on interpretation of findings.
While numerical improvements (e.g., 20% lower MAE than LSTGCN) are stated, statistical significance (e.g., p-values, confidence intervals) is missing. Little is said about typical error patterns (e.g., during peak periods, adverse weather, or unexpected events). Include simple significance tests or confidence ranges to show that improvements are not due to random variation.
Add brief commentary on specific time periods or traffic situations where STLLformer excels or struggles.
The “discussions” subsection (3.5) repeats parts of the Introduction and Methodology.
The narrative describes “why” the model is better (e.g., seasonal component importance) but could relate more directly to real-world implications (e.g., highway congestion management, scheduling maintenance). Streamline 3.5 to emphasize practical meaning: how the 10 % error reduction can help traffic agencies in long-term planning.
Briefly mention scalability or deployment considerations (e.g., computational requirements, data availability).

Final decision: This manuscript has interesting objectives; however, it needs Major correction.

Author Response

We thank the reviewer for their positive assessment of our work's interesting objectives and results, and for the constructive comments provided to enhance the paper's clarity and impact. We are truly grateful for your detailed and practical suggestions, which have helped us present a more compelling and well-rounded study.

Comment 2.1: The datasets used for validation are not named, and the 10% improvement is mentioned only once without confidence context. Indicate the type or source of datasets (e.g., "two large-scale highway datasets"). If space permits, clarify whether the 10% reduction is an average or peak improvement. The final sentence on "other spatiotemporal prediction domains" is useful but a bit generic. End with a stronger, application-oriented statement, such as implications for traffic management, congestion mitigation, or intelligent transport planning.

Response to Comment 2.1: We sincerely thank the reviewer for these precise suggestions to improve the abstract's specificity and impact. We have thoroughly revised the abstract to address all these points, as reflected in the new abstract on Page 1, Lines 13-32:

Dataset Naming: We now explicitly name the datasets used for validation. The revised abstract states (Page 1, Lines 26-27): "Experiments on two real-world traffic datasets (PEMSD8 and HHY) demonstrate that STLLformer outperforms strong baseline methods..." This provides clear context for our validation setup.
Context for Performance Improvement: We have replaced the generic mention of "over 10% improvement" with a more concrete and informative statement. The abstract now reads (Page 1, Lines 28-30): "...achieving an average improvement of over 10% in MAE and RMSE (e.g., on PEMSD8 for 6-hour prediction, MAE drops from 36.87 to 30.34), with statistical significance (p < 0.01).". This clarifies that the improvement is an average across metrics and datasets, provides a specific example for clarity, and adds the crucial statistical significance
Stronger, Application-Oriented Conclusion: Following the reviewer's excellent suggestion, we have replaced the generic final sentence with a statement that underscores the practical utility of our work. The abstract now concludes (Page 1, Lines 30-32): "This work provides a more refined and effective decomposition-fusion solution for traffic forecasting, which holds practical promise for enhancing urban traffic management and alleviating congestion.". This directly links our methodological contribution to tangible benefits in the application domain.

These changes have made the abstract more specific, credible, and compelling, better reflecting the substance and significance of our work.

Comment 2.2: Since the literature review and introduction are combined, more related research must be provided and described about them. Provide a Table at the end of section 1 and make a comparison for their results, findings and methodology.

Response to Comment 2.2: We thank the reviewer for this suggestion to strengthen the literature review and provide a clearer comparative framework. We have significantly expanded the related work section and, in direct response to this comment and a similar one from Reviewer 1, we have added a comprehensive comparison table.

Expanded Discussion of Related Research: We have substantially expanded Section 1.1 (Page 2, Lines 49-100), "Related Work and Challenges". We now provide a more detailed discussion of recent decomposition-based spatiotemporal models, including Autoformer [21] and ACNet [22], and explicitly discuss their limitations when directly applied to traffic forecasting. For instance, we added the text (Page 2, Lines 69-80): "Recently, several studies have integrated series decomposition into deep spatiotemporal forecasting models. For instance, Autoformer [21] introduced a decomposition architecture with auto-correlation mechanism... However, applying them directly to traffic flow forecasting poses limitations. Firstly, they typically process the decomposed subsequences in a mixed manner, lacking dedicated modeling strategies for the trend, seasonal, and residual components which possess distinct physical meanings [23]. Secondly, their primary focus lies on temporal modeling, and their approach to spatial dependency capture often relies on standard graph convolutional networks, which struggle to effectively model the prevalent long-range spatial interactions in road networks [24]."
Added a Methodological Comparison Table: As a central improvement, we have included a new Table 1 at the end of Section 1.1 (Page 2-3). The caption of the table reads: "Methodological comparison of spatiotemporal forecasting models." This table systematically compares key models (STGCN, LSTGCN, Informer, Autoformer, and our STLLformer) across four dimensions: Temporal Modeling, Spatial Modeling, Decomposition Strategy, and Key Limitation. This table provides a clear, at-a-glance overview of the methodological landscape and precisely pinpoints the gaps (C1-C3) that our work aims to fill, effectively setting the stage for our contributions. The introduction and analysis of this table can be found in the text leading up to it on Page 2, Lines 81-99.

These revisions provide a more thorough literature review and a powerful visual aid that clearly articulates the positioning and novelty of our STLLformer model within the existing research field.

Comment 2.3: Use clear subheadings and short summary paragraphs at the end of each subsection (e.g., key findings of sensitivity analysis, ablation study). Remove redundant methodological details and keep the focus on interpretation of findings. While numerical improvements (e.g., 20% lower MAE than LSTGCN) are stated, statistical significance (e.g., p-values, confidence intervals) is missing. Little is said about typical error patterns (e.g., during peak periods, adverse weather, or unexpected events). Include simple significance tests or confidence ranges to show that improvements are not due to random variation. Add brief commentary on specific time periods or traffic situations where STLLformer excels or struggles.

Response to Comment 2.3: We are grateful to the reviewer for these comprehensive suggestions to enhance the rigor, clarity, and depth of our Results section. We have implemented extensive changes to address these points.

Clearer Subheadings and Summary Paragraphs: We have reorganized Section 3 (Results) with clearer subheadings to improve the flow and readability:

1 Benchmark methods and experimental setup
2 Sensitivity analysis
3 Comparative experiments (This section now includes the new "Temporal Error Distribution and Model Stability" analysis)
4 Ablation study
5 discussions

Furthermore, we have added concise summary sentences at the end of key subsections. For example, at the end of Section 3.2 (Page 11, Lines 392-394), we conclude: "In summary, the sensitivity analysis confirms that a multi-variable input configuration and a 4-head attention mechanism are optimal choices for STLLformer, contributing to its stable and accurate performance." A similar summary is provided for the ablation study in Section 3.4 (Page 15, Lines 497-500).

Removal of Redundant Methodological Details: We have streamlined the text in the Results section by removing several methodological explanations that were already covered in Section 2, shifting the focus squarely onto the interpretation of the experimental outcomes.
Addition of Statistical Significance and Confidence Intervals: As also requested by Reviewer 1, we have added robust statistical validation. This includes:

Reporting results as mean ± standard deviation from 10 independent runs for the 6-hour prediction task on the HHY dataset in Table 3 (Page 11).
Explicitly stating the performance of statistical tests in Section 3.3 (Page 11, Lines 406-409): "The improvements are statistically significant (p < 0.01 in paired t-tests against the strongest baseline, LSTGCN)."
Adding a dedicated new analysis in Section 3.3, titled "Temporal Error Distribution and Model Stability" (Page 11-13, Lines 395-444), which includes a new Figure 6. This analysis directly addresses the comment about error patterns by examining performance during peak vs. off-peak periods. We report that "the MAE during peak hours (102.52 vehicles) is 77.8% higher than during off-peak periods (57.66 vehicles)" but crucially contextualize this by noting the much higher traffic volume, leading to a stable relative accuracy (MAPE). This provides a nuanced commentary on when and why the model's absolute error varies, highlighting its reliability under critical high-demand conditions.

These comprehensive revisions ensure that our results are not only presented with clear structure but are also backed by statistical evidence and deeper insights into the model's operational characteristics.

Comment 2.4: The "discussions" subsection (3.5) repeats parts of the Introduction and Methodology. The narrative describes "why" the model is better (e.g., seasonal component importance) but could relate more directly to real-world implications (e.g., highway congestion management, scheduling maintenance). Streamline 3.5 to emphasize practical meaning: how the 10% error reduction can help traffic agencies in long-term planning.

Response to Comment 2.4: We thank the reviewer for this critical observation. We have completely restructured and rewritten Section 3.5 "discussions" (Page 15, Lines 501-527) to eliminate repetition and strongly emphasize the practical implications of our work.

Streamlined and Refocused Content: We removed the paragraphs that merely restated the methodology or introduction. The section now opens by directly synthesizing the experimental findings and then pivots to their interpretation and significance.
Enhanced Practical Implications: As suggested, we now explicitly link the model's performance to real-world traffic management applications. We have added text that states (Page 15, Lines 518-524): "From a practical standpoint, the observed performance profile—characterized by robust overall accuracy and reliable peak-period forecasting—holds significant value for traffic management systems. The model’s ability to provide stable, hours-ahead predictions enables a shift from reactive to proactive traffic control. For instance, accurate forecasts of impending peak-hour volumes can inform dynamic speed limit adjustments and pre-emptive congestion mitigation strategies, while reliable long-term trend predictions can optimize infrastructure maintenance scheduling.". This directly addresses the reviewer's point by explaining how the improved forecasting capability translates into actionable insights for traffic agencies.

Comment 2.5: Briefly mention scalability or deployment considerations (e.g., computational requirements, data availability).

Response to Comment 2.5: We appreciate the reviewer's suggestion to touch upon these practical aspects.

Data Availability: In accordance with the journal's guidelines, we have included a "Data Availability Statement" in the manuscript (Page 16). It clearly specifies the public access to the PEMSD8 dataset and the restricted, authorization-required nature of the HHY dataset, ensuring full transparency.

Operational Reliability: Regarding deployment considerations, we have added a discussion on the model's stability in the revised Section 3.5 (Page 15, Lines 524-527): "The fact that the model maintains reasonable relative error (MAPE) even during high-volume, high-variability peak hours underscores its operational reliability when it is needed most." This characteristic of maintaining robust performance under the most demanding conditions is a key factor for its potential deployment in real-world traffic management systems.

Reviewer 3 Report

Comments and Suggestions for Authors

The paper deals with important topics in traffic flow prediction. The authors have reported evaluating the self-attention mechanism for adaptive feature cor- 18 correlation selection across subsequences.

However, I have a number of suggestions:

Please, reinforce the introduction section with clearer contributions. I would suggest preparing in a point-based structure. (They are partially mentioned in conclusion, but still I suggest reinforcing the introduction section)
I would suggest making the abstract more specific, mentioning outcomes, limitations, and key benefits of the provided approach. And please, highlight metrics and obtained results.
Please double-check the references; some of them are outdated and unlinked. Please fix it by using papers 3-5 years old in high-impact journals.
I would suggest explaining why you select MAPE and RMSE as a loss function.

Author Response

We thank the reviewer for their valuable suggestions aimed at strengthening the presentation of our work, particularly in clarifying contributions and updating references. Your guidance has been instrumental in improving the overall quality and scholarly rigor of our manuscript, for which we are sincerely thankful.

Comment 3.1: Please, reinforce the introduction section with clearer contributions. I would suggest preparing in a point-based structure. (They are partially mentioned in conclusion, but still I suggest reinforcing the introduction section)

Response to Comment 3.1: We thank the reviewer for this excellent suggestion. We agree that a point-by-point structure in the introduction is the clearest way to present our contributions. We have now restructured the end of the introduction into a dedicated subsection, "1.2 Our Contributions" (Page 3, Lines 100-119). In this subsection, we clearly list our main contributions as follows:

A Component-specific Encoder-Decoder Architecture. We design a novel architecture where the seasonal component is processed separately by the encoder, and all three components (trend, seasonal, residual) are modeled in parallel within the decoder [29,30]. This enables targeted feature extraction for each component's unique characteristics, directly addressing C1.
A Seasonal-driven Autocorrelation Mechanism. This mechanism operates purely on the seasonal subsequence, filtering out interference from trend and noise to provide dynamic and precise capture of cyclical traffic patterns [31], thereby effectively addressing C2.
A Low-rank Graph Convolutional Network (LGCN). This module performs low-dimensional reconstruction and dynamic filtering of the graph structure, explicitly enhancing multi-scale spatial node interaction and the modeling of long-range dependencies [32,33], filling a key technical gap (C3).

This structured presentation provides a clear roadmap for the reader and immediately highlights the novelty and structure of our work.

Comment 3.2: I would suggest making the abstract more specific, mentioning outcomes, limitations, and key benefits of the provided approach. And please, highlight metrics and obtained results.

Response to Comment 3.2: We thank the reviewer for pushing us to make the abstract more concrete and impactful. We have thoroughly revised the abstract to incorporate these suggestions, as seen in the new version on Page 1, Lines 13-32.

Specific Outcomes and Metrics: We now explicitly state the key evaluation metrics and provide a concrete example of the results achieved. The abstract includes the sentence (Page 1, Lines 26-30): "Experiments on two real-world traffic datasets (PEMSD8 and HHY) demonstrate that STLLformer outperforms strong baseline methods (including LSTGCN, LSTM, and ARIMA), achieving an average improvement of over 10% in MAE and RMSE (e.g., on PEMSD8 for 6-hour prediction, MAE drops from 36.87 to 30.34), with statistical significance (p < 0.01).". This replaces a vaguer statement about performance improvement.
Key Benefits Highlighted: The abstract now more clearly articulates the core methodological benefit of our approach. It introduces STLLformer as a model that "establishes a seasonal-dominated, multi-component collaborative forecasting paradigm" (Page 1, Lines 19-20) and later emphasizes that this provides a "more refined and effective decomposition-fusion solution" (Page 1, Line 30-31).
Regarding Limitations: While a detailed discussion of limitations is typically reserved for the conclusion in a paper of this format, we have strengthened the conclusion section to outline future work that addresses current boundaries, which functionally acknowledges limitations. For instance, in the Conclusion (Section 4, Page 16, Lines 552-554), we state: "Future work will focus on... integrating real-time external events to improve robustness during anomalies...", which implies a current limitation in handling unexpected external events.

These changes ensure the abstract now packs a stronger punch by clearly stating what was achieved, how it was measured, and why the approach is beneficial.

Comment 3.3: Please double-check the references; some of them are outdated and unlinked. Please fix it by using papers 3-5 years old in high-impact journals.

Response to Comment 3.3: We sincerely thank the reviewer for this critical reminder regarding the currency and quality of our references. We have conducted a thorough review and update of the entire reference list to address this concern.

Comprehensive Update: We have meticulously gone through the reference list and replaced outdated or less relevant sources with more recent publications (primarily from 2022-2025) from high-impact journals and top conferences. Examples of newly added and updated high-quality references include:

[4] Liu, Y.; et al. Multi-scale feature enhanced spatio-temporal learning for traffic flow forecasting. Knowl.-Based Syst. 2024. (A 2024 study on multi-scale spatio-temporal feature learning, directly relevant to our multi-component modeling approach.)
[8] Wang, Z.; et al. Spatio-Temporal Meta-Graph Learning for Traffic Forecasting. arXiv preprint 2025. (A very recent 2025 preprint exploring advanced graph structures for traffic forecasting.)
[11] Wang, Y.; et al. Spatio-temporal fusion network with multi-component decomposition for traffic flow forecasting. Appl. Soft Comput. 2025. (A 2025 journal paper that explicitly employs multi-component decomposition, strongly validating our core methodology.)
[18] Wang, S.; et al. Dynamic graph convolutional network with multi-head attention for traffic speed prediction. Inf. Fusion 2023. (A paper from the high-impact journal Information Fusion on dynamic graphs and attention mechanisms.)
[22] Liu, Y.; et al. ACENet: Adaptive correlation-enhanced network for multivariate time series forecasting. Digit. Signal Process. 2025. (A 2025 journal publication on adaptive correlation enhancement, supporting our temporal modeling strategy.)
[24] Liu, Y.; et al. ST-MTM: Masked Time Series Modeling with Seasonal-Trend Decomposition for Time Series Forecasting. Chaos Solitons Fract. 2025. (A concurrent 2025 study that also leverages Seasonal-Trend decomposition, highlighting the timeliness of our research direction.)
[26] Xu, M.; et al. Spatial-Temporal Graph Neural Networks with Frequency Adaptation for Traffic Forecasting. Eng. Appl. Artif. Intell. 2025. (Another 2025 journal article incorporating frequency-domain analysis for spatio-temporal forecasting.)

These additions ensure our reference list is current, authoritative, and highly relevant to the specific technical contributions of our work.

Proper Linking and Formatting: We have ensured that all references are now correctly cited in the text and that the corresponding entries in the reference list are complete, properly formatted, and include stable Digital Object Identifiers (DOIs) or URLs where available, as per the journal's guidelines. The entire updated reference list can be found in the References section on Pages 17-18.

Comment 3.4: I would suggest explaining why you select MAPE and RMSE as a loss function.

Response to Comment 3.4: We sincerely thank the reviewer for this insightful question, which allowed us to clarify a crucial point in our methodology. There was a potential for misunderstanding in our original description, which we have now rectified in the manuscript. It is important to distinguish between the loss function used for model training and the evaluation metrics used for performance assessment.

Loss Function (For Training): The Mean Squared Error (MSE) was employed as the loss function during the model's training phase. The primary reason for this choice is that MSE penalizes larger prediction errors more heavily than smaller ones due to the squaring operation. This property provides a strong and smooth gradient signal, which facilitates more stable and efficient convergence during the gradient-based optimization process (e.g., with the Adam optimizer).
Evaluation Metrics (For Performance Reporting): The Mean Absolute Error (MAE), Root Mean Squared Error (RMSE), and Mean Absolute Percentage Error (MAPE) were used as complementary metrics to comprehensively evaluate the model's performance on the test set after training. Each metric offers a unique perspective:

MAE provides a robust and intuitive measure of the average error magnitude, as it is less sensitive to outliers compared to RMSE.
RMSE, also derived from squared errors, is useful for emphasizing the impact of larger errors, which is critical in traffic forecasting where avoiding large misses is often important for management decisions.
MAPE offers a scale-independent, relative error measure, which is particularly valuable for comparing model performance across different datasets or road sections with inherently different traffic flow volumes.

We have clarified this distinction and the rationale behind the choice of metrics in Section 3.1, Page 9, Lines 340-348: "The model was trained using the Mean Squared Error (MSE) loss function... For comprehensive evaluation, we employed three complementary metrics: Mean Absolute Error (MAE), Root Mean Squared Error (RMSE), and Mean Absolute Percentage Error (MAPE). MAE provides a robust measure of average error magnitude. RMSE emphasizes the impact of larger errors due to its squaring operation, and MAPE offers a scale-independent relative error perspective, which is particularly useful for comparing performance across different datasets or traffic conditions with varying flow volumes [37]."

Article Menu

Long-Term Traffic Flow Prediction for Highways Based on STLLformer Model

Further Information

Guidelines

MDPI Initiatives

Follow MDPI