Hybrid Rule-Based and Reinforcement Learning for Urban Signal Control in Developing Cities: A Systematic Literature Review and Practice Recommendations for Indonesia
Round 1
Reviewer 1 Report
Comments and Suggestions for AuthorsThis is a well-structured systematic literature review that addresses an important gap in traffic signal control research for developing countries, particularly Indonesia. The paper demonstrates methodological rigor following PRISMA 2020 guidelines and provides practical recommendations for implementation. However, several areas require improvement to strengthen the contribution.
-
Expand Literature Base: Consider a broader geographic scope or longer time window to capture more relevant studies.
-
Strengthen Evidence-Recommendation Link: Create explicit mapping between systematic review findings and implementation recommendations.
-
Add Economic Analysis: Include discussion of cost-effectiveness and financing mechanisms for developing country deployments.
-
Validation Framework: Propose metrics and methodologies for evaluating the proposed hybrid approaches in real-world settings.
-
Risk Assessment: Provide more comprehensive analysis of implementation risks and mitigation strategies.
Author Response
Comments 1: Expand Literature Base: Consider a broader geographic scope or longer time window to capture more relevant studies. |
Response 1: Thank you for pointing this out. In the revised manuscript, we clarified that the review applied a broad time window (2000–2025) and covered multiple regions (Asia, Africa, Europe, and North America). This is now explicitly stated in the Search Strategy section (lines 114-116) “The search window spanned 2000–2025 and covered studies across multiple regions, including Asia, Africa, Europe, and North America, to ensure a broad geographic scope.”
|
Comments 2: Strengthen Evidence-Recommendation Link: Create explicit mapping between systematic review findings and implementation recommendations. |
Response 2: Thank you for this valuable suggestion. We revised Section 6 (Roadmap) to explicitly map systematic review findings to recommendations. The text now states that the staged roadmap directly reflects the synthesis tables and grouped evidence. This appears around lines 815–817. “The staged roadmap presented in this section directly reflects the systematic review findings, ensuring that each recommendation is explicitly grounded in the synthesized evidence.”
|
Comments 3: Add Economic Analysis: Include discussion of cost-effectiveness and financing mechanisms for developing country deployments. |
Response 3: We appreciate this important point. The revised version now includes a discussion of cost-effectiveness and financing strategies for developing-country deployments. This is integrated in Section 6 (Discussion/Policy), lines 877–883. The text highlights that hybrid retrofits are less capital-intensive, can leverage existing infrastructure, and may be financed via phased budgets or Public–Private Partnership (PPP) schemes. “Hybrid retrofits that reuse existing cabinet controllers and adopt camera-first sensing are substantially less capital-intensive than full adaptive replacements. Cost-effectiveness arises from leveraging existing infrastructure, deploying low-cost edge devices, and restricting online adaptation to offset tuning rather than full phase reoptimization. Financing mechanisms may include incremental upgrades through municipal budget cycles, integration with broader ITS modernization programs, and selective public–private partnerships.”
|
Comments 4: Validation Framework: Propose metrics and methodologies for evaluating the proposed hybrid approaches in real-world settings. |
Response 4: Thank you for pointing this out. We added a clear validation framework in Future Work (lines 945–948). It proposes measurable indicators such as AoG, PCD before–after comparisons, delay distributions, and travel-time reliability, with methodology based on high-resolution logging and standardized data formats. “We propose metrics that can be logged from existing ATSPM systems, including arrivals-on-green percentages, Purdue Coordination Diagram before–after comparisons, delay distributions, and travel-time reliability.” |
Comments 5: Risk Assessment: Provide more comprehensive analysis of implementation risks and mitigation strategies. Response 4: Thank you for highlighting this gap. The revised manuscript now contains a dedicated Implementation Risks (around lines 934–940). It discusses technical risks (sensor failures, comms outages), institutional risks (staffing, mandates), and operational risks (misapplied safeguards), along with mitigation strategies such as redundant comms, audit-first acceptance, and staff training. “Technical risks include unreliable sensors, backhaul outages, or synchronization errors that could disrupt corridor operations. Institutional risks arise from staffing shortages, fragmented mandates, and resistance to procedural change within traffic agencies. Operational risks exist if safeguards are misapplied, potentially creating unsafe phase transitions. Mitigation strategies include redundant communications, audit-first acceptance policies, and staff training” |
Author Response File: Author Response.docx
Reviewer 2 Report
Comments and Suggestions for Authors Dear AuthorsBelow you will find a review of this work, structured by sections:
Abstract
Please, you should add a couple of sentences that correctly indicate the effect and direction of the summary (for example, “A, B, C, ... studies reported improved AoG or reduced delay vs. baseline”) to make the abstract’s key outcome more explicit.
Introduction
Please, improve the novelty paragraph by introducing a comparison prior RL-only surveys and stating exactly what this SLR adds (for example, operational safeguards, auditability, standards, and Indonesian implementation constraints).
Methods
Risk-of-bias or quality appraisal, ROBIS was replaced with an “operational relevance” checklist. Provide justification and a mapping to standard SLR quality domains; otherwise, the authors should add a supplemental risk-of-bias instrument.
PRISMA counts indicate Databases (n=1) vs. arXiv Registers (n=34) before de-duplication; Reconcile this with the stated multi-database strategy and discuss potential bias from heavy reliance on preprints.
SWiM is appropriate in your paper, but please add a compact effect-direction table (improve/mixed/no-change) for the primary metrics with the sensitivity subsets you already defined.
Results/Evidence Synthesis
The authors should quantify the direction and prevalence of improvements (for example, count studies with AoG and analyze the increase or decrement of those) using the SWiM effect-direction approach and the predefined sensitivity subsets
Create a small comparative table summarizing the hybrid patterns (rule shields, action masking, bounded variables, prerequisites) across included studies.
Conclusions
Explicitly list limitations (sensor reliability, backhaul dependence, institutional capacity)
Add a paragraph on future work (related to pilot evaluation designs, data-sharing standards)
Author Response
Comments 1: Abstract Please, you should add a couple of sentences that correctly indicate the effect and direction of the summary (for example, “A, B, C, ... studies reported improved AoG or reduced delay vs. baseline”) to make the abstract’s key outcome more explicit. |
Response 1: Thank you for pointing this out. We revised the abstract to make the direction and effect of outcomes explicit. “Across the 18 studies, the majority reported improvements in arrivals on green, delay, travel time, or related coordination metrics compared to fixed-time or actuated baselines, while only a few showed neutral or mixed effects and very few indicated deterioration”
|
Comments 2: Introduction |
Response 2: We appreciate this helpful suggestion. The novelty paragraph has been expanded by contrasting this review with RL-only surveys and clarifying the unique contributions. “To our knowledge, comprehensive reviews focusing specifically on such hybrid rule-based and MARL signal control, especially on their fit for Indonesia’s heterogeneous traffic, remains limited.”
|
Comments 3: Methods Risk-of-bias or quality appraisal, ROBIS was replaced with an “operational relevance” checklist. Provide justification and a mapping to standard SLR quality domains; otherwise, the authors should add a supplemental risk-of-bias instrument. |
Response 3: Thank you for raising this important point. We justified replacing ROBIS with an operational relevance checklist and mapped items to standard quality domains. “Formal tools such as ROBIS or AMSTAR-2 were considered but deemed less appropriate for engineering and transportation systems research, since these tools were developed with clinical, interventional studies in mind [18][19]. Instead, we applied an operational relevance checklist tailored to traffic-signal control studies, focusing on sensing environment, safeguard presence, simulation versus field context, and reporting of ATSPM-related metrics. Each checklist item can be mapped to standard systematic review quality domains: applicability/external validity (sparse vs dense sensing), study design adequacy (presence of safeguards), indirectness (simulation vs pilot), and reporting transparency (AoG, PCD, delay, or travel time metrics).”
|
Comments 4: PRISMA counts indicate Databases (n=1) vs. arXiv Registers (n=34) before de-duplication; Reconcile this with the stated multi-database strategy and discuss potential bias from heavy reliance on preprints. |
Response 4: We acknowledge this observation. The discrepancy in PRISMA counts has been clarified by explaining that the PRISMA template groups multiple bibliographic databases under “Databases” (n=1) and lists arXiv separately under “Registers” (n=34). We also acknowledged the potential bias from including preprints. This is now explicitly stated in the Methods – Search Strategy lines 107–111: “To align with the PRISMA 2020 template, records were grouped under “Databases” and “Registers.” Although multiple databases were searched, only one category is shown in the PRISMA diagram under Databases (n=1), while arXiv preprints were clas-sified under Registers (n=34). This reflects PRISMA’s reporting structure rather than a true reliance on a single database.”
“Fourth, search strategy limitations may introduce bias: despite broad coverage of six major da-tabases, non-English studies may have been missed, and the inclusion of preprints (arXiv) may introduce uncertainty.” |
Comments 5: SWiM is appropriate in your paper, but please add a compact effect-direction table (improve/mixed/no-change) for the primary metrics with the sensitivity subsets you already defined. Response 5: We appreciate the reviewer’s suggestion. A compact effect-direction summary table has been added using the SWiM framework. “To enhance transparency, we also quantified the direction of reported effects us-ing the SWiM effect-direction approach. Across the 18 studies, 13 reported improve-ments, 3 were neutral or mixed, and 2 showed deterioration. Improvements were most consistent in offset-only or bounded-variable designs, while deterioration was ob-served mainly when action spaces were unconstrained and safeguards were absent. Effect-direction summary. Following SWiM, we summarized the direction of ef-fects for the primary metrics and split results by sensitivity subsets (simulation-only vs pilot/field). Table 7 provides a compact view. Table 7. Effect-direction summary by metric and sensitivity subset… .” |
Comments 6: Results/Evidence Synthesis The authors should quantify the direction and prevalence of improvements (for example, count studies with AoG and analyze the increase or decrement of those) using the SWiM effect-direction approach and the predefined sensitivity subsets Create a small comparative table summarizing the hybrid patterns (rule shields, action masking, bounded variables, prerequisites) across included studies.
Response 6: Thank you for highlighting this. The revised Results section now quantifies the prevalence of improvements and includes a comparative table of hybrid safeguard patterns. “Across the 18 studies, 13 reported improvements, 3 were neutral or mixed, and 2 showed deterioration … Table 7 summarizes how plan authority, action masking, bounded variables, and prerequisites were applied”
|
Author Response File: Author Response.docx
Reviewer 3 Report
Comments and Suggestions for AuthorsThe paper is a systematic review of hybrid traffic signal control that combines rule-based logic with reinforcement learning, with practice recommendations tailored to Indonesia’s context of mixed traffic, sensing limits, and governance needs; its contribution is mapping architectures and safeguards across 18 studies and translating them into a practical deployment checklist and roadmap for corridors in developing cities.
The authors should clearly state the paper’s exact scope and keep it consistent, because parts read like both a systematic review and a design proposal; please say plainly if this is “SLR only” or “SLR plus practice framework,” and signpost where evidence ends and where recommendations begin so readers do not get confused.
The authors should strengthen PRISMA transparency by putting the full flow diagram and checklist in the main text, giving the exact database search strings and dates, and using one consistent search window across the whole paper, so others can repeat the review without guesswork.
The authors should make inclusion and exclusion rules more concrete with simple thresholds, for example define what “sparse/camera‑first sensing” means, what “explicit safeguards” include, and when dense-detector studies are still in scope, so screening is clear and repeatable.
The authors should add a short “evidence map” table in the main paper that lists the 18 studies with setting, control lever (offset/split/cycle), safeguard type (mask/shield/bounds), metrics (AoG, PCD, delay, travel time), and whether results beat baselines, so readers see the big picture at a glance.
I suggest to the authors to separate literature findings from the proposed architecture more gently by labeling the “reference–follower, offset‑only MARL” as the authors’ recommendation, and only linking claimed benefits to specific included studies or stating clearly when it reflects expert opinion.
I suggest to the authors to be careful with real‑world claims about RL by stating which of the 18 studies are simulation‑only and which are pilots, and avoid broad statements about field operations unless each point is tied to a cited study in the evidence map.
The authors should add a simple, honest limitations paragraph in the Conclusion noting that the synthesis is narrative (no meta‑analysis), Indonesia‑specific evidence is still thin, metrics vary across studies, and search choices may introduce bias, and briefly say how this affects confidence in the checklist.
I suggest to the authors to streamline writing and reduce repetition: say the “audit‑first” and AoG/PCD points once, fix minor grammar and placeholders, and shorten long sentences in the Introduction and Methods so the guidance feels clear and friendly to city engineers.
I suggest to the authors to include one small corridor example (2–3 signals) that walks through before/after AoG/PCD and shows the exact bounded offset tweaks over a few cycles, turning the checklist into something practical people can picture using on the ground next week.
The authors should move a few key supplemental items (core parts of Tables S1–S6, screening logs, and synthesis grouping rules) into brief main text tables or figures, because readers need these to understand how the review was done and how the conclusions were formed.
Author Response
Comments 1: The authors should clearly state the paper’s exact scope and keep it consistent, because parts read like both a systematic review and a design proposal; please say plainly if this is “SLR only” or “SLR plus practice framework,” and signpost where evidence ends and where recommendations begin so readers do not get confused.
|
Response 1: We thank the reviewer for this suggestion. We have explicitly clarified that our paper is both a systematic literature review and a practice framework. For example, the Abstract now states (lines 34–36): “This paper is not only a systematic review but also develops a practice-oriented framework tailored to Indonesian corridors, ensuring that evidence synthesis and practical recommendations are clearly distinguished.” This statement (also reflected at the end of the Introduction) makes the dual scope clear to readers.
|
Comments 2: The authors should strengthen PRISMA transparency by putting the full flow diagram and checklist in the main text, giving the exact database search strings and dates, and using one consistent search window across the whole paper, so others can repeat the review without guesswork. |
Response 2: We appreciate the suggestion to improve reproducibility. The revised Methods (Sec.2) now explicitly note in lines 146-151 that: “Screening followed the PRISMA 2020 flow. After removing duplicates, titles and abstracts were screened against the eligibility rules, followed by full-text review. Records of inclusion and exclusion at each stage are documented in Supplementary Tables S2–S3, with reasons for exclusion logged. The overall process is summarized in the PRISMA 2020 flow diagram as shown Figure 1. In total, 18 studies met the criteria and were retained for synthesis.” We now explicitly state in Section 2.1 Search Strategy, lines 128–130 that the Scopus query executed on 16 August 2025 was: This addition provides an exact search string with date, strengthening PRISMA transparency and showing that a consistent 2000–2025 window was applied. The exact database search strings and run dates are fully documented in Supplementary Table S1, ensuring the review process can be replicated without ambiguity.
|
Comments 3: The authors should make inclusion and exclusion rules more concrete with simple thresholds, for example define what “sparse/camera‑first sensing” means, what “explicit safeguards” include, and when dense-detector studies are still in scope, so screening is clear and repeatable. |
Response 3: Thank you for raising this important point. We rewrote Section 2.2 Eligibility Criteria to define thresholds and added a concise inclusion/exclusion table.
|
Comments 4: The authors should add a short “evidence map” table in the main paper that lists the 18 studies with setting, control lever (offset/split/cycle), safeguard type (mask/shield/bounds), metrics (AoG, PCD, delay, travel time), and whether results beat baselines, so readers see the big picture at a glance. |
Response 4: Thank you for this helpful suggestion. We created a concise evidence map summarizing all 18 studies in the main text.
|
Comments 5: I suggest to the authors to separate literature findings from the proposed architecture more gently by labeling the “reference–follower, offset‑only MARL” as the authors’ recommendation, and only linking claimed benefits to specific included studies or stating clearly when it reflects expert opinion. Response 5: Thank you for this observation. We revised Section 4.1.2 and Section 4.2 to insert explicit transition sentences marking where literature summaries end and recommendations begin. |
Comments 6: I suggest to the authors to be careful with real‑world claims about RL by stating which of the 18 studies are simulation‑only and which are pilots, and avoid broad statements about field operations unless each point is tied to a cited study in the evidence map.
Response 6: Thank you for pointing this out. To avoid overstating field evidence, we classified all studies explicitly by simulation or pilot. |
Comments 7: The authors should add a simple, honest limitations paragraph in the Conclusion noting that the synthesis is narrative (no meta‑analysis), Indonesia‑specific evidence is still thin, metrics vary across studies, and search choices may introduce bias, and briefly say how this affects confidence in the checklist. Response 7: Thank you for this important comment. We added a clear Limitations paragraph at the end of the Conclusions. This is now explicitly stated in Section 7 (Limitations), lines 915–917: This revised wording confirms the narrative nature of the synthesis, notes the scarcity of Indonesia-specific evidence, highlights heterogeneity of metrics, and explains potential biases, thereby addressing the reviewer’s concerns.
Comments 8: I suggest to the authors to streamline writing and reduce repetition: say the “audit‑first” and AoG/PCD points once, fix minor grammar and placeholders, and shorten long sentences in the Introduction and Methods so the guidance feels clear and friendly to city engineers. |
Response 8: The revised manuscript does show that audit-first and AoG/PCD definitions are presented once and then referenced later. For example, Section 4 (Hybrid Architecture) at lines 1–5 explains: Later sections only use shorthand (AoG/PCD) without repeating full definitions. Long sentences in the Introduction and Methods have been broken up, and placeholders were corrected. This confirms the streamlining.
|
Comments 9: I suggest to the authors to include one small corridor example (2–3 signals) that walks through before/after AoG/PCD and shows the exact bounded offset tweaks over a few cycles, turning the checklist into something practical people can picture using on the ground next week.
Response 9: Thank you for the constructive suggestion. We now explicitly state in the Results, lines 375–392 that
“As illustrated in Figure 5, a simplified Purdue Coordination Diagram (PCD) is used to show how bounded Δoffset adjustments affect corridor progression. In the before case… AoG ≈ 48%. In the after case… raising AoG to ≈ 62%.”
This corridor example makes the checklist more tangible and practical. This is reinforced by Figure 5, which contrasts before/after conditions for a three-signal corridor
Comments 10: The authors should move a few key supplemental items (core parts of Tables S1–S6, screening logs, and synthesis grouping rules) into brief main text tables or figures, because readers need these to understand how the review was done and how the conclusions were formed.
Response 10: Thank you for pointing this out. We have moved condensed versions of key supplemental materials into the main text so that readers can immediately see how the review was conducted and how the conclusions were derived, while still keeping the detailed versions in Supplementary Tables S1–S6.
This is now explicitly stated as follows:
- Table 1 (Section 2.1, line 121) now summarizes the database search and screening outcomes (records retrieved, duplicates removed, and final studies included), complementing the PRISMA flow diagram.
- Table 2 (Section 2.3, line 158) presents the main protocol adjustments and decision rules, showing how deviations from the a priori protocol and edge cases were handled consistently.
- Table 8 (Section 4.3, lines 527) provides a comparative safeguards summary across all 18 studies:
“In addition, we compared the safeguard strategies across studies. Table 8 summarizes how plan authority, action masking, bounded variables, and prerequisites were applied. This table condenses the hybrid safeguard logic into a concise comparative view and complements the broader evidence map.”
Author Response File: Author Response.docx
Round 2
Reviewer 1 Report
Comments and Suggestions for AuthorsThe paper improved for previous comments.
Reviewer 2 Report
Comments and Suggestions for AuthorsI would like to thank the authors for carefully considering my previous comments. The revised manuscript now includes protocol registration, improved quality appraisal, quantitative effect-direction synthesis, and explicit limitations. These changes have substantially strengthened the paper
Comments on the Quality of English LanguageMinor English improvements are required
Reviewer 3 Report
Comments and Suggestions for AuthorsThe authors have addressed all comments from the initial submission.
The manuscript can be accepted in its current form.