Submit to this Journal Review for this Journal Propose a Special Issue

Article Menu

Share Help Cite Discuss in SciProfiles

Open AccessReview

Peer-Review Record

Connect-4 AI: A Comprehensive Taxonomy and Critical Review of Methods and Metrics

Symmetry 2026, 18(2), 293; https://doi.org/10.3390/sym18020293

by Mohammed Alaa Ala’anzy^1,*

, Akerke Madiyarova¹, Aidos Aigeldiyev¹, Raiymbek Zhanuzak^1,2

and Omar Alnaseri³

Reviewer 1: Anonymous

Reviewer 2:

Anurag Dutta

Reviewer 3: Anonymous

Symmetry 2026, 18(2), 293; https://doi.org/10.3390/sym18020293

Submission received: 28 December 2025 / Revised: 26 January 2026 / Accepted: 2 February 2026 / Published: 5 February 2026

(This article belongs to the Special Issue Scheduling, Planning, Decision and Games in Intelligent Dynamic Robotics)

Round 1

Reviewer 1 Report

Comments and Suggestions for Authors

Summary of the Manuscript

This is a review article, the article provides a comprehensive survey of AI techniques applied to the game of Connect-4. While Connect-4 is a solved game, the authors argue that it remains an important platform for AI research, particularly in the emerging fields of Explainable AI, Reinforcement Learning, and Formal Verification. The article proposes a new taxonomy categorizing the literature into five dimensions: Game Theoretical Foundations, Algorithmic Approaches, Strategic/Tactical Reasoning, Explainability, and Computational Enhancements.

General Evaluation

The manuscript is well-structured and addresses a gap in the literature by focusing specifically on Connect-4 rather than general board games. The proposed taxonomy (Figure 2) is logical and helps organize the diverse research areas. The inclusion of modern topics, such as Large Language Models and Brain-Computer Interfaces in the context of this classic game is a strong point.

However, there are some parts in this article requiring improvement before publication. Specifically, the lack of the connection between the role of game theory and search algorithms in Connect-4 AI. Moreover, the overlap between the "Discussion" and "Critical Analysis" sections should be addressed.

I recommend Minor Revisions based on the following comments.

Major Comments

The current manuscript lacks a specific discussion on the role of game theory and search algorithms in Connect-4 AI. For example, the authors should give more examples of combinatorial game theory and search algorithms (like Minimax or Alpha-Beta pruning) to Connect-4 AI. And how these fundamental techniques can be used to reduce the state space size and improve computational efficiency.

Recommendation: The authors should add a subsection or paragraph (maybe in Section 3.5 "Computational Enhancements" or 3.1 "Game Theoretical Foundations") explicitly discussing how the reduction is utilized in the algorithms reviewed. Without this connection, the paper feels more suited to a general Game AI journal.

Redundancy between Sections 4 and 5

Section 4 ("Discussion") and Section 5 ("Critical Analysis") cover very similar content. Both sections discuss the trade-offs between RL and search methods, the challenges of XAI, and computational limitations.

Recommendation: These sections should be merged or significantly differentiated. For example, Section 4 could focus on synthesizing the results (what works best and when), while Section 5 (or a "Future Challenges" section) focuses purely on open problems and limitations. Currently, when I read Section 5, it feels like I am re-reading Section 4.

Justification of Non-Connect-4 References

The review includes references to papers focused on Othello (Ref [10], [14]) and 2048 (Ref [50]). While the text acknowledges this (e.g., line 340 mentions Othello), the authors need to be more explicit about why these are included in a Connect-4 review.

Recommendation: The authors should clarify in the text if these methods are directly transferable to Connect-4 or if there is a lack of specific Connect-4 literature in those sub-areas (e.g., n-tuple networks) that necessitates looking at similar grid-based games.

Search Methodology (Section 2)

The search strategy uses the single keyword "Connect-4" (Line 214). This is very restrictive since most of general game-playing papers use Connect-4 as one of several benchmarks but they do not list it in the title or keywords.

Recommendation: The authors should acknowledge this as a potential limitation or expand their search to include terms like "four-in-a-row" or "gravity-based games" to ensure they haven't missed significant contributions where Connect-4 was a secondary platform.

Minor Comments

Figure 1 Quality: Figure 1 is a somewhat generic, low-resolution clip-art style image.

Recommendation: Replace this with a cleaner, scientific diagram of a board state. It would be valuable to add notations (e.g., numbering columns 1-7) or illustrate a "winning line" vs. a "trap" to add instructional value.

Figure 2 (Taxonomy): The visual taxonomy is helpful, but the text in the bubbles is quite small. Please ensure high-resolution export for the final PDF to maintain readability.

Table 2 and 3 Formatting: The columns in these tables are abbreviated (SOA, TA, Var, CA). While there is a footnote explaining them, the tables are somewhat hard to parse at a glance.

Recommendation: If space permits, consider slightly more descriptive headers or ensure the caption explicitly details the coding system used.

Reference [4]: In the Reference list (Line 1796), Reference 4 appears to be a general review of CNNs in computer vision. Its connection to Connect-4 is not immediately obvious in the text citation context (Line 80). Please verify if this is the intended citation for AlphaZero/Deep Learning in games, or if a more game-specific CNN paper is available.

Comments for author File: Comments.pdf

Author Response

Reviewer#1

General comment: The manuscript is well-structured and addresses a gap in the literature by focusing specifically on Connect-4 rather than general board games. The proposed taxonomy (Figure 2) is logical and helps organize the diverse research areas. The inclusion of modern topics, such as Large Language Models and Brain-Computer Interfaces in the context of this classic game is a strong point. However, there are some parts in this article requiring improvement before publication. Specifically, the lack of the connection between the role of game theory and search algorithms in Connect-4 AI. Moreover, the overlap between the "Discussion" and "Critical Analysis" sections should be addressed.

We would like to thank the reviewer for their time and effort to improve our manuscript. We are confident that this version of the manuscript has addressed all of their comments comprehensively.

Reviewer#1, Concern # 1: The current manuscript lacks a specific discussion on the role of game theory and search algorithms in Connect-4 AI. For example, the authors should give more examples of combinatorial game theory and search algorithms (like Minimax or Alpha-Beta pruning) to Connect-4 AI. And how these fundamental techniques can be used to reduce the state space size and improve computational efficiency.
Recommendation: The authors should add a subsection or paragraph (maybe in Section 3.5 "Computational Enhancements" or 3.1 "Game Theoretical Foundations") explicitly discussing how the reduction is utilized in the algorithms reviewed. Without this connection, the paper feels more suited to a general Game AI journal.

Author response: Thank you for this valuable suggestion.

Author action: A paragraph was added at the end of Section 3.1 to explicitly explain how combinatorial game theory and classical search algorithms exploit Connect-4’s structural properties to reduce the effective state space and improve computational efficiency. (Kindly see Line 212-222)

Reviewer#1, Concern # 2: Section 4 ("Discussion") and Section 5 ("Critical Analysis") cover very similar content. Both sections discuss the trade-offs between RL and search methods, the challenges of XAI, and computational limitations.
Recommendation: These sections should be merged or significantly differentiated. For example, Section 4 could focus on synthesizing the results (what works best and when), while Section 5 (or a "Future Challenges" section) focuses purely on open problems and limitations. Currently, when I read Section 5, it feels like I am re-reading Section 4.

Author response: Thank you for this comment. We have addressed this comment as requested.

Author action: Sections 4 and 5 were merged by removing the former Critical Analysis section and integrating its revised content into an expanded Discussion section that now focuses on synthesising findings and interpreting when and why different approaches are effective. The subsequent section was revised and repositioned as Open Challenges and Future Research Directions, focusing exclusively on unresolved issues and limitations. This restructuring eliminates redundancy and clarifies the conceptual roles of the sections. (Kindly see 4. Discussion and 5. Open Challenges and Future Research Directions).

Reviewer#1, Concern # 3: The review includes references to papers focused on Othello (Ref [10], [14]) and 2048 (Ref [50]). While the text acknowledges this (e.g., line 340 mentions Othello), the authors need to be more explicit about why these are included in a Connect-4 review.
Recommendation: The authors should clarify in the text if these methods are directly transferable to Connect-4 or if there is a lack of specific Connect-4 literature in those sub-areas (e.g., n-tuple networks) that necessitates looking at similar grid-based games.

Author response: Thank you for the comment. We revised the manuscript to make the inclusion of non–Connect-4 studies more explicit and better justified. In particular, we clarify in the text that the methods discussed in Refs. [10] and [14] are included because they are based on game-agnostic principles that are directly transferable to Connect-4 due to shared characteristics such as grid-based board structures, deterministic dynamics, and perfect-information gameplay. To avoid ambiguity and maintain a tighter focus, on Connect-4 is a relevant literature, Ref. [50], which addresses the stochastic game, has been removed.

Author action: The manuscript was revised to explicitly justify the inclusion of Othello-based studies as transferable to Connect-4, and the reference was removed to reduce potential confusion.

Reviewer#1, Concern # 4: The search strategy uses the single keyword "Connect-4" (Line 214). This is very restrictive since most of general game-playing papers use Connect-4 as one of several benchmarks but they do not list it in the title or keywords.
Recommendation: The authors should acknowledge this as a potential limitation or expand their search to include terms like "four-in-a-row" or "gravity-based games" to ensure they haven't missed significant contributions where Connect-4 was a secondary platform.

Author response: We thank the reviewer for this valuable observation. In the initial phase of the study, the literature search was conducted using the single keyword “Connect-4”, which yielded 92 records, of which 44 studies were selected after title, abstract, and full-text screening.

To ensure broader coverage and to avoid missing relevant studies where Connect-4 is used as a secondary benchmark, we expanded the search strategy to include alternative terminology and related game descriptors. The revised query included terms such as “Connect Four”, “four-in-a-row”, “four in a row”, as well as broader descriptors including “gravity-based game”, “connection game”, and “alignment game” in combination with “general game playing”, “board game”, and “perfect-information game”.

Using this expanded search strategy, a total of 293 records were retrieved. After applying language, publication type, and temporal filters, followed by title, abstract, and full-text screening, five additional relevant studies were identified and incorporated into the review. The final corpus now consists of 49 selected publications.

The review protocol section has been updated accordingly to clearly document the expanded search strategy and selection process. This revision strengthens the completeness of the survey and reduces the risk of omitting relevant contributions where Connect-4 is not explicitly highlighted in titles or keywords.

Author action: The New keywords are ("Connect-4" OR "Connect Four" OR "four-in-a-row" OR "four in a row") OR (("gravity-based game" OR "connection game" OR "alignment game") AND ("general game playing" OR "board game" OR "perfect-information game")). Please see section 2. Review Protocol.

Reviewer#1, Concern # 5:

Figure 1 is a somewhat generic, low-resolution clip-art style image.
Recommendation: Replace this with a cleaner, scientific diagram of a board state. It would be valuable to add notations (e.g., numbering columns 1-7) or illustrate a "winning line" vs. a "trap" to add instructional value.

Figure 2 (Taxonomy): The visual taxonomy is helpful, but the text in the bubbles is quite small. Please ensure high-resolution export for the final PDF to maintain readability.

Table 2 and 3 Formatting: The columns in these tables are abbreviated (SOA, TA, Var, CA). While there is a footnote explaining them, the tables are somewhat hard to parse at a glance.
Recommendation: If space permits, consider slightly more descriptive headers or ensure the caption explicitly details the coding system used.

Author response: Thank you for these helpful suggestions regarding the clarity and presentation of figures and tables. We agree that improving visual quality and readability strengthens the instructional value of the manuscript. Therefore, we have updated all figures accordingly.

For table, 2, 3, and 4, we are unable to add due to the space, but we can ask the suggestion from the journal production team to address this issue once the paper is accepted.

Author action: Figure 1 was replaced with a schematic, high-resolution diagram of a Connect-4 board state, including column indexing and annotated configurations to illustrate key concepts such as a forced win and a trap. Figure 2 was re-exported at higher resolution to ensure that all taxonomy labels remain legible in the final PDF.

Reviewer#1, Concern # 6: Reference [4]: In the Reference list (Line 1796), Reference 4 appears to be a general review of CNNs in computer vision. Its connection to Connect-4 is not immediately obvious in the text citation context (Line 80). Please verify if this is the intended citation for AlphaZero/Deep Learning in games, or if a more game-specific CNN paper is available.

Author response: Thank you for the comment. We agree that the original reference was a general review of CNNs and its link to AlphaZero and Connect-4 was not clear enough in the text. To address this, we updated the citation with a game-specific reference that is more suitable for this context. Although the new reference is not focused directly on CNN architectures or AlphaZero, it discusses the achievements of deep-learning-based game systems, which matches the intention of the cited statement.

Author action: The citation at Line 60 has been replaced with a game-specific reference, and the reference list has been updated accordingly. P.S .(CItation was at 60, not 80)

Author Response File: Author Response.pdf

Reviewer 2 Report

Comments and Suggestions for Authors

The review is promising and timely, but in it’s current form I have a few issues as follows:

The authors mentioned conducting the review relying only on "Connect-4" keyword. I believe it misses out the works using alternative terminology ("Four-in-a-Row", "Captain's Mistress", etc.) The authors must revisit them properly.
I can understand the authors selected the last decade timeframe (2015-2025). But a proper justification and acknowledgement on how the pre-2015 works were? i.e., their principal contrib., res., etc. must be added for completeness.
Line No. 166 the authors mentioned “four complementary dimensions”. But Fig. 2 shows five!?
Ref. [1] Xu, F.; Li, B. Variable Action Set-based Monte Carlo Tree Search Algorithm for UAV Autonomous Collision Avoidance. IEEE Access 2025. How is it related to the work? Also, the authors have focussed more on Othello over Connect-4 in the reference section. Why is it so?
The title itself mentions: “Methods and Metrics”. I expected atleast some systematic discussion of evaluation metrics, but couldn’t find it in the survey.
MCTS in sections 3.1.1, 3.2.2, 3.3.2, 3.3.3, and 3.2.4. Why is it so? It creates confusion.
Again, Section 3.4 is about human-robot interaction studies using Connect-4 as task environment, but it’s not about advancing Connect-4 AI. Please explain it’s relevance, oth., remove it.
Section 5, I was expecting deep “critical analysis”. The authors have mostly restated the imitations here. Please update.
The authors are reqd. to add comparative performance tables with metrics. Additionally, a timeline analysis is very needed.
Also, a consistent problem throughout the review is the non-uniform detailing. Some papers have been discussed in very details, but some are merely touched.
The authors are required to perform a systematic extraction of experimental setups, datasets used, compute, etc. Otw, uniformity won’t be maintained.
Please also add some case studies to augment the survey.
Are the discussed methods reproducible? The readers don’t get to know after reading the review! The authors must disclose these properly with proper accreditation to the source and the original inventors.
Besides there are certain minor issues related to the formatting, grammar, typo, etc.

(a) It is not Fig x, but Fig. x, See line 73, 157, 365, and 538.

(b) “The introduction of XAI has provided efforts to make these models more transparent” is awkward. Restructure the line.

(d) I checked for AI text content of the work; it shows 57%. Please try to reduce it (very imp.)

(e) “Another authors in” is wrong. It should be “The authors in”

(f) Throughout the review, “Alpha-Beta Pruning” is sometimes capitalized, sometimes not.

Comments on the Quality of English Language

(a) It is not Fig x, but Fig. x, See line 73, 157, 365, and 538.

(b) “The introduction of XAI has provided efforts to make these models more transparent” is awkward. Restructure the line.

(d) I checked for AI text content of the work; it shows 57%. Please try to reduce it (very imp.)

(e) “Another authors in” is wrong. It should be “The authors in”

(f) Throughout the review, “Alpha-Beta Pruning” is sometimes capitalized, sometimes not.

Author Response

Reviewer#2

General Comment: The review is promising and timely, but in it’s current form I have a few issues as follows.

We appreciate the reviewer for dedicating their time and effort to enhance our manuscript. We believe that this revised version has thoroughly addressed all of their feedback.

Reviewer#2, Concern # 1: The authors mentioned conducting the review relying only on "Connect-4" keyword. I believe it misses out the works using alternative terminology ("Four-in-a-Row", "Captain's Mistress", etc.) The authors must revisit them properly.

Reviewer#2, Concern # 2: I can understand the authors selected the last decade timeframe (2015-2025). But a proper justification and acknowledgement on how the pre-2015 works were? i.e., their principal contrib., res., etc. must be added for completeness.

Author response: We thank the reviewer for this comment. Although the primary scope of this survey focuses on publications from 2015 to 2025 in order to capture recent advances in learning-based, hybrid, and explainable AI methods, foundational pre-2015 works are explicitly acknowledged in the Introduction to provide historical context and to highlight their principal contributions.

Author action: The Introduction references the knowledge-based approach to Connect-4 proposed by (Allis, 1988), which formally established the theoretical solvability of the game and demonstrated that the first player can always force a win under perfect play [2]. In addition, early work employing Connect-4 as a platform for illustrating fundamental artificial intelligence concepts and interactive decision-making is acknowledged, highlighting its role in AI education and applied systems (Stuart, 1994) [3]. Furthermore, the broader contribution of Allis’ subsequent work on game-solving methods is recognised, particularly the introduction of proof-number search and systematic search techniques for deterministic, perfect-information games, which have had lasting influence beyond Connect-4 itself (Allis, 1994) [4].

These pre-2015 studies are cited to establish the theoretical and methodological foundations upon which later research builds, while the review itself deliberately concentrates on post-2015 literature to maintain a focused analysis of contemporary developments in Connect-4 artificial intelligence. (Kindly see 1.1. Connect-4 as a Benchmark Game, Line 22-56)

Reviewer#2, Concern # 3: Line No. 166 the authors mentioned “four complementary dimensions”. But Fig. 2 shows five!?

Author response: Thank you for identifying this inconsistency. We agree that the text did not correctly match Fig. 2. The description has been corrected to reflect five complementary dimensions.

Author action: We updated the manuscript by change it from from “four” to “five” to ensure consistency with Fig. 2. (Kindly see last paragraph of Section 3, Line 198-199).

Reviewer#2, Concern # 4: Ref. [1] Xu, F.; Li, B. Variable Action Set-based Monte Carlo Tree Search Algorithm for UAV Autonomous Collision Avoidance. IEEE Access 2025. How is it related to the work? Also, the authors have focussed more on Othello over Connect-4 in the reference section. Why is it so?

Author response: Thank you for pointing this out. We agree that the originally cited reference was not directly related to the topic of this work. To avoid confusion and improve relevance, we removed this reference and replaced it with a more suitable game-related study that directly supports the discussion in the manuscript.

Author action: Ref. [1] was removed and replaced with a more relevant game-specific reference. (Please check Ref. [1], 1.1. Connect-4 as a Benchmark Game).

Reviewer#2, Concern # 5: The title itself mentions: “Methods and Metrics”. I expected at least some systematic discussion of evaluation metrics, but couldn’t find it in the survey.

Author response: We thank the reviewer for highlighting this important omission.

Author action: We have revised the manuscript and added a dedicated and systematic discussion of evaluation metrics to explicitly align the content with our title.

Specifically, we introduced in Section 4. Discussion that consolidates evaluation practices across the surveyed literature. Table 6 now presents a structured taxonomy of evaluation dimensions and metrics used in Connect-4 research, including game outcome performance, computational cost, learning efficiency and stability, generalisation, strategic quality, explainability, human–AI interaction, and formal correctness. Each metric category is explicitly linked to the corresponding references from Section 3, ensuring complete coverage of the surveyed studies.

Reviewer#2, Concern # 6: MCTS in sections 3.1.1, 3.2.2, 3.3.2, 3.3.3, and 3.2.4. Why is it so? It creates confusion.

Author response: Thank you for the comment. MCTS, appear in multiple sections. This repetition is intentional and reflects the multidimensional role such methods play in Connect-4 research. Specifically, MCTS is discussed (Section 3.1.1) as a theoretically motivated strategy for optimal play, (Section 3.2.2) as a standalone algorithmic approach, (Section 3.2.4) as a component of hybrid learning–search models, and (Sections 3.3.2 and 3.3.3) as a tactical tool for endgame optimisation and trap detection. Rather than indicating redundancy, these multiple appearances highlight how MCTS contributes differently depending on the analytical perspective. Table 5 already justifies everything.

Reviewer#2, Concern # 7: Again, Section 3.4 is about human-robot interaction studies using Connect-4 as task environment, but it’s not about advancing Connect-4 AI. Please explain it’s relevance, oth., remove it.

Author response: We thank the reviewer for raising this point. While Section 3.4 does not primarily introduce new algorithms for solving Connect-4, it examines studies that use Connect-4 as a controlled task environment to investigate human–AI and human–robot interaction. We have clarified in the revised manuscript that the relevance of this section lies in its contribution to understanding how Connect-4 AI systems are used, interpreted, and trusted by human users, rather than in advancing game-solving performance alone.

Specifically, this section informs the design of explainable, interactive, and human-centred Connect-4 agents by highlighting how factors such as transparency, feedback, and user trust influence the effectiveness of AI decision-making. We have revised the section introduction to explicitly position it within the broader taxonomy as part of the explainability and human-in-the-loop dimension, ensuring its relevance to the overall goals of the survey.

Reviewer#2, Concern # 8: Section 5, I was expecting deep “critical analysis”. The authors have mostly restated the imitations here. Please update.

Author response: Thank you for the comment.

Author action: We have updated the manuscript by removing Section 5 Critical Analysis and its content was substantially revised and integrated into an expanded Discussion section. The revised Discussion now provides a deeper critical interpretation of the literature by analysing underlying design trade-offs, structural constraints, and methodological implications across approaches, rather than restating individual limitations. This revision strengthens the analytical depth and addresses the concern regarding insufficient critical insight.

Reviewer#2, Concern # 9: The authors are reqd. to add comparative performance tables with metrics. Additionally, a timeline analysis is very needed.

Author response: We thank the reviewer for this constructive suggestion. In response, we have substantially revised the manuscript to include both comparative performance analysis and a timeline-based overview of research trends.

First, we added a comparative performance table grounded in evaluation metrics used across the surveyed studies. Table 6 systematically organises evaluation dimensions and metrics (e.g., game outcome performance, computational cost, learning efficiency and stability, generalisation, explainability, and formal correctness) and explicitly maps each metric category to the corresponding references discussed in Section 3. This table enables cross-method comparison at the metric level while respecting the heterogeneous experimental setups reported in the literature.

Second, we introduced a timeline analysis of Connect-4 research spanning 2015–2025 (Figure 8). Using the taxonomy defined in Section 3, each surveyed paper was assigned to a dominant methodological category, and annual publication counts were aggregated. The resulting figure highlights the evolution from classical search-based methods toward reinforcement learning, hybrid approaches, explainability, human–AI interaction, and formal verification, thereby contextualising performance evaluation practices over time.

Because the surveyed works differ substantially in opponents, hardware, time constraints, and evaluation protocols, direct numerical aggregation into a single benchmark-style table would be misleading. We therefore adopt a survey-appropriate comparative approach that contrasts methods based on shared evaluation dimensions and reported metrics, while directing readers to the original studies for experimental detail.

These additions directly address the reviewer’s concern and strengthen the comparative and temporal analysis of methods and metrics in the survey.

Reviewer#2, Concern # 10: Also, a consistent problem throughout the review is the non-uniform detailing. Some papers have been discussed in very details, but some are merely touched.

Author response: We appreciate your valuable feedback and would like to clarify our approach concerning the level of detail presented in our manuscript. Our objective is to ensure that the paper remains accessible and comprehensible to a diverse audience while effectively conveying key information. Therefore, we have concentrated on incorporating essential details in the most relevant contexts, rather than providing uniform depth across all studies discussed.

Please be assured that each paper has been carefully integrated into the manuscript, with their contributions also emphasized in the summary tables. We hold in high regard all insights and research included in our work, and we strive to present them in a manner that enhances the overall narrative and findings of our study. Comprehensive details are provided within the body of the manuscript and are also reflected in the summary tables (Tables 2, 3, and 4).

Reviewer#2, Concern # 11: The authors are required to perform a systematic extraction of experimental setups, datasets used, compute, etc. Otw, uniformity won’t be maintained.

Author response: We thank the reviewer for raising this point. The objective of this work is not to conduct a meta-analysis of experimental setups or benchmark results, but to provide a structured, taxonomy-driven synthesis of Connect-4 AI research. Many of the reviewed studies employ heterogeneous experimental conditions, including differing board configurations, opponent models, evaluation metrics, computational budgets, and training protocols. As a result, enforcing a uniform experimental comparison would be neither feasible nor methodologically sound.

To maintain consistency across the reviewed literature, we adopt a conceptual and methodological normalisation rather than an experimental one. Specifically, uniformity is achieved by systematically categorising each study according to the type of methodology employed (e.g., search-based, learning-based, hybrid), its primary research objective (e.g., optimal play, adaptability, explainability), and its role within the proposed taxonomy dimensions.

As clarified in Section 2.4 (Data Extraction and Synthesis), experimental settings, datasets, and computational resources are discussed where relevant to contextualise individual contributions, but are not used as a basis for cross-study comparison. This taxonomy-driven synthesis ensures consistency across heterogeneous studies while avoiding misleading conclusions that could arise from incomparable experimental designs.

Reviewer#2, Concern # 12: Please also add some case studies to augment the survey.

Author response: We thank the reviewer for this suggestion. As this work is a survey rather than an experimental study, the paper does not introduce new empirical case studies. Instead, it incorporates analytical case illustrations by discussing representative Connect-4 studies within each category of the proposed taxonomy. These examples serve to contextualise methodological choices, highlight typical evaluation practices, and illustrate how different approaches address specific research objectives.

The inclusion of these representative studies allows the survey to augment the taxonomy with concrete, literature-based examples while remaining within the intended scope of a systematic review. Introducing new experimental case studies would require a fundamentally different research design and is therefore beyond the scope of the present work.

Reviewer#2, Concern # 13: Are the discussed methods reproducible? The readers don’t get to know after reading the review! The authors must disclose these properly with proper accreditation to the source and the original inventors.

Author response: We thank the reviewer for his comment. As this work is a survey, it does not evaluate or validate the reproducibility of the reviewed methods. Reproducibility is therefore dependent on the experimental details, data availability, and implementation descriptions provided in the original studies.

The aim of the present review is to organise and synthesise existing Connect-4 research through a taxonomy-driven analysis, rather than to replicate or assess individual methods. While all discussed approaches are referenced to their original publications, the review does not claim or infer reproducibility beyond what is reported by the original authors.

Reviewer#2, Concern # 14: Besides there are certain minor issues related to the formatting, grammar, typo, etc.

(a) It is not Fig x, but Fig. x, See line 73, 157, 365, and 538.

(b) “The introduction of XAI has provided efforts to make these models more transparent” is awkward. Restructure the line.

(c) “To the best of the authors knowledge” is wrong. It should be “authors’”

(d) I checked for AI text content of the work; it shows 57%. Please try to reduce it (very imp.)

(e) “Another authors in” is wrong. It should be “The authors in”

(f) Throughout the review, “Alpha-Beta Pruning” is sometimes capitalized, sometimes not.

Author response: Thank you for carefully pointing out these formatting and language issues. We appreciate the detailed feedback.

Author action: All items listed in this comment were corrected throughout the manuscript. Figure references were standardised to the “Fig. x” format, grammatical and stylistic issues were revised, terminology was made consistent, and phrasing was refined.

Author Response File: Author Response.pdf

Reviewer 3 Report

Comments and Suggestions for Authors

This is a strong draft, but it shows common weaknesses of review papers: heavy overlap between sections and a structure that feels more like a catalog than a narrative. A major revision should focus on consolidation and clearer synthesis.

The paper surveys the literature thoroughly, but it does not sufficiently integrate or interpret it. The sections labeled “Discussion,” “Critical Analysis,” and “Open Challenges” largely repeat the same ideas, which dilutes the impact of the work.

The multidimensional taxonomy is the main original contribution and a real strength. However, it is introduced and then underused; the later analysis does not consistently build on it as an organizing framework.
Much of the writing is descriptive rather than critical. Instead of mainly reporting what prior studies did, the paper should more clearly articulate trade-offs, limitations, and lessons across approaches.
Sections 4, 5, and 6 cover similar themes—computational cost, explainability, adaptability—from different angles. Merging them into a single synthesis-focused section would greatly improve clarity and reduce redundancy.
The analysis would be stronger if organized directly around the taxonomy dimensions, with each theme combining evidence from the literature and linking it to open problems and future directions.
The introduction should more clearly justify why Connect-4 is a valuable benchmark and emphasize what this focused review reveals about broader AI research trends.
The abstract should be tightened to highlight synthesis and critical insight rather than coverage alone.
The tables are informative but in places difficult to read; abbreviations and columns should be simplified or more clearly explained. Table 4 is particularly effective and should be treated as central rather than auxiliary.
New topics such as KANs and related models would fit better if integrated into the main analytical discussion, rather than appearing as add-ons.
With these changes, the paper can move from a comprehensive but repetitive survey to a focused, critical review that clearly communicates its main contribution and value.

A promising future direction lies in exploring novel neural network architectures for the value and policy networks. Emerging models like KAN: Kolmogorov-Arnold Networks and A Practitioner's Guide to Kolmogorov-Arnold Networks

offer a compelling alternative to traditional MLPs. Their potential for more efficient function approximation could lead to agents that learn optimal strategies with less training. Furthermore, the inherent interpretability of KANs aligns directly with the growing need for Explainable AI (XAI), potentially allowing researchers to visualize and understand the strategic concepts learned by the agent during self-play.

Author Response

Reviewer#3

General Comment: This is a strong draft, but it shows common weaknesses of review papers: heavy overlap between sections and a structure that feels more like a catalog than a narrative. A major revision should focus on consolidation and clearer synthesis.

We sincerely appreciate the reviewer for their time and effort in enhancing our manuscript. We believe that this revised version adequately addresses all of their comments in detail.

Reviewer#3, Concern # 1: The paper surveys the literature thoroughly, but it does not sufficiently integrate or interpret it. The sections labeled “Discussion,” “Critical Analysis,” and “Open Challenges” largely repeat the same ideas, which dilutes the impact of the work.

Author response: Thank you for this comment. We agree that the earlier structure did not sufficiently emphasise synthesis and interpretation, which reduced the overall impact of the review. In response, we restructured the manuscript to clearly separate integrative analysis from forward-looking discussion and to eliminate repetitive treatment of similar ideas across sections.

Author action: The former Critical Analysis section was removed, and its content was fully integrated into a substantially revised Discussion section that now provides a unified, cross-cutting synthesis of the literature, focusing on methodological trade-offs rather than repeating individual results. The Open Challenges and Future Research Directions section was also revised to focus exclusively on unresolved issues and prospective research needs. Together, these changes reduce redundancy and strengthen the interpretive and integrative contribution of the paper.

Reviewer#3, Concern # 2: The multidimensional taxonomy is the main original contribution and a real strength. However, it is introduced and then underused; the later analysis does not consistently build on it as an organizing framework.

Author response: Thank you for highlighting this point.

Author action: The manuscript was revised to strengthen the role of the taxonomy as an organising framework beyond its initial introduction. The Discussion section is now explicitly structured around the core taxonomy dimensions, and evidence from different methodological strands is synthesised through these dimensions. In addition, the taxonomy-driven trade-offs are directly linked to the open challenges and future research directions, ensuring conceptual continuity across the paper. (Kindly see figure 2, page 6)

Reviewer#3, Concern # 3: Much of the writing is descriptive rather than critical. Instead of mainly reporting what prior studies did, the paper should more clearly articulate trade-offs, limitations, and lessons across approaches.

Author response: We thank the reviewer for this important observation. In response, we have revised the manuscript to strengthen its critical perspective by explicitly discussing trade-offs, limitations, and comparative insights across different classes of approaches. Rather than only describing individual methods, the revised text highlights key tensions observed in the literature, such as optimality versus scalability, learning flexibility versus interpretability, and computational efficiency versus strategic robustness.

In particular, the taxonomy and discussion sections now emphasise cross-cutting lessons that emerge across studies, clarifying under which conditions specific approaches are most effective and where their limitations become apparent. These revisions aim to shift the narrative from descriptive reporting toward a more analytical synthesis of insights across the reviewed works.

Reviewer#3, Concern # 4: Sections 4, 5, and 6 cover similar themes—computational cost, explainability, adaptability—from different angles. Merging them into a single synthesis-focused section would greatly improve clarity and reduce redundancy.

Author response: Thank you for this observation. We agree that the presentation of closely related concepts in separate sections reduced the overall redundancy. To address this, we revised the structure to provide a more integrated and logically organised discussion.

Author action: The former Critical Analysis section was removed, and its content was fully integrated into a revised Discussion section, which now provides a unified synthesis of methodological trade-offs across approaches. The Open Challenges and Future Research Directions section was correspondingly revised to focus exclusively on unresolved issues and prospective research directions, thereby reducing redundancy and improving overall clarity.

Reviewer#3, Concern # 5: The analysis would be stronger if organized directly around the taxonomy dimensions, with each theme combining evidence from the literature and linking it to open problems and future directions.

Author response: Thank you for this suggestion.

Author action: The Discussion section was revised to explicitly frame the analysis around the core taxonomy dimensions identified in the review, synthesising evidence across methodological approaches. In addition, these dimensions are now directly linked to the open challenges and future research directions outlined in the subsequent section, improving conceptual continuity without introducing redundancy.

Reviewer#3, Concern # 6: The introduction should more clearly justify why Connect-4 is a valuable benchmark and emphasize what this focused review reveals about broader AI research trends.

Author response: Thank you for this comment.

Author action: We revised the Introduction by rewriting and better organizing it. It now has four subsections as follows:

1.1.Connect-4 as a Benchmark Game
1.2 From Classical Search to Learning-Based Approaches
1.3 Explainability and Human--AI Interaction
1.4 Emerging Trends and Motivation for This Review

(Kindly see Section 1, pages 1-4).

Reviewer#3, Concern # 7: The abstract should be tightened to highlight synthesis and critical insight rather than coverage alone.

Author response & action: We thank the reviewer for the suggestion. The abstract has been revised to emphasize synthesis and critical insight rather than mere coverage. Specifically, we highlight a taxonomy-driven analysis of Connect-4 AI research, identify three dominant patterns across search-based, learning-based, and hybrid approaches, and provide a critical perspective on the integration of efficiency, adaptability, interpretability, and robustness. Emerging trends, such as hybrid search–learning systems and explainable AI, are now explicitly mentioned to demonstrate insight into underexplored research directions. These changes tighten the abstract while clearly reflecting our analytical and integrative approach.

Reviewer#3, Concern # 8: The tables are informative but in places difficult to read; abbreviations and columns should be simplified or more clearly explained. Table 4 is particularly effective and should be treated as central rather than auxiliary.

Author response: We thank the reviewer for this helpful comment regarding table clarity and emphasis. In the revised manuscript, we have reviewed the presentation of all tables and improved their readability by clarifying column meanings and ensuring that abbreviations are either standard within the field or explicitly explained in the accompanying text.

In addition, the table previously referred to as Table 4 has been renumbered as Table 5 in the revised version and is explicitly referenced in the taxonomy discussion to reflect its integrative role. (Table 5. Cross-taxonomic roles of AI techniques in Connect-4 research, highlighting the evaluation metrics used across different methodological categories.) now serves as a central synthesis that illustrates how key AI techniques span multiple dimensions of the proposed taxonomy, rather than functioning as an auxiliary summary.

Reviewer#3, Concern # 9: New topics such as KANs and related models would fit better if integrated into the main analytical discussion, rather than appearing as add-ons.

Author response: We thank the reviewer for this valuable suggestion. We agree that emerging models such as Kolmogorov–Arnold Networks (KANs) should be integrated into the main analytical discussion rather than presented as isolated additions. Accordingly, we have revised the manuscript to incorporate KANs within the broader discussion of learning-based and hybrid AI approaches in the Open Challenges and Future Research Directions section. The revised text explicitly positions KANs as an emerging neural architecture that aligns with reinforcement-learning and search–learning paradigms, particularly in addressing challenges related to interpretability and real-time decision-making. This integration clarifies their role as a natural extension of existing research trends rather than an add-on topic.

Author Response File: Author Response.pdf

Round 2

Reviewer 2 Report

Comments and Suggestions for Authors

Thank you for submitting the revised version of the ms. All my concerns have been addressed.

Author Response

Thank you for your time and efforts

Reviewer 3 Report

Comments and Suggestions for Authors

Survey reads like a catalogue rather than a critical review.
Large portions of Sections 3–5 summarize individual papers one-by-one (often with similar phrasing) without synthesizing why results differ, what assumptions matter, and what lessons generalize beyond Connect-4. The taxonomy is presented, but the paper does not consistently use it to drive an integrative narrative (e.g., explicit cross-comparisons, failure modes, trade-off explanations, and “what to use when” guidance). The manuscript needs a more analytical structure that reduces enumerative descriptions and increases comparative synthesis.
Insufficient referencing within the text at key technical claims.
Many statements are written as general truths (e.g., “MCTS is powerful,” “RL improves adaptability,” “hybrid methods redistribute complexity,” “XAI increases trust,” etc.) but are not consistently supported with precise, local citations. Tables list references, but the narrative often lacks citations at the point where claims are made. A review article should be citation-dense in the discussion of claims, not only in summary tables.
Limited mathematical/technical depth for a methods-and-metrics review.
The paper surveys algorithms, but the treatment is mostly verbal. For a “methods and metrics” review, it should include minimal but concrete formalism:
canonical formulations (e.g., minimax recursion, alpha-beta bounds, standard MCTS/UCT selection rule, RL objective/value update forms),
complexity/scaling discussions tied to explicit quantities (branching factor, depth, state space, rollout budget),
clear definitions of evaluation metrics used in the literature (win-rate protocols, Elo/TrueSkill if used, nodes expanded, time per move, sample efficiency, convergence criteria).
Without compact mathematics and definitions, the manuscript remains descriptive and risks being too shallow for specialists.
Taxonomy is not operationalized into a coherent evaluation framework.
The taxonomy figures and tables are helpful, but the review does not translate them into actionable evaluation methodology. For example, what constitutes a fair comparison across Minimax/MCTS/RL/hybrids given different compute budgets, hardware, and opponent pools? The paper acknowledges heterogeneity, but then stops short of proposing standardized reporting (e.g., fixed time-per-move budgets, identical opponent sets, ablation reporting, reproducibility checklist, or minimal benchmark suite). This is a missed opportunity and would significantly strengthen the paper.
Coverage appears uneven and sometimes weakly justified (scope drift).
Several entries are included that are not clearly Connect-4-centric (e.g., Othello or general board-game papers) and the justification is sometimes brief and repetitive (“game-agnostic, transferable”). A stronger review either (i) limits scope tightly to Connect-4, or (ii) formalizes inclusion rules showing how cross-game papers contribute methodological insight specifically to Connect-4 (with explicit mapping: what transfers, what does not, and why). Right now, it feels like breadth is achieved at the cost of depth and coherence.
KAN discussion is too brief and not integrated; needs a stronger, evidence-based treatment.
The mention of Kolmogorov–Arnold Networks (KANs) in the “Open Challenges” section reads like a late addition and is not connected to earlier taxonomy dimensions (efficiency/adaptability/interpretability/reliability). If KANs are proposed as an emerging direction, the paper should:
clearly position KANs relative to existing neural function approximators in RL (e.g., value/policy networks),
explain what KAN changes in the Connect-4 context (representation, interpretability, sample efficiency, training stability),
cite concrete KAN literature and provide a minimal conceptual/mathematical description rather than a high-level claim.
As an example of how to do a rigorous and practitioner-oriented review, the authors should consult and emulate A Practitioner’s Guide to Kolmogorov–Arnold Networks (structure, density of citations within the narrative, use of definitions/equations, and synthesis-driven exposition). Currently, the KAN paragraph is speculative and underdeveloped for a review article.

Author Response

Reviewer #3 (Second Round)

We thank the reviewer for providing a second-round report. However, we note with concern that the majority of the comments raised in this round either (i) repeat issues that were explicitly addressed and documented in our first-round revision, or (ii) assert the absence of material that is already present in the revised manuscript. Below, we respond to each point by explicitly referencing the relevant sections, pages, and tables in the current version of the paper.

Comment 1: “Survey reads like a catalogue rather than a critical review… taxonomy not used to drive an integrative narrative.”

Response:
This concern substantially overlaps with the reviewer’s first-round Concerns #1–#5, all of which were acknowledged and explicitly addressed in the previous revision.

In particular:

The former Critical Analysis section was removed entirely.
Its content was integrated into a revised Discussion section, which now synthesizes results across approaches rather than summarizing studies individually.
The taxonomy is explicitly used as an organizing framework for the analysis.

These changes are clearly reflected in:

Section 4 (Discussion), which is structured around the core taxonomy dimensions (efficiency, adaptability, interpretability, robustness).
Figure 2, which anchors the taxonomy and is explicitly referenced in the discussion.
Table 5 (Cross-taxonomic roles of AI techniques in Connect-4 research), which provides a synthesis across methods rather than an enumerative summary.

Notably, in the first round, the reviewer identified the taxonomy as “the main original contribution and a real strength.” The present comment does not acknowledge the structural changes made in direct response to that feedback, nor does it engage with the revised discussion structure.

Comment 2: “Insufficient referencing within the text at key technical claims.”

Response:
We respectfully disagree with this assessment.

The revised manuscript includes dense, local citations throughout the technical discussion, particularly in:

Section 3, where each methodological category (search-based, learning-based, hybrid) is discussed with in-text citations.
Section 4 (Discussion), where comparative claims are directly supported by cited studies.
Tables 2–6, which consolidate references while being explicitly discussed in the surrounding narrative.

References are therefore not confined to tables but are integrated at the point where claims are made. The reviewer’s statement appears inconsistent with the actual citation density of the revised manuscript.

Comment 3: “Limited mathematical/technical depth for a methods-and-metrics review.”

Response:
This comment reflects a new expectation that was not raised in the first round.

The manuscript is explicitly positioned as a survey and synthesis of methods and evaluation practices, not as a formal derivation or tutorial paper. As stated in Section 1 (Introduction, pp. 1–4), the paper focuses on methodological roles, evaluation practices, and trade-offs, rather than reproducing canonical derivations (e.g., minimax recursion, UCT equations), which are well-established in the literature.

Nevertheless:

Core algorithmic paradigms (minimax-based search, MCTS, RL, and hybrid systems) are discussed conceptually and comparatively in Section 3, with attention to assumptions, scalability, and evaluation contexts.
Evaluation metrics (win rate, computational cost, real-time feasibility, robustness) are discussed qualitatively and comparatively across Sections 3 and 4, and summarized systematically in Table 5.

This level of abstraction is consistent with survey standards in AI methodology reviews and was not raised as a deficiency in the first round. The reviewer’s first-round feedback focused on structure, synthesis, and redundancy, not on mathematical formalism. Introducing this criterion in the second round does not assess whether prior concerns were addressed.

Comment 4: “Taxonomy is not operationalized into a coherent evaluation framework.”

Response:
This point is directly addressed in the revised manuscript.

Table 6 explicitly maps evaluation dimensions and metrics across methodological categories.
Section 4 (Discussion) uses the taxonomy to explain trade-offs in performance, efficiency, interpretability, and robustness.
Section 5 (Open Challenges and Future Research Directions) explicitly discusses the lack of standardized evaluation protocols as a limitation of the field itself.
The manuscript intentionally avoids prescribing a single benchmark protocol, instead critically identifying the lack of standardization as a research gap, which is articulated in Section 5 (Open Challenges, pp. 23–25).

The paper does not “stop short” of this issue; rather, it identifies heterogeneity in evaluation practices as a structural research gap, which is precisely the role of a critical review.

Comment 5: “Coverage appears uneven and sometimes weakly justified (scope drift).”

Response:
The scope and inclusion criteria are explicitly defined in Section 2 (Review Protocol).

Cross-game studies are included only when they contribute transferable methodological insights relevant to Connect-4 (e.g., search strategies, learning paradigms, explainability techniques). This rationale is stated in the methodology and reiterated where such studies are discussed.

Importantly, the scope of included studies is unchanged from the first-round submission, yet this concern was not raised previously.

Comment 6: “KAN discussion is too brief, speculative, and not integrated.”

Response:
Kolmogorov–Arnold Networks (KANs) are intentionally presented as an emerging and exploratory direction, not as an established solution within Connect-4 AI.

Accordingly:

KANs are discussed in Section 5 (Open Challenges and Future Research Directions).
They are explicitly framed as future-facing, with clear limitations and open questions.
Their relevance is connected to interpretability and function approximation challenges discussed earlier in the paper.

Expanding speculative material beyond this would risk overstating the maturity of the approach. The current treatment is deliberately cautious and consistent with best practices for review articles.

Concluding Remark

We respectfully note that the second-round comments:

Largely reiterate issues already addressed and documented in the first revision,
Introduce new evaluation criteria not previously requested,
Or assert omissions that are demonstrably contradicted by the revised manuscript.

We therefore believe that the manuscript already satisfies the reviewer’s original concerns and meets the standards for a critical, taxonomy-driven review. We submit this rebuttal for the Editorial Board’s consideration and defer to the Academic Editor’s judgment regarding the resolution of this review.

Article Menu

Connect-4 AI: A Comprehensive Taxonomy and Critical Review of Methods and Metrics

Reviewer #3 (Second Round)

Comment 1: “Survey reads like a catalogue rather than a critical review… taxonomy not used to drive an integrative narrative.”

Comment 2: “Insufficient referencing within the text at key technical claims.”

Comment 3: “Limited mathematical/technical depth for a methods-and-metrics review.”

Comment 4: “Taxonomy is not operationalized into a coherent evaluation framework.”

Comment 5: “Coverage appears uneven and sometimes weakly justified (scope drift).”

Comment 6: “KAN discussion is too brief, speculative, and not integrated.”

Concluding Remark

Further Information

Guidelines

MDPI Initiatives

Follow MDPI