Review Reports - Designing CAPTCHA Systems with Reinforcement Learning for Adaptive Defense

Round 1

Reviewer 1 Report

Comments and Suggestions for Authors

The article proposes an adaptive CAPTCHA defense system based on reinforcement learning for high-security web applications. Specifically, it models bot detection as a POMDP, uses a PPO+LSTM agent to analyze sequential behavioral telemetry, and complements it with an XGBoost classifier at the session level. The validation is carried out on a simulated ticket-purchasing web application, with human data and different types of bots, including scripted bots, replay bots, and LLM-based bots. The work also collects mouse movements, clicks, scrolls, and keystrokes, including mouse sampling every 15 ms.

Among the strengths of the manuscript, first, the originality of the approach stands out, especially in framing CAPTCHA defense as a sequential decision-making problem rather than a simple static classification task. Second, the methodology is described in considerable detail and makes it possible to understand the web application, the data collection, the RL architecture, the classifier, and the experimental protocol. Third, the results are presented clearly, through tables, comparisons between agent variants, analysis by bot tiers, and complementary evaluation with XGBoost. Finally, the article does acknowledge several relevant limitations, such as the use of a single application, a small set of human sessions, and the dependence on client-side DOM telemetry.

However, the article must improve the following aspects in order to be considered suitable for acceptance:

The research questions and, where appropriate, verifiable hypotheses should be stated explicitly, because the manuscript currently describes the methods very well, but does not make the research design sufficiently explicit.
The text can be resumed, since several sections are lengthy and too theoretical for an applied paper.
The practical feasibility of collecting so much information to detect bots should be critically discussed. Although the article claims that the approach is “low-cost” and widely applicable, in practice it collects a considerable amount of telemetry, including mouse movements at 15 ms intervals, in addition to clicks, scrolls, and keyboard activity, which may be costly or not very scalable outside very specific contexts.
It should be explicitly discussed that detection seems to occur late in the interaction. The manuscript itself indicates that actions such as allow or block can only be carried out at the terminal stage and that the agents observe the full sequence before making the final decision, which reduces the practical value of the system for intervening early before critical actions such as reservation or checkout. For me, this is the main weakness that needs to be addressed.
A formal conclusions section should be incorporated, because the article currently moves from Discussion and Limitations to Future Work and does not offer an explicit closing section that synthesizes contributions, scope, and implications.
Several claims about real-world applicability should be moderated, since the evidence comes from a single simulated application, with a limited human sample, desktop sessions, and synthetic bots.
The discussion on privacy, consent, and data management should be strengthened, because the proposal is based on silent collection of behavioral telemetry and this aspect is barely problematized.
The bibliography must reduce the use of technical documentation, web pages, and preprints.

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Reviewer 2 Report

Comments and Suggestions for Authors

his paper addresses an important and timely problem and contains several promising ideas, especially its attempt to make CAPTCHA orchestration open, auditable, and adaptively learned rather than purely proprietary. However, the current manuscript still has major weaknesses in empirical framing, comparison fairness, generalization evidence, and claim calibration. I therefore i suggest few suggestion to for improvements before acceptance of manuscript.

Reframe the contribution to avoid overstating deployment readiness. Clarify the exact advantage of RL over the stronger XGBoost benchmark on this dataset.
Explain clearly that terminal action is taken at session end, or add true early-intervention experiments.
Standardize RL and classifier evaluation protocols for fair comparison. Add stronger generalization tests such as user-disjoint or bot-family-disjoint evaluation.
Add more baselines and ablation studies. Add sensitivity analysis for reward and challenge-outcome assumptions.
Report multiple training seeds for RL. Expand the treatment of accessibility, fairness, and privacy.

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Reviewer 3 Report

Comments and Suggestions for Authors

The paper is rather long providing very detailed information to the reader, however, it could prove a disadvantage as the presented information can definitely be presented in a more summarized and focused manner.

The paper also requires a careful review for editing errors and typos. Examples include:

Lines 33-34: “…reflecting the scale at which of this issue.”

Line 157: “…supplement other other approaches”

Line 220: “This distinction is especially important when comparing prior systems with the framework proposed in this paper.”: Which distinction?

On the concept proposed by the paper:

The authors focus on the development of a CAPTCHA solution which comes mainly as a replacement of existing solutions like Google reCAPTCHA. As the authors recognize, based on the rapid development of Machine Learning and Generative AI solutions, reCAPTCA v3 is already considered to be vulnerable to latest bots attacks which can mimic human behavior in mouse moves and key strokes.

The proposed solution relies on similar attributes (like mouse movement, click events, keystrokes, and scrolling patterns) so as discriminate between a human user and a machine/bot. The proposed detection methods and especially XGBOOST model provide an excellent performance in recognizing a human user based on the dataset considered in the paper. However, it is evident that the proposed concept exploits similar features to the ones used by e.g. reCAPTCHA. In this context, the proposed solution is equally vulnerable to machine learning / AI bots based attacks as reCAPTCHA. Even if it is a new method unknown to potential attackers it may be compromised in a period of weeks or months as soon as it becomes commercially available since mimicking human patterns for the input features is a feasible task for generative AI already.

As any proposed security method should provide a clear advantage vs. state of the art methods which will allow for immunity to attacks for a substantial time period, it appears that the proposed solution does not qualify for wide adoption in the market. To provide an example from another cybersecurity field: post-quantum cryptography provides solutions that can survive future quantum computing based security attacks creating thus new cryptographic solutions that can last for many years.

The authors provide a good level of discussion regarding the potential drawbacks of the proposed concept. However, they do not address the core issue: i.e., is the proposed solution capable to handle real state of the art AI based attacks and not the simulated AI bot patterns of the study.

Based on the above comments, in order for the paper to become publishable, the authors should provide a clear rationale on why the proposed solution is eligible as a future proof security measure that justifies commercial exploitation.

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Round 2

Reviewer 1 Report

Comments and Suggestions for Authors

I am satisfied with the revisions made by the authors, as they have adequately addressed my main comments and substantially improved the manuscript. Therefore, I recommend that the paper be accepted for publication in its current form.

Reviewer 2 Report

Comments and Suggestions for Authors

Authors have addressed the issue raised in review process and changes have been incorporated. i don't have any further comment. So, manuscript could be accepted after formal processing

Reviewer 3 Report

Comments and Suggestions for Authors

The authors have revised and improved the content of the paper. As a feedback to the authors: increasing the length of the paper is not a good practice.