Review Reports - The Geometry of Privacy: A Two-Stage Analysis of Generative Membership Inference in Federated Learning

Round 1

Reviewer 1 Report

Comments and Suggestions for Authors

The manuscript «The Geometry of Privacy: A Two-Stage Analysis of Generative Membership Inference in Federated Learning» is dedicated to developing a new approach to analysing the risks of membership inference attacks in federated learning systems. The authors present an in-depth investigation of privacy issues in the context of Federated Learning (FL), with a focus on Membership Inference Attacks (MIA). The essence of the proposal lies in a two-stage methodology for assessing privacy based on a geometric analysis of signal survival and its attribution. The study conducted by the authors is important in view of the growing application of federated learning and the need to protect confidential data. The work addresses the need for a formalised approach to assessing the risks of information leakage in FL systems.

I would like to ask the authors a few questions:

In Section 6, the validation is based solely on the use of PathMNIST. Could such empirical verification fail to reflect the system’s behaviour on other types of data?
What confidentiality factors does the proposed SUR fail to take into account?
To which schemes other than FedAVG does your solution apply?
How does the size of the updates affect the results?

The study proposes a new method for assessing privacy risks in federated learning. The authors have developed the SUR metric and a two-stage analysis approach that provides a better understanding of how data protection works in FL systems. The manuscript is recommended for publication following minor revisions. The study is of interest to specialists in ML and data protection, offering a fresh perspective on the problem of privacy in FL.

Author Response

Please see the attached document.

Author Response File: Author Response.pdf

Reviewer 2 Report

Comments and Suggestions for Authors

The paper proposes a clear conceptual split of federated MIA risk. This decomposition is the main strength of the work: it turns an end-to-end privacy question into two interpretable statistical questions, and the paper does a good job motivating why this separation matters in FL. The formulation around alignment, cancellation, and score geometry is interesting and potentially useful for privacy auditing.

The claims are often stronger than what the experiments appear to support, especially where the paper uses deterministic language such as “absolute certainty,” “decisively confirms,” or “strictly holds at 1.0.”

The distinction between client-level and sample-level privacy is a good idea, but the manuscript would benefit from a clearer statement of what exactly TS-MIA quantifies: client contribution detectability, sample membership, or a hybrid notion. Right now the interpretation moves between these levels, which may confuse readers.

The paper should better connect its framework to existing attack literature. It cites recent FL MIA works, but the comparison is mostly conceptual. A reviewer would expect at least one table clarifying: what existing methods measure, what TS-MIA adds, and whether TS-MIA correlates with actual attack outcomes from those baselines.

Some prose is overly dense and rhetorical, especially in the results and conclusion sections. Toning this down would improve scientific style. Phrases like “decisively confirms,” “structurally insulated,” and “strictly locking the attributable score trace” reduce clarity more than they help.

There is also a small presentation issue in the funding/authorship block: “funding, F.A.G.” appears inconsistent with the author list, which names Federico Álvarez but not “F.A.G.” This should be checked.

The paper would improve substantially if the authors added one real attack validation layer: run one or more actual MIA baselines on the same federated trajectories and show whether TS-MIA predicts attack success.

The paper also needs a stronger threat model section that explains when access to UUU and BB is reasonable and when the analysis should instead be interpreted as an upper bound or auditing diagnostic rather than an executable attack model.

The authors should also revise the wording of the conclusions to avoid overclaiming and clearly separate what is proved, what is observed in simulation, and what is hypothesized for broader FL settings.

The paper has a promising core idea and a good theoretical angle, but it needs stronger experimental grounding, a more realistic threat-model discussion, and more careful claim calibration before it is ready for acceptance.

Author Response

Please, see the attached document.

Author Response File: Author Response.pdf