Interpretable Deep Prototype-Based Neural Networks: Can a 1 Look like a 0?
Round 1
Reviewer 1 Report
Comments and Suggestions for AuthorsThe paper identifies, demonstrates and discusses the potential risk of
a prototype-based explainable AI method associated with using the
interpretations of the prototypes during the training by mapping them
back into the human-readable input space.
The paper is very well and clearly written, has a convincing structure
and a detailed mathematical analysis of the method leading to the
demonstrated paradigm.
The reviewer believes that the paper can be published as is.
Comments on content-related issues:
Apart from the well-explained feature of the method that the
prototypes' values don't provide semantical information, it might also
be noted as a beneficial characteristic of the method that "the
dynamics of the autoencoder and the spatial configuration of the
prototypes within the latent space" is the major cause for the
interpretability. Namely, with a given ground-truth set, the
discrepancies between predictions and found prototype-class may allude
to the specifics of 'how the predictor errs', i.e., which type of
input may be mis-interpreted. Of course, this would be a post-hoc
method, analysing the predictor, instead of an prototype-assistance of
the training.
However, the reviewer finds it interesting that the third-last line in
the table of Fig. 4 puts the heighest weight on class '0', while
visually, it may well be attributed to '8' or '9'. In a post-hoc way,
this may be considered as a bonus of the method, identifying the
features on which the predictor decides (see the discussion under
Related Work in Reference 22).
The reviewer thinks, this might be worth noting, as it is done in the
last sentence of the conclusion ("post-hoc auditing tools capable of
identifying contradictory associations"). The reviewer understands,
however, if the authors think this is not within the scope of the
paper.
Comments on Style and Form:
Fig. 4: It is not clear which of the 15 digits shown in the 3 by 5
raster corresponds to the 15 rows in the table below (although it
becomes plausible after inspection of the values).
The references sometimes have bold-face year-of-occurrence numbers,
sometimes not.
Typos:
No typos found.
Author Response
Please see the attachment.
Author Response File: Author Response.pdf
Reviewer 2 Report
Comments and Suggestions for AuthorsThe state of the art needs to be expanded with more recent results, including a comparative analysis of some evaluations against alternative methods. Structurally, it is important to standardize citation style throughout the text, as currently references alternate between numerical formats and author date (e.g., Author et al.). Also, the formatting of multiplede citations be fixed, as sometimes they are separated by commas and sometimes by dashes. The description of the architectural figure should be improved by avoiding vague references such as “superior” in the text (it can be divided into clearly defined sections for clarity). Acronyms should be defined only once at their first occurrence to avoid redundancy and improve reader understanding.
Author Response
Please see the attachment.
Author Response File: Author Response.pdf
Reviewer 3 Report
Comments and Suggestions for AuthorsThis paper proposes a Deep Prototype-Based Network (DPBN) that stacks multiple prototype layers for “this-looks-like-that” reasoning, alongside a hypersphere-intersection (HS-Int) decoding scheme to reconstruct deep prototypes back into input space. It further introduces a Normalized Negative Entropy (NNE) metric to assess how class-specific the prototype-to-class weight matrix is.
Some things should be addressed before publishing the paper.
- The reproducibility of the results is impossible as the code is not available.
- MNIST is saturated and visually simple; conclusions about interpretability/robustness may not transfer. Include at least one modern, fine-grained or texture-biased benchmark where prototype methods are commonly tested (e.g., CUB-200-2011, Stanford Cars, CIFAR-10/100, Fashion-MNIST). Current claims about a “structural paradox” are under-supported without broader evidence
- Provide complexity analysis (per-sample, per-layer), convergence behavior (rate, failure modes), and ablations versus simpler decoders (e.g., MLP regressors from distance vectors to latent codes, or linear least-squares only). Discuss differentiability when intersections are non-unique and how sub-gradients influence training. Currently, the method sounds elegant but its computational footprint and robustness are unclea
- NNE measures global sharpness of the classifier head but not faithfulness of each prototype to a class. Add per-prototype diagnostics: top-1 class mass, prototype class-purity over nearest neighbors, and per-prototype confusion (precision/recall for the class it most supports). Consider training-time constraints (e.g., class-specific prototype assignment or regularizers tying prototypes to class logits) and test whether they reduce paradoxes without hurting accuracy
Author Response
Please see the attachment.
Author Response File: Author Response.pdf
Round 2
Reviewer 2 Report
Comments and Suggestions for AuthorsThe revisions satisfactorily address my comments, and I believe the manuscript is now ready for publication