Dual-Contrastive Attribute Embedding for Generalized Zero-Shot Learning
Round 1
Reviewer 1 Report
Comments and Suggestions for AuthorsThe paper introduces a novel method, Dual-Contrastive Attribute Embedding (DCAE), which explicitly addresses the domain shift problem by learning highly discriminative attribute-level features and attribute prototypes. The use of dual-contrastive learning (both attribute-level and class-level) for optimization is a notable innovation. The reported superior performance on three benchmark datasets (CUB, SUN, AWA2) compared to existing state-of-the-art techniques further supports its significance.
1) While Figure 2 provides a good overview, a more detailed architectural diagram for the internal workings of each module (e.g., the exact layers and connections within the image encoder network, attribute filtering network, and attribute semantics encoder) would be beneficial for full understanding and reproducibility.
2) The paper successfully demonstrates the model's strengths with t-SNE visualizations. However, discussing specific cases where the model fails or underperforms, and analyzing the reasons behind these errors, could provide deeper insights into its limitations and guide future research.
3) While performance is excellent, there's no explicit discussion of the computational complexity (training and inference time, memory usage) of DCAE compared to other methods. This is an important practical consideration for novel models.
4) While the hyperparameter analysis (Figure 4, Figure 5) shows how performance changes with certain parameters, a more in-depth discussion on the robustness of the model to variations in hyperparameters (especially how sensitive it is outside the reported optimal range) would be valuable.
5) While the concept is explained (lines 205-213), the precise mechanism of adaptively selecting "hard" positive and negative samples, especially the sorting and exclusion criteria, could benefit from a more explicit algorithmic description or pseudocode, beyond just stating the top μ or ε are excluded.
6) The paper relies on human-defined attributes. A brief discussion on how the quality or granularity of these attributes might impact DCAE's performance, or if the model offers any insights into more effective attribute design, could be interesting.
7) While strong on the three benchmark datasets, a discussion on potential challenges or adaptations needed for entirely different types of data or attributes (e.g., highly abstract attributes, attributes requiring very complex reasoning) would strengthen the generalizability claims.
Author Response
Please see the attachment.
Author Response File:
Author Response.pdf
Reviewer 2 Report
Comments and Suggestions for Authors
The paper relies heavily on empirical validation but provides little theoretical analysis of why dual-contrastive optimization should guarantee better generalization. For example, no formal discussion of sample complexity, embedding geometry, or generalization bounds.
The reliance on calibrated stacking for GZSL is borrowed from prior work; its interaction with DCAE is not deeply analyzed.
Only three benchmark datasets are used; these are standard but limited. Absence of evaluation on large-scale or domain-shift-heavy datasets weakens claims of broad applicability.
Results are presented as single numbers without standard deviations or significance testing; unclear whether improvements are statistically significant or due to variance in training.
While components are analyzed, the effect of alternative sampling strategies beyond hard sample selection or different backbone architectures (ResNet101 only) is not studied.
Although prototypes are mentioned, their dimensionality, initialization, and interpretability are not deeply discussed.
The selection mechanism is heuristic (percentile-based µ and ε). More formal justification or adaptive strategies could strengthen the technical novelty.
Several important cross-disciplinary literatures are missing, like Multi-objective optimisation of micromixer design using genetic algorithms and multi-criteria decision-making algorithms.
Computational complexity and training cost are not reported. Dual contrastive learning with large attribute sets may be expensive.
Author Response
Please see the attachment.
Author Response File:
Author Response.pdf
Round 2
Reviewer 1 Report
Comments and Suggestions for AuthorsPlease add an abbreviation table at the end of the paper.
Author Response
Comment1: Please add an abbreviation table at the end of the paper.
Response: Thanks for your comment. We have added an abbreviation table at the end of the paper.