RipenessGAN: Growth Day Embedding-Enhanced GAN for Stage-Wise Jujube Ripeness Data Generation
Round 1
Reviewer 1 Report
Comments and Suggestions for AuthorsReviewer Comments
- Key Contribution and Implementation
- The authors state that temporal progression data generation is the key contribution of RipenessGAN. However, in Figure 1 this aspect is not clearly illustrated. I recommend adding intermediate synthetic data examples between the four discrete stages shown in the synthetic image row. This would better demonstrate the claimed capacity for continuous stage-wise transformation.
- Conceptual Framing of RipenessGAN
- CycleGAN serves as the baseline, performing domain translation between maturity classes (e.g., green ↔ red fruit) using categorical labels. While effective for discrete mappings, it does not capture continuous growth progression.
- RipenessGAN introduces Growth Day Embeddings (GDE) to expand the space from four discrete classes (0–3) to 56 daily steps, thereby enabling more granular control of temporal transitions. The generator conditions on both class label and growth day, while the discriminator (PatchGAN) incorporates these embeddings for conditional evaluation. Linear interpolation between embeddings further supports continuous progression.
- Conceptually, this is a meaningful extension of CycleGAN. It is similar to conditional GANs with embeddings and is technically valid. However, clarifications are needed on:
- Whether cycle-consistency is preserved or replaced, since stability in CycleGAN depends on this term.
- How temporal smoothness is enforced beyond PatchGAN, and whether a dedicated temporal consistency loss is included.
- The claim of superiority over diffusion models: diffusion’s noise-based conditioning is not directly comparable to temporal embeddings and should be justified.
- The rounding of growth days to integers may introduce discontinuities; interpolation mitigates this but should be more formally described (e.g., with an equation).
- Strengths
- Qualitative results are included (Figure 7), which appropriately demonstrate visual comparisons across CycleGAN, DDPM, and RipenessGAN.
- Quantitative metrics (FID, LPIPS, PSNR, SSIM) are aligned with qualitative evidence, showing perceptual differences between models.
- The discussion and conclusion sections provide a clear narrative of strengths and weaknesses of each paradigm, linking results to augmentation and classification.
- Remaining Concerns
- Impact remains modest
- Reported improvements in classification accuracy (98.67% → 98.75%) are marginal. The practical agricultural value of such small gains is unclear. The authors should emphasize whether qualitative improvements (e.g., smoother temporal transitions) provide benefits beyond classifier accuracy.
- Comparisons are limited
- The study compares CycleGAN, DDPM, and RipenessGAN, but omits simpler augmentation baselines (e.g., traditional augmentation techniques, temporal CNN/RNN models). Without these, it is difficult for readers in AI + agriculture to judge whether RipenessGAN is a necessary advance.
- Discussion style
- The Discussion reads more like a GAN benchmarking paper than an agricultural phenotyping study. It would be strengthened by tying the visual results directly to agricultural traits (e.g., texture, gloss, and color progression in jujube images, and how these align with real-world ripening physiology).
- Suggested Revision
The inclusion of Figure 7 with comparative images across CycleGAN, DDPM, and RipenessGAN is valuable and substantiates the numerical results. However, the gains in classification accuracy are modest, raising questions about practical significance for agricultural applications. The discussion would benefit from stronger connections between visual improvements (e.g., smoother temporal transitions) and agricultural utility, and from additional benchmarks against simpler augmentation or temporal modeling baselines. These revisions would position RipenessGAN more convincingly as not only a novel GAN variant, but as a meaningful contribution to agricultural AI.
- Crop Choice and Generalizability
- The manuscript focuses on jujube as the case study crop, but the rationale for this choice is not explained. Was this due to dataset availability, agricultural importance, or specific visual characteristics of jujube ripening?
- I recommend that the authors clarify the reason for selecting jujube.
- More importantly, the authors should discuss whether RipenessGAN is expandable to other crops, such as tomato, grape, or apple, which also exhibit distinct ripening trajectories. If the Growth Day Embedding approach is general, then its scalability should be emphasized. If there are limitations (e.g., reliance on crop-specific cues or dataset constraints), these should be acknowledged.
- This clarification would help establish whether RipenessGAN is intended as a general-purpose agricultural generative framework or primarily as a jujube-specific solution.
Author Response
Please see the attachment.
Author Response File:
Author Response.pdf
Reviewer 2 Report
Comments and Suggestions for Authors- The literature review is not sufficiently comprehensive or in-depth. Recent advancements in agricultural vision, such as Transformers, Vision-Language Models, and multimodal generative models, are scarcely mentioned. Traditional approaches to addressing data imbalance (e.g., oversampling/undersampling, class-weighted loss functions, ensemble learning) are not systematically discussed, which makes the research perspective appear narrow. While the manuscript notes the limitations of CycleGAN and diffusion models, it does not provide a systematic comparison of existing agricultural studies, instead relying on scattered references. A more thorough supplementation is recommended.
- The statements in the research purpose and background section are overly categorical. For example, the manuscript claims that CycleGAN “lacks temporal progression modeling capability” and that diffusion models are “too slow for deployment.” Such expressions are too absolute, as improved versions of CycleGAN and lightweight diffusion models have partially alleviated these issues.
- The extension to practical applications is insufficient. Although harvest timing is mentioned, the manuscript does not elaborate on how the results could be translated into tangible agricultural benefits, such as labor savings or improved marketability. As a result, the justification of application value remains underdeveloped.
- It is suggested to include a transitional paragraph that briefly summarizes the specific requirements of agricultural vision tasks for generative models (e.g., temporal consistency, efficiency under resource constraints). This would provide a more natural motivation for introducing RipenessGAN.
- Although the mathematical definition of Growth Day Embedding is complete, it lacks intuitive illustrations or pseudocode, making it difficult for general readers to follow. The explanation of the temporal consistency loss function is also superficial and does not clarify its role in improving training stability or preventing mode collapse. Parameter settings and implementation details (e.g., learning rate schedules, loss function weights) are not adequately justified.
- While the overall methodology is clearly described, some parameter choices (e.g., the weight of the temporal consistency loss, the selection of 16 embedding dimensions) lack sufficient justification, which may affect the generalizability of the approach. Classifier validation relies solely on ResNet-18, with no evaluation across other architectures, potentially limiting the applicability of the findings. The data augmentation strategies employed are relatively basic and do not account for common agricultural challenges (e.g., lighting variations, occlusion, complex backgrounds), raising concerns about robustness in real-world scenarios.
- The analysis of results remains largely at the numerical level. The manuscript does not provide deeper mechanistic explanations of why RipenessGAN outperforms CycleGAN and diffusion models. Failure cases or artifacts in generated images are not examined. Moreover, the lack of statistical validation (e.g., significance testing, variance analysis) undermines the robustness of the claims, making them appear intuitive rather than evidence-based.
- All experiments were conducted on high-performance GPUs (4 × RTX 3090). There is no verification in practical agricultural scenarios, nor evaluation of inference speed, power consumption, or memory usage on low-resource devices. While RipenessGAN shows greater efficiency than CycleGAN and diffusion models in relative terms, this does not automatically imply deployability in real-world agricultural production. The claim that RipenessGAN is “highly suitable for resource-constrained agricultural scenarios” is therefore insufficiently rigorous and risks overgeneralization.
- The discussion does not explore the underlying reasons for CycleGAN artifacts or DDPM’s blurred boundaries. Nor does it adequately analyze limitations such as the singularity of the dataset, adaptability to real-world conditions, or deployment feasibility. These omissions reduce the depth of the discussion.
- The conclusions section largely repeats experimental results (including detailed numerical values), resembling a restatement of results rather than a higher-level synthesis of research significance and scholarly contribution. The absence of acknowledgment of limitations or shortcomings makes the conclusions appear overly optimistic.
- The manuscript directly emphasizes that the method “has practical value in agricultural vision applications,” and even implies broad applicability. However, the experiments lack real-world deployment or validation under resource-constrained conditions, making this claim overgeneralized. Similarly, adaptability to complex agricultural environments (e.g., varying illumination, occlusion, noise) has not been tested, yet the conclusions emphasize “robustness and scalability,” which is not adequately supported by evidence.
Author Response
Please see the attachment.
Author Response File:
Author Response.pdf
Round 2
Reviewer 1 Report
Comments and Suggestions for AuthorsMost of my comments are reflected in the revised vesion.
Reviewer 2 Report
Comments and Suggestions for AuthorsThere are no other issues, it can be accepted.

