Ptycho-LDM: A Hybrid Framework for Efficient Phase Retrieval of EUV Photomasks Using Conditional Latent Diffusion Models
Round 1
Reviewer 1 Report
Comments and Suggestions for AuthorsThe manuscript presents a compelling advancement in ptychographic phase retrieval, with strong empirical results and a valuable dataset. Addressing the major concerns—particularly real-data validation and defect detection metrics—would significantly strengthen its impact. Recommend acceptance with minor revisions.
Sec. 4.2.4: Clarify how peripheral misdetections (beyond probe support) impact full-chip inspection.
Author Response
Please see the attachment.
Author Response File:
Author Response.pdf
Reviewer 2 Report
Comments and Suggestions for AuthorsComments are attached as pdf.
Comments for author File:
Comments.pdf
Author Response
Please see the attachment.
Author Response File:
Author Response.pdf
Reviewer 3 Report
Comments and Suggestions for AuthorsThe submitted manuscript proposes a hybrid framework Ptycho-LDM that combines the Difference Map ptychographic phase retrieval algorithm with a conditional latent diffusion model to improve EUV photomask phase retrieval. The authors demonstrate that their method outperforms conventional approaches and highlights improvements in reconstruction speed and quality. The proposed framework is interesting and has potential for photomask inspections, but there are several important concerns that need to be addressed before further consideration. The following are my comments:
- The study is based entirely on in-house synthetic dataset LAMP and does not include any evaluation using real-world experimental measurements. Since actual EUV mask inspection deals with noise, alignment errors, and other imperfections, it’s hard to tell how well the proposed method would work in practice. Adding even a small test on real-world data or discussing how the model might be adapted, would help support the practical value of the method.
- The presented results show that LDM adds substantial value to the proposed framework. But it would strengthen the work to include more analysis of how much each pat of the hybrid method contributes. How well would the LDM perform if initialized differently, or without DM at all?
- The method assumes ideal probe shape, perfect scanning, and no noise. Real imaging setups often deviate from these conditions. The manuscript does not evaluate how sensitive the method is to such practical issues.
- MSE and PSNR are computed within a mask defined by a probe intensity threshold τ = 0.015, but the manuscript does not explain why this threshold was chosen or whether the evaluation is sensitive to it. Since it affects the reported metrics, this should be clearly addressed.
- The model was trained using high-performance GPUs GH200 on the CSCS Alps cluster, but the authors do not report details on training cost, training time, inference time, memory usage, etc. This information is important for understanding whether the proposed method is feasible for real-time inspection scenarios.
- Some terms like ‘LDM’ are defined more than once, line 92, 97, 99, 133, 230, 347... After introducing abbreviations like LDM or DM, the full names are still used later in the manuscript. For clarity and professionalism, the authors need to remove any repeated definitions and use abbreviations consistently after their initial definition.
Author Response
Please see the attachment.
Author Response File:
Author Response.pdf
Round 2
Reviewer 3 Report
Comments and Suggestions for AuthorsThe authors have done a good job addressing the comments.
