Learning to See Around Corners: A Deep Unfolding Framework for Terahertz Radar Non-Line-of-Sight 3D Imaging
Round 1
Reviewer 1 Report
Comments and Suggestions for AuthorsThe manuscript presents some algorithms on Learning to See Around Corners on hidden targets by 3D imaging using a Non-Line-of-Sight THz Radar.
In my opinion the idea of this work is highly interesting and methodologically addressing the stated research problem. The adopted model demonstrates a strong theoretical foundation, and the techniques employed for both the presentation and solution of the problem are scientifically robust and, in several innovative aspects. The analytical framework is well-constructed, and the authors clearly possess a deep understanding of the domain, which is reflected in the mathematical formulation and the overall coherence of the proposed methodology. Particulary, the strength of the work lies in its ability to integrate advanced modelling techniques with a structured problem-solving strategy including the introduction of some algorithms. The results appear consistent with the assumptions made, and the internal logic of the approach is sound. From a purely scientific and technical standpoint, the manuscript makes a noteworthy contribution and has the potential to advance the field.
However, there are significant concerns regarding the applicability and generalizability of the proposed design. The experimental or conceptual setup, as currently presented, does not adequately reflect real-world conditions. Key variables and constraints that would typically influence implementation in practical scenarios seem to be either simplified or omitted. As a result, while the model performs well under the defined conditions, its relevance to real-world applications remains uncertain. Addressing this gap would substantially strengthen the impact of the work.
Additionally, the manuscript suffers from issues related to clarity and readability. The text contains an excessive number of acronyms, many of which are inconsistently used throughout the document. This creates confusion and makes it difficult for the reader to follow the argument, particularly for those who may not be deeply familiar with the specific subdomain. A more careful introduction of terminology, along with consistent usage and possibly a glossary of acronyms, would greatly improve the accessibility of the manuscript.
In summary, while the scientific quality of the model and the associated techniques is commendable, revisions are necessary to (I) ensure that the design better represents real-world conditions and (II) improve the clarity of the text by reducing and standardizing the use of acronyms. Addressing these issues would significantly enhance both the practical relevance and readability of the manuscript.
Note: I have included several comments and suggestions throughout the manuscript for the authors’ consideration and that require the authors’ attention.
Comments for author File:
Comments.pdf
Author Response
Please see the attachment
Author Response File:
Author Response.pdf
Reviewer 2 Report
Comments and Suggestions for AuthorsNLOS THz radar 3D imaging is an emerging area with significant practical potential. The paper addresses this relevant and timely topic. The construction of a THz radar experimental platform and the collection of real measured data represent a meaningful effort.
However, the novelty of this paper is limited. The method used is essentially a combination of two existing LOS components: holographic imaging operator and FISTA-based deep unfolding applied to a highly simplified NLOS scenario. The authors themselves acknowledge in the conclusion: "Under this simplified model, the proposed NLOS learning imaging algorithm can essentially be regarded as an extension and expansion of existing LOS imaging networks." The FISTA-based deep unfolding framework closely mirrors the architecture of FISTA-Net by Xiang et al. (IEEE Trans. Medical Imaging, vol. 40, no. 5, pp. 1329–1339, 2021). This seminal work is not cited in the manuscript, despite the proposed network sharing the same name and nearly identical structure. This is a significant citation omission. The authors need to convincingly argue what NLOS-specific innovations are present beyond simply applying existing LOS deep unfolding methods in a mirrored geometry.
The NLOS model is overly simplified. It is built on assumptions that significantly reduce its practical applicability. For example, only specular reflection is considered, diffuse scattering and diffraction are ignored; a single flat metal plate serves as the reflective surface, rather than realistic building materials. Under these constraints, the proposed method essentially solves a standard LOS sparse imaging problem on a mirror-projected coordinate system, rather than addressing the genuine challenges of NLOS radar imaging such as multipath, phase errors from rough surfaces, clutter separation, etc.
A major revision addressing the above concerns, particularly strengthening the novelty justification, broadening the experimental comparisons, and being more forthright about the method's current limitations, is required before the paper can be reconsidered for publication.
Comments on the Quality of English LanguageThe quality of English in this manuscript must be improved before publication.
While the technical content is generally understandable, the paper contains numerous grammatical errors, typographical mistakes, and spelling issues (e.g., “Teraherz,” “Fronbenius,” “propriate,” “initialiaztion,” “operatoring,” “In additional”), as well as misspelled author names in the references (e.g., “Rasker” for “Raskar,” “Wwi J.S.” for “Wei J.S.”). There are also instances of awkward phrasing, inconsistent terminology, and mixed Chinese/English labels in figures.
I recommend a thorough language edit by a proficient English speaker or professional editing service.
Author Response
Please see the attachment
Author Response File:
Author Response.pdf
Reviewer 3 Report
Comments and Suggestions for AuthorsThis manuscript presents a 3D non-line-of-sight (NLOS) imaging method for hidden metallic targets using a 121 GHz system together with a deep-unfolding reconstruction network (FISTA-Net). The work combines a model-based holographic imaging framework with a learnable iterative sparse-recovery scheme, and the experiments show improved reconstruction quality over conventional RMA and classical FISTA, especially under reduced sampling. The topic is timely, and the combination of computational imaging and learning-based reconstruction is potentially valuable for NLOS sensing at sub-THz frequencies. However, in its current form, the paper still has several important issues that should be addressed before publication. My main concern is that the manuscript currently reads as a proof-of-concept demonstrated under a highly favourable and idealized relay-wall scenario, while some of the practical limitations and broader positioning of the work are not yet sufficiently discussed. I recommend a major revision before the publication on Photonics.
- The terminology should be made more precise. The system operates at 121 GHz, which is close to 0.1 THz. While this is near the THz boundary, it would be clearer and more physically accurate to describe the system explicitly as a 0.1 THz or sub-THz imaging system, rather than implying a more general THz imaging regime. This distinction matters because scattering behaviour, wall interaction, and practical deployment conditions at around 0.1 THz can differ substantially from what many readers may associate with higher-frequency THz imaging.
- the experimental scenario appears too idealised because the relay wall is sheet metal. This provides a highly reflective and favourable condition for the proposed method, but it is far from many realistic NLOS situations. The authors should discuss this limitation more explicitly. More importantly, it would significantly strengthen the paper if they could test or at least simulate relay walls with less ideal reflective properties, such as rougher metallic surfaces, ceramic-coated surfaces, polymer-coated surfaces, or more diffuse walls. It would be very useful to know how the proposed 0.1 THz NLOS 3D imaging performs for different wall materials and different reflection conditions, since this is critical for practical applicability.
- It will be worthwhile to report and discuss distance dependence of the imaging performance. The current experiments seem to focus on one favourable hidden-target distance. For a practical NLOS imaging system, it is important to understand how the reconstruction quality changes as the hidden target moves farther away, or as the relay path becomes less favourable. I suggest the authors include a simple distance-dependent study, for example showing performance metrics or representative reconstructions for several target distances. This would make the practical limits of the method much clearer.
- The manuscript would benefit from a stronger discussion of related NLOS imaging approaches that address non-ideal illumination or detection conditions.
In particular, the authors should better position their contribution relative to prior work such as “Liu, Xintong, Jianyu Wang, Leping Xiao, Zuoqiang Shi, Xing Fu, and Lingyun Qiu. "Non-line-of-sight imaging with arbitrary illumination and detection pattern." Nature Communications 14, no. 1 (2023): 3230.”, which directly considers NLOS imaging beyond idealised regular acquisition patterns.
The present paper focuses on sparse sampling and deep unfolding, which is interesting, but the authors should explain more clearly what is new relative to that line of work and how their method would behave when the illumination or detection pattern becomes less regular or less ideal.
- 2 --- At present, it is not self-contained or self-explanatory enough. The figure should include a simple summary of the operations performed at each layer so that the reader can understand the data flow directly from the diagram. In particular, there are two arrows involving 𝑋^{t} and Z^{(t+1)} , but the transition between these variables is not sufficiently clear. The authors should explicitly explain what happens between these two nodes and how the output of layer 𝑡 t becomes the input to layer t+1. A clearer caption and more explicit labelling would greatly improve readability.
- The realism and generalisability of the training strategy should be discussed more carefully. If the network is mainly trained on simplified sparse synthetic targets, then the authors should comment on how well this training distribution represents realistic hidden scenes. The successful experiments on metallic objects are encouraging, but the manuscript should better explain whether the network is learning a broadly applicable reconstruction prior or whether it is more specialised to sparse and favourable target classes.
- I suggest the authors briefly discuss this work in the broader context of terahertz hardware development. Although the present demonstration is based on a 121 GHz radar-type system, it would be useful to discuss whether similar NLOS concepts might also be implemented using other coherent terahertz platforms, such as laser feedback interferometry or quantum cascade laser-based systems, for example,
- Silvestri, Carlo, Aleksandar D. Rakić, Dragan Indjin, Ali Khalatpour, Christian Jirauschek, Aleksandar Demic, Zoran Ikonic et al. "Quantum cascade laser roadmap." arXiv preprint arXiv:2602.17042 (2026). and
- Han, She, et al. "Laser feedback interferometry as a tool for analysis of granular materials at terahertz frequencies: Towards imaging and identification of plastic explosives." Sensors 16.3 (2016): 352.
Author Response
Please see the attachment
Author Response File:
Author Response.pdf
Reviewer 4 Report
Comments and Suggestions for AuthorsThe manuscript studies Terahertz (THz) radar 3D imaging for detecting, localizing, and imaging hidden targets in occluded environments. This method has potential applications in areas such as autonomous driving and disaster rescue. To address its challenges, the authors study a 3D learning imaging method for NLOS THz radar based on a holographic imaging operator, leveraging the adaptive optimization properties of deep unfolding networks and prior environmental perception. The authors also performed experimental validations in the LAC scenario on several selected targets. Their results demonstrate that the proposed method significantly improves 3D imaging precision and the computational speed of the imaging algorithms.
Opportunities for improvement include minor corrections of English and some points that could be expressed more clearly:
1) It would be kind to readers to explain in 2.4.1 the padded echo after downsampling more clearly.
2) The claim that their algorithm increases computational speed over traditional sparse imaging algorithms by two orders of magnitude could also be explained more clearly and in more detail.
Comments on the Quality of English LanguageThe language is generally good, but there are opportunities to improve, e.g.:
First, in order to pursuit high data fidelity, --> First, in order to pursue high data fidelity
using a moving platform to synthesis a 2D virtual antenna array --> using a moving platform to synthesize a 2D virtual antenna array)
Also, there are some missing articles and some phrasing could be more elegant.
Author Response
Please see the attachment.
Author Response File:
Author Response.pdf
Round 2
Reviewer 2 Report
Comments and Suggestions for AuthorsThe revised manuscript has satisfactorily addressed the most critical concerns from the first round, particularly the citation of the original FISTA-Net work, correction of reference errors, expanded limitation discussion, and improved figure quality. The remaining issues are minor typographical and formatting matters that need to be resolved in a final round of proofreading. The paper presents a valid feasibility study of deep unfolding networks for NLOS THz radar 3D imaging with real experimental validation, and makes a useful contribution to the field.

