Next Article in Journal
A Fault Detection Framework for Rotating Machinery with a Spectrogram and Convolutional Autoencoder
Previous Article in Journal
Improving Moving Insect Detection with Difference of Features Maps in YOLO Architecture
 
 
Article
Peer-Review Record

Truncation Artifact Reduction in Stationary Inverse-Geometry Digital Tomosynthesis Using Deep Convolutional Generative Adversarial Network

Appl. Sci. 2025, 15(14), 7699; https://doi.org/10.3390/app15147699
by Burnyoung Kim 1 and Seungwan Lee 2,*
Reviewer 1: Anonymous
Reviewer 2: Anonymous
Reviewer 3: Anonymous
Reviewer 4:
Appl. Sci. 2025, 15(14), 7699; https://doi.org/10.3390/app15147699
Submission received: 10 June 2025 / Revised: 7 July 2025 / Accepted: 8 July 2025 / Published: 9 July 2025
(This article belongs to the Section Biomedical Engineering)

Round 1

Reviewer 1 Report

Comments and Suggestions for Authors

This manuscript titled "Truncation artifact reduction in stationary inverse-geometry digital tomosynthesis using deep convolutional generative adversarial network" addresses an important problem in tomosynthesis imaging, specifically the challenge of truncation artifacts in stationary inverse-geometry configurations.

  • The review of related deep learning methods for artifact reduction in tomosynthesis (and specifically GAN applications) is somewhat limited.
  • The architecture of the GAN is not fully detailed. It is unclear how many layers, the kernel size, and activation functions were used.

  • No ablation study is presented to justify design choices (e.g., number of residual blocks, choice of loss functions).

  • The data augmentation techniques, if any, are not described.

  • The choice of metrics for performance evaluation should be justified, especially if only a limited set is used.

  • The sample size for real clinical images is not reported.

  • The train/test split methodology lacks detail. Was a k-fold cross-validation performed?

  • There’s no explanation of how ground truth images for truncated regions in clinical data were obtained.

  • Limitations are underexplored. The potential for GAN hallucination artifacts, generalization to different imaging geometries, and inference time issues are not mentioned.

Author Response

Comments 1: The review of related deep learning methods for artifact reduction in tomosynthesis (and specifically GAN applications) is somewhat limited.

Response 1: Thank you for your comment. As you commented, we additionally reviewed the previously published papers, which reported the GAN-based networks for reducing metal artifacts and improving image quality for digital tomosynthesis images. We have added a sentence in the third paragraph of Section 1. Also, we have added the references [12] and [13] in the reference section.

Comments 2: The architecture of the GAN is not fully detailed. It is unclear how many layers, the kernel size, and activation functions were used.

Response 2: The detailed architectures of the proposed network have been already described in the second and third paragraphs of Sub-section 2.3, Figure 3, Figure 4, Table 2 and Table 3.

Comments 3: No ablation study is presented to justify design choices (e.g., number of residual blocks, choice of loss functions).

Response 3: Thank you for your comment. As already described in the first paragraph of Sub-section 2.3, the architecture of the proposed network was based on that of the DCGAN reported from the reference [18]. Also, as already described in the first paragraph of Sub-section 2.4, the loss functions were chosen with reference to [23]. The residual block was not used in the proposed network.

Comments 4: The data augmentation techniques, if any, are not described.

Response 4: Thank you for your comment. As described in the second paragraph of Section 1 and can be observed in Figures 6 and 10, the truncation effects in the s-IGDT images depended on the geometric characteristics and scan conditions of the imaging system. The network training of this study was performed considering these characteristics and conditions. Thus, in this study, the training and test projections were prepared without the data augmentation techniques because the geometric characteristics and scan conditions can be distorted in the augmented images. We have added sentences in the second paragraph of Section 4.

Comments 5: The choice of metrics for performance evaluation should be justified, especially if only a limited set is used.

Response 5: Thank you for your comment. The SNR is a well-known metric for evaluating the noise property of an image. The PSNR and SSIM are generally used to measure the accuracy and similarity of a given image compared with a ground-truth image. Moreover, a lot of studies developing deep learning-based techniques have used the PSNR and SSIM for demonstrating their performance. For the above reasons, we used the SNR, PSNR and SSIM in order to evaluate the performance of the proposed method. We have added the references [25]-[27] at the end of the first paragraph in Sub-section 2.5 and in the reference section for justifying the choice of the metrics.

Comments 6: The sample size for real clinical images is not reported.

Response 6: Thank you for your comment. Several s-IGDT systems have been pre-clinically tested for the applications, such as breast and chest imaging. Clinical s-IGDT systems have not been reported yet, and neither have the size of clinical s-IGDT image. The purpose of this study was to preliminarily demonstrate the feasibility of the proposed method for reducing the truncation artifacts in the s-IGDT, which is developing for clinical applications. We have added a sentence in the final paragraph of Section 4 with the references [35] and [36] in order to inform the current status of developing the s-IGDT system.

Comments 7: The train/test split methodology lacks detail. Was a k-fold cross-validation performed?

Response 7: Thank you for your comment. The cross-validation is able to overcome the issue caused by a lack of training/test data. But, as described above, the purpose of this study was to preliminarily demonstrate the feasibility of the proposed method for reducing the truncation artifacts in the s-IGDT images. Thus, the cross-validation method was not applied in this study. In spite of that, the cross-validation is necessary to improve the reliability of results. We have added a sentence in the final paragraph of Section 4 for describing the necessity of the cross-validation and considering the limitation of this study.

Comments 8: There’s no explanation of how ground truth images for truncated regions in clinical data were obtained.

Response 8: Thank you for your comment. As described above, clinical s-IGDT systems have not been reported yet, and the s-IGDT system and digital phantoms used in this study were simulated. The detailed description of the simulation methodology was already explained in Sub-sections 2.1 and 2.2. We designed the additional detector, which was larger than the detector of the simulated s-IGDT system, for obtaining ground-truth projections without truncation. The methodology for obtaining ground-truth projections was already described in Sub-sections 2.1.

Comments 9: Limitations are underexplored. The potential for GAN hallucination artifacts, generalization to different imaging geometries, and inference time issues are not mentioned.

Response 9: Thank you for your comment. The network training was implemented on two 3.0 GHz Intel Xeon E5-2687W v4 CPUs, 128 GB RAM and 12 GB Nvidia TITAN Xp GPU, and approximately 43 hours were required for the network training. We have added a sentence at the end of Sub-section 2.4 for describing the network training conditions and training time.

In this study, we simulated the different scan geometries, which had the various lengths of source array and the various numbers of focal spots, for considering the various distributions of truncation artifacts as already described in Sub-section 2.1. The results demonstrated that the truncation artifact-free s-IGDT images can be acquired by using the proposed method regardless of system geometries. This point was already discussed in third paragraph of Section 4.

As you commented, this study has a couple of issues to be resolved in future studies. A large number of the obtained projections was assigned to the training images in order to improve the performance of the trained model under the given dataset. This strategy leaded to a lack of the test images. Thus, the number of the test images should be increased for generalizing the proposed method. Also, the proposed method needs to be implemented in accordance with practical applications, and its performance should be experimentally investigated for enhancing the applicability of the proposed method. We have added a paragraph with the references [35] and [36] at the end of Section 4 for exploring the limitations of this study.

Reviewer 2 Report

Comments and Suggestions for Authors

Dear Authors,

Thank you for providing me with the opportunity to read this interesting paper. Below, i have listed my comments:

1) Some technical concepts are described vaguely or too densely. Why dilated convolutions help with out-painting specifically could be better motivated. The dual-discriminator architecture is interesting, but its motivation could be clearer. Are the sub-discriminators optimizing against different targets?

2) The manuscript needs some grammatical and stylistic polishing. For instance, 'DT can reduce radiation dose' needs to be 'DT can reduce the radiation dose'. Another instance, 'due to the scan strategy with limited angles' needs ot be 'due to its limited-angle scanning strategy'.

3) I have one hesitancy about the results and perhaps the authors could provide some explanations. Numerical improvements (e.g., 4–5% higher PSNR or SSIM) are claimed as meaningful but may not be perceptually or statistically significant. Couldn't statistical significance tests (e.g., paired t-tests or Wilcoxon signed-rank) support claims of superiority, especially when differences are small? Hence why I think the claim, in the conclusion section, that the method is "superior to conventional methods" may need to be softened unless more baselines are added or statistically significant improvements are shown.

4) In the discussion, insightful reflections on architecture limitations (e.g., detail loss from downsampling) are appreciated. However, the recommendation to use perceptual loss and skip connections seems speculative without experimental support.

I hope this feedback is helpful.

Comments on the Quality of English Language

I have pointed out above two instances where language could be revised.

Author Response

Comments 1: Some technical concepts are described vaguely or too densely. Why dilated convolutions help with out-painting specifically could be better motivated. The dual-discriminator architecture is interesting, but its motivation could be clearer. Are the sub-discriminators optimizing against different targets?

Response 1: Thank you for your comment. The out-painting task suffers from a lack of learnable data because the image pixels adjacent to missing image regions are lesser than those used for the in-painting task as already described in the third paragraph of Section 1. This fundamental issue can be resolved by providing the learnable data from dilated receptive fields. Comparing to the standard convolution layer, features can be extracted from a large image area in the dilated convolution layer because the dilated convolution expands a receptive field. This characteristic of the dilated convolution allows the provision of sufficient information for the out-painting task. We have added sentences in the second paragraph of Sub-section 2.3 for describing the benefits of the dilated convolution in the out-painting task.

In the s-IGDT image, non-truncated regions should maintain the features of input images, and the generated image for truncated regions should harmonize with surrounding regions through the discrimination procedure. But, the single discriminator used in conventional GAN-based networks is hard to separately distinguish the artificiality of each region in a generated image because the training of the single discriminator is implemented by using all areas of the generated image. We have added sentences in the third paragraph of Sub-section 2.3 for clarifying the motivation of the use of the dual-discriminator.

As already described in the third paragraph of Sub-section 2.3, the sub-discriminators can be independently optimized for discriminating each region because the truncated and non-truncated regions were separately input into the sub-discriminators for network training.

Comments 2: The manuscript needs some grammatical and stylistic polishing. For instance, 'DT can reduce radiation dose' needs to be 'DT can reduce the radiation dose'. Another instance, 'due to the scan strategy with limited angles' needs ot be 'due to its limited-angle scanning strategy'

Response 2: Thank you for your comment. We have revised the phrases as you commented.

Comments 3: I have one hesitancy about the results and perhaps the authors could provide some explanations. Numerical improvements (e.g., 4–5% higher PSNR or SSIM) are claimed as meaningful but may not be perceptually or statistically significant. Couldn't statistical significance tests (e.g., paired t-tests or Wilcoxon signed-rank) support claims of superiority, especially when differences are small? Hence why I think the claim, in the conclusion section, that the method is "superior to conventional methods" may need to be softened unless more baselines are added or statistically significant improvements are shown.

Response 3: Thank you for your comment. As you commented, we have revised Section 5 to soften our conclusion, and a sentence have been added in Section 5 for describing the limitation of this study.

Also, the root-mean-square error (NRMSE) of the output image was evaluated for providing an additional quantitative result. Comparing the conventional s-IGDT imaging and PDC methods, the proposed method reduced the NRMSEs of the s-IGDT images by 24.11-58.60 and 5.34-30.23%, respectively. Also, the NRMSEs were minimized by the proposed out-painting method for all the lengths of the source array and all the numbers of the focal spots. We have added the equation (3) and a sentence in the first paragraph of Sub-section 2.5 for describing the methodology of the NRMSE calculation. Sentences have added in the second paragraph of Sub-section 3.1, the second paragraph of Sub-section 3.2 and the third paragraph of Section 4 for representing the results of the NRMSE measurements. Also, we have revised Figures 9 and 13 in order to add the NRMSE graphs.

Comments 4: In the discussion, insightful reflections on architecture limitations (e.g., detail loss from downsampling) are appreciated. However, the recommendation to use perceptual loss and skip connections seems speculative without experimental support.

Response 4: Thank you for your comment. Several studies have been reported that GAN-based networks including residual block and skip connection are able to improve the spatial resolution of output images (Zhang X., et al., SIGNAL IMAGE VIDEO P. 2021, 15, 725-733; Bulat A., et al., In Proc. of European Conference on Computer Vision, 2018, pp. 185-200). Other studies showed that the perceptual loss is useful for restoring structural details in image transformation and enhancement tasks (Johnson J., et al., In Proc. of Computer Vision-ECCV 2016 Part II, 2016, Volume 14, pp. 694-711; Gholizadeh-Ansari M., et al., J. Digit. Imaging 2020, 33, 504-515). These studies demonstrated that the applications of residual block, skip connection and perceptual loss have a potential of preventing a loss of high-resolution information for the output image. The effects of residual block, skip connection and perceptual loss function in the out-painting task would be evaluated in the near future. We have added sentences with the references [32] and [34] in the fourth paragraph of Section 4 in order to justify our assumption, and a future plan has been described at the end of Section 4.

Response to Comments on the Quality of English Language
Point 1: I have pointed out above two instances where language could be revised.
Response 1: Thank you for your comment. We have revised the phrases as you commented.

Reviewer 3 Report

Comments and Suggestions for Authors

This manuscript presents a novel deep learning approach for reducing truncation artifacts in stationary inverse-geometry digital tomosynthesis (s-IGDT). Specifically, the authors design a dual-discriminator, encoder-decoder-based deep convolutional GAN (DCGAN) to perform an out-painting task for restoring truncated projection data. The method is evaluated using simulated CT-based digital phantoms and compared against a previously published projection data correction (PDC) approach. The work is timely and addresses a relevant challenge in medical imaging, particularly in optimizing the clinical utility of low-dose tomographic techniques. The integration of out-painting using GANs in s-IGDT is original and technically well-justified. The paper is generally well-structured, but some important clarifications and improvements are needed, particularly in method transparency, result reproducibility, and linguistic polish.

  • Manuscript introduces a novel GAN-based out-painting strategy to reduce s-IGDT truncation artifacts. The dual-discriminator architecture tailored to differentiate truncated and non-truncated regions is innovative. The work clearly addresses a practical limitation in s-IGDT systems. Could you emphasize more explicitly how this approach improves upon previous GAN-based or CNN-based artifact correction methods (including the authors’ own 2020 SPIE paper)? It would be beneficial to consider briefly comparing with inpainting literature in CT and tomosynthesis reconstruction.
  • The dataset is based on a single test phantom. While the training dataset is large, generalization may be limited. This should be clearly stated in the limitations. There is no mention of whether multiple trials or random seeds were tested. Performance may vary based on initialization. Could you please clarify whether training/testing was repeated for different phantoms or network initializations? Consider adding adding results from an additional phantom to demonstrate robustness. I would be grateful if you could provide standard deviation/error bars for quantitative metrics in Figures 8 and 11.

  • SSIM results slightly favor the PDC method in some configurations. You have provided a strong explanation, but it’s important to note this limitation in the conclusions. Clinical significance of SNR/SSIM/PSNR improvements (how much of an increase is perceptually meaningful or diagnostically relevant) should be discussed. Clearly state in the conclusions that the current results are based on simulated phantoms, not patient data.
  • There are several awkward or grammatically incorrect sentences, likely due to translation or automatic grammar tools. Repetitions in phrasing (“truncation artifacts in s-IGDT were suppressed…”) reduce readability.  A full language and grammar check is needed by a native English speaker or professional editor. Use clearer subheadings and transitions between subsections. 
  • Inconsistent citation formatting (some entries show arXiv preprint codes in inconsistent places). Check typographic consistency (spacing, periods) in references.

Comments on the Quality of English Language

There are several awkward or grammatically incorrect sentences, likely due to translation or automatic grammar tools. Repetitions in phrasing (“truncation artifacts in s-IGDT were suppressed…”) reduce readability.  A full language and grammar check is needed by a native English speaker or professional editor. Use clearer subheadings and transitions between subsections. 

Author Response

Comments 1: Manuscript introduces a novel GAN-based out-painting strategy to reduce s-IGDT truncation artifacts. The dual-discriminator architecture tailored to differentiate truncated and non-truncated regions is innovative. The work clearly addresses a practical limitation in s-IGDT systems. Could you emphasize more explicitly how this approach improves upon previous GAN-based or CNN-based artifact correction methods (including the authors’ own 2020 SPIE paper)? It would be beneficial to consider briefly comparing with inpainting literature in CT and tomosynthesis reconstruction.

Response 1: Thank you for your comment. We have already described the benefits of the encoder-decoder architecture-based generator compared to the conventional generator at the beginning of the second paragraph in Sub-section 2.3.

The out-painting task suffers from a lack of learnable data because the image pixels adjacent to missing image regions are lesser than those used for the in-painting task as already described in the third paragraph of Section 1. This fundamental issue can be resolved by providing the learnable data from dilated receptive fields. Comparing to the standard convolution layer, features can be extracted from a large image area in the dilated convolution layer because the dilated convolution expands a receptive field. This characteristic of the dilated convolution allows the provision of sufficient information for the out-painting task. We have added sentences in the second paragraph of Sub-section 2.3 for emphasizing the benefits of the dilated convolution in the out-painting task.

In the s-IGDT image, non-truncated regions should maintain the features of input images, and the generated image for truncated regions should harmonize with surrounding regions through the discrimination procedure. But, the single discriminator used in conventional GAN-based networks is hard to separately distinguish the artificiality of each region in a generated image because the training of the single discriminator is implemented by using all areas of the generated image. We have added sentences in the third paragraph of Sub-section 2.3 for clarifying the motivation of the use of the dual-discriminator.

Comments 2: The dataset is based on a single test phantom. While the training dataset is large, generalization may be limited. This should be clearly stated in the limitations. There is no mention of whether multiple trials or random seeds were tested. Performance may vary based on initialization. Could you please clarify whether training/testing was repeated for different phantoms or network initializations? Consider adding adding results from an additional phantom to demonstrate robustness. I would be grateful if you could provide standard deviation/error bars for quantitative metrics in Figures 8 and 11.

Response 2: Thank you for your comment. As already described in the first paragraph of Sub-section 2.2, the digital phantoms were constructed by using the clinical CT images scanned for different patients. Those CT images were obtained with the different tube voltages, tube currents, slice thicknesses and scanners. Thus, each phantom had different structures and image properties, and the diversity of the phantoms was somewhat ensured.

A large number of the obtained projections was assigned to the training images in order to improve the performance of the trained model under the given dataset. This strategy leaded to a lack of the test images, and consequently the test images were prepared by using the single phantom. Thus, the number of the test images should be increased for generalizing the proposed method. We have added sentences in the fifth paragraph of Section 4 for clarifying the methodology of preparing the training/test data and its limitation.

As you commented, we have provided the error bars for the quantitative measurements in Figures 9 and 13.

Also, the root-mean-square error (NRMSE) of the output image was additionally evaluated for improving the reliability of the results. Comparing the conventional s-IGDT imaging and PDC methods, the proposed method reduced the NRMSEs of the s-IGDT images by 24.11-58.60 and 5.34-30.23%, respectively. Also, the NRMSEs were minimized by the proposed out-painting method for all the lengths of the source array and all the numbers of the focal spots. We have added the equation (3) and a sentence in the first paragraph of Sub-section 2.5 for describing the methodology of the NRMSE calculation. Sentences have added in the second paragraph of Sub-section 3.1, the second paragraph of Sub-section 3.2 and the third paragraph of Section 4 for representing the results of the NRMSE measurements. Also, we have revised Figures 9 and 13 in order to add the NRMSE graphs.

Comments 3: SSIM results slightly favor the PDC method in some configurations. You have provided a strong explanation, but it’s important to note this limitation in the conclusions. Clinical significance of SNR/SSIM/PSNR improvements (how much of an increase is perceptually meaningful or diagnostically relevant) should be discussed. Clearly state in the conclusions that the current results are based on simulated phantoms, not patient data.

Response 3: Thank you for your comment. As you commented, we have revised Section 5 in order to describe the limitation of this study in terms of the SSIM results. Also, we have stated at the beginning of Section 5 that the performance of the proposed method was quantitively evaluated by simulations.

Several s-IGDT systems have been pre-clinically tested for practical applications (Qian X., et al., Med. Phys. 2009, 36, 4389-4399; Yang G., et al., In Proc. of the SPIE Medical Imaging, 2008, Volume 6913, pp. 441-450; Primidis T.G., et al., Biomed. Phys. Eng. Express 2022, 8, 015006; Billingsley A., et al., Med. Phys. 2024, 52, 542-552), and clinical s-IGDT systems have not been reported yet. And, the purpose of this study was to preliminarily demonstrate the feasibility of the proposed method for reducing the truncation artifacts in the s-IGDT images. Thus, the clinical significance of the proposed method would be experimentally investigated after the clinical systems are developed. We have added sentences in the fifth paragraph of Section 4 with the references [35] and [36] in order to inform the current status of developing the s-IGDT system and describe the necessity of the experimental evaluation.

We can expect that the improvements in the noise property and quantitative accuracy by the proposed method lead to the precise diagnosis and image guidance using the s-IGDT image. We have added a sentence in the third paragraph of Section 4.

Comments 4: There are several awkward or grammatically incorrect sentences, likely due to translation or automatic grammar tools. Repetitions in phrasing (“truncation artifacts in s-IGDT were suppressed…”) reduce readability. A full language and grammar check is needed by a native English speaker or professional editor. Use clearer subheadings and transitions between subsections.

Response 4: Thank you for your comment. We have proofread the entire manuscript for correcting the grammatical and typographical errors, and the manuscript has revised for avoiding unnecessary repetitions.

Sub-section 2.3 has been divided into Sub-sections 2.3 (Network architecture) and 2.4 (Network training), and Section 3 has been also divided into Sub-sections 3.1 (Lengths of the source array) and 3.2 (Numbers of the focal spots).

Comments 5: Inconsistent citation formatting (some entries show arXiv preprint codes in inconsistent places). Check typographic consistency (spacing, periods) in references.

Response 5: Thank you for your comment. As you commented, we have rechecked the arXiv preprint codes and typographic consistency in the reference section.

Response to Comments on the Quality of English Language
Point 1: There are several awkward or grammatically incorrect sentences, likely due to translation or automatic grammar tools. Repetitions in phrasing (“truncation artifacts in s-IGDT were suppressed…”) reduce readability. A full language and grammar check is needed by a native English speaker or professional editor. Use clearer subheadings and transitions between subsections.
Response 1: Thank you for your comment. We have proofread the entire manuscript for correcting the grammatical and typographical errors, and the manuscript has revised for avoiding unnecessary repetitions.
Sub-section 2.3 has been divided into Sub-sections 2.3 (Network architecture) and 2.4 (Network training), and Section 3 has been also divided into Sub-sections 3.1 (Lengths of the source array) and 3.2 (Numbers of the focal spots).

Reviewer 4 Report

Comments and Suggestions for Authors

The manuscript titled “Truncation Artifact Reduction in Stationary Inverse-Geometry Digital Tomosynthesis Using Deep Convolutional Generative Adversarial Network” presents scientifically relevant content by proposing a GAN-based approach to mitigate truncation artifacts in images reconstructed from s-IGDT systems. While the topic is of interest and potential impact, the study requires further refinement, as several methodological and presentation aspects need clarification and improvement.

1)  Abstract: Clarify the novelty more explicitly. Additionally, this session could be enhanced by incorporating key quantitative results and relevant statistical metrics obtained from the study. 

2) In the Introduction section, the authors should summarize at least five key findings from their study, establishing the significance and novelty of the proposed approach.

3) Materials and methods: Figures 1 and 2 require higher resolution (300 dpi), clearer labeling, and a schematic workflow of the process.

4) Authors should justify this limitation of the study. Clarify if hyperparameters were tuned systematically and whether data augmentation or image normalization was applied. 

5) Additionally, the manuscript should specify the data split or cross-validation methods to assess the model and address any relevant ethical or data access considerations.

6) Results: Can the authors provide a more detailed quantitative comparison between the proposed method and baseline approaches, including statistical metrics (e.g., PSNR, SSIM) and significance testing, to better validate the performance improvements claimed?. Figures must be described in the text before being shown.

7) It is recommended that the authors expand the discussion session, which highlights improvements over previous PDC methods but lacks comparison with other GAN-based approaches, consideration of limitations such as generalizability and dataset size, and discussion of clinical applicability. Additionally, this section should include relevant comparisons, acknowledge methodological constraints, and suggest future research directions.

8) The conclusions section should indicate the need for further validation on real-world or clinical datasets to ensure the model's generalizability and robustness.

Author Response

Comments 1: Abstract: Clarify the novelty more explicitly. Additionally, this session could be enhanced by incorporating key quantitative results and relevant statistical metrics obtained from the study.

Response 1: Thank you for your comment. The editorial board of the Applied Sciences regulates that the abstract should be a total of about 200 words maximum. Thus, it is limited to include lots of contents in the abstract section. In spite of that, we have added a sentence at the beginning of the abstract section for clarifying the motivation of this study. Also, the evaluation method with the used metrics have been described at the middle of the abstract section, and the key results with the quantitative values have been represented in the abstract section.

Comments 2: In the Introduction section, the authors should summarize at least five key findings from their study, establishing the significance and novelty of the proposed approach.

Response 2: Thank you for your comment. The significance and novelty of the proposed method are follows. First, in the proposed out-painting method, a dilated convolutional block was added to the DCGAN for providing the contextual information in diverse receptive fields, which is used for restoring the truncated regions of s-IGDT projections, during network training. A generator was designed with an encoder-decoder architecture for generating plausible output images and leading to stable network training. Also, a dual-discriminator architecture was applied to the DCGAN for deriving the precise training of the generator and rejecting the artificiality of restored s-IGDT projections. Binary mask images were used for differentiating truncated regions from non-truncated regions in an input projection. The sub-discriminators were separately optimized by using the out-painted and non-truncated regions of the generated images. Finally, the performance of the proposed out-painting method was compared with a conventional out-paining method by evaluating the noise property and quantitative accuracy of s-IGDT images. We have added a paragraph at the end of Section 1 for summarizing the methodological significance and novelty of the proposed method.

Comments 3: Materials and methods: Figures 1 and 2 require higher resolution (300 dpi), clearer labeling, and a schematic workflow of the process.

Response 3: Thank you for your comment. We have substituted Figures 1 and 2 with the others having high resolution (330 DPI). The training workflow of the proposed method has been already illustrated in Figure 5 with high resolution. The resolution of the figures may appear low according to monitor resolution or software version. We have submitted the source figure files with high resolution.

Comments 4: Authors should justify this limitation of the study. Clarify if hyperparameters were tuned systematically and whether data augmentation or image normalization was applied.

Response 4: Thank you for your comment. The truncation effects depended on the geometric characteristics and scan conditions of the s-IGDT system, and the network training of this study was performed considering these characteristics and conditions. Thus, in this study, the training and test projections were prepared without the data augmentation techniques because the geometric characteristics and scan conditions can be distorted in the augmented images. The purpose of this study was to preliminarily demonstrate the feasibility of the proposed method for reducing the truncation artifacts in the s-IGDT images. Thus, the hyperparameters for network training were manually determined without an optimal tuning. We have added sentences in the second paragraph of Section 4.

Comments 5: Additionally, the manuscript should specify the data split or cross-validation methods to assess the model and address any relevant ethical or data access considerations.

Response 5: Thank you for your comment. As you know, the cross-validation is able to overcome the issue caused by a lack of training/test data. But, as described above, the purpose of this study was to preliminarily demonstrate the feasibility of the proposed method for reducing the truncation artifacts in the s-IGDT images. Thus, the cross-validation method was not applied in this study. In spite of that, the cross-validation is necessary to improve the reliability of results. We have added a sentence in the final paragraph of Section 4 for describing the necessity of the cross-validation and considering the limitation of this study.

Comments 6: Results: Can the authors provide a more detailed quantitative comparison between the proposed method and baseline approaches, including statistical metrics (e.g., PSNR, SSIM) and significance testing, to better validate the performance improvements claimed?. Figures must be described in the text before being shown.

Response 6: Thank you for your comment. The SNR is a well-known metric for evaluating the noise property of an image. The PSNR and SSIM are generally used to measure the accuracy and similarity of a given image compared with a ground-truth image. Moreover, a lot of studies developing deep learning-based techniques have used the PSNR and SSIM for demonstrating their performance. For the above reasons, we initially used the SNR, PSNR and SSIM in order to evaluate the performance of the proposed method. We have added the references [25]-[27] at the end of the first paragraph in Sub-section 2.5 and in the reference section for justifying the choice of the metrics.

The root-mean-square error (NRMSE) of the output image was additionally evaluated for providing more detailed quantitative results and improving the reliability of the results. Comparing the conventional s-IGDT imaging and PDC methods, the proposed method reduced the NRMSEs of the s-IGDT images by 24.11-58.60 and 5.34-30.23%, respectively. Also, the NRMSEs were minimized by the proposed out-painting method for all the lengths of the source array and all the numbers of the focal spots. We have added the equation (3) and a sentence in the first paragraph of Sub-section 2.5 for describing the methodology of the NRMSE calculation. Sentences have added in the second paragraph of Sub-section 3.1, the second paragraph of Sub-section 3.2 and the third paragraph of Section 4 for representing the results of the NRMSE measurements. Also, we have revised Figures 9 and 13 in order to add the NRMSE graphs.

All figures have been described in the manuscript before being shown.

Comments 7: It is recommended that the authors expand the discussion session, which highlights improvements over previous PDC methods but lacks comparison with other GAN-based approaches, consideration of limitations such as generalizability and dataset size, and discussion of clinical applicability. Additionally, this section should include relevant comparisons, acknowledge methodological constraints, and suggest future research directions.

Response 7: Thank you for your comment. We have revised and expanded Section 4 in accordance with your suggestions.

We have added sentences in the third paragraph of Section 4 for highlighting the performance of the proposed method with relevant comparisons. (The SNRs of the s-IGDT images obtained by using the proposed out-painting method were 75.42-84.91 and 38.87-52.02% higher than those obtained with the truncated projections and PDC method, respectively.; The proposed out-painting method reduced the NRMSEs of the s-IGDT images compared to the conventional s-IGDT imaging and PDC methods by 24.11-58.60 and 5.34-30.23%, respectively. The s-IGDT images obtained by using the proposed out-painting method had higher PSNRs than those obtained with the truncated projections and PDC method by 10.19-37.87 and 1.24-5.10%, respectively. Also, the SSIMs for the proposed out-painting method were 16.50-32.77% higher than those for the conventional s-IGDT imaging method.)

We have already discussed the limitation of this study in terms of the SSIM results and the methodological constraints in the fourth paragraph of Section 4. Several studies have been reported that GAN-based networks including residual block and skip connection are able to improve the spatial resolution of output images (Zhang X., et al., SIGNAL IMAGE VIDEO P. 2021, 15, 725-733; Bulat A., et al., In Proc. of European Conference on Computer Vision, 2018, pp. 185-200). Other studies showed that the perceptual loss is useful for restoring structural details in image transformation and enhancement tasks (Johnson J., et al., In Proc. of Computer Vision-ECCV 2016 Part II, 2016, Volume 14, pp. 694-711; Gholizadeh-Ansari M., et al., J. Digit. Imaging 2020, 33, 504-515). These studies demonstrated that the applications of residual block, skip connection and perceptual loss have a potential of preventing a loss of high-resolution information for the GAN- and deep learning-based approaches. The effects of the residual block, skip connection and perceptual loss function in the out-painting task would be evaluated in the near future. We have added sentences with the references [32] and [34] in the fourth paragraph of Section 4 for presenting the strategies, which may overcome the issue of the spatial resolution degradation, referring to the other GAN- and deep learning-based approaches. Also, a sentence had been added at the end of Section 4 for suggesting future research directions.

We simulated the s-IGDT projections using the digital phantoms, which were constructed by using the clinical CT images obtained with the various tube voltages, tube currents and slice thicknesses for different patients. And, a large number of the obtained projections was assigned to the training images in order to improve the performance of the trained model under the given dataset. This strategy leaded to a lack of the test images. Thus, the number of the test images should be increased for generalizing the proposed method. We have added sentences in the fifth paragraph of Section 4 for considering the limitation caused by a lack of dataset size.

We can expect that the improvements in the noise property and quantitative accuracy by the proposed method lead to the precise diagnosis and image guidance using the s-IGDT image. Several s-IGDT systems have been pre-clinically tested for practical applications, and the various scan strategies have been reported for optimizing their applications (Qian X., et al., Med. Phys. 2009, 36, 4389-4399; Yang G., et al., In Proc. of the SPIE Medical Imaging, 2008, Volume 6913, pp. 441-450; Primidis T.G., et al., Biomed. Phys. Eng. Express 2022, 8, 015006; Billingsley A., et al., Med. Phys. 2024, 52, 542-552). Thus, the proposed out-painting method needs to be implemented in accordance with the applications, and its performance should be experimentally investigated for enhancing the clinical applicability. We have added sentences in the third and fifth paragraphs of Section 4 for discussing the clinical applicability of the proposed method and suggesting future research directions.

Comments 8: The conclusions section should indicate the need for further validation on real-world or clinical datasets to ensure the model's generalizability and robustness.

Response 8: Thank you for your comment. As you commented, we have described the necessity of experimental validations for the generalization of the proposed method at the end of Section 5.

 

Round 2

Reviewer 1 Report

Comments and Suggestions for Authors

The revised version is good and recommended for publication.

Author Response

Comment 1: The revised version is good and recommended for publication.

Response 1: Thank you for your revision.

Reviewer 2 Report

Comments and Suggestions for Authors

Thank you for revising the manuscript.

Author Response

Comment 1: Thank you for revising the manuscript.

Response 1: Thank you for your revision.

Reviewer 3 Report

Comments and Suggestions for Authors

Following careful consideration of the reviewer’s feedback, the authors have made the necessary revisions and improvements to the manuscript. I believe that the article now meets the required standards and can move forward in the publication procedure.

Author Response

Comment 1: Following careful consideration of the reviewer’s feedback, the authors have made the necessary revisions and improvements to the manuscript. I believe that the article now meets the required standards and can move forward in the publication procedure.

Response 1: Thank you for your revision.

Reviewer 4 Report

Comments and Suggestions for Authors

The manuscript entitled: Truncation Artifact Reduction in Stationary Inverse-Geometry Digital Tomosynthesis Using Deep Convolutional Generative Adversarial Network, has effectively addressed all previously raised comments and suggestions. The authors have made the necessary revisions. Therefore, I recommend that the manuscript be accepted for publication in its current form.

Author Response

Comment 1: The manuscript entitled: Truncation Artifact Reduction in Stationary Inverse-Geometry Digital Tomosynthesis Using Deep Convolutional Generative Adversarial Network, has effectively addressed all previously raised comments and suggestions. The authors have made the necessary revisions. Therefore, I recommend that the manuscript be accepted for publication in its current form.

Response 1: Thank you for your revision.

Back to TopTop