A Study on a Directional Gradient-Based Defect Detection Method for Plate Heat Exchanger Sheets
Round 1
Reviewer 1 Report
Comments and Suggestions for Authors1. GB/T 232-1988 is outdated, as newer versions like GB/T 232-2010 exist.Why is the newer standard not being referenced?
2.The text annotation in Figure 1 is incorrect, as the term "crack length" is inconsistent with the main text.
3. The comparative experiments for object detection algorithms based on neural networks are not comprehensive, such as the deep learning models like YOLO and Transformer mentioned in the introduction.
4. The author emphasizes that the gradient-based method for detecting PHE focuses on FDR and LDR. In fact, the model does effectively reduce missed detections, but there is no obvious advantage in false positives, and the false detection rates, missed detection rates, and testing times under conditions of limited computational resources and scarce data, etc., are not supported by evidence.
Comments on the Quality of English Language1. Several sentences need grammatical refinement and contain unnecessarily complex phrasing.
2.Inconsistency in terminology within the context.
Author Response
comments 1: GB/T 232-1988 is outdated, as newer versions like GB/T 232-2010 exist.Why is the newer standard not being referenced?
Response 1:
Thank you for pointing this out. I agree with this comment .
Therefore I have:
【Revised in Introduction Section】
In the Introduction ( Paragraph 2) added the following clarification:
("It should be noted that although GB/T 232-2024 supersedes the 1988 version, the quantitative definition of micro-cracks is explicitly documented only in GB/T 232-1988. Accordingly, this study adopts the 1988 edition as the normative reference.")
Although the current latest standard GB/T 232-2024 serves as the general specification for bend testing of metallic materials, the quantitative dimensional definition of micro-cracks is no longer included in its text or appendices. Upon verification: Appendix A of GB/T 232-1988 explicitly stipulates the quantitative dimensional criteria for micro-cracks。
comments 2:The text annotation in Figure 1 is incorrect, as the term "crack length" is inconsistent with the main text?
Response 2:
We sincerely appreciate the reviewer's meticulous observation.
The original Figure 1 inadvertently included the term "crack length", which does not align with the core focus of this study. As emphasized throughout the manuscript, defect width is the critical dimensional parameter for micro-crack characterization, To address this inconsistency: Figure 1 has been redesigned to emphasize the defect width in the conceptual diagram .
Therefore I have:
【Revised in Figure 1】and【Revised in 2. Micro-crack features( Paragraph 1) 】
In the 2. Micro-crack features ( Paragraph 1) references to "length" have been removed:
( "representing both the width and height of these defects."change to"the red double arrow line represents the defect width.")
comments 3: The comparative experiments for object detection algorithms based on neural networks are not comprehensive, such as the deep learning models like YOLO and Transformer mentioned in the introduction.
Response 3:
Thank you for pointing this out. I agree with this comment regarding standard currency.
I will add a comparative experiment and analysis between EfficientNet-B0 [30] and ShuffleNetV2 [31] in section 4.4.1, and streamline this section.
Therefore I have:
【Revised in 4.4.1 Section】
From the second paragraph to the second to last paragraph
(In Section 2, this study introduced improvements to the traditional algorithm. To evaluate performance under data scarcity and CPU-only factory conditions, we conducted comparative experiments with deep learning baseline models, focusing on MDR, FDR, and per-image test time for 600×600 pixel images. Five models were benchmarked: ResNet-50, DenseNet-121, EfficientNet-B0, ShuffleNetV2, and our proposed method (Table 2).
ResNet-50 exhibited high MDR (46.64%) with low FDR (12.42%), attributed to its large receptive field introducing excessive background noise that compromises fine-grained feature learning. DenseNet-121 showed moderate MDR (20.52%) but higher FDR (24.72%) due to feature redundancy from deep architecture and pooling operations. EfficientNet-B0 achieved reduced MDR (18.28%) but suffered incomplete shallow feature retention from early-layer depth, while ShuffleNetV2's aggressive downsampling for speed optimization resulted the MDR (22.10%) and FDR (26.17%) by sacrificing resolution critical for micro-crack defects detection.
Table 2 Detection results of different algorithm
Method |
MDR |
FDR |
Time/s |
Our’s |
14.55% |
21.85% |
0.1402 |
Res Net 50 |
46.64% |
12.42% |
0.9562 |
DenseNet-121 |
20.52% |
24.72% |
1.6875 |
EfficientNet-B0 |
18.28% |
34.31% |
0.8221 |
ShuffleNetV2 |
22.01% |
26.17% |
0.1984 |
comments 4:The author emphasizes that the gradient-based method for detecting PHE focuses on FDR and LDR. In fact, the model does effectively reduce missed detections, but there is no obvious advantage in false positives, and the false detection rates, missed detection rates, and testing times under conditions of limited computational resources and scarce data, etc., are not supported by evidence..
Response 4:
Thank you for pointing this out. I agree with this comment regarding standard currency.
I have added the reason why the FDR of the method in this article is greater than that of ResNet.
【Revised in 4.4.1 Section】
In the final paragraph:
(The higher FDR of our method (21.85%) compared to ResNet-50 (12.42%) stems from a fundamental trade-off in feature utilization. Where ResNet's expansive receptive field discards critical shallow features entirely losing the capacity to even recognize subtle defect signatures . but our approach actively learns these features, some of which inherently resemble background noise. This deliberate retention enables detection of the most challenging micro-crack defects but inevitably introduces noise. Crucially, industrial safety standards prioritize minimizing MDR over FDR, as undetected defects directly threaten production safety, while false alarms merely require secondary verification.)
Insufficient description of the resource-constrained and data-scarce experimental environment constitutes is an oversight . I will rectify this by explicitly restating these critical conditions at the outset of Section 4.4.1
Therefore I have:
【Revised in 4.4.1 Section】
At the beginning of the paragraph:
(To substantiate claims regarding performance under constrained industrial conditions, all comparative experiments were conducted on an Intel i7-8570 CPU without GPU acceleration, explicitly simulating factory computational limitations. Furthermore, the SUT-B1 dataset—containing only 4,428 sample patches, each 24 × 24 pixels (Section 4.1)—represents a data-scarce scenario relative to deep learning standards. This experimental design directly validates our method’s operational efficacy in resource-limited, low-data environments.)
comments :
- Several sentences need grammatical refinement and contain unnecessarily complex phrasing.
2.Inconsistency in terminology within the context
Response:
Thank you for pointing this out. I agree with this comment regarding standard currency.
Therefore I have:
1.In academic writing, micro-crack defects (with hyphens) are a more standardized and recommended form. Therefore, the entire text uses micro-crack defects.
- The miss detection rate (MDR) is unified as the miss detection rate.
- The false detection rate (FDR) is unified as the false detection rate.
- Use defect to remove target and prevent confusion in reference
- Revise some subheadings and legends of the paper
- Fix grammar errors and modify some sentence structures
Finally, thank you for your guidance.
Reviewer 2 Report
Comments and Suggestions for AuthorsThis manuscript presents a directional gradient-based micro-crack detection algorithm for PHE sheets, aiming to address the limitations of traditional methods and deep learning models in industrial scenarios with limited computing resources and scarce data. The topic is highly relevant to industrial quality inspection, and the proposed method demonstrates potential advantages in detecting micro-cracks with variable widths and asymmetric grayscale profiles. However, several issues need to be addressed to enhance the depth and clarity of the work.
1-The study effectively identifies key challenges in PHE sheet micro-crack detection, such as the inadequacy of traditional grayscale-based methods in handling asymmetric defects and the high miss rates of deep learning models for small-scale targets under resource constraints. The proposed algorithm, which reframes detection from grayscale ridge edges to gradient-based double-ridge edges, PLEASE directly addresses these issues in the last paragraph in INTRODUCTION.
2-Clear Technical Innovation: The integration of directional Sobel operators for gradient transformation and Gaussian line detection for extracting positive/negative ridge edges is a well-justified technical approach. Please clarify the innovation.
3-The goal of this paper is to find the defects in plate heat exchanger sheets, but there is still a gap between the specific indicators it presents and the current ones, especially regarding the indicator of FDR. Please explain the reasons or analyze the sources of errors. What are the product indicators like in actual factories? Please refer ABSTRACT: "Experimental results show that tested in the defective boards library and under simulated factory CPU conditions, this algorithm achieves a miss rate of 14.55%, a false detection rate of 21.85%, and an 600*600 pixel image detection time of 0.1402 seconds. " and Table 2.
4-There remain some areas requiring improvement, e.g. incorrect symbol (such as asterisk in "600*600 pixel"), Reference to the equations (Equation (7)), INTRODUCTION should be condensed.
Author Response
comments 1: The study effectively identifies key challenges in PHE sheet micro-crack detection, such as the inadequacy of traditional grayscale-based methods in handling asymmetric defects and the high miss rates of deep learning models for small-scale targets under resource constraints. The proposed algorithm, which reframes detection from grayscale ridge edges to gradient-based double-ridge edges, PLEASE directly addresses these issues in the last paragraph in INTRODUCTION.
Response 1:
Thank you for pointing this out. I agree with this comment.
I will highlight the core solution proposed in this article to address existing challenges such as grayscale asymmetry and high missed detection rates for small targets, which involves reconstructing the target into a gradient space with dual ridge edges
Therefore I have:
【Revised in Introduction Section】
In the Introduction ( Paragraph second to last) change the following clarification:
( However, significant challenges remain in detecting micro-crack defects on PHE sheet surfaces with existing methods: grayscale-based approaches struggle with asymmetric grayscale profiles, while deep learning models suffer feature degradation for small-scale defects, yielding high miss rates under industrial computational constraints. To address these issues, this study establishes a theoretical foundation for parameter selection in variable width defect detection. We propose a directional gradient based algorithm that mathematically constrains the Gaussian template width to cover variable width defects with a fixed σ, explicitly reframing the detection defect from ridge edges in grayscale images to centrally symmetric double ridge edges in gradient images. By employing a directional Sobel operator to generate gradient images and utilizing Gaussian line detection to extract dual ridge edges, the method mitigates asymmetry impacts while enhancing accuracy for small scale micro-crack defects under data and computational constraints.)
comments 2: Clear Technical Innovation: The integration of directional Sobel operators for gradient transformation and Gaussian line detection for extracting positive/negative ridge edges is a well-justified technical approach. Please clarify the innovation.
Response 2:
We sincerely appreciate the reviewer's meticulous observation.
The core innovation of this study is to provide theoretical basis for parameter selection of known detection models in section 3.3.2. For the first time, the fitting relationship between Gaussian template width (σ) and variable width linear defects has been rigorously theoretically proven, providing theoretical support for fixed parameter coverage of variable width defects in industrial inspection.
Firstly, through the principle of convolutional similarity, it is revealed that the similarity between Gaussian templates and target shapes determines the strength of convolutional response. When the template width is equal to or greater than the defect width, the convolution preserves the central peak (effective detection); When the template width is less than the defect width, the response splits into two peaks (missed detection); Then, it is proposed that a single fixed value of σ can cover all defects below the maximum width, solving the pain point of traditional methods that require repeated adjustment of σ and improving the practicality of industrial scenarios. Finally, this theory transforms empirical parameter selection into theoretical mathematical constraints, providing new ideas for detection.
Therefore I have:
【Revised in Abstract】
(To address these issues, this study establishes a theoretical foundation for parameter selection in variable-width defect detection. We propose a directional gradient-based algorithm that mathematically constrains the Gaussian template width to cover variable-width defects with a fixed σ, reframing the detection defect from ridge edges to centrally symmetric double-ridge edges in gradient images.)
【Revised in 3.3.2 Section】
In the ( Paragraph 5) added the following clarification:
This work bridges a critical gap in industrial defect detection by providing the first theoretical constraint for Gaussian parameter selection in variable-width linear defects. Based on the convolution similarity principle, the Gaussian emplate to defect similarity governs response intensity:
Template width ≥ defect width: Preserved central peak ,effective detection.
Template width < defect width: Split dual peaks ,missed detection.
To enable single-σ coverage of variable-width defects,
【Revised in 5.Conclusion】
In the ( Paragraph 4)
(Finally,grayscale based methods often experience accuracy degradation with variable asymmetric defect widths. The core theoretical contribution is a mathematically constrained Gaussian width selection for variable width defects enables single parameter coverage previously unattainable in industrial settings.we replace empirical tuning with deterministic design.)
comments 3: The goal of this paper is to find the defects in plate heat exchanger sheets, but there is still a gap between the specific indicators it presents and the current ones, especially regarding the indicator of FDR. Please explain the reasons or analyze the sources of errors. What are the product indicators like in actual factories? Please refer ABSTRACT: "Experimental results show that tested in the defective boards library and under simulated factory CPU conditions, this algorithm achieves a miss rate of 14.55%, a false detection rate of 21.85%, and an 600*600 pixel image detection time of 0.1402 seconds. " and Table 2.
Response 3:
Thank you for pointing this out. I agree with this comment.
I have added the reasons why the FDR of the method in this article is greater than that of ResNet, and provided the testing focus of the factory.
I Therefore I have:
【Revised in 4.4.1 Section】
In the final paragraph:
(The higher FDR of our method (21.85%) compared to ResNet-50 (12.42%) stems from a fundamental trade-off in feature utilization. Where ResNet's expansive receptive field discards critical shallow features entirely losing the capacity to even recognize subtle defect signatures . but our approach actively learns these features, some of which inherently resemble background noise. This deliberate retention enables detection of the most challenging micro-crack defects but inevitably introduces noise. Crucially, industrial safety standards prioritize minimizing MDR over FDR, as undetected defects directly threaten production safety, while false alarms merely require secondary verification.)
comments 4:There remain some areas requiring improvement, e.g. incorrect symbol (such as asterisk in "600*600 pixel"), Reference to the equations (Equation (7)), INTRODUCTION should be condensed.
Response 4:
Thank you for pointing this out. I agree with this comment.
Therefore I have modify incorrect symbol and condense the introduction
【Revised in Abstract】
(600*600)change to(600×600),
【Revised in Introduction】
Condense the fourth paragraph to the third paragraph from the end.
Condensed from 977 words to 484 words
(Effective micro-crack defects detection is critical for quality inspection per industrial and national standards. Current methods comprise non-machine vision and machine vision approaches, the latter subdivided into traditional image processing and neural network-based object detection
Non-machine vision methods like ultrasound [5], eddy current [6], and magnetic memory detection [7] struggle with height variations on non-flat PHE surfaces due to working distance sensitivity. Machine vision overcomes this limitation.
Traditional machine vision methods include Gaussian line detection for optical fibers [8], OTSU-Canny lane detection [9], Gaussian-Hough track detection [10], and gray centroid-Gaussian convolution for tire bubbles [11]. However, these approaches assume fixed directionality or symmetric grayscale profiles, limiting their effectiveness for variable-width, asymmetric micro-crack defects. Image smoothing also risks filling crack valleys, hindering accurate edge extraction.
Defect detection based on (local binary patterns, LBP) [12-15] is limited for micro-crack defects due to their low grayscale values hindering local feature extraction.Directional, Fourier, Haar wavelet, and Gabor filters [16-19] also struggle: symmetric templates (except Haar) poorly handle asymmetric profiles as convolution strength diminishes with defect width, while Haar's multi-scale decomposition increases latency. Gaussian mixture and Markov field models [20-21] fail when defect-background contrast is lower than internal background variations. Zhang et al. [22] applied the non-directional Canny operator but reported false positives from background variations and provided no width-adaptive parameter selection.
Deep learning plays a significant role in industrial inspection. Neural network-based object detectors fall into three categories:
1) One-stage algorithms: Liu et al combined Single Shot Multi Box Detector (SSD) with Residual Network-50 (ResNet-50) for steel defect detection [23], leveraging residual learning but facing complex parameter optimization. Some researchers adopt variants of the You Only Look Once (YOLO) series for object detection [24-28], despite the loss of shallow features caused by downsampling operations.
2) Two-stage algorithms: Hao et al used Faster Region-Based Convolutional Neural Network (Faster R-CNN) with ResNet-50 [29], enabling multi-scale fusion but risking overfitting with limited data. Zhang et al employed Mask R-CNN with EfficientNet [30], effective for large objects but missing small defects due to low-resolution features.
3) End-to-end Transformers: Xing et al adapted Vision Transformer (ViT) with ShuffleNet-V2 [31], compensating for missing convolutional priors but losing fine details via self-attention. Zhu et al combined deformable convolution with Detection Transformer (DETR) [32], improving key region focus but requiring large datasets unsuitable for industrial constraints.
Traditional defect detection methods maintain advantages over deep learning for PHE sheet inspection under industrial constraints. First, national standards prioritize minimal missed detections (allowing some false positives), whereas deep learning typically balances miss/false detection rates, potentially increasing misses. Second, traditional algorithms are interpretable, require no training data, and rely on physical models, ensuring reliability in data-scarce scenarios. Finally, they demand fewer computational resources, enabling efficient deployment in embedded systems.
Traditional grayscale-based algorithms offer advantages for PHE sheet defect detection but assume symmetrical defect profiles. Accuracy decreases with variable widths and asymmetry. Non-directional operators also increase false detections for directional defects.)
Finally, thank you for your guidance.
Reviewer 3 Report
Comments and Suggestions for AuthorsDear Authors,
Thank you for this interesting contribution.
A directional gradient and the Gaussian line detection-based algorithm proposed in the paper is associated with defects (micro-crack) detected on the plate-heat-exchanger (PHE) sheet that is shaped like a linear defect, possesses an asymmetric grayscale response, and has a width of between 7 to 21 pixels.
The procedures entail candidate region extraction at boundaries of plate protrusion, edge enhancement based on directional Sobel gradients, and sub-pixel accuracy Gaussian line detection achieved using well-chosen widths of templates and standard deviation with identification of defect features.
Real production image testing shows high detection rate (~85.45%), low miss rate (~14.55%), and low speed ( 0.14 seconds to process 600 600 images), compared to the traditional grayscale and Canny edge method, and with fewer processing resources required than deep learning networks, the experimental validation results in a resource-constrained industrial environment.
The method is a tradeoff between interpretability, robustness and efficiency, but the detection may be influenced by defect contrast, size, and intricate surface textures.
Some suggestions are:
- Become more descriptive and clear in the methods section especially in description of the calculation of the gradient image and detection of the Gaussian line steps to increase the reproducibility.
- Clarify the table of results by adding more comparative figures or quantitative results to easily prove the effectiveness of the offered algorithm.
- It would be good to do some work on the usage of English language to make reading and description more precise.
- Include a further discussion of limitation of such a proposed method, like difficulty in dealing with defects of different orientation or sensitivity to noise.
- Add more references to other related studies or alternative solutions to place the contribution in the context of the existing research more definitely.
- Enhance the quality and clarity of figures and tables by having consistent labeling, higher resolutions of the images and the clarity of their legends to facilitate understanding of readers.
- Comment on possible implementations or use in real-time systems, or embedded systems, and note the benefits in terms of computational efficiency.
- Explain the selection of directional gradient templates and their effects on accuracy of detection with further supporting experimental data.
I hope these comments may help improve the quality of the paper.
Kindest regards
Comments on the Quality of English Language- Use better sentence construction to make it easier to read in complicated technical details to better understand discussion.
- Limit very long sentences which may be ambiguous as you divide the sentences into shorter and more concise statements.
- Apply the same terminology within the manuscript to avoid confusion especially when referring to some important concepts such as gradient image and defect edges.
- Increase language accuracy in terms of explaining steps of the algorithm so that every process is described unequivocally and clearly.
- Fix the little mistakes in grammar and poor sentence construction to enhance the flow and professionalism of the text on the whole.
- Administered jargon that is very technical when possible or make short explanations to render the content understandable to a wider group of people.
- Correct and edit figures captions and labels becoming grammatically accurate and clear to help to understand visual content.
Author Response
comments 1: Become more descriptive and clear in the methods section especially in description of the calculation of the gradient image and detection of the Gaussian line steps to increase the reproducibility.
Response 1:
Thank you for pointing this out. I agree with this comment.
To enhance methodological clarity, I will incorporate dedicated summary paragraphs at the end of both Section 3.2 and Section 3.3.
Therefore I have:
【Revised in 3.2 Section】
In the last paragraph of section 3.2 add the following clarification:
( The gradient conversion methodology initiates by recognizing that micro-crack defects exhibit asymmetric ridge-edge profiles in grayscale space. This enables defect detection through identification of centrally symmetric double-ridge structures in gradient space . To optimize this transformation, we systematically selected a vertical 5×5 Sobel operator 。Firstly, orientation was determined perpendicular to predominant defect directions, yielding higher gradient contrast. Secondly, kernel size was optimized to 5×5 to maximize feature preservation. Finally, Pascal-derived weights prioritized central pixels to enhance sensitivity to thin linear features while suppressing edge noise. This directional gradient approach thus converts grayscale images into enhanced representations where asymmetric defects manifest as detectable symmetric dual ridges.)
【Revised in 3.3 Section】
In the last paragraph of section 3.3 add the following clarification:
( Our directional gradient approach transforms defect detection from grayscale space to gradient space, where micro-crack defects manifest as centrally symmetric double-ridge edges Figure 9(b). To precisely localize these features, we implement a Gaussian line detection operator comprising three core computational phases: First, Hessian matrix construction via second-order Gaussian derivatives Eq.s (1)–(6) identifies ridge normal vectors through eigenvalue analysis. Second, sub-pixel localization is achieved through Taylor series expansion along the normal direction Eq.s (7)–(9), solving for extrema coordinates. Third, hysteresis thresholding and eight-neighborhood connectivity Eq. (10) generate continuous centerlines. Crucially, Gaussian width selection follows a convolution similarity principle: template width must exceed defect width to preserve central peaks Figure 14. we set σ via Eq. (11) to cover the maximum half-width, ensuring single-parameter efficacy across variable defect dimensions.)
comments 2:Clarify the table of results by adding more comparative figures or quantitative results to easily prove the effectiveness of the offered algorithm.
Response 2:
Thank you for pointing this out. I agree with this comment regarding standard currency.
I will add a comparative experiment and analysis between EfficientNet-B0 [30] and ShuffleNetV2 [31] in section 4.4.1, and streamline this section.
Therefore I have:
【Revised in 4.4.1 Section】
From the second paragraph to the second to last paragraph
(In Section 2, this study introduced improvements to the traditional algorithm. To evaluate performance under data scarcity and CPU-only factory conditions, we conducted comparative experiments with deep learning baseline models, focusing on MDR, FDR, and per-image test time for 600×600 pixel images. Five models were benchmarked: ResNet-50, DenseNet-121, EfficientNet-B0, ShuffleNetV2, and our proposed method (Table 2).
ResNet-50 exhibited high MDR (46.64%) with low FDR (12.42%), attributed to its large receptive field introducing excessive background noise that compromises fine-grained feature learning. DenseNet-121 showed moderate MDR (20.52%) but higher FDR (24.72%) due to feature redundancy from deep architecture and pooling operations. EfficientNet-B0 achieved reduced MDR (18.28%) but suffered incomplete shallow feature retention from early-layer depth, while ShuffleNetV2's aggressive downsampling for speed optimization resulted the MDR (22.10%) and FDR (26.17%) by sacrificing resolution critical for micro-crack defects detection.
Table 2 Detection results of different algorithm
Method |
MDR |
FDR |
Time/s |
Our’s |
14.55% |
21.85% |
0.1402 |
Res Net 50 |
46.64% |
12.42% |
0.9562 |
DenseNet-121 |
20.52% |
24.72% |
1.6875 |
EfficientNet-B0 |
18.28% |
34.31% |
0.8221 |
ShuffleNetV2 |
22.01% |
26.17% |
0.1984 |
comments 3: It would be good to do some work on the usage of English language to make reading and description more precise.
Response 3:
Thank you for pointing this out. I agree with this comment.
Therefore I have:
1.In academic writing, micro-crack defects (with hyphens) are a more standardized and recommended form. Therefore, the entire text uses micro-crack defects.
- The miss detection rate (MDR) is unified as the miss detection rate.
- The false detection rate (FDR) is unified as the false detection rate.
- Use defect to remove target and prevent confusion in reference
- Revise some subheadings and legends of the paper
- Fix grammar errors and modify some sentence structures
comments 4: Include a further discussion of limitation of such a proposed method, like difficulty in dealing with defects of different orientation or sensitivity to noise.
Response 4:
Thank you for pointing this out. I agree with this comment.
I will add a new section 4.4.5 to state on the limitations of this method
Therefore I have:
【Revised in 4.4.5 section】
(4.4.5 Limitations of the Proposed Metho
While the proposed method demonstrates strong performance for horizontal micro-crack defects under industrial constraints, two key limitations warrant discussion. First, the algorithm’s dependence on directional gradient operators inherently prioritizes defects aligned with the Sobel template’s orientation. vertical gradients maximized sensitivity for horizontal defects, but cracks deviating significantly from this orientation may exhibit reduced contrast in gradient space, increasing miss rates. Second, the Gaussian line detection stage remains sensitive to high-frequency noise in gradient images. Textured background regions with grayscale transitions resembling asymmetric ridges can trigger false positives, contributing to the observed 21.85% FDR. Future work will address these limitations through multi-orientation template and noise-robust gradient computation techniques.)
comments 5: Add more references to other related studies or alternative solutions to place the contribution in the context of the existing research more definitely.
Response 5:
Thank you for pointing this out. I agree with this comment.
I have added four YOLO series object detection methods in the introduction and reference
Therefore I have:
【Revised in Introduction Section】
(One-stage algorithms: Liu et al combined Single Shot Multi Box Detector (SSD) with Residual Network-50 (ResNet-50) for steel defect detection [23], leveraging residual learning but facing complex parameter optimization. Some researchers adopt variants of the You Only Look Once (YOLO) series for object detection [24-28], despite the loss of shallow features caused by downsampling operations.)
【Revised in reference Section】
([25] Liu, Bao, and Wenqiang Jiang. "LA-YOLO: Bidirectional Adaptive Feature Fusion Approach for Small Object Detection of Insulator Self-explosion Defects." IEEE Transactions on Power Delivery (2024)
[26] Zhang Y, Zhang H, Huang Q, et al. DsP-YOLO: An anchor-free network with DsPAN for small object detection of multiscale defects[J]. Expert Systems with Applications, 2024, 241: 122669.
[27] Wang W, Meng Y, Li S, et al. HV-YOLOv8 by HDPconv: Better lightweight detectors for small object detection[J]. Image and Vision Computing, 2024, 147: 105052.
[28] Liu Q, Lv J, Zhang C. MAE-YOLOv8-based small object detection of green crisp plum in real complex orchard environments[J]. Computers and Electronics in Agriculture, 2024, 226: 109458.)
comments 6: Enhance the quality and clarity of figures and tables by having consistent labeling, higher resolutions of the images and the clarity of their legends to facilitate understanding of readers.
Response 6:
Thank you for pointing this out. I agree with this comment.
Therefore I have:
【Revised in Figure】
I have enhanced the resolution of the grayscale profile and gradient profile,respectively Figure 1,Figure 4,Figure 9(a),Figure 9(b),Figure 11(a),Figure 11(b),Figure 12,Figure 15,Figure 21(b),Figure 22(b),Figure 24。
comments 7: Comment on possible implementations or use in real-time systems, or embedded systems, and note the benefits in terms of computational efficiency.
Response 7:
Thank you for pointing this out. I agree with this comment.
I will add a new section 4.4.6 to state on the advantages of computational requirements, processing speed, and memory usage.
Therefore I have:
【Revised in 4.4.6 Section】
(4.4.6 Real-Time Deployment Potential
The proposed algorithm demonstrates significant potential for real-time implementation in industrial production lines and deployment on resource-constrained embedded systems. Its CPU-only operation eliminates the need for expensive, power-intensive GPUs, crucial for cost-sensitive environments. As shown in Table 2, the fast processing speed (0.1402 seconds per 600x600 image, ~7.1 FPS on a factory-grade CPU) . Furthermore, the algorithm's reliance on deterministic image processing steps with fixed kernels and avoidance of large model parameters results in a very low memory footprint, ideal for embedded systems with limited RAM. Consequently, this combination of CPU-only execution, sub-second processing time, and minimal memory requirements makes the directional gradient-based approach a computationally efficient and practical solution for integrating micro-crack defects detection directly onto production equipment or embedded vision systems.)
comments 8:Explain the selection of directional gradient templates and their effects on accuracy of detection with further supporting experimental data.
Response 8:
Thank you for pointing this out. I agree with this comment.
I added and analyzed ablation experiments using vertical Sobel, horizontal Sobel, vertical Prewitt, and non directional Laplacian operators as directional gradient templates in section 4.4.4
【Revised in 4.4.4 Section】
(4.4.4 Ablation Study on Directional Gradient Templates
An ablation study comparing four 5×5 gradient operators (vertical Sobel, horizontal Sobel, vertical Prewitt, non-directional Laplacian) on the SUT-B1 dataset revealed significant performance differences in micro-crack defects detection. The results are shown in Table 3.The proposed vertical Sobel operator achieved superior accuracy (14.55% MDR, 21.85% FDR), while the horizontal Sobel variant exhibited substantially higher miss rates (34.70% MDR) due to orthogonal misalignment with transverse crack features. Although sharing vertical orientation, the Prewitt operator demonstrated reduced sensitivity (22.01% MDR, 23.86% FDR), suggesting its uniform kernel provides inferior noise suppression in textured regions compared to the Sobel's center-weighted coefficients. The isotropic Laplacian yielded the poorest performance (47.76% MDR, 33.82% FDR), generating spurious responses in gradual transition zones that fragmented defect signatures. These results confirm that effective micro-crack defects detection requires both directional alignment with defect geometry and optimized weighting characteristics, conditions uniquely satisfied by the vertical Sobel operator.)
Table 3 Gradient operator performance comparison
Operator |
MDR |
FDR |
Time/s |
Vertical Sobel |
14.55% |
21.85% |
0.1402 |
Horizontal Sobel |
34.70% |
26.58% |
0.1411 |
Vertical Prewitt |
22.01% |
23.86% |
0.1723 |
Laplacian |
47.76% |
33.82% |
0.1685 |
Response 3 describes modifications related to the English language.
Finally, thank you for your guidance.