Confidence-Aware Ship Classification Using Contour Features in SAR Images
Round 1
Reviewer 1 Report
Comments and Suggestions for AuthorsThe manuscript should be much improved.
Instead of describing by characters, you must explain the method by equations ,figurs or tables as a techinical paper. Especially, the important new features and the procedures should be clearly described. The current version is not ready for techinical evaluation.
I would like to review the manuscript again after you modified it.
Comments on the Quality of English LanguageNo comment for English
Author Response
Comment 1:
The manuscript should be much improved.
Instead of describing by characters, you must explain the method by equations ,figurs or tables as a techinical paper. Especially, the important new features and the procedures should be clearly described. The current version is not ready for techinical evaluation.
I would like to review the manuscript again after you modified it.
Response 1:
Thank you for your comment. As recommended, equations have been added to the following:
- 2 Proposed Features (equations 1-19)
- 3 Entropy-based Ensembling (equations 21-30)
- 4 Entropy-based Confidence levels (equations 31-37)
While it is agreed that the addition of background equations enhanced the quality of the paper, the technical contributions of the sections in question were still previously evident. Moreover, the remainder of the manuscript is clearly structured. Therefore, this should not have hindered the ability of performing a technical review.
Reviewer 2 Report
Comments and Suggestions for AuthorsThe manuscript focuses on the classification of ships in SAR images. I have the following suggestions for improvement:
-
The innovation of this manuscript is relatively weak, as the techniques involved appear to be a combination of existing methods rather than a novel contribution. The authors should clarify the specific aspects of their approach that distinguish it from prior work and demonstrate how these components contribute to improving the classification performance in SAR ship detection.
-
The review of the research background includes references and methods that are outdated. It is important to cite more recent literature and incorporate state-of-the-art techniques in SAR image classification to provide a stronger foundation for the study. This will help contextualize the current research in relation to more advanced methods and justify the need for the proposed approach.
-
The "Related Works" section should not be embedded within the introduction. It is more appropriate to have it as a separate section following the introduction. This structure will improve the clarity and organization of the manuscript, allowing the introduction to focus solely on presenting the problem and motivating the study, while the related works can provide a dedicated discussion of existing methods.
Author Response
Comment 1:
The innovation of this manuscript is relatively weak, as the techniques involved appear to be a combination of existing methods rather than a novel contribution. The authors should clarify the specific aspects of their approach that distinguish it from prior work and demonstrate how these components contribute to improving the classification performance in SAR ship detection.
Response 1:
Thank you for your comments. The manuscript contains several novel contributions, centred around two main concepts:
- Contour features
The classification of ships in SAR images using contours was presented only once in the literature [25]. However, the approach in that study differs significantly from ours (lines 96-107):
- A 3D model had to be constructed and projected for each ship type to match the SAR image. This process requires substantial effort for the inclusion of any new ship class.
- The study was performed on only four test samples and employed images with a very-high spatial resolution of 0.6 m.
- The contours were solely based on geometric aspects and did not include any radiometric information.
In contrast, the features proposed in our study are generalisable to additional ship classes without necessitating additional work to accommodate new inclusions. Moreover, the proposed features are rigorously evaluated on four variations of established datasets (OpenSARShip GRD, OpenSARShip SLC, FUSAR-Ship 3 categories, and FUSAR-Ship 5 categories), where the average number of test samples per class reaches the hundreds. The proposed features are shown to be effective across a wide range of spatial resolutions (1 m to 23 m). The features are shown to improve the classification performance compared to other handcrafted features, as well as a VGG19 convolutional neural network, in Tables 4 and 5.
- Entropy
Although the concept of information entropy is well-established in the literature, our study represents, to the best of our knowledge, the first instance where it is applied to combine an ensemble of models. The resultant ensemble model is shown to improve the classification performance when compared to the individual models in Table 6.
Furthermore, the manuscript introduces the novel concept of categorising classification predictions into three distinct confidence levels using entropy, achieved without the need for additional data. This approach is shown to improve the classification performance in Table 7.
To clearly highlight the novelty of our methods, section 1.1 Contributions has been rewritten to emphasise the information mentioned above.
[25] Zhu, J.; Qiu, X.; Pan, Z.; Zhang, Y.; Lei, B. Projection Shape Template-Based Ship Target Recognition in TerraSAR-X Images. IEEE 821 Geoscience and Remote Sensing Letters 2017, 14, 222–226. https://doi.org/10.1109/LGRS.2016.2635699.
Comment 2:
The review of the research background includes references and methods that are outdated. It is important to cite more recent literature and incorporate state-of-the-art techniques in SAR image classification to provide a stronger foundation for the study. This will help contextualize the current research in relation to more advanced methods and justify the need for the proposed approach.
Response 2:
The majority of studies discussed in section 2.1 Handcrafted Features, are relatively dated (2013–2018). However, this is not indicative of an outdated literature review process. Rather, it reflects the shift in SAR image classification research towards deep learning methodologies in recent years. Consequently, there has been a noticeable decline in recent publications focusing on handcrafted features, a problem that the paper highlights and addresses (mentioned in lines 34-41, 73-76).
Most of the recent studies on SAR ship classification are predominantly focused on deep learning, making an extensive review of such works irrelevant to the scope of this paper. However, there are two notable exceptions [8, 11], which integrate handcrafted features with deep learning, and are mentioned in our paper (lines 48-55).
[8] Zhang, T.; Zhang, X.; Ke, X.; Liu, C.; Xu, X.; Zhan, X.; Wang, C.; Ahmad, I.; Zhou, Y.; Pan, D.; et al. HOG-ShipCLSNet: A Novel Deep Learning Network With HOG Feature Fusion for SAR Ship Classification. IEEE Transactions on Geoscience and Remote Sensing 2022, 60, 1–22. https://doi.org/10.1109/TGRS.2021.3082759.
[11] Zhang, T.; Zhang, X. Injection of Traditional Hand-Crafted Features into Modern CNN-Based Models for SAR Ship Classification: What, Why, Where, and How. Remote Sensing 2021, 13. https://doi.org/10.3390/rs13112091.
Comment 3:
The "Related Works" section should not be embedded within the introduction. It is more appropriate to have it as a separate section following the introduction. This structure will improve the clarity and organization of the manuscript, allowing the introduction to focus solely on presenting the problem and motivating the study, while the related works can provide a dedicated discussion of existing methods.
Response 3:
Thanks for this comment. As suggested, 2. Related Works is now set as a separate section.
Reviewer 3 Report
Comments and Suggestions for AuthorsTo address the issues of ship classification in SAR images, the manuscript proposed 13 handcrafted features derived from the contours of ships as representations to classify the ship types. To perform classification process, a novel confidence aware scheme based on the information entropy was proposed. Extensive experiments demonstrated the method’s effectiveness. However, the presentation and organization of this manuscript are a little worse so that the readers may not easily catch the main ideas. Therefore, this manuscript should undergo a major revision. My concerns and suggestions are given as follows.
1. The Introduction Section should be extended. The Related Works might be presented in an independent Section instead of being a subsection of Introduction. Moreover, some segmentation methods for the contour extraction of SAR ship targets should be added.
2. For the Methodology section, an overall framework of the proposed method should be firstly provided for easy understanding.
3. Some formulas and diagrams should be added to facilitate extraction process of the 13 contour features. The description for the datasets used should be removed to the experiments section.
4. Some formulas and pseudocodes for the Watershed segmentation algorithm should be added because the readers may cannot figure out the algorithm principle only from the texts.
5. The description for the U-Net should be reorganized. The train-test split methods for the datasets and the experimental environment should be removed to the experiments Section.
6. The Classification Process and Entropy Analysis should be reorganized. More attentions should be paid to the Entropy Analysis method. It would be more clear and better to introduce the process by a schematic diagram.
7. The experiments Section should be reorganized. For example, to split into more subsections. Some more segmentation methods should be added for comparison.
8. For the experimental analysis, the quantitative metrics should be as more as exact.
Author Response
Comment 1:
The Introduction Section should be extended. The Related Works might be presented in an independent Section instead of being a subsection of Introduction. Moreover, some segmentation methods for the contour extraction of SAR ship targets should be added.
Response 1:
Thank you for your comments. As suggested, 2. Related Works is now set as a separate section.
A discussion on segmentation methods, however, has not been added to the revised version of the paper. It is acknowledged that segmentation is an important preprocessing step in the contour extraction process. However, it should be emphasised that the focus of the manuscript is on classification, not segmentation. Incorporating a review of such methods would extend the scope of the paper unnecessarily and divert attention from the primary focus.
Comment 2:
For the Methodology section, an overall framework of the proposed method should be firstly provided for easy understanding.
Response 2:
Thanks very much for this suggestion. A flowchart of the overall proposed methodology has been added (Figure 1).
Comment 3:
Some formulas and diagrams should be added to facilitate extraction process of the 13 contour features. The description for the datasets used should be removed to the experiments section.
Response 3:
As recommended, equations have been added to the following:
- 3.2 Proposed Features (equations 1-19)
Also, a new section 4. Experimental Setup has now been added, with the dataset description moved there (subsection 4.1).
Comment 4:
Some formulas and pseudocodes for the Watershed segmentation algorithm should be added because the readers may cannot figure out the algorithm principle only from the texts.
Response 4:
Pseudocode of the watershed algorithm has been added in page 5. However, the authors believe that this addition could be redundant given the text in 3.1.1 and the included references. If the reviewer finds the pseudocode valuable in enhancing the paper, it will be retained; otherwise, the authors suggest its removal based on the provided rationale.
Comment 5:
The description for the U-Net should be reorganized. The train-test split methods for the datasets and the experimental environment should be removed to the experiments Section.
Response 5:
The train-test split of U-Net and experimental environments have been moved to 4. Experimental Setup
Comment 6:
The Classification Process and Entropy Analysis should be reorganized. More attentions should be paid to the Entropy Analysis method. It would be more clear and better to introduce the process by a schematic diagram.
Response 6:
4.4 Classification Process is now placed in 4. Experimental Setup for added clarity.
The previous Entropy Analysis section has been split into two: 3.3 Entropy-based Ensembling and 3.4 Entropy-based Confidence Levels, with equations added in each to supplement the description (equations 21-37).
The flowchart added per the suggestion in Comment 2 (Figure 1) effectively summarises the process.
Comment 7:
The experiments Section should be reorganized. For example, to split into more subsections. Some more segmentation methods should be added for comparison.
Response 7:
As suggested in Comment 3, a new section called 4. Experimental Setup has been added with five Subsections: 4.1 Datasets, 4.2 Implementation Details, 4.3 Contour Extraction Process, 4.4 Classification process, and 4.5 Entropy Analysis.
Additionally, the entropy results in section 5. Results and Discussion have been split into two subsections: 5.3 Entropy-based Ensembling Results and 5.4 Confidence-Aware Classification Results.
Regarding the suggestion to include additional segmentation methods, it is important to reiterate that the focus of this paper is on classification rather than segmentation. The employed segmentation methods are well-established and represent two distinct domains (statistical-based and deep learning-based). Including more segmentation methods would expand the paper unnecessarily without presenting any new insights, which is why the authors preferred to avoid additional segmentation methods in the original paper.
Comment 8:
For the experimental analysis, the quantitative metrics should be as more as exact.
Response 8:
We are not entirely sure on the meaning of this comment. Does it refer to the rounding of quantitative metrics?
If yes, the current rounding sufficiently highlights any differences between models/methods. Adding an extra decimal place, especially in the presence of large tables of numerical results, would likely lead to a more cluttered view without providing any added benefit.
Round 2
Reviewer 1 Report
Comments and Suggestions for AuthorsThe manuscript has been improved much. I have no more question.
Author Response
Comment 1:
The manuscript has been improved much. I have no more question.
Response 1:
Thank you for your comment.
Reviewer 2 Report
Comments and Suggestions for AuthorsThis revised version demonstrates significant improvement compared to the initial submission, and I appreciate the effort the authors have made. However, I still have the following suggestions for further refinement:
The paper mentions that contour features are extracted separately using the watershed algorithm and U-Net. However, it is not clear how these two types of features are combined and utilized to form the final contour features. A detailed explanation of the integration process would enhance the clarity of this approach.
From Table 3, it is evident that the contour extraction performance of the watershed algorithm is suboptimal in most cases. Did the authors account for this limitation when fusing the contour features? For instance, was any weighting mechanism introduced to reduce the influence of less accurate contour predictions? Addressing this point would strengthen the robustness of the proposed method.
The formation of 13 types of contour features is an interesting aspect of this study. I strongly recommend that the authors include visualizations of these features to provide readers with an intuitive understanding of their nature and significance.
After integration, three different confidence levels are obtained. However, it is not explicitly stated in the paper how these confidence levels are subsequently applied to improve classification accuracy. A more detailed discussion or diagram explaining their utilization would greatly enhance the comprehensiveness of the methodology section.
Overall, while the paper has made substantial progress, addressing these points would further solidify its contribution and improve its overall quality.
Author Response
Comment 1:
The paper mentions that contour features are extracted separately using the watershed algorithm and U-Net. However, it is not clear how these two types of features are combined and utilized to form the final contour features. A detailed explanation of the integration process would enhance the clarity of this approach.
Response 1:
Thank you for your comment.
The contour features are indeed extracted separately using the watershed algorithm and U-Net. However, the features derived from the contours of each method are not combined. As indicated in lines 4–5 and lines 182–186, as well as the results presented in Tables 4 and 5, the purpose of employing two distinct segmentation algorithms is to evaluate the robustness of the proposed 13 features with respect to variations in the contour extraction process. By evaluating features derived from each method independently, the study ensures that the proposed features are not reliant on the characteristics of a single extraction method.
Comment 2:
From Table 3, it is evident that the contour extraction performance of the watershed algorithm is suboptimal in most cases. Did the authors account for this limitation when fusing the contour features? For instance, was any weighting mechanism introduced to reduce the influence of less accurate contour predictions? Addressing this point would strengthen the robustness of the proposed method.
Response 2:
While watershed segmentation demonstrates suboptimal performance on the HRSID dataset, its performance is comparable to U-Net on the FUSAR-Ship dataset and exceeds it on the OpenSARShip GRD dataset. This is evident from the results in Tables 4 and 5, where watershed-derived contours outperform U-Net for the OpenSARShip GRD dataset and achieve accuracy within a 5% margin in the remaining datasets.
As discussed in lines 558–559, the superior performance of the watershed algorithm on the OpenSARShip GRD dataset can be attributed to the difference in spatial resolution relative to the U-Net training samples. Furthermore, as stated in lines 577–580, U-Net’s performance is likely to surpass watershed segmentation with sufficient training data. However, in scenarios with limited data, such as this study, watershed segmentation offers a viable and competitive alternative without the need for extensive training.
The purpose of entropy-weighted averaging is precisely to address such variability in the accuracy of contour predictions. By weighting the contributions of classifiers based on their certainty, the method reduces the influence of less accurate predictions, ensuring that the final classification is driven by the most reliable contours. It is important to note that the weighting is applied to classifiers trained on contours derived from the same segmentation method (whether watershed or U-Net) but extracted using different intensity capping percentiles (detailed in line 531). This ensures that the comparison remains consistent within each segmentation method, without mixing results from watershed and U-Net.
Comment 3:
The formation of 13 types of contour features is an interesting aspect of this study. I strongly recommend that the authors include visualizations of these features to provide readers with an intuitive understanding of their nature and significance.
Response 3:
Thank you for your comment.
A new figure (Figure 4) has been added that showcases the bending energy (f3), concave points and depths (f4-f7), and perpendicular distances to the PCA (f8-f10). The remaining features were not included in the visualisation due to their simplicity and to keep the paper length manageable.
Comment 4:
After integration, three different confidence levels are obtained. However, it is not explicitly stated in the paper how these confidence levels are subsequently applied to improve classification accuracy. A more detailed discussion or diagram explaining their utilization would greatly enhance the comprehensiveness of the methodology section.
Response 4:
Thank you for your comment.
The primary purpose of introducing confidence levels is not to directly improve overall classification accuracy but to quantify the uncertainty associated with predictions. Without these confidence levels, all 725 test samples are treated equally. By introducing confidence bands, the model provides an additional layer of interpretability, allowing predictions to be assessed based on their reliability.
This has significant practical value, particularly in real-world applications. High confidence predictions can be trusted and acted upon immediately, while low confidence results can prompt further investigation or manual review. This approach ensures a more informed decision-making process, especially in critical scenarios where classification errors may have serious implications.
Additionally, as presented in Figure 12 and stated in lines 777–781, analysing confidence levels at the class level provides insights that can guide targeted improvements in classification. For instance, the bulk carrier class predominantly has low confidence predictions, which suggests that the current training samples may not sufficiently represent the variability within this class or that these samples are noisier (discussed in lines 771-775). Hence, using the confidence levels, the classes that require additional representative samples or noise reduction can be identified and prioritised for improvement.
Reviewer 3 Report
Comments and Suggestions for AuthorsThe revised manuscript has made improvements of some extent compared with the initial version. However, there are still some issues needed to be answered and being carefully modified. Additionally, the newly added texts and modifications should be highlighted or shown in color for convenient review. The remained problems are given as follows.
1. For the pseudocode of the watershed algorithm, if the authors believe that this addition could be redundant given the text, they might make effort to simplify the text rather than present it directly. Additionally, in the next revision, the authors can replace the pseudocode with the flowchart of the watershed algorithm, where the intermediate process results involving of SAR ship images for each step can be presented.
2. In the Entropy-based Ensembling Section, more attention should be paid to the formula of the entropy. It would be better to give the entropy computation in terms of the multiple contours under the scope of the manuscript.
3. The formula of the entropy-weighted averaging only integrates the predictions of different classifiers. Can the authors give more details and explanations for the function of different segmentation contours to the ensemble classification?
4. Does the hypothesis of the central limit theorem for the mean entropy hold exactly? As shown in Figure 8, the normal distribution for the mean entropy is not so apparent. The authors should give more explanations.
5. There exists abuse of some symbols in the revised manuscript. For example, the K in formulas of (21) and (23).
6. The authors should carefully consider the necessity of the Entropy-based Confidence Levels being presented as a main part of the overall algorithm. As the experiment shows, this method is only used to facilitate the analysis of the classification results. So, it may be better to move this method to the experimental setup section. Moreover, can the authors give more discussions for the separation of confidence levels? For example, use two standard deviations. This may make difference to the experiment result of the SVM classifier that the Middle confidence level’s F1 score exceeds that of the High confidence level.
7. The subsections of 4.3, 4.4 and 4.5 should better be rearranged as 4.2.1, 4.2.2 and 4.2.3, respectively.
8. In subsection 5.4, what does the “five models” mean? Please give more explanations.
9. For the confidence levels’ analysis, can the authors give some correct and incorrect classified ship images corresponding to different confidence levels for intuitive comprehension?
10. If possible, the authors should add some comparison experiments of related SAR ship classification methods.
11. The Conclusion Section maybe a little long. Some texts can be moved to the Discussion Section.
12. In Line 171, the reference for the Complement Naïve Bayes model should be added. More details for the Reference [61] should be provided.
Author Response
Additionally, the newly added texts and modifications should be highlighted or shown in color for convenient review.
A second pdf will be included in the submission which highlights the changes.
Comment 1:
For the pseudocode of the watershed algorithm, if the authors believe that this addition could be redundant given the text, they might make effort to simplify the text rather than present it directly. Additionally, in the next revision, the authors can replace the pseudocode with the flowchart of the watershed algorithm, where the intermediate process results involving of SAR ship images for each step can be presented.
Response 1:
Thank you for your suggestion.
The pseudocode of the watershed algorithm has been replaced with a flowchart (Figure 2) that highlights the main steps of the algorithm.
Comment 2:
In the Entropy-based Ensembling Section, more attention should be paid to the formula of the entropy. It would be better to give the entropy computation in terms of the multiple contours under the scope of the manuscript.
Response 2:
Thank you for your comment.
The formulas in Section 3.3 Entropy-based Ensembling already account for the multiple contours in the computations. Specifically, in Equations 21–30, the symbols K and k represent the total number of classifiers and a specific classifier, respectively, where each classifier is trained on a distinct contour.
For instance, in Equations 21 and 26, the summation is performed across the contours, while in Equation 25, the samples are expanded by treating each contour of the same sample as a distinct entity.
Comment 3:
The formula of the entropy-weighted averaging only integrates the predictions of different classifiers. Can the authors give more details and explanations for the function of different segmentation contours to the ensemble classification?
Response 3:
In this context, the different classifiers correspond directly to the multiple contours. For instance, the entropy-weighted probabilities ensemble method combines predictions from five classifiers, each trained on a specific contour of the same sample. These contours are generated using the five intensity capping percentiles detailed in line 531.
The purpose of using multiple segmentation contours is to ensure that the ensemble classification is robust to variations in the contour extraction process. By integrating predictions from classifiers trained on different contours, the entropy-weighted averaging accounts for the uncertainty across these variations, thereby enhancing the reliability of the final classification outcome.
To elaborate this point in the paper, the following sentence has been added in section 3.3, lines 309-311:
‘Here, multiple classifiers refer to the same base classifier applied to identical input samples but trained on distinct contours derived from different threshold levels.’
Comment 4:
Does the hypothesis of the central limit theorem for the mean entropy hold exactly? As shown in Figure 8, the normal distribution for the mean entropy is not so apparent. The authors should give more explanations.
Response 4:
Thank you for your comment.
As stated in lines 358-364, this is an assumption that is based on the premise that the entropies, derived from models trained on slightly different data representations, are sufficiently independent and identically distributed. While Figure 10 (previously Figure 8) shows some deviations from a perfect Gaussian distribution, this can be attributed to factors such as the finite sample size and minor dependencies between the entropy values. Nonetheless, the mean entropy histograms in Figure 10 are closer to a normal distribution than the histograms from the single model. Additionally, the central tendency observed in the mean entropy distributions aligns with the expectation that extreme entropies require uniformity across classifiers, while mixed entropy values tend to average towards moderate uncertainty.
The Gaussian modelling approach can also be seen to provide a more calibrated estimation of confidence levels compared to single-model methods (presented in Table 7), where the mean and standard deviation are measured directly. Hence, while the hypothesis of the central limit theorem might not hold exactly, it can be considered a valid assumption as evident by the results.
Comment 5:
There exists abuse of some symbols in the revised manuscript. For example, the K in formulas of (21) and (23).
Response 5:
In the manuscript, K consistently refers to the total number of classifiers, which corresponds directly to the number of contours extracted using different threshold settings. Since the classifiers, thresholds, and feature sets are equivalent concepts in this study, the use of K in Equations (21) and (23) is consistent and appropriate within this context.
Comment 6:
The authors should carefully consider the necessity of the Entropy-based Confidence Levels being presented as a main part of the overall algorithm. As the experiment shows, this method is only used to facilitate the analysis of the classification results. So, it may be better to move this method to the experimental setup section. Moreover, can the authors give more discussions for the separation of confidence levels? For example, use two standard deviations. This may make difference to the experiment result of the SVM classifier that the Middle confidence level’s F1 score exceeds that of the High confidence level.
Response 6:
The entropy-based confidence levels play a central role in quantifying prediction uncertainty, which is a critical component of the proposed methodology. By stratifying predictions into High, Moderate, and Low confidence levels, the method enhances the interpretability and reliability of the classification process, particularly for real-world applications where understanding the certainty of predictions is as important as the predictions themselves.
Regarding the observed discrepancy in the SVM ensemble's F1 scores for the OpenSARShip SLC dataset, as discussed in lines 759-770, this anomaly arises due to the interaction between the F1 score calculation and class imbalance in the test set. Specifically, the F1 score gives equal importance to all classes, and in the Bulk carrier category, only a single High confidence sample was present, which was misclassified. This misclassification, though statistically rare, disproportionately impacted the F1 score for the high confidence level.
The suggestion to use two standard deviations to separate confidence levels is valid. However, the current entropy-based were carefully chosen based on the statistical principles stated in lines 368-389 to ensure a meaningful stratification of predictions. While alternate thresholds might shift specific metrics like the F1 score, the overall trends and conclusions drawn from the confidence levels remain consistent. Hence, the observed anomaly does not detract from the value of entropy-based confidence levels as a main part of the methodology.
Comment 7:
The subsections of 4.3, 4.4 and 4.5 should better be rearranged as 4.2.1, 4.2.2 and 4.2.3, respectively.
Response 7:
Thank you for your comment.
The subsections have been rearranged as suggested.
Response 8:
Thank you for your comment.
As mentioned in the response to Comment 3, the five models refer to the classifiers trained on each of the contours generated using the five intensity capping percentiles detailed in line 531.
To further clarify this in the paper, the following sentence has been added in Section 5.4, lines 701-703:
‘Each of these five models represents the same base classifier, trained on contours extracted using one of the five percentiles detailed in Section 4.2.3.’
Comment 9:
For the confidence levels’ analysis, can the authors give some correct and incorrect classified ship images corresponding to different confidence levels for intuitive comprehension?
Response 9:
Thank you for your comment.
A new figure (Figure 13) has been added which showcases examples of samples classified in each of the three confidence levels. A brief paragraph discussing this figure has also been added (lines 782-797)
Comment 10:
If possible, the authors should add some comparison experiments of related SAR ship classification methods.
Response 10:
The classification results in Tables 4 and 5 include four of the most widely used handcrafted feature sets for SAR ship classification, alongside a comparison with the VGG-19 model to provide a benchmark against a deep learning approach. This selection highlights the focus of the paper on handcrafted features while ensuring a meaningful contrast with an established CNN architecture.
Although related studies also experiment with OpenSARShip and FUSAR-Ship, the differences in training samples, validation splits, and preprocessing pipelines make direct comparisons with results from other papers imprecise and potentially misleading. Re-implementing these methods for this specific experimental setup could introduce further variability, given differences in hyperparameters, augmentations, or network optimisations that might not be explicitly detailed in the original papers.
Comment 11:
The Conclusion Section maybe a little long. Some texts can be moved to the Discussion Section.
Response 11:
Thank you for your comment.
It is acknowledged that the conclusion section is detailed to ensure a comprehensive summary of the study's findings and implications. We made an attempt in reorganizing the text as suggested, however it was disrupting the flow of the manuscript as the discussion primarily focuses on interpreting results rather than summarising the study or proposing future directions. We believe that the current structure effectively ties together the key results and their relevance to the objectives outlined in the introduction and finally decided to keep it as it is.
Comment 12:
In Line 171, the reference for the Complement Naïve Bayes model should be added. More details for the Reference [61] should be provided.
Response 12:
Thank you for your comment.
A reference has been added for the Complement Naïve Bayes model in line 171 (reference 38). Moreover, details for reference 62 (previously 61) have been added.