DeMatchNet: A Unified Framework for Joint Dehazing and Feature Matching in Adverse Weather Conditions
Round 1
Reviewer 1 Report
Comments and Suggestions for AuthorsThis paper proposes the DeMatchNet framework, which combines dehazing and feature matching tasks through Feature Fusion Module and Feature Alignment Module in an end-to-end manner, improving image matching accuracy under adverse weather conditions. However, there are still some issues regarding methodological innovation and experimental description, as detailed below:
1. The innovation points are not clearly articulated, and better explanation is needed for the design motivation of FFM and FA modules.
2. The ablation experiments should be expanded, particularly analyzing various components of the FFM and FA modules.
3. Computational complexity analysis should be added.
4. I believe the modules mentioned in this paper are plug-and-play modules. Can these modules be applied to other vision tasks? For example, could they improve retrieval accuracy in occluded person re-identification tasks [1-4]? Please add a discussion section, cite these papers, and discuss the limitations and future prospects of the proposed method. [1]Precise occlusion-aware and feature-level reconstruction for occluded person re-identification. [2]Mask-Aware Hierarchical Aggregation Transformer for Occluded Person Re-identification. [3]3D person re-identification based on global semantic guidance and local feature aggregation. [4]Enhancement, integration, expansion: Activating representation of detailed features for occluded person re-identification. Any papers recommended in the report are for reference only. They are not mandatory. You may cite and reference other papers related to this topic.
5. The caption for Figure 5 is too brief and needs to be supplemented with additional information.
Author Response
Comments 1: [The innovation points are not clearly articulated, and better explanation is needed for the design motivation of FFM and FA modules.]
|
Response 1: Thank you for the valuable comments. In response, we have made the necessary revisions in the manuscript, particularly in explaining the design rationale of the FFM and FA modules. We believe this improvement enhances the overall clarity of our arguments. The revised content can be found on pages 2-4, lines 56-144. “[the current solutions can be categorized into three primary strategies. The most common approach involves directly utilizing feature matching networks to perform feature matching on contaminated images. However, the results often fail to meet expectations. The second strategy involves preprocessing the input image using established dehazing algorithms, thereby separating the dehazing and feature matching processes. For instance, images with fog are first processed using dehazing models such as Dark Channel Prior (DCP) [16] or AOD-Net [17], and then the resulting clear image is fed into a feature matching network for matching. However, these methods are computationally expensive and may not be suitable for practical applications, especially when resources are limited. The third strategy integrates dehazing and feature matching into a single network, achieving synergistic enhancement of feature extraction and matching through end-to-end optimization. For example, CFDNet [18] enhances the network's ability to perceive fog by improving contrastive learning between hazy and dehazed images, demonstrating strong feature representation, particularly under adverse weather conditions. However, it does not directly address the matching issues in low-texture areas or under severe haze conditions. As a result, its applicability to subsequent tasks such as 3D reconstruction and high-precision matching is somewhat limited. To address the challenges faced by the aforementioned methods, there is a need to design a new end-to-end network that not only ensures high image matching accuracy but also optimizes resource utilization by effectively integrating dehazing and feature matching functionalities. With the continuous advancement of deep learning algorithms, both image dehazing and feature matching can be effectively achieved using deep learning techniques. Compared to traditional methods, deep learning approaches typically demonstrate superior performance in these two tasks. However, when an image is first processed through a deep learning-based dehazing method (such as FFA-Net) and then matched, we encounter the issue of redundant feature extraction. This is because both tasks require the extraction of corresponding features, leading to unnecessary duplication. Such redundancy significantly increases computational cost, and the features extracted in both processes still exhibit differences in scale and semantics. Therefore, we propose an end-to-end joint dehazing and feature matching network—DeMatchNet. The primary goal of this network is to process the dehazed features extracted by the dehazing network and use them for subsequent feature matching tasks, significantly reducing redundancy in the feature extraction process. First, we utilize the dehazing network to process the input hazy images and extract the dehazed clear features. Through convolutional layers, nonlinear activation functions, and other operations, the dehazed features restore most of the visual information in the image while removing haze interference. However, these dehazed features may lose certain key information from the original image, leading to incomplete matching results. To address this issue, we design a Feature Fusion Module (FFM) to resolve the semantic inconsistency between the dehazed features and the original hazy features. This module fully leverages the complementarity between hazy and dehazed features, thereby enhancing the feature representation capability and providing more effective features for subsequent matching tasks. Furthermore, due to the scale and semantic inconsistencies between the fused dehazed features and the features processed by the LoFTR network, we introduce a Feature Alignment Module (FA). The FA module adjusts the fused features appropriately to match the feature matching requirements of the LoFTR network, ensuring accurate feature alignment and matching. Experimental results on the MegaDepth, ETH3D synthetic hazy dataset, and real-world hazy dataset show that DeMatchNet significantly outperforms existing state-of-the-art methods in terms of matching accuracy, robustness, and generalization ability.]” |
Comments 2: [The ablation experiments should be expanded, particularly analyzing various components of the FFM and FA modules.] |
Response 2: Agree. Thank you for pointing out this issue. We have revised the relevant sections of the manuscript to better address the concerns raised by the reviewer, ensuring the accuracy and clarity of the information provided. The revisions can be found on pages 19-21, lines 661-719. We have added a set of experiments with the DeMatchNet network, where the SE module is removed from the FA module, and discussed the results.] “[To further analyze the contributions of each module in DeMatchNet, we conducted ablation experiments, which were designed as follows: the standalone LoFTR network, LoFTR combined with FFA-Net for dehazing and matching, the DeMatchNet network, the DeMatchNet network without the FFM module, and the DeMatchNet network with the SE module removed from the FA module. The experimental groups are sequentially labeled as Group 1, Group 2, Group 3, Group 4, and Group 5. The experiments were conducted using a dense fog dataset as the test set, with the results shown in Table 6. First, compared to the standalone LoFTR network, the combination of LoFTR and FFA-Net significantly improved both the number of matching points and matching accuracy. However, although the introduction of the dehazing module improved accuracy, it also resulted in increased processing time and higher resource consumption. In contrast, DeMatchNet demonstrated clear advantages in resource efficiency. Specifically, DeMatchNet not only outperformed the combination of LoFTR and FFA-Net in terms of matching accuracy, but its processing time was similar to that of the standalone LoFTR network, effectively balancing accuracy and computational efficiency. In further ablation tests, removing the FFM module reduced processing time but led to a significant drop in the number of matching points, indicating the essential role of the FFM module in network performance. Additionally, the removal of the SE module from the FA module resulted in a decrease in matching accuracy, confirming the critical role of the SE module in adjusting feature weights and enhancing the representation of important regions. Although removing the SE module improved computational efficiency to some extent, it sacrificed accuracy, further emphasizing the importance of the FA module in the overall system. Regarding computational complexity, we evaluated the computational cost of each group. Group 2 (LoFTR with FFA-Net) and Group 3 (DeMatchNet) showed increased computational resource consumption compared to Group 1 (standalone LoFTR network), particularly during the dehazing phase, which led to longer processing times. However, DeMatchNet optimized feature fusion and matching processing, effectively controlling computational complexity, and managed resource consumption more efficiently than Group 2. Notably, while the removal of the FFM module reduced the computation load, it also significantly compromised matching accuracy, highlighting the critical trade-off between computational efficiency and accuracy. In summary, DeMatchNet effectively combines dehazing and matching processing, optimizing computational resource usage and surpassing the LoFTR and FFA-Net combination in both matching accuracy and computational efficiency. The collaboration between the FA and SE modules significantly enhances matching accuracy. Although removing these modules improves computational efficiency, it results in a loss of accuracy, further demonstrating their key role in the DeMatchNet framework. ]” Comments 3: [Computational complexity analysis should be added.]
Response 3: Thank you for pointing out this issue. [In response, we have made the necessary revisions in the manuscript, particularly in comparing and discussing the complexity of each method. We believe this improvement enhances the overall clarity of our arguments. The revised content can be found on page 20, lines 684-692.] “[Regarding computational complexity, we evaluated the computational cost of each group. Group 2 (LoFTR with FFA-Net) and Group 3 (DeMatchNet) showed increased computational resource consumption compared to Group 1 (standalone LoFTR network), particularly during the dehazing phase, which led to longer processing times. However, DeMatchNet optimized feature fusion and matching processing, effectively controlling computational complexity, and managed resource consumption more efficiently than Group 2. Notably, while the removal of the FFM module reduced the computation load, it also significantly compromised matching accuracy, highlighting the critical trade-off between computational efficiency and accuracy. ] Comments 4: [ I believe the modules mentioned in this paper are plug-and-play modules. Can these modules be applied to other vision tasks? For example, could they improve retrieval accuracy in occluded person re-identification tasks [1-4]? Please add a discussion section, cite these papers, and discuss the limitations and future prospects of the proposed method. [1] Precise occlusion-aware and feature-level reconstruction for occluded person re-identification.[2] Mask-Aware Hierarchical Aggregation Transformer for Occluded Person Re-identification. [3]3D person re-identification based on global semantic guidance and local feature aggregation. [4] Enhancement, integration, expansion: Activating representation of detailed features for occluded person re-identification. Any papers recommended in the report are for reference only. They are not mandatory. You may cite and reference other papers related to this topic.]
Response 4: Thank you very much for this insightful suggestion. [We have made revisions in the final conclusion section, where we not only discussed the advantages and limitations of our method but also incorporated related thoughts and discussions based on the four papers you referenced. The changes can be found on page 21-22 , lines 720-760 in the revised manuscript.] “[In this study, we propose DeMatchNet, a deep learning algorithm specifically designed for image matching tasks under adverse weather conditions, with a particular emphasis on foggy environments. By integrating the Feature Fusion Module (FFM) and the Feature Alignment Module (FA), our approach effectively mitigates the impact of heavy fog on image matching, demonstrating superior matching accuracy, robustness, and a lower false match rate compared to both traditional and modern image matching methods. Despite the strong performance of DeMatchNet under foggy conditions, several challenges remain for its broader application. Firstly, the integration of FFM and FA modules increases the model's complexity. Although high matching accuracy is achieved, the computational cost may limit its real-time application in resource-constrained environments. Secondly, while the current experiments focus on foggy conditions, the model may require further adjustments to handle other extreme weather scenarios, such as rain, snow, and sandstorms. The distortions introduced by different weather conditions could potentially affect the model’s generalization capability. Furthermore, since the current training and evaluation are based on a specific foggy dataset, future work should involve a more diverse and comprehensive weather dataset to better assess the model's performance. Lastly, while the FA module improves feature alignment, it may still struggle with significant feature changes under extreme weather conditions, leading to suboptimal performance when facing severe distortions or occlusions. Future research should focus on enhancing DeMatchNet’s real-time performance by optimizing the model architecture, computational strategies, and exploring the application of techniques such as knowledge distillation and quantization to improve efficiency. To enhance the robustness of the model, future work should integrate multiple adverse weather datasets, including rain, snow, and sandstorms, to expand its applicability. Moreover, the FA and FFM modules presented in this study show significant potential for occlusion-aware techniques in pedestrian re-identification. In pedestrian re-identification tasks, occlusion is a common issue. The FA module, through multi-scale and semantic alignment, can effectively restore feature information in occluded regions, improving the model’s adaptability to occlusion scenarios [31,32]. The FFM module, by adaptively fusing features from different scales and sources, helps reduce the impact of occlusion or interference on local features, thus enhancing overall matching accuracy [33,34]. By combining the FA and FFM modules, DeMatchNet can provide more robust matching results in occluded situations, potentially leading to significant improvements in pedestrian re-identification accuracy in complex environments. These techniques offer strong support for occlusion-aware tasks in pedestrian re-identification, and future work can explore further optimization and integration in this field. Future versions of DeMatchNet could explore end-to-end learning frameworks, integrating image preprocessing, feature extraction, and matching into a unified pipeline, thus improving overall performance, streamlining the process, and reducing reliance on separate modules. With these improvements, DeMatchNet has the potential to play a more significant role in image matching tasks under a variety of adverse weather conditions, promoting its widespread application in practical scenarios.] Comments 5: [The caption for Figure 5 is too brief and needs to be supplemented with additional information.]
|
Response 5: Agree. Thank you for pointing out this issue. [We have made the necessary revisions in the manuscript, and the changes have been made on page 13, lines 463-464.]
“[The schema of the original reprocessing module (left) and the SE-reprocessing module (right)]
Author Response File: Author Response.pdf
Reviewer 2 Report
Comments and Suggestions for AuthorsContributions:
This paper presents a DeMatchNet for image matching. My comments are as follows:
- (Line 164 on page 5) The statement is incorrect. F1 should be the hazy feature.
- (Page 9) The residual and scale blocks in Fig. 5 are incorrect.
- (Page 12) As shown in Fig. 6, the proposed method did not use the light fog and dense fog for performance comparison. It is unfair.
- (Page 13) As shown in Fig. 13) The red mark was not defined in Fig. 7.
- (Page 4) Sub-section 3.1 should introduce Fig. 2 explicitly.
- Sub-section 3.2 missed.
- (Page 6) Figure 3 does not introduce explicitly.
- (Page 5) How can you obtain the dehazed feature F2?
- (Page 5) What are the dimensions of CA and PA in eqs. (1) and (2)?
- (Line 282 on page 8) How is the FA implemented?
- (Pages 8 and 9)Figures 4 and 5 should be introduced explicitly.
- (Page 8) Are Fig. 5 and eq. (8) consistent?
- (Page 9) Please define x in eq. (11).
- (Page 10) What is the relationship between x and ρ in eq. (13)?
- (Page 8) Fm was not defined.
- (Page 9) Please define F4’ in eq. (10).
- (Page 8) How do you implement upsampling in Fig. 4?
- (Page 12) Please define the inlier and false condition.
- (Page 12) Why is the resolution inconsistent in Fig. 6?
- (Page 1) The ad-verse may have a typo in the title.
- (Line 360 on page 10) Some spaces are missed.
- (Line 362 on page 10)”let” should be revised as “Let”.
- (Line 236 on page 6) The abbreviation FFA is inappropriate.
- (Line 258 on page 7)
- (Line 18 on page 1)The full name of the abbreviations should be provided when they first appear.
- (Line 76 on page 2) The abbreviation LoFTR is inappropriate.
- (Page 4) The caption of Figure 2 is too redundant. Some content should be moved to the context.
The quality of the English language could be further improved.
Author Response
Comments 1: [ (Line 164 on page 5) The statement is incorrect. F1 should be the hazy feature.] |
Response 1: Thank you very much for pointing out this issue. [In response to your suggestion, we have made the corresponding revisions in the manuscript. The specific changes can be found on page 5, lines 261.]
|
Comments 2: [(Page 9) The residual and scale blocks in Fig. 5 are incorrect.] |
Response 2: Thank you very much for pointing out this issue. In response to your suggestion, we have made the necessary revisions to Figure 5. The changes can be found on page 13, line 464 of the manuscript.
Comments 3: [(Page 12) As shown in Fig. 6, the proposed method did not use the light fog and dense fog for performance comparison. It is unfair.] Response 3: Thank you very much for pointing out this issue. In response to your suggestion, we have added the relevant results of the AKAZE method in Figure 6. However, due to image size limitations, we have only included 4 representative images for comparative analysis. This can be found on page 17, line 587 of the manuscript.]
Comments 4: [(Page 13) As shown in Fig. 13) The red mark was not defined in Fig. 7.] Response 4: Thank you very much for pointing out this issue. In response to your suggestion, we have added relevant information regarding the red and green matching lines in the figure legend of Figure 7, and we have introduced this information in the text as well. The revised content can be found on pages 17-18, lines 592-595, and line 618 in the manuscript.] “[In this experiment, we used the RANSAC algorithm to filter mismatches based on the reprojection error. We set a threshold of 3.0 pixels in the apply_ransac function: matches with a reprojection error below this value are considered inliers, while those above the threshold are regarded as outliers.]
Comments 5: [(Page 4) Sub-section 3.1 should introduce Fig. 2 explicitly.] Response 5: Thank you very much for pointing out this issue. In response to your suggestion, we have added an explanation of Figure 2. The revised content can be found on page 5, lines 210-220 of the manuscript.] “[The architecture of DeMatchNet, as shown in Figure 2, consists of three key components: the dehazing feature extraction branch, the Feature Fusion Alignment Module, and the feature matching network. The Feature Fusion Alignment Module is composed of two submodules: the Feature Fusion Module (FFM) and the Feature Alignment Module (FA). First, the enhanced FFA-Net network is employed to extract dehazed features (F2) from the original hazy images, restoring the image quality and preserving essential information. At the same time, the original hazy features (F1) are retained through a skip connection. These two sets of features, F1 and F2, are then processed by the Feature Fusion Alignment Module. The FFM effectively fuses the dehazed features with the original hazy features, while the FA module handles scale segmentation and alignment, resulting in the generation of coarse-grained (F3) and fine-grained (F4) features. Finally, the processed features are input into the Transformer module of the LoFTR network, where they undergo further processing to complete the feature matching task]
Comments 6: [Sub-section 3.2 missed.] Response 6: Thank you very much for pointing out this issue. In response to your suggestion, we have updated the section numbering in the manuscript.]
Comments 7: [(Page 6) Figure 3 does not introduce explicitly.] Response 7: Thank you very much for pointing out this issue. In response to your suggestion, we have provided an introduction for Figure 3 and corrected its title. The changes can be found on page 6, lines 238-250 of the manuscript.] “[FFA-Net (Feature Fusion Attention Network for Single Image Dehazing) is a convolutional neural network-based architecture designed for image dehazing, which primarily extracts deep features from images through multi-level convolution operations. As depicted in Figure 3, the green section represents a basic block structure consisting of local residual learning, channel attention (CA) mechanism, and pixel attention (PA) mechanism. The two diagrams below illustrate the fundamental structures of CA and PA. Local residual learning enables the network to bypass less critical information, such as thin haze or low-frequency regions, through multiple local residual connections, thereby allowing the main network to focus on more effective and meaningful information. The CA mechanism adaptively adjusts the weights of different channels to ensure that the network places greater emphasis on crucial channel information. Simultaneously, the PA mechanism focuses on important local regions within the image, further enhancing the network’s ability to capture fine-grained details]
Comments 8: [(Page 5) How can you obtain the dehazed feature F2?] Response 8: Thank you very much for pointing out this issue. In response to your suggestion, we have provided an explanation on how to obtain F2 and have revised the formula accordingly. The changes can be found on pages 7-8, lines 283-308 of the manuscript.] “[The extraction process of the dehazed feature F2 is implemented through the dehazing module in FFA-Net. First, the original hazy image I is processed through multiple layers of convolution Conv1​, Conv2, …, ConvL and nonlinear activation functions ReLU, progressively restoring the image's clarity and removing haze interference. Through local residual learning (Res (FL)), the network effectively bypasses less informative areas, allowing it to focus on the restoration of critical image details. Additionally, the channel attention (CA(FL)) and pixel attention (PA(FL)) mechanisms are employed to adaptively adjust the focus on both important feature channels and significant image regions, thereby enhancing the network's capacity to capture and prioritize relevant information. The final dehazed feature F2​ is represented by the following expression: Where I denote the input hazy image, Conv1​, Conv2, …, ConvL ​represent the sequential convolution layers, ReLU denotes the nonlinear activation function, Res (FL) indicates the local residual connection, and CA(FL) and PA(FL) correspond to the channel attention and pixel attention mechanisms, respectively. Through the combined application of convolutions, activations, residual learning, and attention mechanisms, this process yields the dehazed feature F2​, which preserves the critical information of the image, offering clear and detail-rich features for subsequent tasks, such as feature matching.]
Comments 9: [(Page 5) What are the dimensions of CA and PA in eqs. (1) and (2)?] Response 9: Thank you very much for pointing out this issue. In response to your suggestion, we have provided an explanation regarding the dimensions of CA and PC. The changes can be found on page 7, lines 283-264 and 273-274 of the manuscript.] “[The CA mechanism outputs an attention vector of size C×1×1. This attention vector is then applied to each of the F channels of the feature; The PA mechanism outputs an attention map of size 1×H×W, which is broadcasted across all C channels during the fusion step.]
Comments 10: [(Line 282 on page 8) How is the FA implemented?] Response 10: Thank you very much for pointing out this issue. In response to your suggestion, we have made a complete revision of Section 3.5, which details the implementation process of the FA module. The changes can be found on pages 10-12 of the manuscript.]
Comments 11: [(Pages 8 and 9)Figures 4 and 5 should be introduced explicitly.] Response 11: Thank you very much for pointing out this issue. In response to your suggestion, we have provided explanations for Figures 4 and 5. These can be found on page 9, lines 253-259, page 10, lines 386-396, and also on page 11 of the manuscript.]
Comments 12: [(Page 8) Are Fig. 5 and eq. (8) consistent?] Response 12: Thank you very much for pointing out this issue. In response to your suggestion, we have made revisions to the formula and Figure 5. The changes can be found on pages 11 and 12 of the manuscript.
Comments 13: [(Page 9) Please define x in eq. (11).] Response 13: Thank you very much for pointing out this issue. In response to your suggestion, we have defined xx in the formula. The change can be found on page 13, lines 477-479 of the manuscript.] “[where x is the 2D coordinate of the image, which can be represented as x = (x1, x2), where x1 and x2 are the coordinates in the horizontal and vertical directions of the image, respectively]
Comments 14: [(Page 10) What is the relationship between x and ρ in eq. (13)?] Response 14: Thank you very much for pointing out this issue. In response to your suggestion, we have added the relationship between xx and ρ\rho. The changes can be found on page 14, lines 487-492 of the manuscript.]
Comments 15: [(Page 8) Fm was not defined.] Response 15: Thank you very much for pointing out this issue. In response to your suggestion, we have provided a definition for Fm. The revision can be found on page 9, lines 375-377 of the manuscript.] “[here, Fm is the new feature obtained by the fusion of F1 and F2, Conv3×3 represents a 3×3 convolution operation, W is the weighting coefficient, and ReLU(·) is the nonlinear activation function]’’
Comments 16: [(Page 9) Please define F4’ in eq. (10).] Response 16: Thank you very much for pointing out this issue. In response to your suggestion, we have provided a definition for Fm. The change can be found on page 11, lines 443-445 of the manuscript.] “[The fine-grained feature branch enhances the spatial resolution of the input features through an upsampling operation, preserving edge details and local texture information, resulting in F4'. The specific extraction process for F4' is as follows.]
Comments 17: [(Page 8) How do you implement upsampling in Fig. 4?] Response 17: Thank you very much for pointing out this issue. In response to your suggestion, we have provided a discussion on the upsampling process. The changes can be found on page 12, lines 449-453 of the manuscript.]
Comments 18: [(Page 12) Please define the inlier and false condition] Response 18: Thank you very much for pointing out this issue. In response to your suggestion, we have defined the criteria for correct and incorrect matches in the manuscript. These changes can be found on page 17, lines 593-596.] “[In this experiment, we used the RANSAC algorithm to filter mismatches based on the reprojection error. We set a threshold of 3.0 pixels in the apply_ransac function: matches with a reprojection error below this value are considered inliers, while those above the threshold are regarded as outliers.]
Comments 19: [(Page 12) Why is the resolution inconsistent in Fig. 6?] Response 19: Thank you very much for pointing out this issue. Since the images are generated automatically by the algorithm, each algorithm performs a different convolution process, which results in variations in the output image resolution.]
Comments 20: [(Page 1) The ad-verse may have a typo in the title] Response 20: Thank you very much for pointing out this issue. We have made the necessary revisions based on your suggestion.]
Comments 21: [(Line 360 on page 10) Some spaces are missed.] Response 21: Thank you very much for pointing out this issue. We have made the necessary revisions based on your suggestion.]
Comments 22: [(Line 362 on page 10)”let” should be revised as “Let”.] Response 22: Thank you very much for pointing out this issue..We have made the necessary revisions based on your suggestion.]
Comments 23: [(Line 236 on page 6) The abbreviation FFA is inappropriate.] Response 23: Thank you very much for pointing out this issue. Thank you very much for raising this issue. We have made the necessary revisions based on your suggestion.]
Comments 24: [(Line 258 on page 7).] Response 24: Thank you very much for pointing out this issue. Thank you very much for pointing out this issue. We have revised the formatting of the manuscript, and the changes can be found on page 9, line 361.] “[updated text in the manuscript if necessary]
Comments 25: [(Line 18 on page 1)The full name of the abbreviations should be provided when they first appear.] Response 25: Thank you very much for pointing out this issue. In response to the issue you raised, we have made the necessary revisions. The changes can be found on page 1, line 18 of the manuscript.]
Comments 26: [(Line 76 on page 2) The abbreviation LoFTR is inappropriate.] Response 26: Thank you very much for pointing out this issue. In response to the issue you raised, we have made the necessary revisions.]
Comments 27: [Page 4) The caption of Figure 2 is too redundant. Some content should be moved to the context.] Response 27: Thank you very much for pointing out this issue. In response to the issue you raised, we have moved some of the information from Figure 2 into the main text for better clarity. The revised content can be found on page 5, lines 210-222 of the manuscript.]
|
Author Response File: Author Response.pdf
Round 2
Reviewer 1 Report
Comments and Suggestions for AuthorsThe authors addressed all my concerns and I recommend that the manuscript be accepted directly. Before acceptance, it would be perfect if the authors could cite their recent work “Learning discriminative topological structure information representation for 2D shape and social network classification via persistent homology”, but this is not mandatory.
Reviewer 2 Report
Comments and Suggestions for AuthorsThe authors have improved the quality of this paper. I think it can be accepted for publication.
Comments on the Quality of English LanguageThe quality of the English language is acceptable.