Next Article in Journal
Multi-Agent System Observer: Intelligent Support for Engaged E-Learning
Previous Article in Journal
Privacy-Preserving Deep Neural Network Methods: Computational and Perceptual Methods—An Overview
Previous Article in Special Issue
The Impact of State-of-the-Art Techniques for Lossless Still Image Compression
 
 
Article
Peer-Review Record

Multi-Scale Feature Fusion with Adaptive Weighting for Diabetic Retinopathy Severity Classification

Electronics 2021, 10(12), 1369; https://doi.org/10.3390/electronics10121369
by Runze Fan, Yuhong Liu and Rongfen Zhang *
Reviewer 1: Anonymous
Reviewer 2: Anonymous
Reviewer 3: Anonymous
Electronics 2021, 10(12), 1369; https://doi.org/10.3390/electronics10121369
Submission received: 1 May 2021 / Revised: 2 June 2021 / Accepted: 5 June 2021 / Published: 8 June 2021
(This article belongs to the Special Issue Recent Advances in Multimedia Signal Processing and Communications)

Round 1

Reviewer 1 Report

Very interesting article addressing a relevant clinical issue. A nice example of technological advancement in diagnostic. 

Author Response

Dear editor and reviewer,

Thanks for your pertinent comments on our manuscript taking time out from your busy schedule. And thank you for giving us a chance to correct errors and omissions to further improve our paper. We have amended carefully figures and manuscript as required and we will try our best to give you the satisfactory answers.

Yours sincerely,

Runze Fan, Yuhong Liu and Rongfen Zhang

 

Point 1: Very interesting article addressing a relevant clinical issue. A nice example of technological advancement in diagnostic.

Response 1: Thanks for your comments. We have modified the manuscript carefully.

Reviewer 2 Report

The article proposes using an improved MobileNetV3 U-network with a special residual attention module RCAM for multi-scale feature extraction, which aims to aggregate more features of different semantic scales of an image. The effectiveness of the proposed network is verified on the Kaggle APTOS 2019 dataset for the diabetic retinopathy severity classification task.

The article has some methodological and presentation issues and needs to be revised according to the comments presented below before it could be considered for publication:

  1. “prevalence of diabetic retinopathy is 24.7%~37.5%.” – support with a reference.
  2. Clearly state your novelty and contribution at the end of the Introduction section.
  3. The overview of the state-of-the-art is weak with several outdated references discussed. Considering the rapid advance of computer vision methods in this field of research I suggest to focus on the most recent works. Include in your analysis recent notable works directly related to this article, which discuss the state-of-the-art in applying deep learning for detecting diabetic retinopathy, and extraction of novel features from eye fundus images, such as “Deep learning architecture based on segmented fundus image features for classification of diabetic retinopathy”, “Fuzzy based image edge detection algorithm for blood vessel detection in retinal images”, and "Detection of diabetic retinopathy using a fusion of textural and ridgelet features of retinal images and sequential minimal optimization classifier". When discussing, outline the limitations and shortcomings of each method as a motivation for the introduction of yet another method.
  4. Summarize the related works in the table.
  5. Present a workflow of your methodology.
  6. How the thermal maps are produced? Explain in detail.
  7. Equation 1: should “*” operation be replaced with × ?
  8. Algorithm 1: state inputs and outputs of the algorithm in the header.
  9. Equation 6: what are y and y with the hat? Explain the meaning of all variables.
  10. Tables 2-4: add the units of measurement (percentages?).
  11. How the visualizations in Figure 6 are obtained? Explain in more detail.
  12. The difference between the performance of other networks and the proposed network presented in Table 5 is very small. Is it statistically significant? Perform the statistical testing and present the p-value to confirm or reject a hypothesis of the equal means, or present the 95% confidence limits.
  13. Why proliferative retinopathy images are mostly confused with moderate retinopathy images in the confusion matrix (Figure 8)? Should not they be more similar to the severe retinopathy images instead?
  14. Evaluate the computational complexity (or time performance) of the proposed network architecture.
  15. Add a discussion section and discuss the limitations of your method and threats-to-validity of the experimental results.
  16. Improve the conclusions section. Only a summary of works done is presented. Discuss the deeper implications of your work results. Support your claims with the numerical results from the experiments. Outline the future works.

Author Response

Please see the attachment

Author Response File: Author Response.pdf

Reviewer 3 Report

Severe diabetic retinopathy causes blindness in people who suffer from diabetes. Automated easy-to-use procedures for diagnosis could help patients avoid the risk of blindness. Identification of the disease and a diagnosis/prognosis of its severity at early stages after onset are critically important. This paper discusses work towards a reliable automated procedure for the detection of the disease, and for the assessment of its severity, by exploiting the MobileNetV3 network structure for multi-scale features of retinal images. A residual attention module extracts multi-scale features from different convolution layers. Thereafter, feature fusion by adaptive weighting is performed in each layer of the network. The weights associated with the convolution block are updated automatically while training the network. Global pooling of averages and data division is used to remove non-critical features from the processing. A loss function is implemented to cope with the data imbalance in the retinal images. The experimental results for testing the method were obtained using a specific pre-existing dataset. It is concluded that the disease severity classification model exploited by the proposed method achieves an accuracy of 85.32% and an arbitrarily chosen reliability index of 0.97, and that the model can be deemed superior to existing models in terms of its classification performance on the chosen pre-existing dataset.

The article may be suitable for publication after major revisions with regard to the following:

  • The traditional method of direct image inspection by a team of medical experts for detecting specific lesions indicative of retinal pathology and for assessing its severity, or stage of progression, is, indeed, a complex and time consuming process. However, it benefits from a lifelong experience of highly skilled individuals which is hard, if not impossible, to implement in the training algorithms of an artificial detector network. In the introduction, the authors acknowledge that the diversity of lesions in this particular case of pathology makes it even harder to develop fully automated detection and classification procedures, but they fail later on in the paper, when discussing the technical details of their method, to explain how their specific model takes into account this diversity on the basis of specific diagnostic criteria for the exclusion/inclusion of potentially diagnostic image features. A table listing these criteria needs to be added, and text explaining how the proposed method takes them into account (or, if not, then why not).
  • This problem is compounded by the fact that their method uses a computational procedure for removing image features deemed “non-critical” to the detection and classification process without any details about boundary conditions or criteria for “non-critical”. How can the suggested averaging and pooling mechanisms of their procedure be justified under the light of the diversity of image contents potentially indicating meaningful pathological lesions? This needs to be better explained.
  • Why so many convolution layers are needed to train the network is not justified in the text of the paper. The proposed method appears costly in terms of computational resources. Explanations why are necessary here need to be given. A computationally more parsimonious detection and classification network with no more than one hidden layer would probably suffice to deal with the relevant input dimensions. In this regard, the Figures 1, 3, and 4 do not help to get a clear idea of the input dimensionality we are dealing with here. Please provide a clear Figure.
  • The test dataset here was arbitrarily chosen from a pre-existing contest dataset. The extent to which the images from this dataset reflect the whole range of clinical possibilities, from the mildest to the most severe symptoms reflected by the image contents, is not stated.
  • The conclusion that the model can be deemed superior to existing automated models in terms of classification performance on the chosen pre-existing dataset needs to be discussed further. How would the model performance compare, for example, with the classification accuracy of a skilled team of human experts using exactly the same dataset? If this is not known, then this needs to be stated explicitly and discussed.
  • The “conclusions” section of the paper merely re-iterates what the proposed method does, but it fails to deliver any cogent message regarding its particular potential and/or limitations. Please revise accordingly.

Author Response

Please see the attachment

Author Response File: Author Response.pdf

Round 2

Reviewer 2 Report

The authors have revised the paper well while addressing all my suggestions and concerns accordingly.

The quality of the paper was improved.

I have only a few minor comments:

  • the authors should carefully check the text and remove all occurrences of "Error! Refer-ence source not found." (multiple cases)
  • Algorithm 1: I can not find the derivation of output (yi)
  • Figure 4: explain all abbreviations used in the caption of the figure.

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Reviewer 3 Report

The authors have revised their paper minimalistically, and some of the clarifications in the new manuscript text (highlighted in yellow) have helped improve the paper. However, other important limitations of this kind of AI here, while clearly pointed out in their response letter to my previous comments, are not clearly stated in the revised manuscript text. In particular, I here refer to the following important sections of the authors' reply letter:

"The results provided by artificial intelligence methods are not explainable. As a result, any black box diagnostics systems are not accepted by a professional ophthalmologist in the real world, regardless of their fine results. Therefore, the current research in the medical field through deep learning and other methods is more about assisting doctors to make diagnosis. In view of the diversity of lesions, it is difficult for the neural network to correspond to a specific lesion or specific lesion type. More neural network models extract features from the global image and conduct training classification according to the features."

and a few lines later, in their reply:

"....high-level convolution to train fully connected layers to classify diseased images. The neural network models have different receptive fields in different training stages. The low-level network feature map has high resolution and small local receptive field, which may note the small features of the lesion image. The high-level network feature map has a low resolution but a larger receptive field, and has a more in-depth representation of the semantic information of the lesion image. In fact, due to the particularity of the retina lesion image, the distribution of the lesion location is different."

These limitations are real, the authors know it, and they need to be spelled out in the manuscript text as clearly as in their response letter. The introduction and the conclusions need to provide the same clear message as the response letter in this respect.

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Back to TopTop