Next Article in Journal
Fast Method of Computations of Ripples in the Junction Temperature of Discrete Power SiC-MOSFETs at the Steady State
Previous Article in Journal
Anaerobic Co-Digestion of Wastes: Reviewing Current Status and Approaches for Enhancing Biogas Production
Previous Article in Special Issue
The Effect of Cr Substitution on the Anomalous Hall Effect of Co3−xCrxAl (x = 0, 1, 2, 3) Heusler Compounds: An Ab Initio Study
 
 
Article
Peer-Review Record

Deep Learning in Left and Right Footprint Image Detection Based on Plantar Pressure

Appl. Sci. 2022, 12(17), 8885; https://doi.org/10.3390/app12178885
by Peter Ardhianto 1, Ben-Yi Liau 2, Yih-Kuen Jan 3,4,5, Jen-Yung Tsai 6, Fityanul Akhyar 7, Chih-Yang Lin 8, Raden Bagus Reinaldy Subiakto 9 and Chi-Wen Lung 3,10,*
Reviewer 3: Anonymous
Appl. Sci. 2022, 12(17), 8885; https://doi.org/10.3390/app12178885
Submission received: 28 July 2022 / Revised: 22 August 2022 / Accepted: 1 September 2022 / Published: 5 September 2022

Round 1

Reviewer 1 Report

MINORS


In ABSTRACT:
"matric" -> "metric"


In Figure 1:
Explain better or move the acronyms to the main text.
"Note: DenseNet, Dense Convolutional Network; ResNet50, Residual Neural Network."

MAJOR
- Improve quality of fig 4 and fig 7

- Results in section 3.1 are interesting but does not give much information as it is developed in the training set (maybe there is an overfitting?). Section 3.2 should be improved and provide results like in 3.1.

Author Response

Response to Reviewer 1

Thank you very much for your thoughtful review of our manuscript. We appreciate the time the reviewer has dedicated to providing valuable suggestions on our manuscript. We are grateful to the reviewer for your insightful comments on this paper. We have been able to incorporate changes to reflect most of the suggestions the reviewer provided. We have highlighted the changes within the manuscript.

 

## MINORS

1. In ABSTRACT: "matric" -> "metric"

 

>> Response:

We thank the reviewer for the helpful comments. We have changed the “matric” into “metric” in the abstract section,

 

**page 1, line 39.

“YOLOv4 reached over 99.00% in various metric performances.”

 

 

2. Explain better

In Figure 1: Explain better or move the acronyms to the main text.

"Note: DenseNet, Dense Convolutional Network; ResNet50, Residual Neural Network."

 

>> Response:

Thank you very much for the reviewer's suggestion. We have added the acronym to the main text in the method section.

 

**page 4, line 157-159.

Figure 1. Illustration of object detection Network Architecture; (A) Residual Neural Network 50 (ResNet50) network architecture, (B) Dense Convolutional Network (DenseNet) network architecture.

 

 

 

## MAJOR

1. Improve quality of fig 4 and fig 7

 

>> Response:

Thank you for the reviewers' comments. We have improved the quality of Figure 4 (page 6) and Figure 7 (page 9).

 

2. Overfitting

Results in section 3.1 are interesting but does not give much information as it is developed in the training set (maybe there is an overfitting?). Section 3.2 should be improved and provide results like in 3.1.

 

>> Response:

Thank you for your feedback on overfitting. We believe it is vital to discuss overfitting findings. Therefore, we have updated information regarding overfitting and renamed "Training Results" to "Experimental Results" since our section 3.1 results were train and validation results. In addition, in section 3.2 we tested the network performance using only a few images; therefore, we have modified the subtitle of section 3.2 from "Prediction Results" to "Testing Samples" to make our study clearer.

 

**Page 6, line 228-231

3.1. Experimental results

In this study, we randomly split 974 plantar pressure images into 70% for the training and 30% for the validation set. According to Table 1, Figures 5, and 6, our proposed method can detect the foot profiles and classify them with over 60% accuracy without overfitting.

 

**Page 8, line 258-260

3.2. Testing samples

We tested the images from the validation set as a prediction sample to evaluate the performance of the YOLOv4 network on several images.

Author Response File: Author Response.docx

Reviewer 2 Report

Based on models of deep learning, particularly the one named YOLO, the authors want to predict the footprints from patients with cerebral palsy. The work is of the importance, because of the need to produce gait helpers to the patients with cerebral palsy.

However, I cannot find so much originality, it sounds to me like, the authors applied a technique to learn and predict some two dimensional objects coming from particular data.

On the other hand, the paper is well presented, it is well organized, and can be easily followed.The results show that the technique worked properly.

Author Response

Response to Reviewer 2

Thank you very much for your thoughtful review of our manuscript. We are grateful to the reviewers for their insightful comments on our paper. We have highlighted the manuscript changes, especially detailing our originality in this study.

 

1. Originality

Based on models of deep learning, particularly the one named YOLO, the authors want to predict the footprints from patients with cerebral palsy. The work is of the importance, because of the need to produce gait helpers to the patients with cerebral palsy.

However, I cannot find so much originality, it sounds to me like, the authors applied a technique to learn and predict some two-dimensional objects coming from particular data.

On the other hand, the paper is well presented, it is well organized, and can be easily followed. The results show that the technique worked properly.

 

>> Response:

Thank you for pointing out the originality. We agree with this suggestion and point out the two things (discover new information and provide a new solution to problems) to strengthen our originality in the introduction section and conclusion section

 

**page 3, line 113-117

“This study is the primary investigation to discover new information on deep learning performance defect footprint in CP for left and right detection. In addition, this study may have implications for providing new solutions for accessing the efficacy of ankle foot orthosis in people with spastic CP.”

 

**page 12, line 388-391

“Therefore, the auto-detection of the left and right foot may have implications in discovering new information of footprint images under the defection feature in people with CP and provide new solutions through beneficial information to manage the treatment strategy of the orthosis in people with spastic CP.”

Author Response File: Author Response.docx

Reviewer 3 Report

1.Line 124. Resent 50 and dense net are not ‘object detection’ models, they are DNN backbones that extract good representations. The implementation of localisation and classification are achieved by using different branches that shared the same representations. It is not clear here if the resent50 and densenet here are used only as object detection backbones or for the binary classification.

 

2.Why are object detection models chosen? Have you considered just using neural network for only classification purpose instead of localization and classification for your data. Table 2 only compared classification based ANN and proposed object detection based methods however the data modalities are very different. It is more convincing to run classification using Resnet50 or something similar on pressure data to demonstrate or use it as baseline to compare with proposed method. Current experiment results are not convincing to justify of using OD approach.

 

3.Line 342 says there are is a 14-17% performance increase using proposed method. I do not see where this is coming from. In fact ANN method seems to work very well as well.

 

4.Is the collected data and annotation available for public? It would be important to state this in the manuscript.

 

5.Not clear how are the model in the proposed method pertained? If it’s pertained from image dataset which is highly likely, what about the performance training from scratch using random initialization. It would be more helpful for the community to understand the problem.

 

6.Authors focused only on supervised learning for the proposed method. however given the small number of training samples, it is worth to add some discussion about potential of using meta learning or few shot learning in this scenario.

 

7.Authors state that the difference performance of left and right foot is due to left-right leg dominance, why not include this as annotation that the experts need to decide? Such that this problem can be studied more thoroughly i.e you could train 2 models, one for left dominant and one for right dominant. Comparing these conditions could give us a clearer picture of current challenges.

Author Response

Response to Reviewer 3

Thank you very much for your thoughtful review of our manuscript. We are grateful to the reviewer for their insightful comments on our paper. We have highlighted the manuscript changes, especially detailing our originality in this study.

 

1. object detection backbones or for the binary classification

1.Line 124. Resent 50 and dense net are not ‘object detection’ models, they are DNN backbones that extract good representations. The implementation of localisation and classification are achieved by using different branches that shared the same representations. It is not clear here if the resent50 and densenet here are used only as object detection backbones or for the binary classification.

 

>> Response:

Thank you for mentioning the object detection backbones of DenseNet and ResNet50. We added more explanation of using Darknet-19 as the backbone, DenseNet & Resnet50 as the classifier, and YOLOv2 detector. We revised Figure 1A and Figure 1B.

 

**Page 3, line 135-139

“To classify images, Dewi et al. [26] introduced that multiscale feature maps can be effectively combined for YOLOv2 detection and classification of DenseNet and ResNet-50 to prevent performance loss. In this study, we used Darknet-19 as the backbone to support DenseNet and ResNet-50 as a classifier with a YOLOv2 detector.

Figure 1. Illustration of object detection Network Architecture. (A) Residual Neural Network 50 (ResNet-50) network architecture; (B) Dense Convolutional Network (DenseNet) network architecture.

 

2. object detection models chosen

Why are object detection models chosen? Have you considered just using neural network for only classification purpose instead of localization and classification for your data. Table 2 only compared classification based ANN and proposed object detection based methods however the data modalities are very different. It is more convincing to run classification using Resnet50 or something similar on pressure data to demonstrate or use it as baseline to compare with proposed method. Current experiment results are not convincing to justify of using OD approach.

 

>> Response:

Thank you for your feedback on the object detection models chosen. Our main concern about object detection was based on our dataset containing defect plantar pressure images and abnormal foot progression angles, making it difficult to differentiate the left and right foot. Therefore, we have added more elaboration on our chosen localization and classification of our data.

 

**Page 10, line 311-314

The abnormal foot progression angle and complex footprint images challenged iden-tifying the left and right foot via plantar pressure images [18]. Furthermore, plantar pres-sure images are needed for bounding box annotation to specify the location of the object that can help recognize the foot features [40].

3. ANN method seems to work

Line 342 says there are is a 14-17% performance increase using proposed method. I do not see where this is coming from. In fact ANN method seems to work very well as well.

>> Response:

Thank you for your feedback on the network performance comparison. We have revised the range of differentiation between the proposed method and the current study. We also added the sentence that the proposed method can achieve good performance with the small dataset to strengthen our statement.

 

**Page 11, line 352-358

“However, the above three studies did not use the object detection method and showed around a 0%-16% range from object detection. Therefore, the main reason for our research is to select the object detection method that can localize and classify the image features of plantar pressure images. As a result, it can be achieved with higher accuracy (over 93%) with small dataset. Furthermore, object detection may be suitable for determining where objects are located in a given image and which category each object belongs to [47].”

 

4. available for public

Is the collected data and annotation available for public? It would be important to state this in the manuscript.

 

>> Response:

Thank you for pointing out the data availability. The public data with the membership register and email validation requirement is below: https://aidea-web.tw/topic/e3ab9046-2d56-48d0-b339-c80d9ab0001d?focus=intro.

 

5. Proposed Method Pertained

Not clear how are the model in the proposed method pertained? If it’s pertained from image dataset which is highly likely, what about the performance training from scratch using random initialization. It would be more helpful for the community to understand the problem.

 

>> Response:

Thank you for the reviewer comment about pretrained model and scratch. We did not use the scratch model to predict the left and right plantar pressure images due to aim of this study was initial research using object detection.

 

**Page 3, line 111-117

“Thus, initial research is essential to analyze healthy people's foot profiles in left and right detection, providing a basis for understanding acquiring incomplete foot profiles in a specific case in CP. This study is the primary investigation to discover new information on deep learning performance defect footprint in CP for left and right detection. In addition, this study may have implications for providing new solutions for accessing the efficacy of ankle foot orthosis in people with spastic CP.”

 

6. Meta-learning or Few-shot learning

Authors focused only on supervised learning for the proposed method. however given the small number of training samples, it is worth to add some discussion about potential of using meta learning or few shot learning in this scenario.

 

>> Response:

Thank you for mentioning meta-learning or few-shot learning. Since each patient is different when using plantar pressure to present the footprint, there is potential to use meta-learning or few-shot learning, but it now needs further research. As a result, it requires a lot of data to produce higher accuracy. We have added a discussion about meta-learning or few-shot learning potential on Page 12, line 370-376.

 

**Page 11, line 373-379

“However, there are several limitations to the current work, which could be used to develop future improvement directions. The first limitation is the different sizes of footprint patterns in plantar pressure images [48]. Second, the diversity of data acquisition conditions, especially in complex target age testing in spastic CP, still needs to be studied [49]. Finally, using meta-learning or few-shot classification [50] and appliying fusion multi-modal physiological data in prediction may solve the different footprint sizes and complex target age limitations in future works [51].”

 

7. left dominant and right dominant

Authors state that the difference performance of left and right foot is due to left-right leg dominance  include this as annotation that the experts need to decide? Such that this problem can be studied more thoroughly i.e you could train 2 models, one for left dominant and one for right dominant. Comparing these conditions could give us a clearer picture of current challenges.

 

>> Response:

Thank you for your feedback on the state of left-right leg dominance experts only assist in determining the ground truth of foot plantar pressure, whether to the left or right. We have added the difficulties in identifying the dominant foot on Page 2, lines 59–84, and expertise assisting on Page 5, line 178-186

 

**Page 2, line 60-85

“Identifying the dominant leg from the footprint can help evaluate orthosis treatment in improved postural control in spastic CP patients [8]. Foot supination occurs in the dynamic evaluation when the dominant leg transitions from heel contact to middle stance throughout the gait cycle, which is related to the foot balance index in orthosis treatment [9]. In people with spastic CP, the dominant leg was assumed to be ipsilateral [10]. Hence, proprioception error was related to the non-dominant leg, which may induce the complex footprint [11]. Therefore, an assessed CP complex footprint would be beneficial to managing the orthosis treatment to provide the essential information in the dominant or non-dominant leg [12].

 

Precision detection of the left and right foot from complex footprints can positively decrease energy expenditure in the dominant or non-dominant leg and monitor the association between the maximum step length test and the walking efficiency in CP patients [13]. However, CP's footprint has limitations from complex footprint features, which would be incomprehensible to determining the left and right foot in clinics [14]. For example, scissor gait and toe walking are prominent among CP [15]. The scissor gait, usually walking with crossing legs due to spastic paraplegia or excessive contraction of hip ad-ductors muscle [16], may lead to the abnormal foot progression angle in footprint images [17]. The abnormal foot progression angle would affect the footprint's recognition of left and right [18]. The other example is toe walking, a bilateral gait abnormality in which a normal heel strike is absent, and weight-bearing occurs through the forefoot caused by limb length discrepancy, spastic equinus, and Achilles tendon contracture [19]. Determining the left and right based on footprint images in toe walking conditions may have difficulties without the full foot pressure recorded, especially in the absence of the heel region on the footprint distribution. According to scissor gait and toe walking foot problems, it would be incomprehensible to determine the left and right foot under abnormal foot progression angle and complex footprint in people with spastic CP.”

 

**Page 5, line 180-189

“For determining the left and right foot, a senior expert in plantar pressure images with experience of over 15 years in footprint imaging was supervised (the left foot 487 images and right foot 487 images). The abnormal foot progression angle in the plantar pressure image may challenge the recognition of the left and right foot. Furthermore, the defect footprint feature may have limitations due to an unfull footprint pattern, particularly in the plantar region, that challenges the prediction of the left and right foot [30]. Nevertheless, object detection achieved good results in medical images [22]. For the plantar pressure, this study used bounding box annotation to determine the left and right foot through the different object detection models to achieve better accuracy [18,24]. The bounding box is used to localize and classify the left and right foot based on manual labeling to get a prediction.”

Back to TopTop