Next Article in Journal
A Novel Speckle-Tracking Echocardiography Derived Parameter That Predicts Clinical Worsening in Children with Pulmonary Arterial Hypertension
Next Article in Special Issue
An Approach for Selecting the Most Explanatory Features for Facial Expression Recognition
Previous Article in Journal
The Novel Role of Solvents in Non-Surgical Endodontic Retreatment
Previous Article in Special Issue
Facial Micro-Expression Recognition Based on Deep Local-Holistic Network
 
 
Article
Peer-Review Record

Hybrid Approach for Facial Expression Recognition Using Convolutional Neural Networks and SVM

Appl. Sci. 2022, 12(11), 5493; https://doi.org/10.3390/app12115493
by Jin-Chul Kim 1, Min-Hyun Kim 1, Han-Enul Suh 1, Muhammad Tahir Naseem 2 and Chan-Su Lee 1,3,*
Reviewer 1:
Reviewer 2:
Reviewer 3:
Appl. Sci. 2022, 12(11), 5493; https://doi.org/10.3390/app12115493
Submission received: 15 April 2022 / Revised: 21 May 2022 / Accepted: 22 May 2022 / Published: 28 May 2022
(This article belongs to the Special Issue Research on Facial Expression Recognition)

Round 1

Reviewer 1 Report

Decision: Major Revision

  1. As you mentioned you have used a hybrid approach but I didn’t find any novelty in the corresponding section so it's an existing technique and many researchers used a hybrid approach. What makes the proposed method New and suitable for this unique task? What new development to the proposed method have the authors added (compared to the existing approaches)? These points should be clarified.
  2. The complexity of the proposed model and the model parameter uncertainty is not mentioned in the methodology section. It should be mentioned and experimentally proven to easily show the model's significance.
  3. The current challenges are not crystal clearly mentioned in the introduction section of this paper. I suggest adding a dedicated paragraph about the current challenges in this area followed by the authors’ contribution to overcoming those challenges.
  4. The contributions in the current version are not crystal clear. I strongly recommend adding built-wise contributions in the revised version. The author can follow the following paper “Att-Net: Enhanced emotion recognition system using.
  5. In addition, the authors must provide a sufficient critical review of the literature to indicate the drawbacks of existing approaches and then define the main focus of the research direction. How did those previous studies perform? Readers need more positive reviews of the literature to indicate the state-of-the-art development.
  6. Utilized technologies and methods between different systems - details can be provided, and how fair they are working should be explained.
  7. Text inside tables is not consistent it should be consistent with the manuscript text and check the overall manuscript to ensure consistency.
  8. Some mathematical notations and presentations are not rigorous/clear enough to correctly understand the contents of the paper. It is suggested to check all the definitions of variables and redefine the missing information when preparing the re-submission of the paper.
  9. Section Conclusion - Authors are suggested to include in the conclusion section the real actual results for the best performance of their proposed methods in comparison towards other methods to highlight and justify the advantages of their proposed methods with possible future direction.
  10. More recently-published papers in the field of deep learning should be discussed in the Introduction/literature. The authors may be benefited by reviewing more papers such as DOI: 10.1109/ACCESS.2021.3093053. And 10.3390/math8122133.

Author Response

Response to Reviewer 1 Comments

I would like to say thanks to consider our article and giving beneficial feedback to improve the required parameters in terms of content writing, diagrams, references and methodology etc.

According to the reviewers’ comments, the article has been revised by the authors, and tried our best to meet the entire requirements. The detailed author response to the reviewer’s comments is as follows:

 

Point 1: As you mentioned you have used a hybrid approach but I didn’t find any novelty in the corresponding section so it's an existing technique and many researchers used a hybrid approach. What makes the proposed method New and suitable for this unique task? What new development to the proposed method have the authors added (compared to the existing approaches)? These points should be clarified.

 

Response 1: According to your valuable suggestion, we have added a subsection 1.1 Limitations of related work and our contributions in the Introduction section which addresses the points of the proposed model in the manuscript.

 

Point 2: The complexity of the proposed model and the model parameter uncertainty is not mentioned in the methodology section. It should be mentioned and experimentally proven to easily show the model's significance.

 

Response 2: The prediction complexity of the kernel SVM is O(kd) where k is the number of support vectors with d the number of input dimensions. If we consider the matrix computational complexity is O(d3) and vector multiplication as O(d2), in the CNN model, forward convolution layers are O(d4), and fully connected layers are O(d2). Therefore the computational complexity of the CNN model is much higher than the SVM model. The computational complexity of the proposed model is dominated by the CNN model.

 

Point 3: The current challenges are not crystal clearly mentioned in the introduction section of this paper. I suggest adding a dedicated paragraph about the current challenges in this area followed by the authors’ contribution to overcoming those challenges.

 

Response 3: According to your valuable suggestion, we have added a paragraph in the Introduction section of the manuscript which clearly highlights the limitations of previous work and our contributions.

 

Point 4: The contributions in the current version are not crystal clear. I strongly recommend adding built-wise contributions in the revised version. The author can follow the following paper “Att-Net: Enhanced emotion recognition system using lightweight self-attention module.

 

Response 4: According to your valuable suggestion, we have added a subsection 1.1 Limitations of related work and our contributions in the Introduction section of the manuscript and clearly mentioned contributions of the proposed work.

 

Point 5: In addition, the authors must provide a sufficient critical review of the literature to indicate the drawbacks of existing approaches and then define the main focus of the research direction. How did those previous studies perform? Readers need more positive reviews of the literature to indicate the state-of-the-art development.

 

Response 5: According to your valuable suggestion, we have added Table 1 before 1.1 Limitations of related work and our contributions subsection in the manuscript which provides sufficient critical literature review.

 

Point 6: Utilized technologies and methods between different systems - details can be provided, and how fair they are working should be explained.

 

Response 6: According to your valuable suggestion, we have added Table 1 before 1.1 Limitations of related work and our contributions subsection which describes the method and database used by different systems.

 

Point 7: Text inside tables is not consistent it should be consistent with the manuscript text and check the overall manuscript to ensure consistency.

 

Response 7: According to your valuable suggestion, text inside tables is made consistent in the manuscript. 

 

Point 8: Some mathematical notations and presentations are not rigorous/clear enough to correctly understand the contents of the paper. It is suggested to check all the definitions of variables and redefine the missing information when preparing the re-submission of the paper.

 

Response 8: According to your valuable suggestion, all the notations and presentations are rectified in the manuscript.

 

Point 9: Section Conclusion - Authors are suggested to include in the conclusion section the real actual results for the best performance of their proposed methods in comparison towards other methods to highlight and justify the advantages of their proposed methods with possible future direction.

 

Response 9: Conclusions and future works section is modified by your valuable suggestion in the manuscript.        

 

Point 10: More recently-published papers in the field of deep learning should be discussed in the Introduction/literature. The authors may be benefited by reviewing more papers such as DOI: 10.1109/ACCESS.2021.3093053. And 10.3390/math8122133.

 

Response 10: According to your valuable suggestion, referred articles are cited in the manuscript at the end of the Introduction section.  

 

 

Author Response File: Author Response.docx

Reviewer 2 Report

The presented Topic is interested and well organized, however, I have some minor concerns:

  1. Quality of fig 1 must be improved.
  2. List your contributions in introduction section.
  3. Explain why your method has low results as compared to [61] method. 

 

Author Response

The presented Topic is interested and well organized, however, I have some minor concerns:

 

I would like to say thanks to consider our article ang giving beneficial feedback to improve the required parameters in terms of content writing, diagrams, references and methodology etc.

According to the suggestions, the article has been revised by the authors and tried our best to meet the entire requirements. The detailed author response to the reviewer’s comments is as follows:

 

Point 1: Quality of fig 1 must be improved.

 

Response 1: According to your valuable suggestion, the quality of Figure 1. is improved in the manuscript. 

 

Point 2: List your contributions in introduction section.

 

Response 2: According to your valuable suggestion, contributions are listed in subsection 1.1 Limitations of related work and our contributions of the manuscript.

 

Point 3: Explain why your method has low results as compared to [61] method.

 

Response 3: In this article, we have proposed a fused ML approach, which is also known as decision level fusion by combining local features and global features. In [61], real 4D features are extracted based on HOG3D features on the local depth patch sequence from the depth sequence, which may represent characteristics of the depth sequence around the onset frame. For the recognition of facial expression recognition, they employed hierarchical classification to divide easily-confused expressions with additional feature selections. Their method was optimized for the specific dataset like BU4D and further optimized for repeated feature selection in each hierarchical layer of the classifier. Therefore, they provide better performance than the proposed approach at the onset frames with sliding windows. Their approach shows much worse recognition performance when they use all frames as we did. Therefore, their approaches may have limitations to apply in general facial expression recognition in a different database. We add the performance of all frames in the paper and add summarized comments in the manuscript.

Author Response File: Author Response.docx

Reviewer 3 Report

This manuscript proposes an interesting approach to the problem of facial expression recognition. Some minor changes should be made by the authors:

  1. Ln. 38 - AdaBoot should state AdaBoost?
  2. Fig. 1 - is a bit blurry, try to recreate the image and save it in higher resolution
  3. Fig. 4 - the labels of the subimages should be under the images
  4. Eqs. 5 and 6 - are de facto the same equation and therefore should not be split up
  5. Figs. 5 and 6 - the subimages should be aligned with each other
  6. Ln. 271 - a reference to the figure is missing
  7. Tabs 4 and 6 - have the same labelling; the authors should change the labelling to better show the data in the tables
  8. Tab. 10 - authors should bold the best results as they did in Tab. 9

There are some important points regarding the research:

  1. What is the split of train/validation/test data for each of the data sets? This needs to be stated in the paper
  2. Why do the authors only use the late fusion? Have they tried early fusion (i.e., adding the Lint. and Lang. values to the fully connected layer)?
  3. Fusion of information with the sum rule often results in a suboptimal solution. Authors should use more competitive fusion methods, e.g., alpha integration (Safont G., Salazar A., Vergara L. Multiclass Alpha Integration of Scores from Multiple Classifiers)

Author Response

This manuscript proposes an interesting approach to the problem of facial expression recognition. Some minor changes should be made by the authors:

 

I would like to say thanks to consider our article and giving beneficial feedback to improve the required parameters in terms of content writing, diagrams, references, and methodology etc.

Point 1: Ln. 38 - AdaBoot should state AdaBoost?

 Response: According to your valuable suggestion, AdaBoost is rectified at Ln. 38 in the manuscript.

 

Point 2: Fig. 1 - is a bit blurry, try to recreate the image and save it in higher resolution

Response 2: According to your valuable suggestion, Figure. 1 is saved in higher resolution in the manuscript.

 

Point 3: Fig. 4 - the labels of the subimages should be under the images

Response 3: According to your valuable suggestion, the Labels in Figure. 4 are aligned under the images in the manuscript.

 

Point 4: Eqs. 5 and 6 - are de facto the same equation and therefore should not be split up

 

Response 4: According to your valuable suggestion, Eq. 5 and 6 are the same so they are given the same number in the manuscript.

 

Point 5: Figs. 5 and 6 - the subimages should be aligned with each other

 

Response 5: According to your valuable suggestion, Fig. 5 and 6 are aligned with each other in the manuscript.

 

Point 6: Ln. 271 - a reference to the figure is missing

 

Response 6: According to your valuable suggestion, a reference to the figure is added to the manuscript.

 

Point 7: Tabs 5 and 7 - have the same labelling; the authors should change the labelling to better show the data in the tables

 

Response 7: According to your valuable suggestion, the labeling of Tab. 5 and 7 are rectified in the manuscript.

 

Point 8: Tab. 12 - authors should bold the best results as they did in Tab. 9

 

Response 8: According to your valuable suggestion, results are made bold in Tab. 12 in the manuscript.

 

There are some important points regarding the research:

 

Point 9: What is the split of train/validation/test data for each of the data sets? This needs to be stated in the paper

 

Response 9: According to your valuable suggestion, we have used 80% for training, and 20% for validation tests. The same is added in the Experiment section of the manuscript. 

 

Point 10: Why do the authors only use the late fusion? Have they tried early fusion (i.e., adding the Lint. and Lang. values to the fully connected layer)?

 

Response 10: Early fusion uses one single model to make predictions while late fusion allows the use of different models on different modalities. Since, in the proposed model, we have used different modalities by using a hybrid approach. That’s why we have used late fusion. As for your valuable suggestion, we will perform early fusion in our future works.  

 

Point 11: Fusion of information with the sum rule often results in a suboptimal solution. Authors should use more competitive fusion methods, e.g., alpha integration (Safont G., Salazar A., Vergara L. Multiclass Alpha Integration of Scores from Multiple Classifiers).

 

Response 11: As you mentioned there are various methods for late fusion e.g. Sum rule, alpha integration, maximum likelihood, fuzzy-based fusion, etc. We have used one of them, which is the sum rule. The method is simple, however, it works well in our model. We will utilize other fusion methods like alpha integration, maximum likelihood, and fuzzy-based fusion as future work.

 

Author Response File: Author Response.docx

Round 2

Reviewer 1 Report

The authors successfully addressed my comments and suggestions. Good Luck!

 

 

Author Response

  

Back to TopTop