Next Article in Journal
Energy Management Strategy of Urban Rail Energy Storage System Considering Life Assessment of Train Converter
Previous Article in Journal
A 512 KBytes Highly Reliable and High-Speed Embedded NOR Flash Memory
 
 
Article
Peer-Review Record

Emotion Recognition Using a Siamese Model and a Late Fusion-Based Multimodal Method in the WESAD Dataset with Hardware Accelerators

Electronics 2025, 14(4), 723; https://doi.org/10.3390/electronics14040723
by Hyun-Sik Choi
Reviewer 1: Anonymous
Reviewer 2: Anonymous
Reviewer 4: Anonymous
Electronics 2025, 14(4), 723; https://doi.org/10.3390/electronics14040723
Submission received: 13 January 2025 / Revised: 6 February 2025 / Accepted: 11 February 2025 / Published: 13 February 2025
(This article belongs to the Section Artificial Intelligence)

Round 1

Reviewer 1 Report

Comments and Suggestions for Authors

This manuscript presents a novel approach to emotion recognition using the WESAD dataset by integrating a Siamese network and a late fusion-based multimodal method optimized for hardware accelerators. The study achieves impressive performance, with a classification accuracy of 99.8%, and demonstrates real-time emotion recognition capabilities suitable for wearable devices. Integrating physiological signals (BVP and EDA) through advanced preprocessing methods and the efficient use of hardware accelerators represent notable innovations. However, while the technical aspects are robust, the manuscript would benefit from improved clarity and a more critical discussion of its broader applicability and limitations. Here are some suggestions for the authors’ information:
1. The introduction highlights the importance of emotion recognition and wearable technologies but should be more concise, focusing on the research gap and emphasizing the study's novelty, such as integrating the Siamese network with late fusion and optimizing it for hardware accelerators
2. Justify the exclusive use of the WESAD dataset by addressing its limitations, such as small sample size and single-day data collection, and suggest validating the methodology on additional datasets or real-world scenarios to enhance its robustness and relevance.
3. It is recommended to explain why specific preprocessing techniques, such as EMD over DWT, were chosen and how they impacted model performance. Supporting these choices with comparative experiments or references would strengthen the methodology. Additionally, discuss potential biases introduced by the preprocessing steps or data selection and outline strategies to mitigate them. Lastly, include more details about hyperparameter tuning, specifying the parameters optimized and explaining their role in improving the model’s performance.
4. This study explores emotion recognition in PD patients, focusing on physiological signals such as blood volume pulse and electrodermal activity to overcome the limitations of outward emotional expressions, such as facial cues, often impaired in PD patients. However, the motor and non-motor symptoms of PD may influence emotion recognition in these individuals. It is recommended that the authors include relevant discussions and explanations in the Discussion section. Such additions would not only strengthen the discussion but also enhance the overall contribution of the manuscript, positioning it as a more impactful and broadly applicable work for both research and practical applications across diverse populations. References for the authors’ information: Argaud, Soizic, et al. "Facial emotion recognition in Parkinson's disease: a review and new hypotheses." Movement disorders 33.4 (2018): 554-567.; Chiang et al. "Disgust-specific impairment of facial emotion recognition in Parkinson’s disease patients with mild cognitive impairment." Social Cognitive and Affective Neuroscience 19.1 (2024): nsae073..
5. The manuscript is well-written overall, but reducing verbosity, eliminating redundancy, and ensuring consistent use of technical terms would improve clarity and reader engagement.

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Reviewer 2 Report

Comments and Suggestions for Authors

1. Please add relevant ablation experiments to illustrate the role of Siamese models and multi-task learning.

2. The comparison in Table 1 lacks fairness and can easily mislead readers. It is recommended to delete the comparison in Table 1.

3. Please add information about the software and hardware platform used for deep learning experiments.

4. Please provide a more detailed description of how to handle the loss function when training Siamese models and multi task learning simultaneously.

Comments on the Quality of English Language

The English could be improved to more clearly express the research.

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Reviewer 3 Report

Comments and Suggestions for Authors

The paper addresses an important and timely topic in the field of emotion recognition using wearable devices and presents a novel combination of a Siamese network with a late fusion-based multimodal approach. Below are my comments and suggestions to help improve the manuscript.

Strengths of the Manuscript:

Relevance and Impact:

   - The study tackles an essential issue in emotion recognition and wearable technology, with significant practical applications in health monitoring and stress management. 

   - The proposed approach demonstrates a commendable effort to optimize resource efficiency for real-time applications using hardware accelerators.

 Methodological Rigor:

   - The authors provide a detailed explanation of the preprocessing steps, including the application of Empirical Mode Decomposition (EMD) for BVP data. 

   - The Siamese network and the late fusion method are well-integrated and justified based on the characteristics of the WESAD dataset.

Performance Evaluation:

   - The reported accuracy of 99.8% suggests a highly effective model. 

   - The inclusion of confusion matrices and metric evaluations (precision, recall, F1-score) supports the robustness of the findings.

Major Points for Improvement:

1. Clarification of Novelty:

   - While the combination of Siamese networks and late fusion is interesting, the manuscript should more clearly highlight how this approach differs from existing methods and why it presents an advancement over traditional multimodal techniques. 

   - A comparative discussion with state-of-the-art approaches should be expanded in the introduction or discussion section.

Details on Hardware Implementation:

   - The manuscript should provide more details on the resource constraints encountered during the hardware implementation, such as power consumption, memory usage, and scalability to other platforms. 

   - A discussion of the trade-offs between accuracy and hardware complexity would enhance the practical applicability of the study.

Generalization and Limitations:

   - Given the small number of participants in the WESAD dataset, generalization to larger and more diverse populations remains a concern. 

   - The authors should discuss potential limitations of the proposed method, such as its ability to handle noisy data in real-world conditions.

Minor Points for Improvement:

Introduction Section:

   - The introduction provides a good overview, but it could benefit from additional references to recent studies in deep learning for emotion recognition, particularly those utilizing wearable devices.

Conclusion:

Overall, the manuscript presents a valuable contribution to the field of wearable emotion recognition. With the suggested revisions, the paper can achieve a higher impact and readability. I recommend acceptance with minor revisions to address the points mentioned above.

 

 

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Reviewer 4 Report

Comments and Suggestions for Authors

In the article, the author proposed an efficient neural network-based classification model for stress analysis and emotion recognition in wearable environments using the WESAD dataset. Two measured data (BVP and EDA) were combined using a multimodal late fusion method. Moreover, a multiclass classification technique was employed using a Siamese network and a parallel learning structure. The author showed that the late fusion approach, which combines physiological signals, effectively addresses the incompleteness of the data. Furthermore, learning based on similarity scores through a Siamese network allowed efficient classification even with limited data. The applied model uses minimal hardware resources and is capable of real-time emotion recognition.

The topics discussed in the article are interesting and useful as research objects. The background of research, proposed solutions, obtained results, discussion, conclusions, and limitations were adequately and accurately described. Summing up, the paper is professionally written and therefore I recommend the submitted manuscript for publication after making a few minor corrections following the comments below.

- It would be worthwhile to check if the use of the two terms "acceleration" and "three-channel accelerometer" in L139 is not redundant.
- It is advisable to add a short description of the four affective states and especially explaining the fourth state, i.e. meditation.
- L377-378: It seems advisable to explain what is meant by "reasonable level of accuracy achieved by Siamese networks" and provide a reference to a publication that confirms this.
- Adding the URL to the WASED dataset used seems important.
- In the limitations it would be advisable to add the weaknesses of the WASED dataset and their possible impact on the conducted research (i.e.: a small number of participants, focusing on only four emotional reactions and that the data come from a single day).

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Round 2

Reviewer 1 Report

Comments and Suggestions for Authors The author responded to my question and the revised article is more scientific and readable.

 

Back to TopTop