Next Article in Journal
Reverse Knowledge Distillation with Two Teachers for Industrial Defect Detection
Next Article in Special Issue
Utilizing Machine Learning for Detecting Harmful Situations by Audio and Text
Previous Article in Journal
Taylor Spatial Frame Behavior in High Tibial Osteotomies: A Clinical–Mechanical Study
Previous Article in Special Issue
Poetry in Pandemic: A Multimodal Neuroaesthetic Study on the Emotional Reaction to the Divina Commedia Poem
 
 
Communication
Peer-Review Record

DevEmo—Software Developers’ Facial Expression Dataset

Appl. Sci. 2023, 13(6), 3839; https://doi.org/10.3390/app13063839
by Michalina Manikowska, Damian Sadowski, Adam Sowinski and Michal R. Wrobel *
Reviewer 1: Anonymous
Reviewer 3:
Reviewer 4:
Appl. Sci. 2023, 13(6), 3839; https://doi.org/10.3390/app13063839
Submission received: 7 February 2023 / Revised: 5 March 2023 / Accepted: 13 March 2023 / Published: 17 March 2023
(This article belongs to the Special Issue Advances in Emotion Recognition and Affective Computing)

Round 1

Reviewer 1 Report

The author present a visual emotion data set that they collected. It is a useful contribution to be used in related research. The data set information and data collection process is described clearly on the overall. Yet there are several issues that need better exploration and explanation:

- There are several similar data sets published in the literature. The proposed data set includes a set of video clips and they are taken during a programming exam. On the other hand, not all emotion labels are available in the data set. On the overall, the advantage of the proposed data set for emotion detection studies is not very clear. 

- In Table 1, list of similar data sets are given by the authors. Among them, for Aff-Wild [18], condition is given as "spontaneous in controlled environment". But in the text , on page 3, there is the explanation "the only other dataset recorded in an uncontrolled environment was the Aff-Wild dataset [18]" which conflicts with the entry in the table. If Aff-Wild data set is also taken in "spontaneous in ucontrolled environment", a more detailed comparison between the proposed data set and Aff-Wild is needed.

- It is stated that each video is annotated by 3 annotators. However the experience, expertise and the demographic info about the reviewers are not given and any bias that may arise from selection of annotators is not discussed.

- Several important statistics about the data set is presented. But the following statistics are missing: What is the number of emotions/distinct emotions detected per user, what is the max-min duration of clips for each emotion, any mismatch of video-clip beginning/end marking by different annotators. 

- The authors argue that the proposed data set is taken under uncontrolled environment, so it is free from Hawthorne effect. The authors only stated that the consent of the participants are taken (which is positive), but it is not clearly described whether the participants know that the purpose of the study involves emotion detection. If it is stated this may cause a bias on the data collection.

Author Response

Dear Reviewer,

We would like to thank you for your insightful comments and suggestions. We feel that by following your comments, we have significantly improved the paper.

The author present a visual emotion data set that they collected. It is a useful contribution to be used in related research. The data set information and data collection process is described clearly on the overall.

Yet there are several issues that need better exploration and explanation:

- There are several similar data sets published in the literature. The proposed data set includes a set of video clips and they are taken during a programming exam. On the other hand, not all emotion labels are available in the data set. On the overall, the advantage of the proposed data set for emotion detection studies is not very clear.

We have modified the description of potential uses of the DevEmo dataset, highlighting the possible use of dataset in the current stage of development in the application of transfer learning technique.

 

- In Table 1, list of similar data sets are given by the authors. Among them, for Aff-Wild [18], condition is given as "spontaneous in controlled environment". But in the text , on page 3, there is the explanation "the only other dataset recorded in an uncontrolled environment was the Aff-Wild dataset [18]" which conflicts with the entry in the table. If Aff-Wild data set is also taken in "spontaneous in ucontrolled environment", a more detailed comparison between the proposed data set and Aff-Wild is needed.

Thank you for catching this mistake, which arose during the creation of the table. The recordings from the Aff-Wild dataset were captured in an uncontrolled environment. The main difference between Aff-Wild's collection and DevEmo's is that our prepared dataset contains only recordings of computer users, while the former contains data of all kinds, such as from stage performances, interviews, and lectures. For this reason, DevEmo is more suitable for developing models that focus on recognizing computer users' emotions while working and learning. Both the table and its summary have been modified according to the comment.

 

- It is stated that each video is annotated by 3 annotators. However the experience, expertise and the demographic info about the reviewers are not given and any bias that may arise from selection of annotators is not discussed.

Information about annotators has been added in Section 4.

 

- Several important statistics about the data set is presented. But the following statistics are missing: What is the number of emotions/distinct emotions detected per user, what is the max-min duration of clips for each emotion, any mismatch of video-clip beginning/end marking by different annotators.

Statistics have been added as suggested.

 

- The authors argue that the proposed data set is taken under uncontrolled environment, so it is free from Hawthorne effect. The authors only stated that the consent of the participants are taken (which is positive), but it is not clearly described whether the participants know that the purpose of the study involves emotion detection. If it is stated this may cause a bias on the data collection.

Participants were aware that the purpose of the study was to identify emotions while solving programming tasks. An appropriate note has been added to the "Threats to Validity" section.

Reviewer 2 Report

The paper introduces a new dataset called DevEmo, which includes 217 video clips of 33 students solving programming tasks, recorded in their actual work environment to capture spontaneous facial expressions while they were engaged in programming tasks. The dataset is labelled with five categories, including four emotions (anger, confusion, happiness, and surprise) and a neutral state. The dataset aims to explore the relationship between emotions and computer-related activities and could support the development of more personalized and effective tools for computer-based learning environments. The process of creating the dataset involved a thorough and systematic labelling process by at least three annotators, ensuring a high degree of accuracy in the labelling process. The paper is presented very well, and all the important steps are described clearly. There are minor things the authors need to address before I can recommend this paper for publication.

       I.          Could you describe the expertise level of the annotators? How many years of experience they have involved in the research-related activity of human emotion recognition?

      II.          Is the combination of the three annotators keep remained for the annotation task of the whole 219 videos? Or there are multiple combinations of different annotators?

    III.          Could you also highlight some limitations of the DevEmo dataset?

    IV.          Misspelling detected in line 175.

 

     V.          Line 232: the word embarrassment may not be a suitable word to use. 

Author Response

Dear Reviewer,

We would like to thank you for your insightful comments and suggestions. We feel that by following your comments, we have significantly improved the paper.

I. Could you describe the expertise level of the annotators? How many years of experience they have involved in the research-related activity of human emotion recognition?

Information about annotators has been added in Section 4.

 

II. Is the combination of the three annotators keep remained for the annotation task of the whole 219 videos? Or there are multiple combinations of different annotators?

All annotation was carried out by the same three annotators. The relevant information is presented more clearly in the "Annotation procedure" section.

 

III. Could you also highlight some limitations of the DevEmo dataset?

We have added a paragraph to the Discussion section dedicated to limitations of the DevEmo dataset.

 

IV. Misspelling detected in line 175.

Thank you for you remark, it was corrected.

 

V. Line 232: the word embarrassment may not be a suitable word to use. 

Thank you for pointing out this mistake, it was corrected.

Reviewer 3 Report

Overall, the manuscript provides a comprehensive and well-researched analysis of facial expression dataset. The authors have presented their findings in a clear and concise manner, making it easy to follow their argument and conclusions.

Comments for author File: Comments.pdf

Author Response

Dear Reviewer,

We would like to thank you for your insightful comments and suggestions. We feel that by following your comments, we have significantly improved the paper.

 

1. The presentation of the proposal must be clear and precise.

The introduction has been modified to more clearly formulate the purpose of the study.

 

2. All the reference is used in the manuscript should be in order form.

3. In this manuscript mostly the reference was used randomly.

The list of references is ordered by occurrence in the text. The confusion may arise, from the journal's method of citation. At the beginning of Section 2, there is a reference to several papers, which is denoted as [3-11].

 

4. In addition, the conclusions and abstract must be self-contained.

Thank you for this comment, the abstract and conclusions have been revised.

 

5. Further suggestion for citetion, cite the following ready refenece in this manuscrift

Thank you for bringing valuable papers to our attention.

Reviewer 4 Report

The manuscript presents a dataset for affective computing of computer programmer. Author adopt the categorical approach of emotion classification. I have some major comments to improve the quality of the manuscript and reach in the publication standard of this journal.

1.     I think the study is in very primary stage of research. Author need to perform several experiments (propose some baseline results) on the proposed dataset that can drive some conclusion for the application of the proposed dataset.

2.     Author claimed that the proposed dataset is useful for data driven approach of research such as deep learning. However the number of samples are very limited (217 video clips) for four class (highly imbalanced where some classes has less than 25 samples) emotion classification. In addition, the number of participants are only 33 students. Summing these clues I think the data could be useful for some conventional algorithm and some machine learning approach but not suitable for deep neural network training.

3.     Author need to extend their dataset and make it balanced based on the number of samples in each class. Baseline methods and results need to explore using this dataset.

Author Response

Dear Reviewer,

We would like to thank you for your insightful comments and suggestions. We feel that by following your comments, we have significantly improved the paper.

 

1. I think the study is in very primary stage of research. Author need to perform several experiments (propose some baseline results) on the proposed dataset that can drive some conclusion for the application of the proposed dataset.

We agree with this comment in its entirety. Therefore, the paper was not submitted as an “Article”, but a “Communication”. According to MDPI Guidelines “Communication” is a short article that presents groundbreaking preliminary results or significant findings that are part of a larger study over multiple years.

 

2. Author claimed that the proposed dataset is useful for data driven approach of research such as deep learning. However the number of samples are very limited (217 video clips) for four class (highly imbalanced where some classes has less than 25 samples) emotion classification. In addition, the number of participants are only 33 students. Summing these clues I think the data could be useful for some conventional algorithm and some machine learning approach but not suitable for deep neural network training.

Thank you for this comment, we have modified the description of potential uses of the DevEmo dataset, highlighting the possible use of dataset in the current stage of development in the application of transfer learning technique. In addition, we have added a paragraph to the Discussion section dedicated to limitations of the DevEmo dataset.

 

3. Author need to extend their dataset and make it balanced based on the number of samples in each class. Baseline methods and results need to explore using this dataset.

In further work, we plan to expand the dataset in accordance with the proposed comments. We have added a paragraph to the Section Conclusions dedicated to future work.

Round 2

Reviewer 1 Report

I find the revisions and responses of the authors satisfactory. Since there is already another data set collected in an uncontrolled setting, the novelty of the proposed one is somewhat affected. However having one more data set focused on a specific task will be beneficial for the research community. So on the overall I find the study worth publishing. 

Author Response

Dear Reviewer,

Thank you very much for your review and comments, which undoubtedly helped to improve the quality of the paper. 

Authors

Reviewer 4 Report

Thank for you revised manuscript. In overall analysis of the revised version of manuscript, there is no significant changes as I mentioned in my comments before. The main limitation of the paper is limited dataset and limited number of categories that also accepted by the author in the newly limitations section (5.2). Also as I mentioned before, the paper need to provide some baseline method or results which is not addressed yet. Including these major limitations on this manuscript, I could not recommend it for accept in the current status.   

Author Response

Dear Reviewer, 
we apologize that the previous revision of our manuscript did not address your concerns. In this revision, we have added a Validation section, which shows the analysis of clips from the DevEmo dataset using Noldus FaceReader. 

Although we do not provide a baseline method or results, we hope that given the type of article ("Communication") and the short preparation time for the revision, the new version will at least partially meet your expectations. 

Best regards
Authors

Back to TopTop