Next Article in Journal
Modeling Species-Specific Collision Risk of Birds with Wind Turbines: A Behavioral Approach
Previous Article in Journal
Heat and Mass Transfer Analysis of MHD Jeffrey Fluid over a Vertical Plate with CPC Fractional Derivative
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Emotion Recognition of Down Syndrome People Based on the Evaluation of Artificial Intelligence and Statistical Analysis Methods

by
Nancy Paredes
1,2,*,
Eduardo F. Caicedo-Bravo
1,*,
Bladimir Bacca
1,* and
Gonzalo Olmedo
2,*
1
School of Electrical and Electronics Engineering, Faculty of Engineering, Universidad del Valle, Cali 760032, Colombia
2
Department of Electrical, Electronics and Telecommunications Engineering, Universidad de las Fuerzas Armadas ESPE, Sangolquí 171103, Ecuador
*
Authors to whom correspondence should be addressed.
Symmetry 2022, 14(12), 2492; https://doi.org/10.3390/sym14122492
Submission received: 7 August 2022 / Revised: 3 September 2022 / Accepted: 9 September 2022 / Published: 24 November 2022
(This article belongs to the Section Computer)

Abstract

:
This article presents a study based on evaluating different techniques to automatically recognize the basic emotions of people with Down syndrome, such as anger, happiness, sadness, surprise, and neutrality, as well as the statistical analysis of the Facial Action Coding System, determine the symmetry of the Action Units present in each emotion, identify the facial features that represent this group of people. First, a dataset of images of faces of people with Down syndrome classified according to their emotions is built. Then, the characteristics of facial micro-expressions (Action Units) present in the feelings of the target group through statistical analysis are evaluated. This analysis uses the intensity values of the most representative exclusive action units to classify people’s emotions. Subsequently, the collected dataset was evaluated using machine learning and deep learning techniques to recognize emotions. In the beginning, different supervised learning techniques were used, with the Support Vector Machine technique obtaining the best precision with a value of 66.20%. In the case of deep learning methods, the mini-Xception convolutional neural network was used to recognize people’s emotions with typical development, obtaining an accuracy of 74.8%.

1. Introduction

Down syndrome (DS) is currently the most frequent genetic cause of intellectual disability and congenital malformations [1,2]. In addition, these individuals present heart disease, cognitive impairment, characteristic physical, and a flattened facial appearance due to an extra copy of chromosome 21 [1,2,3]. Among other physical characteristics, certain distinctive features in the human face are typically associated with DS, such as slanting eyes, brush-field spots in the iris, round faces, abnormal outer ears, and flat nasal bridges [4].
People’s emotions are best communicated through their non-verbal behavior (gestures, body posture, tone of voice) [5]. They support cognitive processes to improve social interactions. Communication is a more complex process in people with DS, given that speech results from cognitive, affective, and social elements. The linguistic activity in these people maintains a pattern of execution similar to people with typical development (TD), although the delay progressively increases as the intellectual functions are complex [6]. In people with disabilities, the control, regulation, knowledge, and expression of emotions can improve their quality of life [7]. Therapies are one of the daily activities that people with DS carry out. Thus, within Clinical Psychology, the importance of emotional activation, its intensity, and the processing of emotions constitute decisive elements in the success of a therapy [8].
The studies relate artificial intelligence techniques with DS, focusing mainly on detecting DS. One of them is performed invasively through a puncture in the mother’s abdomen between weeks 15 and 20 of gestation; however, being an invasive method, there is a probability of losing the fetus. Other methods work non-invasively, such as analyzing the fetus’ images [9].
Computer-aided tools based on machine learning can currently recognize facial features in people with genetic syndromes by considering the shape and size of the face [3], [10]. In addition, these tools detect and extract relevant facial points to calculate measurements from images [9,10,11]. Moreover, image processing and machine learning techniques facilitate recognizing facial dysmorphic features associated with genetic factors [10,12]. For instance, a method to detect DS in photographs of children uses a graphical method from facial points and a neural network for classification [10,13]. Zhao [14] proposed a restricted hierarchical local model based on the analysis of independent components to detect DS based on facial photographs through the combination of texture and geometric information. An SVM classifier was used to distinguish between normal and abnormal cases using a sample of 48 images and achieved an efficiency of 97.90%.
Burçin and Vasif used the LBP descriptor to identify DS in images of faces based on 107 obtained samples and achieved an efficiency of 95.30% [15]. Saraydemir used the Gabor wavelet transform as a feature to recognize dysmorphic faces, K-nearest neighbor (K-NN), and SVM to classify the training set of 30 samples and achieved an efficiency of 97.34% [10,14,15,16,17]. Due to subjects’ privacy and sensitivity, a reduced number of studies used a DS dataset. These studies detect the syndrome by taking characteristics of their skull, face, or clinical factors typically associated with DS [3]. In addition, the images of the faces of people with DS processed through computational tools allow new lines of study, such as the recognition of their facial expressions using artificial intelligence, which has been studied very little to date.
There are two approaches to measuring facial expressions: one based on messages and another based on signs [11,18,19,20]. The first case involves labeling the word in basic emotional categories, such as “happiness”, “disgust”, or “sadness”. However, expressions do not frequently occur in their typical forms. In the second case, the messages usually hide social themes. Therefore, the sign-based approach is analyzed to find the best alternative, given that it objectively describes changes in the face configuration during an expression rather than interpreting its meaning. Within this approach, the most-used [18,20,21] signs are those belonging to the FACS, which breaks down facial expressions into small components of an anatomical basis called Action Units (AUs) [10,12,13,14,15,16,17,18,22]. Considering that the investigations of people with DS focus on detecting the syndrome, many of them are based on facial characteristics or some other part of their body [3,9,10,11,12,13].
Artificial intelligence advances in facial expression recognition (FER) to identify emotions have not been applied to people with DS, especially in uncontrolled settings. This study presents a proposal that is part of research that works with people with DS who attend particular education institutions to support daily activities within the therapies they perform and in which they interact with others, taking into account that emotions constitute a support base to define the achievements of people with DS within the treatment.
In this phase presented in this article, different artificial intelligence techniques and statistical analyses are evaluated to define the facial characteristics of this group of people [4] when they express their basic emotions. Since no studies have been carried out in this field, the objective is to obtain the characteristics of the face of this group of people. The analysis is based on facial measurements such as FACS, analysis of machine learning techniques, and deep learning. In this context, it is expected to contribute from the point of view of applied research to support the activities that this group of people carries out in their day-to-day life, emotions being a fundamental pillar in the performance of the activities of all human beings.
People with DS during this pandemic carried out many of their activities through videoconferences. However, with the progressive return to face-to-face activities, it is even more important to analyze how they feel emotionally as additional support to stimulation sessions. For this reason, in this work, different machine learning and deep learning techniques are evaluated and compared for the automatic recognition of emotions (happiness, anger, surprise, sadness, and neutrality), focusing on people with DS, using the intensity values of their micro-expressions through a dataset of this group of people.
In this work, Section 1 shows a compilation of research on people with DS using machine learning and deep learning techniques. Section 2 shows relevant studies based on the field of FER to identify emotions and related works to recognize emotions based on these artificial intelligence techniques, as well as a bibliographic compilation on the system of facial action codes and the main characteristics of the Action Units in people with TD. Section 3 shows the methodology applied in this research. Section 4 presents the results obtained. Finally, Section 5 presents the conclusions and discussion.

2. Literature Review

This section presents relevant works based on facial expression recognition to identify emotions. The topics addressed in this section are (a) analysis of FER, (b) analysis of the main artificial intelligence techniques used in recent years for FER, (c) analysis of facial action coding units, and (d) analysis of basic emotions based on micro-expressions.

2.1. Facial Expression Recognition

Facial expressions are necessary non-verbal means that allow human communication and are vital to transmitting information from one individual to another [23]. Therefore, the automatic recognition of facial expressions through artificial intelligence is highly studied in fields such as psychology, artificial vision, and pattern recognition.
The recognition of emotions is related to this topic, with the human facial image being the main entry. The input data to perform the FER can be divided into two groups: whether the features are manually extracted or generated through the outputs based on deep learning.
The first case consists of three main steps: image preprocessing, feature extraction, and expression classification. FER approaches based on deep learning significantly reduce the dependency on feature extraction. Most existing studies are based on laboratory datasets, but the current trend is the study of FER focused on spontaneous expressions.
The characteristics of each emotion are also determined by evaluating the micro-expressions of the face, whose theory we detail below.

2.2. Analysis of the Main Artificial Intelligence Techniques Used in Recent Years for FER

The advancement of facial emotion recognition at the computational level has been a challenge, which includes the identification of the emotion shown by the individual through the development of computational algorithms capable of interpreting the facial emotions of the human face, constituting a significant advance in the person–computer interaction.
Velluva [24] mentions that recognizing emotions constitutes a significant area of research because facial expressions convey information about human mental states, behavior, and intentions. These are pillars of social communication and interactions between individuals, supported in different areas such as psychology, neuroscience, human cognition, and learning.
Zago [25] reviews several works focused on solving the problem of emotion recognition from facial images using classical methods or a neural-network-based (NNB) approach. Rajan [26] mentions that several facial expression recognition systems focus on analyzing facial muscles, algorithms based on movement, and skin deformation. It is bearing in mind that in an uncontrolled environment, the effectiveness of the existing algorithms may be limited by specific problems of the acquired data, whether image or video, due to the very nature of data capture. Among the most used methods for face detection are the Haar classifier and AdaBoost to detect faces in an image or video sequence due to their high precision and low computational cost.
Several facial expression recognition systems focus on analyzing facial muscles, algorithms based on movement, and skin deformation, considering that in an uncontrolled environment, the effectiveness of existing algorithms may be limited by specific problems of the data acquired, whether in image or video form, due to the very nature of data capture. Among the most used methods for face detection are the Haar classifier and AdaBoost to detect faces in an image or video sequence due to their high precision and low computational cost. For feature extraction, we can mention some algorithms such as the Active Form Model (ASM), Local Binary Patterns (LBP), and the Histogram of Oriented Gradients (HOG). In recent years, convolutional neural networks have been used to extract features. Feature extraction based on FACS can also be mentioned based on the movement of facial muscles [25].
Within the supervised learning techniques used for the recognition of facial expressions for the classification of emotions, we have SVM as the most used, according to Zago [25]. Rajan [26] mentions that for the classification of facial expressions, SVM uses a linear algorithm based on a one-versus-one approach. One can also use the radial polarization function (RBF) kernel function based on one against the rest. This technique is used for its simplicity in terms of the processing power required. Depending on the facial expression, SVM classifiers provide a better classification performance. They represent excellent machine learning tools that compete with new and more complex methods. In addition to this supervised technique, there are other methods used for FER with less use by researchers, such as KNN, Dynamic Bayesian Networks, Fuzzy Logic, Decision Trees, and Hidden Markov Model, among others [25,26,27].
Convolutional neural networks represent the most widely used approach to solving computer vision problems, including real-time emotion processing [25]. In addition, convolutional neural networks are also used in feature extraction, providing greater accuracy using backpropagation and activation function algorithms [25,26].

2.3. Facial Action Coding System (FACS)

The Facial Action Coding System (FACS) was primarily developed as a comprehensive system to distinguish all possible visible facial movements. Initially proposed by Paul Ekman, it is based on an anatomically identified and published system of Action Units (AU), which are movements of individual or group muscles, especially of the face, representing micro-expressions and specific characteristics of different groups of people [22,28,29]. These characteristics of the images are to be analyzed within the present investigation.
FACS provides a standard nomenclature for facial movement research. The AUs, identified by a number and shorthand name, include the anatomical basis for each action presented in Table 1 (see also Table S1 in Supplementary Materials) and are rated on a five-point intensity scale [20,22,28,29,30,31]: A = trace, B = slight, C = marked or pronounced, D = severe or extreme, and E = maximum [22]. For example, AU12 encodes contractions of the zygomaticus major muscle (occurring, e.g., during smiles) [18,32]. This tool objectively describes facial expressions such as those shown in basic emotions. Each has a specific physiological response pattern in all cultures and people [22]. Thus, FACS provides an objective and comprehensive language for describing facial expressions and relating them based on the behavioral science literature [19].
Most of the existing methods for automatic AU detection address the problem using one-versus-all classifiers and do not exploit the dependencies between AU and facial features [20]. Nevertheless, it is possible to analyze whether there is a positive correlation between the use of AU classifiers and finding the relationship between AUs to improve their prediction [28,29]. Furthermore, the adoption of probabilities in the classifier can support a more precise method to identify which AU is activated, and it does so without the experience of the anatomy researcher, considering that the probabilities can be classified as dependent on facial features and independent of the expression itself [29].

2.4. Evaluation of Emotions Based on Micro-Expressions

In [33], Zhang mentions that some Aus coexist in the facial anatomy without being related to any emotion. There is a positive correlation when some Aus appear simultaneously because the same muscle group controls them; this is based on the analysis by Ekman in [28], where, for example, AU1 (internal brow lift) and AU2 (external browlift) are positively correlated since they belong to the same muscle group. In contrast, a negative correlation exists when specific muscles cannot be activated simultaneously [33]; for example, between AU12 (lip commissure extractor) and AU15 (lip commissure depressor). Table 2, taken from [20,33], shows the relationship between the dependence and independence of the Aus based on the facial anatomy of people with TD, without taking any emotion as a reference.
This analysis is essential in this work because it allows us to understand that the activation of some Aus depends on their correlation with others, which is a critical point in the analysis of this proposal [33].
Within the FACS expressions, Aus can be classified into primary, secondary, and other, depending on the expression functions. The primaries are the most important ones as they always activate in basic expressions, such as happiness, sadness, anger, and surprise. On the other hand, secondary Aus can coexist with primary Aus and provide supplemental support for expressions [33]. Table 3 shows the activation of primary and secondary Aus in four basic emotions of people with TD.
Table 4 presents the possible combinations of AU that can activate simultaneously to form an emotion [29] in people with typical development.

3. Materials and Methods

The structure of the face of people with DS has unique characteristics as presented in the literature [1,2,3,4], and their attitude to different situations generates emotions that, in terms of their facial expressions, could differ from those of other people with TD. For this reason, this article will first investigate the structure of the expressions of people with DS, which will be studied from typical expressions such as anger, happiness, sadness, surprise, and neutrality, and later through techniques of artificial expression intelligence, which evaluate their performance to classify them automatically.
A quantitative experimental methodology will be used based on the literature review described in Section 2. In addition, the data will be obtained from developing a dataset of images of faces of people with DS classified as anger, happiness, sadness, surprise, and neutral.
Section 2.4 mentioned that the emotions identified based on FER could be evaluated through the behavior of specific micro-expressions, defined as Action Units, so this technique will be used to obtain the most relevant AUs for each of the emotions of people with DS based on the samples in the created dataset. The Action Units are obtained through the OpenFace software [36], which allows for the discrimination of the micro-expressions of photographs and videos through each Action Unit’s activation and intensity levels. With the results obtained, through a statistical evaluation, based on the activation values and the intensity averages of each Action Unit, the most relevant micro-expressions for each emotion of people with DS are obtained.
Subsequently, the performance of artificial intelligence methods to classify emotions using supervised machine learning and deep learning techniques are compared. In the case of machine learning techniques, it will initially be necessary to define the relevant characteristics of the image dataset, which will be performed based on the activation intensity levels of the Action Units. For the case of deep learning, the classified images of the dataset are used directly.
Several techniques are used for machine learning, but the ones that obtained the best results were SVM, KNN, and assembly, taking into account Zago’s research [25] on the techniques used on people with TD, whereas in the same way for people with DS, SVM presented the best result for FER.
For deep learning, image convolutional neural network techniques based on the Xception architecture are used and are selected based on their efficiency, precision, and low complexity, which are explained in Section 3.4.
The results will be presented based on statistical graphs showing the intensity levels of each Action Unit for each emotion of people with DS. In addition, confusion matrices are given for recognizing emotions through artificial intelligence techniques, where the percentage of success in the recognition of each emotion can be observed.

3.1. Dataset

A dataset of images of faces of people with DS was created and classified according to anger, happiness, sadness, surprise, and neutrality. The images were freely obtained on the Internet and were saved in JPG format, with a matrix size of 227 × 227 × 3. The created dataset (Dataset-IMG) corresponded to a different individual; that is, it contained 555 images of other people with DS that were distributed as follows: 100 images of anger, 120 of happiness, 100 of sadness, 115 of surprise, and 120 neutral. As mentioned above, each sample in each emotion represented a different individual. An example of four individuals per emotion is shown in Table 5.
These images were preprocessed and will be detailed in Section 3.2 to obtain a dataset based on the intensity of each emotion studied action unit, called the Dataset-AU. Table 6 presents a random sample of the intensity values of the AUs of 5 people with DS; this means that for each individual or sample, a set of the intensity values of the 18 AUs that were studied was obtained, and these displayed values represent the preprocessed dataset generated for each emotion.

3.2. Evaluation of Action Units for People with Down Syndrome

This section shows the analysis carried out to recognize emotions from the facial features of people with DS. To perform the analysis of facial expressions, we based it on the intensity levels of the Action Units 1, 2, 4, 5, 6, 7, 9, 10, 12, 14, 15, 17, 20, 23, 25, 26, 28, and 45, taking into account the fact that these represent the upper part (eyes and eyebrows) and the lower part (mouth) of the face, with these being fundamental to define emotions [37] according to the classification shown in Table 1. This work used OpenFace to detect Action Units, an open-source tool that uses a Python and Torch implementation to recognize deep neural networks’ facial recognition [36]. OpenFace detects the Action Units that are present in emotions, as well as their level of intensity. The emotions analyzed in this research were happiness, anger, surprise, sadness, and neutral expressions.
The images of the dataset mentioned in Section 3.1 were processed using OpenFace to generate the structured data presented in Table 6. This table shows the emotions studied in the first column; then, in the following columns, the values of the intensities are given for each sample of the AUs analyzed in each emotion.
The data in Table 6 are the basis of the Dataset–AU that represents the characteristics of the images of the faces that are labeled, and a machine learning system will evaluate them.
To analyze the behavior of the Action Units within each emotion, Figure 1 shows the graphs of the normalized histograms for each emotion, according to the number of samples and the size of the container. This process is derived from the analysis carried out on the intensities of the AUs obtained in the information acquisition phase through OpenFace. Figure 1 shows the degree of contribution of each AU within the emotion studied. This analysis allowed us to obtain Figure 2, where the distributions presented in Figure 1 are considered, and the mean values of the intensity levels of the AUs for each emotion are obtained. For example, the emotion of happiness showed an approximation to a normal distribution of AU 6 and 12, while AU 1 and 2 showed an exponential distribution.
Figure 2 aims to present the characteristic emotion in each action unit. Figure 2 is based on the histograms shown in Figure 1 for each emotion, where the mean values of the intensity levels of the AU were obtained.
The statistical analysis of the units of action of the 555 images (IMG-Dataset) of the different people with DS described in Section 3.1 was used. The highest mean value was chosen for each AU in Figure 2.
The highest mean value in Figure 2 will represent a particular emotion; for example, the mean intensities in the surprise emotion of AUs 1, 2, 5, 25, and 26 were higher than the other emotions. However, in the case of sadness, it only had AU45 as a representative value. Conversely, in the case of this emotion, it was still shallow, approximately 80% lower than the surprise AU5 (the AU with the highest intensity value within the analysis performed). The neutral emotion did not present greater intensities in each analyzed AU, so it was decided as an analysis criterion to take the second greater intensity shown in each unit of action. In the case of neutral AUs, they should not be activated as they do not offer any facial expressions. However, these Action Units are active due to the anatomical features of their eyes. Table 7 was obtained from the analysis of Figure 2, taking the highest emotions of each analyzed AU, which is why they are considered the most representative of people with DS. Figure 2 shows that in the case of AU 23, its mean values were less than 10% of the highest mean values, so it did not present representative intensities with high numerical values compared to the other AUs.

3.3. Machine Learning Analysis in People with Down Syndrome

Various supervised machine learning techniques are used to classify the emotions of the Dataset-AU (mentioned in Section 3.2) that correspond to the intensity levels of the AUs of people with DS. In [27,38], machine learning techniques are presented and developed for emotion recognition in people with TD. In this research, the performance of the techniques presented in [27,38] in people with DS was analyzed, where the best results found are shown in Figure 3, Figure 4 and Figure 5 with the techniques described below:
  • Support Vector Machines (SVM) [27,38] based on a linear kernel.
  • Nearest Neighbor (KNN) [27,39].
  • Set of sub-spatial discriminant classifiers based on the AdaBoost method [39] with the Decision Tree learning technique.
The results of the Dataset-AU were analyzed through these learning techniques and are presented in Figure 3, Figure 4 and Figure 5 through the validation of confusion matrices. This matrix was based on the True Positive Rates (TPR) on the diagonal of the matrix and the False Negative Rates (FNR), and the precision values of the three applied techniques were: for SVM, 66.2%; KNN, 64.9%,; and the combined technique, 64.7%. Accuracy is a metric that allows one to evaluate classification models through true positives over the total number of samples.
In Figure 3, Figure 4 and Figure 5, the emotions of anger and sadness have low precision; in the case of sadness, it was 30.8% (SVM), 13.8% (KNN), and 26.20% (Ensemble), while in the case of anger they showed an accuracy of 42.3% (SVM), 53.8% (KNN) and 46.20% (Ensemble). There are emotions such as happiness with a precision more significant than 82%.
Because emotions such as sadness and anger have low predictive values, it is necessary to analyze these cases. Furthermore, it needs to consider that there are studies carried out on people with DS, which have shown their difficulty in expressing negative emotions such as fear, sadness, and anger [35]. For this reason, it was decided as a next step to evaluate deep learning techniques to analyze if there is an improvement in predicting these emotions.

3.4. Deep Learning Analysis in People with Down Syndrome

The machine learning techniques used in Section 3.3 resulted in a prediction rate of less than 66%, especially in sadness, with a prediction of less than 30%. This section will discuss the use of deep neural networks (DNNs) to predict the emotions of people with DS to improve these previous results.
To choose a technique, we investigated how some DNN architectures are used to recognize the facial expressions of people with TD since there are still no studies focused on people with DS, as in [40,41,42,43] AlexNet, VGG, ResNet, Xception, and Squeezenet. Bianco [44] evaluated some DNN architectures to analyze performance indices, accuracy, computational complexity, and memory usage. The experiment by Bianco [44] was based on the comparison of two machines with different computational capacities. Table 8, based on the analysis carried out by Bianco [44], allows one to compare the techniques used for the recognition of the facial expressions of people with TD. The first column presents the architectures analyzed. The second column shows the computational complexity based on the number of operations; for example, the one with the lowest precision of 57% was AlexNet, and the one with the best accuracy was Xception with 79%. The third column is the computational complexity based on the number of operations. SqueezeNet and Xception were the least complex systems, with 5 M and 10 M (M stands for Millions of parameters), respectively. In contrast, the most complex system was VGG with 150 M. Finally, the fourth column shows the relationship between precision and the number of parameters, that is, efficiency, with Xception being the architecture that works best with 47%. Moreover, finally the fifth column, the number of frames processed per second (FPS); it is essential to analyze this point together with the first, that is:
-
ResNet renders between 70–300 FPS with a maximum accuracy of 73%.
-
Xception processes 160 FPS, with an accuracy of 79%.
-
AlexNet renders around 800 FPS with a 57% accuracy.
It is worth pointing out that after the analysis was carried out, the Xception had a higher efficiency than the other systems; its model is not complex, and it works with a rate of 160 FPS, which is why this architecture was chosen to work within this investigation [44].
Xception’s CNN architecture is slightly different from the typical CNN model because, in the end, fully connected layers are used. In addition, residual modules modify the expected mapping of subsequent layers. Thus, the learned functions become the difference between the feature map of the desired functions and the original ones.
Within the Xception architecture family, we have the mini-Xception architecture; as Arriaga [43] mentioned, mini-Xception reduces the number of parameters compared to an Xception. Moreover, the architecture combines the suppression of a fully connected layer and the inclusion of the combination convolutions separable in-depth and residual modules. Finally, the architectures are trained with the ADAM optimizer.
In [45,46,47], the mini-Xception architecture for people with TD was evaluated using the FER2013 dataset containing 35,887 images of emotions such as happiness, anger, sadness, surprise, neutral, disgust, and fear. This dataset was used to recognize emotions, achieving an accuracy of up to 95%.
Mini-Xception was used with transfer learning in this work, considering that it was already trained with another dataset (FER2013) for emotion recognition. In our case, we took the dataset collected from people with DS called Dataset-IMG (images) described in Section 3.1. These data entered the mini-Xception architecture already trained with FER2013 as a block of test data. Then, the system returned the recognized emotion based on the highest prediction; that is, the highest value of the prediction was the one that the architecture decided was the recognized emotion.
Figure 6 shows the confusion matrix obtained with this architecture, with an improvement in the average accuracy of all the emotions of 74.8%. In the case of the emotions of sadness, an accuracy of 41% was achieved.

4. Results Analysis

Based on the dataset created with faces of people with DS for the five facial expressions, happiness, sadness, surprise, anger, and neutral, the intensities of the Action Units of the micro-expressions that define each emotion were obtained. The behavior of the intensities of the Action Units was evaluated through histograms of relative frequencies. Their level of activation was presented through the mean values of each action unit per emotion, which allowed them to define the most relevant Action Units for each emotion of people with DS, whose results are summarized in Table 7, where additionally, a high correlation can be seen between the Action Units that represent the emotions of sadness and anger, especially with the AUs 4, 9, 15 and 17, which from the beginning indicated that there would be a high probability that these emotions can be confused. On the contrary, emotions such as happiness and surprise presented a greater decorrelation with the other emotions, which indicated that a high probability was possible to differentiate them from the other emotions.
The intensities of the Action Units for all the facial expressions obtained became the main input characteristics to work with machine learning techniques.
From the analysis carried out in Table 7 of Section 3.2, to obtain the technique that gives us a better accuracy when identifying the emotions of people with DS, the AUs were selected for each emotion of people with DS based on the most relevant mean intensity values obtained after the analysis carried out using the Dataset-IMG. First, the results were extracted in the confusion matrices. Then, when evaluating the machine learning and deep learning techniques described in Section 2.4 and Section 3.3, they were consolidated in Table 9, where it was observed that, based on the TPR, the emotion that presented the highest index was happiness, with values between 82.7% and 99%, using all the samples of a single class and applying the deep learning technique used. On the other hand, the surprise emotion obtained a better TPR with machine learning techniques, reaching 64.7% according to the analysis that depended exclusively on the intensity levels of the AUs.
The sadness emotion was the one that achieved the lowest TPR with the machine learning techniques of 13.8% since the intensity in its average value was not relevant and reached a maximum of 41% with deep learning. In Table 7, the active AUs had greater relevance for the happiness emotion. The emotion of anger presented its best TPR with 53.8% machine learning techniques and 66.7% deep learning techniques.
The neutral expression using machine learning was close to 78.9%, reaching 88.4% with the deep learning approach.
The use of transfer learning allowed for the reuse of the Mini-Xception architecture previously trained with the FER2013 dataset. The dataset described in Section 3.1 was taken, which was made up of images of faces of people with DS called Dataset-IMG (images). These data entered the already-trained system (Mini-Xception) as a test data block. This architecture was already tested in previous works to recognize emotions.

5. Conclusions and Discussion

This paper analyzed the typical facial features of people with DS through the units of action identified with the statistical analysis, the moment they express their emotions. The facial characteristics of the emotions of people with DS were evaluated using various artificial intelligence techniques, which were chosen and analyzed based on the results given in previous works for FER. To meet these objectives, a dataset of faces of this group of people had initially been created and classified according to the emotions of anger, happiness, sadness, surprise, and neutrality.
To obtain the characteristics of the micro-expressions that occur in the emotions of this target group, the intensity values of the Action Units defined in the structure of the FACS were obtained, for which the Open Face software was used.
The intensity of all the Action Units described in Section 3.2 was obtained for each emotion. Additionally, the Action Units that presented the highest levels of intensity for each emotion on average were selected, being the most representative, as well as their behavior through histograms of relative frequencies. From these results, it has been deduced that emotions such as happiness and surprise are more uncorrelated concerning other emotions, which increases the probability of differentiating and classifying them. On the other hand, anger and sadness are highly correlated, which would give a higher level of confusion between these emotions.
Obtaining Action Units was used as characteristics for machine learning training, where SVM presented the best performance, with an accuracy of 66.2%, taking into account that in the literature, there are works carried out for people with TD with an accuracy greater than 80% [27,38]. However, on the other hand, KNN managed to classify the emotions of joy and anger with a higher TPR.
On the other hand, images of the collected database were evaluated through transfer learning based on convolutional neural networks (Mini-Xception); it was trained with people with DS, obtaining an accuracy of 74.8%. On the other hand, the images of the collected database were evaluated by transfer learning based on convolutional neural networks (Mini-Xception); it was trained with people with DS, obtaining an accuracy of 74.8%, taking into account that in works carried out with this architecture with databases of people with TD, an accuracy of 66% was obtained [45]. In [47], the authors analyzed emotion recognition in real-time with an accuracy of around 80.77%, and finally, Behera [46] used the Mini-Xception architecture and time series analysis, obtaining a maximum recognition percentage of 40%. The emotion that was recognized by more than 90% was happiness for people with DS and TD people. The results showed a similar performance in accuracy for both types of people.
Based on the obtained results, this proposal has allowed the opening of a research field that considers this group of people who require support tools in their daily activities due to their physical and cognitive characteristics. This work has constituted a starting point to propose algorithms based on the characteristics of this group of people to assess their emotions within their daily activities in real time, such as stimulation sessions, taking into account that the algorithms that have been developed will support the automatic recognition of emotions both in person and through activities carried out through videoconferences; that is, this study will be applied to virtual and face-to-face activities.
As the objective of future research is also to work with people with DS who attend institutions, it will be essential to include the images obtained from the stimulation sessions in the image dataset and to retrain the techniques evaluated in this article to improve the accuracy of the classification of emotions with people with DS and to even work on particular cases. Finally, the analysis of micro-expression behavior also leaves the option of generating synthetic samples based on their relative frequency histogram structured as probability density functions, which allows the recreation of different types of gestures that reflect the emotions typical of people with DS.

Supplementary Materials

The following supporting information can be downloaded at: https://link.springer.com/chapter/10.1007/978-3-030-72208-1_19 (accessed on 6 August 2022); Table S1. The most important Action Units.

Author Contributions

Conceptualization, N.P., E.F.C.-B. and B.B.; methodology, N.P., B.B. and G.O.; formal analysis, N.P. and G.O.; writing—original draft preparation, N.P.; writing—review and editing, N.P., E.F.C.-B. and B.B. All authors have read and agreed to the published version of the manuscript.

Funding

This article had the support of the Universidad del Valle through the internal call for project research with the project “Recognition of emotions of people with Down syndrome based on facial expressions to support a therapeutic process”.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Acknowledgments

We thank Diego Gustavo Arcos Aviles for his support in revising the grammar.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Pérez Chávez, D.A. Síndrome de down. Rev. Actual. Clínica Investig. 2014, 45, 2357–2361. [Google Scholar] [CrossRef]
  2. Rodríguez, E. Programa de educación emocional para niños y jóvenes con síndrome de Down. Rev. Síndrome Down Rev. Española Investig. Inf. Sobre Síndrome Down 2004, 82, 84–93. [Google Scholar]
  3. Agbolade, O.; Nazri, A.; Yaakob, R.; Abdul Ghani, A.A.; Cheah, Y.K. Down Syndrome Face Recognition: A Review. Symmetry 2020, 12, 1182. [Google Scholar] [CrossRef]
  4. Reardon, W.; Donnai, D. Dysmorphology demystified. Arch. Dis. Child.-Fetal Neonatal Ed. 2007, 92, F225–F229. [Google Scholar] [CrossRef] [Green Version]
  5. Shan, C.; Gong, S.; McOwan, P.W. Beyond facial expressions: Learning human emotion from body gestures. In Proceedings of the British Machine Vision Conference, Warwick, UK, 10–13 September 2007. [Google Scholar] [CrossRef] [Green Version]
  6. Matsumoto, D.; Hwang, H.; López, R.; Pérez Nieto, M. Lectura de la Expresión Facial de las Emociones: Investigación básica en la mejora del reconocimiento de emociones. Ansiedad Estres. 2013, 19, 121–129. [Google Scholar]
  7. Ruiz, E.; Alvarez, R.; Arce, A.; Palazuelos, I.; Schelstraete, G. Programa de educación emocional. Aplicación práctica en niños con síndrome de Down. Rev. Síndrome Down Rev. Española Investig. Inf. Sobre Síndrome Down 2022, 103, 126–139. [Google Scholar]
  8. Hauke, G.; Dall Occhio, M. Emotional Activation Therapy (EAT): Intense work with different emotions in a cognitive behavioral setting. Eur. Psychother. 2013, 11, 5–29. [Google Scholar]
  9. Soler Ruiz Tesis, V.; Prim Sabrià Jordi Roig de Zárate, M. Lógica Difusa Aplicada a Conjuntos Imbalanceados: Aplicación a la Detección del Síndrome de Down; Universitat Autònoma de Barcelona: Bellaterra, Spain, 2007. [Google Scholar]
  10. Cornejo, J.Y.R.; Pedrini, H.; Lima, A.M.; Nunes, F.D.L.D.S. Down syndrome detection based on facial features using a geometric descriptor. J. Med. Imaging 2017, 4, 044008. [Google Scholar] [CrossRef]
  11. Lucey, P.; Cohn, J.F.; Kanade, T.; Saragih, J.; Ambadar, Z.; Matthews, I. The extended cohn-kanade dataset (CK+): A complete dataset for action unit and emotion-specified expression. In Proceedings of the 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition-Workshops, CVPRW, San Francisco, CA, USA, 13–18 June 2010; pp. 94–101. [Google Scholar]
  12. Chen, C.-M.; Chou, Y.-H.; Tagawa, N.; Do, Y. Computer-Aided Detection and Diagnosis in Medical Imaging. Comput. Math. Methods Med. 2013, 2013, 790608. [Google Scholar] [CrossRef] [Green Version]
  13. Eroğul, O.; Sipahi, M.E.; Tunca, Y.; Vurucu, S. Recognition of Down syndromes using image analysis. In Proceedings of the 2009 14th National Biomedical Engineering Meeting, Izmir, Turkey, 20–22 May 2009. [Google Scholar] [CrossRef]
  14. Zhao, Q.; Okada, K.; Rosenbaum, K.; Zand, D.J.; Sze, R.; Summar, M.; Linguraru, M.G. Hierarchical constrained local model using ICA and its application to down syndrome detection. In Medical Image Computing and Computer-Assisted Intervention: MICCAI International Conference on Medical Image Computing and Computer-Assisted Intervention; Springer: Berlin/Heidelberg, Germany, 2013; Volume 16, pp. 222–229. [Google Scholar] [CrossRef]
  15. Burçin, K.; Vasif, N.V. Down syndrome recognition using local binary patterns and statistical evaluation of the system. Expert Syst. Appl. 2011, 38, 8690–8695. [Google Scholar] [CrossRef]
  16. Saraydemir, S.; Taşpınar, N.; Eroğul, O.; Kayserili, H.; Dinçkan, N. Down Syndrome Diagnosis Based on Gabor Wavelet Transform. J. Med. Syst. 2011, 36, 3205–3213. [Google Scholar] [CrossRef] [PubMed]
  17. Zhao, Q.; Rosenbaum, K.; Okada, K.; Zand, D.J.; Sze, R.; Summar, M.; Linguraru, M.G. Automated down syndrome detection using facial photographs. In Proceedings of the 2013 35th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), Osaka, Japan, 3–7 July 2013; pp. 3670–3673. [Google Scholar] [CrossRef]
  18. Hupont, I.T.; Chetouani, M. Region-based facial representation for real-time Action Units intensity detection across datasets. Pattern Anal. Appl. 2017, 22, 477–489. [Google Scholar] [CrossRef]
  19. Bartlett, M.S.; Littlewort, G.C.; Frank, M.G.; Lainscsek, C.; Fasel, I.R.; Movellan, J.R. Automatic Recognition of Facial Actions in Spontaneous Expressions. J. Multimed. 2006, 1, 22–35. [Google Scholar] [CrossRef]
  20. Zhao, K.; Chu, W.-S.; de la Torre, F.; Cohn, J.F.; Zhang, H. Joint patch and multi-label learning for facial action unit detection. In Proceedings of the CVPR, IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Boston, MA, USA, 7–12 June 2015; pp. 2207–2216. [Google Scholar] [CrossRef] [Green Version]
  21. Zhang, L.; Verma, B.; Tjondronegoro, D.; Chandran, V. Facial Expression Analysis under Partial Occlusion: A Survey. ACM Comput. Surv. 2018, 51, 1–49. [Google Scholar] [CrossRef] [Green Version]
  22. Paul, E.; Wallace, V.F.; Joseph, C.H. Facial Action Coding System; APA PsycNet: Salt Lake City, UT, USA, 2002. [Google Scholar]
  23. Mehrabian, A.; Russell, J.A. An Approach to Environmental Psychology; The MIT Press: Cambridge, UK, 1974. [Google Scholar]
  24. Puthanidam, R.V.; Moh, T.-S. A hybrid approach for facial expression recognition. In Proceedings of the 12th International Conference on Ubiquitous Information Management and Communication, Langkawi, Malaysia, 5–7 January 2018; pp. 1–8. [Google Scholar] [CrossRef]
  25. Canal, F.Z.; Müller, T.R.; Matias, J.C.; Scotton, G.G.; Junior, A.R.D.S.; Pozzebon, E.; Sobieranski, A.C. A survey on facial emotion recognition techniques: A state-of-the-art literature review. Inf. Sci. 2021, 582, 593–617. [Google Scholar] [CrossRef]
  26. Rajan, S.; Chenniappan, P.; Devaraj, S.; Madian, N. Facial expression recognition techniques: A comprehensive survey. IET Image Process. 2019, 13, 1031–1040. [Google Scholar] [CrossRef]
  27. Revina, I.M.; Emmanuel, W.R.S. A Survey on Human Face Expression Recognition Techniques. J. King Saud Univ.-Comput. Inf. Sci. 2018, 33, 619–628. [Google Scholar] [CrossRef]
  28. Ekman, P.; Friesen, W.V. Facial Action Coding System: A Technique for the Measurement of Facial Movement; Consulting Psychologists Press: Palo Alto, CA, USA, 1978. [Google Scholar] [CrossRef]
  29. Ekman, P.; Friesen, W.V. Measuring facial movement. Environ. Psychol. Nonverbal Behavior. 1976, 1, 56–75. [Google Scholar] [CrossRef]
  30. Clark, E.A.; Kessinger, J.; Duncan, S.E.; Bell, M.A.; Lahne, J.; Gallagher, D.L.; O’Keefe, S.F. The Facial Action Coding System for Characterization of Human Affective Response to Consumer Product-Based Stimuli: A Systematic Review. Front. Psychol. 2020, 11, 920. [Google Scholar] [CrossRef]
  31. Li, Y.; Mavadati, S.M.; Mahoor, M.H.; Ji, Q. A unified probabilistic framework for measuring the intensity of spontaneous facial action units. In Proceedings of the 2013 10th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition (FG), Shanghai, China, 22–26 April 2013; pp. 1–7. [Google Scholar] [CrossRef]
  32. Wang, Z.; Li, Y.; Wang, S.; Ji, Q. Capturing global semantic relationships for facial action unit recognition. In Proceedings of the IEEE International Conference on Computer Vision, Sydney, Australia, 1–8 December 2013; pp. 3304–3311. [Google Scholar] [CrossRef]
  33. Zhang, Y.; Dong, W.; Hu, B.-G.; Ji, Q. Classifier learning with prior probabilities for facial action unit recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 5108–5116. [Google Scholar] [CrossRef]
  34. Du, S.; Tao, Y.; Martinez, A.M. Compound facial expressions of emotion. Proc. Natl. Acad. Sci. USA 2014, 111, E1454–E1462. [Google Scholar] [CrossRef] [Green Version]
  35. Siam, A.I.; Soliman, N.F.; Algarni, A.D.; El-Samie, F.E.A.; Sedik, A. Deploying Machine Learning Techniques for Human Emotion Detection. Comput. Intell. Neurosci. 2022, 2022, 1–16. [Google Scholar] [CrossRef] [PubMed]
  36. Baltrusaitis, T.; Zadeh, A.; Lim, Y.C.; Morency, L.P. Openface 2.0: Facial behavior analysis toolkit. In Proceedings of the 2018 13th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2018), Xi’an, China, 15–19 May 2018; pp. 59–66. [Google Scholar]
  37. Baltrusaitis, T.; Mahmoud, M.; Robinson, P. Cross-dataset learning and person-specific normalisation for automatic Action Unit detection. In Proceedings of the 2015 11th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition (FG), Ljubljana, Slovenia, 4–8 May 2015; pp. 1–6. [Google Scholar] [CrossRef]
  38. Yao, L.; Wan, Y.; Ni, H.; Xu, B. Action unit classification for facial expression recognition using active learning and SVM. Multimed. Tools Appl. 2021, 80, 24287–24301. [Google Scholar] [CrossRef]
  39. Thannoon, H.H.; Ali, W.H.; Hashim, I.A. Detection of deception using facial expressions based on different classification algorithms. In Proceedings of the 2018 Third Scientific Conference of Electrical Engineering (SCEE), Baghdad, Iraq, 19–20 December 2018; pp. 51–56. [Google Scholar] [CrossRef]
  40. Baffour, P.A.; Nunoo-Mensah, H.; Keelson, E.; Kommey, B. A Survey on Deep Learning Algorithms in Facial Emotion Detection and Recognition. Inf. J. Ilm. Bid. Teknol. Inf. Dan Komun. 2022, 7, 24–32. [Google Scholar] [CrossRef]
  41. Li, S.; Deng, W. Deep Facial Expression Recognition: A Survey. J. Image Graph. 2020, 25, 2306–2320. [Google Scholar] [CrossRef] [Green Version]
  42. Aiswarya, P.; Manish; Mangalraj, P. Emotion recognition by inclusion of age and gender parameters with a novel hierarchical approach using deep learning. In Proceedings of the 2020 Advanced Communication Technologies and Signal Processing (ACTS), Silchar, India, 4–6 December 2020; pp. 1–6. [Google Scholar] [CrossRef]
  43. Sun, L.; Ge, C.; Zhong, Y. Design and Implementation of Face Emotion Recognition System Based on CNN Mini_Xception Frameworks. J. Physics Conf. Ser. 2021, 2010, 012123. [Google Scholar] [CrossRef]
  44. Bianco, S.; Cadene, R.; Celona, L.; Napoletano, P. Benchmark Analysis of Representative Deep Neural Network Architectures. IEEE Access 2018, 6, 64270–64277. [Google Scholar] [CrossRef]
  45. Arriaga, O.; Plöger, P.G.; Valdenegro, M. Real-Time Convolutional Neural Networks for Emotion and Gender Classification. In Proceedings of the European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning, Bruges, Belgium, 24–26 April 2019; pp. 221–226. [Google Scholar]
  46. Behera, B.; Prakash, A.; Gupta, U.; Semwal, V.B.; Chauhan, A. Statistical prediction of facial emotions using Mini Xception CNN and time series analysis. In Data Science; Springer: Singapore, 2021; pp. 397–410. [Google Scholar] [CrossRef]
  47. Fatima, S.A.; Kumar, A.; Raoof, S.S. Real Time Emotion Detection of Humans Using Mini-Xception Algorithm. IOP Conf. Series Mater. Sci. Eng. 2021, 1042, 012027. [Google Scholar] [CrossRef]
  48. Batta, M. Machine Learning Algorithms—A Review. Int. J. Sci. Res. IJSR 2019, 9, 381–386. [Google Scholar] [CrossRef]
  49. Sarker, I.H. Machine Learning: Algorithms, Real-World Applications and Research Directions. SN Comput. Sci. 2021, 2, 160. [Google Scholar] [CrossRef]
Figure 1. Normalized histograms—intensity levels of the Action Units. (a) Anger. (b) Happiness. (c) Sadness. (d) Surprise. (e) Neutral.
Figure 1. Normalized histograms—intensity levels of the Action Units. (a) Anger. (b) Happiness. (c) Sadness. (d) Surprise. (e) Neutral.
Symmetry 14 02492 g001aSymmetry 14 02492 g001b
Figure 2. Activation of Action Units in each emotion of people with Down syndrome.
Figure 2. Activation of Action Units in each emotion of people with Down syndrome.
Symmetry 14 02492 g002
Figure 3. Classification confusion matrix of SVM.
Figure 3. Classification confusion matrix of SVM.
Symmetry 14 02492 g003
Figure 4. Classification confusion matrix of KNN.
Figure 4. Classification confusion matrix of KNN.
Symmetry 14 02492 g004
Figure 5. Classification confusion matrix of Ensemble.
Figure 5. Classification confusion matrix of Ensemble.
Symmetry 14 02492 g005
Figure 6. Classification confusion matrix using transfer learning.
Figure 6. Classification confusion matrix using transfer learning.
Symmetry 14 02492 g006
Table 1. The main Action Units (see also Table S1 in Supplementary Materials).
Table 1. The main Action Units (see also Table S1 in Supplementary Materials).
AUImageDescriptionAUImageDescription
AU1Symmetry 14 02492 i001Inner brow raiserAU15Symmetry 14 02492 i002Lip Corner Depressor
AU2Symmetry 14 02492 i003Outer brow raiserAU16Symmetry 14 02492 i004Lower Lip Depressor
AU4Symmetry 14 02492 i005Brow lowererAU17Symmetry 14 02492 i006Chin Raiser
AU5Symmetry 14 02492 i007Upper lid raiserAU18Symmetry 14 02492 i008Lip Puckerer
AU6Symmetry 14 02492 i009Cheek raiserAU20Symmetry 14 02492 i010Lip stretcher
AU7Symmetry 14 02492 i011Lid tightenerAU22Symmetry 14 02492 i012Lip Funneler
AU9Symmetry 14 02492 i013Nose wrinklerAU23Symmetry 14 02492 i014Lip Tightener
AU10Symmetry 14 02492 i015Upper lip raiserAU24Symmetry 14 02492 i016Lip Pressor
AU11Symmetry 14 02492 i017Nasolabial DeepenerAU25Symmetry 14 02492 i018Lips part
AU12Symmetry 14 02492 i019Nasolabial DeepenerAU26Symmetry 14 02492 i020Jaw Drop
AU13Symmetry 14 02492 i021Cheek pufferAU28Symmetry 14 02492 i022Lip suck
AU14Symmetry 14 02492 i023DimplerAU45Symmetry 14 02492 i024Blink
Table 2. The dependencies and independents of the Aus according to facial anatomy of people with typical development [20,33].
Table 2. The dependencies and independents of the Aus according to facial anatomy of people with typical development [20,33].
Aus RelationsAus
Positive correlation(1, 2), (4, 7), (4, 9), (7, 9), (6, 12), (9, 17), (15, 17), (15, 24), (17, 24), (23, 24)
Negative correlation(2, 6), (2, 7), (12, 15), (12, 17)
Table 3. The relationship of primary and secondary Aus of four basic expressions based on FACS (primary “P” and secondary “S”) [33].
Table 3. The relationship of primary and secondary Aus of four basic expressions based on FACS (primary “P” and secondary “S”) [33].
AU124567910121516172023242526
Anger PP S S PP
Happiness PS P S
SadnessP S SP P S
SurprisePP P S PP
Disgust PP S S
FearPPPP P PS S
Table 4. AU dependence based on the Emotional Facial Action Coding System [28,29,30,33,34,35].
Table 4. AU dependence based on the Emotional Facial Action Coding System [28,29,30,33,34,35].
ExpressionAus
Anger(4 and 5) or (4 and 7) or (4 and 5 and 7) or (17 and 23 and 24)
Happiness(12) or (6 and 12) or (7 and 12)
Sadness (1) or (1 and 4) or (15) or (6 and 15) or (11 and 15) or (11 and 17)
Surprise(1 and 2 and 5) or (1 and 2 and 26) or (1 and 2 and 5 and 26)
Disgust9 or 10
Fear(1 and 2 and 4) or 20
Table 5. Samples of the images of Dataset–IMG.
Table 5. Samples of the images of Dataset–IMG.
EmotionPictures Samples
AngerSymmetry 14 02492 i025Symmetry 14 02492 i026Symmetry 14 02492 i027Symmetry 14 02492 i028
HappinessSymmetry 14 02492 i029Symmetry 14 02492 i030Symmetry 14 02492 i031Symmetry 14 02492 i032
SadnessSymmetry 14 02492 i033Symmetry 14 02492 i034Symmetry 14 02492 i035Symmetry 14 02492 i036
SurpriseSymmetry 14 02492 i037Symmetry 14 02492 i038Symmetry 14 02492 i039Symmetry 14 02492 i040
NeutralSymmetry 14 02492 i041Symmetry 14 02492 i042Symmetry 14 02492 i043Symmetry 14 02492 i044
Table 6. A random sample of the preprocessed dataset through OpenFace is generated for each emotion of people with Down syndrome.
Table 6. A random sample of the preprocessed dataset through OpenFace is generated for each emotion of people with Down syndrome.
EmotionAction Units (AUs)
12456791012141517202325262845
Anger0.3302.54000.680.530.88001.272.201.09000.6100
Happiness00002.202.790.240.672.280.890.3701.240.640000.15
Sadness1.620.840.440.5500.210000.441.160.730.2100000
Surprise1.400.260.031.960000000.5600.1801.280.2800
Neutral0.53000.36000000000.5600.200.5100
Table 7. Relevant Action Units that are present in emotions of people with Down syndrome.
Table 7. Relevant Action Units that are present in emotions of people with Down syndrome.
ExpressionAUs
Anger4, 9, 15, 17, 23
Happiness6, 7, 10, 12, 14, 20
Sadness1, 4, 6, 7, 9, 12, 15, 17, 20, 45
Surprise1, 2, 5, 25, 26
Neutral2, 5
Table 8. Comparison of DNN architectures [44].
Table 8. Comparison of DNN architectures [44].
ArchitectureAccuracy %Computational Complexity (Floating-Point Operations)Accuracy Density %Frames Processed per Second (FPS)
AlexNet5775 M1800
ResNet70–735–75 M1, 6–580–300
VGG68–75150 M0, 52190–290
SqueezeNet585 M3, 9600–650
Xception7910 M47160
Table 9. Summary of results for emotion classification techniques of people with DS.
Table 9. Summary of results for emotion classification techniques of people with DS.
True Positive Rates (%)
AngerHappinessNeutralSadnessSurpriseAccuracy
Machine Learning [48,49]KNN53.891.376.113.860.364.9
Ensemble
Subspace Discriminant
46.289.471.626.260.364.7
SVM42.382.778.930.864.166.2
Deep LearningMini-Xception66.79988.44162.274.8
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Paredes, N.; Caicedo-Bravo, E.F.; Bacca, B.; Olmedo, G. Emotion Recognition of Down Syndrome People Based on the Evaluation of Artificial Intelligence and Statistical Analysis Methods. Symmetry 2022, 14, 2492. https://doi.org/10.3390/sym14122492

AMA Style

Paredes N, Caicedo-Bravo EF, Bacca B, Olmedo G. Emotion Recognition of Down Syndrome People Based on the Evaluation of Artificial Intelligence and Statistical Analysis Methods. Symmetry. 2022; 14(12):2492. https://doi.org/10.3390/sym14122492

Chicago/Turabian Style

Paredes, Nancy, Eduardo F. Caicedo-Bravo, Bladimir Bacca, and Gonzalo Olmedo. 2022. "Emotion Recognition of Down Syndrome People Based on the Evaluation of Artificial Intelligence and Statistical Analysis Methods" Symmetry 14, no. 12: 2492. https://doi.org/10.3390/sym14122492

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop