Necessary Morphological Patches Extraction for Automatic Micro-Expression Recognition

Micro expressions are usually subtle and brief facial expressions that humans use to hide their true emotional states. In recent years, micro-expression recognition has attracted wide attention in the fields of psychology, mass media, and computer vision. The shortest micro expression lasts only 1/25 s. Furthermore, different from macro-expressions, micro-expressions have considerable low intensity and inadequate contraction of the facial muscles. Based on these characteristics, automatic micro-expression detection and recognition are great challenges in the field of computer vision. In this paper, we propose a novel automatic facial expression recognition framework based on necessary morphological patches (NMPs) to better detect and identify micro expressions. Micro expression is a subconscious facial muscle response. It is not controlled by the rational thought of the brain. Therefore, it calls on a few facial muscles and has local properties. NMPs are the facial regions that must be involved when a micro expression occurs. NMPs were screened based on weighting the facial active patches instead of the holistic utilization of the entire facial area. Firstly, we manually define the active facial patches according to the facial landmark coordinates and the facial action coding system (FACS). Secondly, we use a LBP-TOP descriptor to extract features in these patches and the Entropy-Weight method to select NMP. Finally, we obtain the weighted LBP-TOP features of these NMP. We test on two recent publicly available datasets: CASME II and SMIC database that provided sufficient samples. Compared with many recent state-of-the-art approaches, our method achieves more promising recognition results.


Introduction
Micro-expression is a brief, involuntary, and external representation of a real emotion that can be exploited to determine the "real" behaviors and feelings of an individual [1].Micro-expression was first discovered by psychologists Ekman and Friesen in 1969 [2].Compared with ordinary facial expressions, micro-expressions have three significant characteristics: short duration, (generally lasts 1/25 to 1/3 s), low intensity, and (usually) local movement [3].Based on these characteristics, micro-expressions are very difficult to detect by human beings [4].Only highly trained individuals may be able to distinguish them, but even with proper training, the accuracy of their recognition falls below 50% in general [5].However, it is very crucial to have the capability to detect and recognize micro-expressions in many areas, such as psychological and clinical diagnosis, police interrogation, and national security [6].Up till now, there have been plentiful research works for this area in the literature [7][8][9][10][11][12][13], in which computer-aided techniques have been established to automatically recognize micro-expressions.
According to the known research results [8][9][10], extracting features from the whole facial region can be found in most experiments, thus generating a lot of unnecessary redundant features so that the efficiency of recognition is greatly reduced.Psychologists have also found that micro-expressions tend to be partial movements that do not appear in the upper and lower portions of the facial region simultaneously [5] and are generally concentrated near the region of eyes, nose, and mouth.Ekman proposed that facial expression represents slight changes in several discrete facial motion units [5].He also established Facial Action Coding System (FACS) [14] to describe the relationship between facial muscle changes and emotional states.This system illustrates that facial expressions are associated with subtle changes in several action units (AUs).Compared with ordinary expression, micro-expressions have extremely brief facial representations and invoke less muscle action units.For example, even without the involvement of eyebrows and mouth, one can express their inner state of surprise only by raising the upper eyelids.Later on, the theory of necessary morphological patches (NMPs) for micro-expression has established [15].NMPs denote that some regions are indispensable in micro expression.For example, in all facial representations of disgust, the eyebrows and mouth do not need to be raised and open respectively, but the upper lip must be raised and appearing nasolabial sulcus on both sides of the nose.Figure 1 shows the NMPs of disgust, which are the indications of micro-expression to judge whether a person has a disgust emotional state.Almost all emotional information of micro-expressions is concentrated in these patches, therefore we need to separate these patches from the whole facial area.
According to the known research results [8][9][10], extracting features from the whole facial region can be found in most experiments, thus generating a lot of unnecessary redundant features so that the efficiency of recognition is greatly reduced.Psychologists have also found that micro-expressions tend to be partial movements that do not appear in the upper and lower portions of the facial region simultaneously [5] and are generally concentrated near the region of eyes, nose, and mouth.Ekman proposed that facial expression represents slight changes in several discrete facial motion units [5].He also established Facial Action Coding System (FACS) [14] to describe the relationship between facial muscle changes and emotional states.This system illustrates that facial expressions are associated with subtle changes in several action units (AUs).Compared with ordinary expression, microexpressions have extremely brief facial representations and invoke less muscle action units.For example, even without the involvement of eyebrows and mouth, one can express their inner state of surprise only by raising the upper eyelids.Later on, the theory of necessary morphological patches (NMPs) for micro-expression has established [15].NMPs denote that some regions are indispensable in micro expression.For example, in all facial representations of disgust, the eyebrows and mouth do not need to be raised and open respectively, but the upper lip must be raised and appearing nasolabial sulcus on both sides of the nose.Figure 1 shows the NMPs of disgust, which are the indications of microexpression to judge whether a person has a disgust emotional state.Almost all emotional information of micro-expressions is concentrated in these patches, therefore we need to separate these patches from the whole facial area.In this paper, we aim to find some NMPs that play a key role in micro-expression recognition, and use these patches to train and learn how to identify micro-expression sequences.Automatic facial landmark detection is the first step in this work.This technique can detect the face in the video sequences, and adjust the position of the face in order to cut out the facial region.Face alignment is applied to extract facial active patches.In this study, in order to locate the facial active patches more accurately, we chose the algorithm of 68 landmarks [10].After finding these patches in the face, we calibrated the active patches of the eyebrows, eyes, nose, and mouth based on the FACS criterion and landmarks technology.Later, we manually cut out 18 active patches, known as regions of interest (ROI) [13] in the whole face area.Moreover, we need to extract more effective features from these active patches [16][17][18][19].Many existing works apply optical flow [12] and Local Binary Patterns from Three Orthogonal Panels (LBP-TOP) [8] algorithm to extract feature form dynamic micro-expression video sequences.The optical flow method reflects the close correlation between the frames by calculating the two adjacent frames.However, the changes between adjacent frames in microexpressions are very weak and hence the algorithm does not reflect the changes of the facial active patches.The LBP-TOP algorithm analyzes image texture from temporal and spatial pattern.Texture In this paper, we aim to find some NMPs that play a key role in micro-expression recognition, and use these patches to train and learn how to identify micro-expression sequences.Automatic facial landmark detection is the first step in this work.This technique can detect the face in the video sequences, and adjust the position of the face in order to cut out the facial region.Face alignment is applied to extract facial active patches.In this study, in order to locate the facial active patches more accurately, we chose the algorithm of 68 landmarks [10].After finding these patches in the face, we calibrated the active patches of the eyebrows, eyes, nose, and mouth based on the FACS criterion and landmarks technology.Later, we manually cut out 18 active patches, known as regions of interest (ROI) [13] in the whole face area.Moreover, we need to extract more effective features from these active patches [16][17][18][19].Many existing works apply optical flow [12] and Local Binary Patterns from Three Orthogonal Panels (LBP-TOP) [8] algorithm to extract feature form dynamic micro-expression video sequences.The optical flow method reflects the close correlation between the frames by calculating the two adjacent frames.However, the changes between adjacent frames in micro-expressions are very weak and hence the algorithm does not reflect the changes of the facial active patches.The LBP-TOP algorithm analyzes image texture from temporal and spatial pattern.Texture is a feature that shows the spatial distribution property of pixels, and can convey the necessary information of micro-expression.In addition, it can also display the local structural information of facial images.Compared to ordinary expressions, micro-expressions call for less muscle motion to convey emotions and to evaluate the current emotional state we can we identify only some NMPs.
To reduce the dimensionality and improve the recognition efficiency, we use the Entropy-Weight method to screen the NMPs, which are essential for micro-expression recognition from 18 active patches.The concept of entropy is first introduced into information theory by Shannon, which describes the size of the average information amount of events.The entropy weight has been widely used in engineering, social and economic fields, whose basic idea is to determine objective weights according to the variability of indexes.In general, the smaller information entropy of an index indicates the greater the variability of the index value.Therefore, this index affects more in terms of comprehensive evaluation, and its weight will be greater.In order to assess the contribution of 18 active patches to micro-expression recognition, we use Entropy-Weight method to evaluate its weights.Entropy-Weight method can not only filter out NMPs from active patches but weight these patches, thus increasing the discriminative ability of our algorithm.Finally, the multi-class SVM classifier is used to identify these NMPs, and the recognition rate is obtained from CASME II and SMIC databases.
The rest of this paper is organized as follows.The next section describes the related work on facial landmark detector and NMP selection, feature extraction, and weighting these necessary regions.Section 3 illustrates the particulars of the databases and discusses the experimental results in detail.Finally, Section 4 concludes the paper.

Methods
The subtle local movement of the facial muscles allows the facial expression to change and involve the relative positions of facial landmarks.The texture information of these regions also changes as the expression changes.In this paper, we aim at exploring different facial areas towards the recognition accuracy based on the subtle, local qualities of micro-expression.In other words, our goal is to identify the crucial facial areas corresponding to different emotions of the micro expression.The framework of the proposed algorithm is shown in Figure 2.
Appl.Sci.2018, 8, x 3 of 15 is a feature that shows the spatial distribution property of pixels, and can convey the necessary information of micro-expression.In addition, it can also display the local structural information of facial images.Compared to ordinary expressions, micro-expressions call for less muscle motion to convey emotions and to evaluate the current emotional state we can we identify only some NMPs.
To reduce the dimensionality and improve the recognition efficiency, we use the Entropy-Weight method to screen the NMPs, which are essential for micro-expression recognition from 18 active patches.The concept of entropy is first introduced into information theory by Shannon, which describes the size of the average information amount of events.The entropy weight has been widely used in engineering, social and economic fields, whose basic idea is to determine objective weights according to the variability of indexes.In general, the smaller information entropy of an index indicates the greater the variability of the index value.Therefore, this index affects more in terms of comprehensive evaluation, and its weight will be greater.In order to assess the contribution of 18 active patches to micro-expression recognition, we use Entropy-Weight method to evaluate its weights.Entropy-Weight method can not only filter out NMPs from active patches but weight these patches, thus increasing the discriminative ability of our algorithm.Finally, the multi-class SVM classifier is used to identify these NMPs, and the recognition rate is obtained from CASME II and SMIC databases.
The rest of this paper is organized as follows.The next section describes the related work on facial landmark detector and NMP selection, feature extraction, and weighting these necessary regions.Section 3 illustrates the particulars of the databases and discusses the experimental results in detail.Finally, Section 4 concludes the paper.

Methods
The subtle local movement of the facial muscles allows the facial expression to change and involve the relative positions of facial landmarks.The texture information of these regions also changes as the expression changes.In this paper, we aim at exploring different facial areas towards the recognition accuracy based on the subtle, local qualities of micro-expression.In other words, our goal is to identify the crucial facial areas corresponding to different emotions of the micro expression.The framework of the proposed algorithm is shown in Figure 2.

Facial Landmark Location
The goal of facial landmark detection is to accurately locate the key points of the face through the detection algorithm.The landmarks generally refer to the points around eyes, eyebrows, nose, mouth, and face contour.Studies have shown that the active facial areas are mainly concentrated in the interlaced area of eyebrows and nasal bridge, as well as the corners of the eyes and mouth.In this paper, we firstly use face detection and landmarks detection to accurately locate the active facial patches.After

Facial Landmark Location
The goal of facial landmark detection is to accurately locate the key points of the face through the detection algorithm.The landmarks generally refer to the points around eyes, eyebrows, nose, mouth, and face contour.Studies have shown that the active facial areas are mainly concentrated in the interlaced area of eyebrows and nasal bridge, as well as the corners of the eyes and mouth.In this paper, we firstly use face detection and landmarks detection to accurately locate the active facial patches.After that, we cut out the regions and extract the necessary features.Therefore, in order to get better location effect in active facial patches, 68 landmarks algorithm is used to calibrate micro-expression sequence.
To the best of our best knowledge, there are many machine learning methods to locate 68 landmarks, such as Active Appearance Model (AAM), Active Shape Model (ASM), and deep neural network algorithm, etc. Taking into account the real time and accuracy of position, we used ASM to localize the landmarks of micro-expression images [20].The algorithm learns facial images, which are calibrated by using a training set, then the best matching points are searched on the test set and the landmarks of the face are located accordingly.We located facial landmarks in micro-expression images based on a previously published algorithm [21].The 68 landmarks we drew on a facial image are shown in Figure 3.This method is applied in our algorithm and 68 landmarks are employed to align the active areas.These landmarks indicate the shape of eyebrows, eyes, nose, mouth, and the whole face, which are beneficial for researchers to cut the active patches.
Appl.Sci.2018, 8, x 4 of 15 that, we cut out the regions and extract the necessary features.Therefore, in order to get better location effect in active facial patches, 68 landmarks algorithm is used to calibrate micro-expression sequence.
To the best of our best knowledge, there are many machine learning methods to locate 68 landmarks, such as Active Appearance Model (AAM), Active Shape Model (ASM), and deep neural network algorithm, etc. Taking into account the real time and accuracy of position, we used ASM to localize the landmarks of micro-expression images [20].The algorithm learns facial images, which are calibrated by using a training set, then the best matching points are searched on the test set and the landmarks of the face are located accordingly.We located facial landmarks in micro-expression images based on a previously published algorithm [21].The 68 landmarks we drew on a facial image are shown in Figure 3.This method is applied in our algorithm and 68 landmarks are employed to align the active areas.These landmarks indicate the shape of eyebrows, eyes, nose, mouth, and the whole face, which are beneficial for researchers to cut the active patches.

Extraction of Facial Active Patches
There are two main drawbacks in direct training of the classifier through the whole face: (1) The dimensions of the features are too large and the training time is relatively long; (2) Some regions on the face do not express emotion and contribute little to the representation of facial expressions.Hence, the features obtained from these regions are most likely to introduce noise.
The face must be partitioned appropriately for micro-expression recognition to be feasible [22].The FACS criterion quantifies several muscle movements of the face and reveals 57 elementary components of the expression.These elementary components are known as the action units (AUs) and action descriptors (ADs).Similar to other facial expressions, a micro-expression is also a spatial combination of AUs.Each AU describes a local movement of micro-expression.Table 1 defines several relationships between AUs and facial movements [14].

Extraction of Facial Active Patches
There are two main drawbacks in direct training of the classifier through the whole face: (1) The dimensions of the features are too large and the training time is relatively long; (2) Some regions on the face do not express emotion and contribute little to the representation of facial expressions.Hence, the features obtained from these regions are most likely to introduce noise.
The face must be partitioned appropriately for micro-expression recognition to be feasible [22].The FACS criterion quantifies several muscle movements of the face and reveals 57 elementary components of the expression.These elementary components are known as the action units (AUs) and action descriptors (ADs).Similar to other facial expressions, a micro-expression is also a spatial combination of AUs.Each AU describes a local movement of micro-expression.Table 1 defines several relationships between AUs and facial movements [14].Considering that micro-expressions only involve certain local muscle movements and AUs, extracting only a few active facial patches instead of the entire image of the face is an effective approach to recognition.The eyebrows and eyes, for example, are involved in nearly all basic emotions [23].The morphological characteristics of the eyes and brow are important cues of different micro-expressions.The mouth is another key discriminant area for expression recognition.Here we manually choose a frontal neutral face image as a template, and divide the image into 18 ROIs, as shown in Figure 4.The patches are separated according to the movements of micro-expressions.Each patch represents the active facial area of the micro-expression.We maintain the same size of each patch and extract the sequences of active patches for subsequent research.

Extraction Features
Micro expressions differ from ordinary ("macro") expression in regards to their low intensity, short duration, and local movements.It is unreasonable to use ordinary expression recognition methods to deal with micro-expression sequences.Here, we extend the classic LBP descriptor to a LBP-TOP to manage dynamic textures and events across spatial-temporal dimensions [24,25].
The LBP-TOP operator extends the LBP to three orthogonal planes, it was first proposed by Ojala et al.This operator reveals the local binary pattern of each image as well as the motion features of the spatial-temporal domain on the whole sequence.The LBP-TOP operator firstly divides the temporal and spatial domain into three orthogonal planes (XY, XT, and YT), then calculates the LBP values of the center pixels in each plane and eventually yields statistics of the expression information in three directions.
In practical applications, the spatial and temporal feature scales are different due to the unpredictable texture orientation, and the differences in the image resolution as well as frame rate.Here, we use an elliptical structure to define all neighboring points on the three orthogonal planes respective to the center point between frames, as shown in Figure 5.The LBP code is extracted from the XY, XT, and YT planes and denoted as XY-LBP, XT-LBP, and YT-LBP.The statistics of three different planes were obtained for all pixels, and then concatenated into a single histogram [24].In this paper, we extract LBP-TOP features and generate feature histograms for The patches are separated according to the movements of micro-expressions.Each patch represents the active facial area of the micro-expression.We maintain the same size of each patch and extract the sequences of active patches for subsequent research.

Extraction Features
Micro expressions differ from ordinary ("macro") expression in regards to their low intensity, short duration, and local movements.It is unreasonable to use ordinary expression recognition methods to deal with micro-expression sequences.Here, we extend the classic LBP descriptor to a LBP-TOP to manage dynamic textures and events across spatial-temporal dimensions [24,25].
The LBP-TOP operator extends the LBP to three orthogonal planes, it was first proposed by Ojala et al.This operator reveals the local binary pattern of each image as well as the motion features of the spatial-temporal domain on the whole sequence.The LBP-TOP operator firstly divides the temporal and spatial domain into three orthogonal planes (XY, XT, and YT), then calculates the LBP values of the center pixels in each plane and eventually yields statistics of the expression information in three directions.
In practical applications, the spatial and temporal feature scales are different due to the unpredictable texture orientation, and the differences in the image resolution as well as frame rate.
Here, we use an elliptical structure to define all neighboring points on the three orthogonal planes respective to the center point between frames, as shown in Figure 5.
The LBP code is extracted from the XY, XT, and YT planes and denoted as XY-LBP, XT-LBP, and YT-LBP.The statistics of three different planes were obtained for all pixels, and then concatenated into a single histogram [24].In this paper, we extract LBP-TOP features and generate feature histograms for 18 active patches sequences.Only a few facial muscles are called on because of the micro expression.If 18 active patches are used to represent micro-expression sequences, the dimension of feature is too large, which makes the feature matching extremely complex and consumes too much system resources.Moreover, the movement range of the micro expression is much smaller than that of ordinary expression so that micro-expressions can be represented by some NMPs.For the next stage, we estimate some NMPs that are of significance for the micro expression from these 18 active patches.and spatial domain into three orthogonal planes (XY, XT, and YT), then calculates the LBP values of the center pixels in each plane and eventually yields statistics of the expression information in three directions.
In practical applications, the spatial and temporal feature scales are different due to the unpredictable texture orientation, and the differences in the image resolution as well as frame rate.Here, we use an elliptical structure to define all neighboring points on the three orthogonal planes respective to the center point between frames, as shown in Figure 5.The LBP code is extracted from the XY, XT, and YT planes and denoted as XY-LBP, XT-LBP, and YT-LBP.The statistics of three different planes were obtained for all pixels, and then concatenated into a single histogram [24].In this paper, we extract LBP-TOP features and generate feature histograms for

Learning Crucial Facial Patches
Only a few NMPs play key roles in micro-expression recognition [5], because each active patch has a different importance for micro-expression recognition.For example, the eye area and the mouth area are highly distinguishable for people to express their emotion.Therefore, we should set different weights for each active patches so as to find out the NMPs and improve the subsequent recognition accuracy.
In this paper, the Entropy-Weight method is used to calculate the weights of each active patch and select the NMPs essential for micro-expression recognition.Information entropy can represent the information content of an image, and express the richness of the image texture.When an image is divided into many sub-patches, the local information entropy can partly reflect the quantity of information for each patch.We therefore can calculate the contribution of the texture feature of each patch based on local information entropy.The weight of the histogram of the patches is given by information entropy, which can absolutely embody the importance of each patch.
Information entropy of the local patch indicates the information contained in the pixel.The greater the amount of information, the more abundant texture information of the patch.Considering the strong discriminating ability of texture features to the expression details, our paper introduces the concept of entropy by using the entropy weight to express the NMPs weight.
The steps of determining the weight by the Entropy-Weight method are as follows: Suppose that there are m objects to be evaluated and n evaluation indexes, the original data matrix of the image is as follows: Step 1: Standardization of the original matrix, thus the normalized matrix is obtained.
where R = r ij m×n is the Establishing evaluation matrix.
where the formula is the standard value of the jth evaluation index on the ith evaluation object.
(2) Step 2: Calculating the proportion f ij of index value of the i-th object the j-th index.(3) Step 3: Determining information entropy.In the case of m objects and n indexes, the information entropy of the jth index is defined as follows: where, k = 1 ln m .(4) Step 4: Defining entropy weight.The entropy weight of the jth index is defined as follows: where, 0 ≤ w j ≤ 1, ∑ n j=1 w j = 1.

Multi-Class Classification
In this study, we used Support Vector Machine (SVM) [25] as a classifier for micro-expression recognition.It projects feature vectors to a higher dimensional plane by nonlinear mapping and finds a linear hyperplane for classification.SVM is a linear two-class model that maximize the margin in the feature space maximum.Micro-expression recognition is a multi-classification problem, in this paper, so we used Leave One Sample Out Cross Validation (LOSOCV) and 10-fold cross validation to it.However, micro-expression recognition is a multi-classification problem, there are two common methods to solve this problem: one-versus-rest (OVR) and one-versus-one (OVO).In this paper, we use the OVO SVMs.The approach is to design a SVM between any two classes of samples, so we need to design k(k − 1)/2 SVMs.After that, when classifying an unknown sample, the sample will select the class with the largest number of votes.The advantage of this method is that it does not need to retrain all SVMs, but only needs to retrain and add classifiers related to the samples.In addition, we also need to use the kernel function to map the sample from the original space to a higher dimensional feature space, so that the sample is linearly separable in this feature space.The kernel functions include the linear kernel, polynomial kernel, and Radial Basis Function (RBF).In this paper, RBF kernel: k(x i , x j ) = exp (− ) is used as our classifier.

Datasets
Compared with the macro-expression, there are only a few micro-expression databases for investigation.In this section, we evaluate the proposed algorithm using CASME II [26] and SMIC [27] databases.

CASME II
The CASME II database was established in 2014 as an upgraded version of the CASME database [26].The time resolution of the new database changed from 60 fps to 200 fps, and the spatial resolution increased to 280 × 340.The database was obtained under a strict laboratory environment and appropriate light conditions to a total of 247 micro-expression videos.The film clips either have a total duration with less than 0.5 s or an onset duration (i.e., time from onset to apex) with less than 0.25 s.The ground truth information includes the onset and offset frames, represented emotions, as well as the AUs being provided.It consists of five classes of emotions, namely, happiness (32 samples), disgust (60 samples), surprise (25 samples), repression (27 samples), and tense (102 samples).

SMIC
The Spontaneous Micro-Expression Database (SMIC) was designed by the Zhao's team of the machine vision research center of the University of Oulu, Finland [27].The researchers asked subjects to watch movie clips, which induced disgust, fear, sadness, and surprise, while attempting to suppress their facial expressions.This experiment used a 100 fps camera.The resulting database includes 164 videos of 16 subjects.The micro-expressions have a maximum total duration of 0.5 s and the longest video sequence contains 50 frames.There are three main emotion categories: positive (happiness; 51 samples), negative (sad, fear, disgust; 70 samples), and surprise (43 samples).

Experiment Settings
Above all, we use ASM to locate 68 landmarks points in all the images of the micro-expression sequence, then cut the face area and the size of each frame is normalize to 164 × 196.
These databases are micro-expression sequences captured by high-speed camera for several different individuals.The frames in the database are different, which will degrade the recognition rate if different sequence samples are used to extract and classify the micro-expressions.With the reference to the literature, ref. [20] we use the time interpolation model (TIM) to normalize all the frames of the micro-expression sequences.Table 2 shows the relationship between the number of frames with the experimental time and accuracy.Based on the table below, the frames of all samples were normalized to 10.We use a 68-point ASM to locate the facial key points and then to establish landmarks [21].The whole face can be divided into active patches based on these landmarks.Eighteen facial active patches are generated based on the FACS rule and AUs.In addition, the number of frames of the active patches sequences is 10.And the size of all facial active patches keeps equal at approximately one-eighth of the width of the face, as shown in the Figure 6.
Appl.Sci.2018, 8, x 8 of 15 literature, ref. [20] we use the time interpolation model (TIM) to normalize all the frames of the microexpression sequences.Table 2 shows the relationship between the number of frames with the experimental time and accuracy.Based on the table below, the frames of all samples were normalized to 10.We use a 68-point ASM to locate the facial key points and then to establish landmarks.[21] The whole face can be divided into active patches based on these landmarks.Eighteen facial active patches are generated based on the FACS rule and AUs.In addition, the number of frames of the active patches sequences is 10.And the size of all facial active patches keeps equal at approximately one-eighth of the width of the face, as shown in the Figure 6.In this paper, we use a LBP-TOP operator to extract the feature vector for micro-expressions recognition and the descriptor contains two important parameters: radius and neighbor points.For convenience, we wrote LBP − TOP , , , , , as  ,  ,  ;  =  =  =  .The parameter comparison of the LBP-TOP algorithm is shown in Table 3.In this paper, we use a LBP-TOP operator to extract the feature vector for micro-expressions recognition and the descriptor contains two important parameters: radius and neighbor points.For convenience, we wrote LBP − TOP R X ,R Y ,R T ,P XY ,P XT ,P YT as R X , R Y , R T ; P XY = P XT = P YT = P.The parameter comparison of the LBP-TOP algorithm is shown in Table 3.In this paper, SVM is selected as the experimental classifier, and the choice of kernel function is very important for its performance.The experiments are conducted on the CASME II and SMIC databases by using weight LBP-TOP feature extraction methods.A SVM classifier with a kernel function was used to evaluate the proposed method.In this multi-subject level analysis, both LOSOCV and 10-fold cross validation are utilized to validate the effectiveness of the proposed method in all the experiments.In LOSOCV, the video sequence of one subject is treated as the testing data and the remaining frames as the training data.Such a process is repeated for k times, where k denotes the number of subjects in the database.Then the recognition results for all subjects are averaged to form the final recognition accuracy.In 10-fold cross validation, the data set is divided into ten parts, nine of which are taken as training data in turn, and the other one as test data.The correct rate is obtained from each test, and the average value of the correct rate is estimated as the accuracy of the algorithm.Finally, we do ten times 10-fold cross validation and get the mean value as the ultimate accuracy.In this paper, two common cross validation methods are used to evaluate the classification and recognition ability of SVM, as shown in Table 4.

Results and Discussion
In this section, we conduct extensive experiments to evaluate the performance of the proposed micro-expression method on two widely-used micro-expression databases.

NMPs
The number of facial active patches also affects the performance and recognition rate.Figure 7 shows the relationship between the number of patches and the recognition rate.Even the use of a single crucial patch yields a recognition rate of 31.62%; the use of all 18 active patches produces a recognition rate of 56.93%.We found that the features of some unimportant patches do not play a significant role in identifying micro-expression.Instead of applying all 18 facial active patches, we can extract some crucial patches (NMPs) with discriminant ability for micro-expression recognition.
In the figure, N represents the number and location of NMPs.The micro expression only involves a few of facial active patches.The recognition rate reaches to the highest when the number of patches is 10.High values of N contain some extra patches that contribute less to subtle movement and micro expression recognition.Moreover, low values of N lost some important information.
The occurrence of micro-expressions is very weak, most of the movements focus on the corners of the eyes, eyebrows, nose, and mouth.The Entropy-Weight method as a feature selection algorithm can evaluate the importance of each feature on the classification problem.This paper compares the contribution of the Entropy-Weight method and other feature selection algorithms in the number of NMPs and recognition accuracy.The experimental results are shown in Table 5.Then, comparing with the data, each algorithm chooses the number of NMPs in the region of the eyes, eyebrows, and mouth, which are basically the same while the number for the cheek and nose area is different.This is because the muscle movement of micro-expressions is mainly concentrated in the eye, eyebrow, and mouth regions, and the action units of the cheek and nose regions are very few.Micro-expressions are usually restrained facial movements, which are very subtle and easily overlooked.The Pearson coefficient is insensitive and misleading to these micro-expression areas because of the small correlation between the motions.The lasso model is very unstable, when the data changes slightly, it may lead to great changes in the model.The Entropy-Weight method has the advantages of high robustness and easy use, and the experimental results show that the NMPs selected by this method are basically in line with the most representative facial muscle motion patches proposed by psychologists when micro-expressions occur.In addition, the Entropy-Weight method can give weight to feature vectors, which can better represent the motion characteristics of micro-expressions in the classification process.

Results and Discussion
In this section, we conduct extensive experiments to evaluate the performance of the proposed micro-expression method on two widely-used micro-expression databases.

NMPs
The number of facial active patches also affects the performance and recognition rate.Figure 7 shows the relationship between the number of patches and the recognition rate.Even the use of a single crucial patch yields a recognition rate of 31.62%; the use of all 18 active patches produces a recognition rate of 56.93%.We found that the features of some unimportant patches do not play a significant role in identifying micro-expression.Instead of applying all 18 facial active patches, we can extract some crucial patches (NMPs) with discriminant ability for micro-expression recognition.In the figure, N represents the number and location of NMPs.The micro expression only involves a few of facial active patches.The recognition rate reaches to the highest when the number of patches is 10.High values of N contain some extra patches that contribute less to subtle movement and micro expression recognition.Moreover, low values of N lost some important information.
The occurrence of micro-expressions is very weak, most of the movements focus on the corners of the eyes, eyebrows, nose, and mouth.The Entropy-Weight method as a feature selection algorithm can evaluate the importance of each feature on the classification problem.This paper compares the contribution of the Entropy-Weight method and other feature selection algorithms in the number of NMPs and recognition accuracy.The experimental results are shown in Table 5.Then, comparing with the data, each algorithm chooses the number of NMPs in the region of the eyes, eyebrows, and mouth, which are basically the same while the number for the cheek and nose area is different.This is because the muscle movement of micro-expressions is mainly concentrated in the eye, eyebrow, and mouth regions, and the action units of the cheek and nose regions are very few.Micro-expressions are usually restrained facial movements, which are very subtle and easily overlooked.The Pearson coefficient is insensitive and misleading to these micro-expression areas because of the small correlation between  Each micro expression affects a few specific facial muscles.In other words, only part of the AUs are crucial for micro expression.In this paper, we use Entropy-Weight method to determine the location of NMPs.The optimum number of NMPs and location corresponding to different micro expressions are shown in Table 6.The subtle muscle movements of micro expressions mainly concentrate in the patches of the eyes, the eyebrows, the alar sides, and the mouth according to the weight value derived.The proposed method chooses 10 patches (R1, R2, R5, R6, R9, R10, R13, R14, R15, R17) which get the highest weights of these regions as NMP.

Figure 3 .
Figure 3. Landmarks on the face (68 total) and cut the image.

Figure 3 .
Figure 3. Landmarks on the face (68 total) and cut the image.
Appl.Sci.2018, 8, x 5 of 15 to recognition.The eyebrows and eyes, for example, are involved in nearly all basic emotions [23].The morphological characteristics of the eyes and brow are important cues of different micro-expressions.The mouth is another key discriminant area for expression recognition.Here we manually choose a frontal neutral face image as a template, and divide the image into 18 ROIs, as shown in Figure 4.

Figure 4 .
Figure 4. Position of facial active patches.

Figure 5 .
Figure 5. Different radius and numbers of neighboring points on three planes.(a) XY orthogonal planes; (b) XT orthogonal planes; (c) YT orthogonal planes.

Figure 4 .
Figure 4. Position of facial active patches.

Figure 5 .
Figure 5. Different radius and numbers of neighboring points on three planes.(a) XY orthogonal planes; (b) XT orthogonal planes; (c) YT orthogonal planes.

Figure 5 .
Figure 5. Different radius and numbers of neighboring points on three planes.(a) XY orthogonal planes; (b) XT orthogonal planes; (c) YT orthogonal planes.

Figure 6 .
Figure 6.Active facial patches of micro-expression sequence.

Figure 6 .
Figure 6.Active facial patches of micro-expression sequence.

Table 1 .
Relationships between AUs and facial movements.
Considering that micro-expressions only involve certain local muscle movements and AUs, extracting only a few active facial patches instead of the entire image of the face is an effective approach cut into a standard size

Table 1 .
Relationships between AUs and facial movements.

Table 2 .
Relationship between time interpolation model (TIM) length with time and accuracy.

Table 2 .
Relationship between time interpolation model (TIM) length with time and accuracy.

Table 4 .
Recognition rate of different kernel functions on CASME II and SMIC database (%).

Table 5 .
Accuracy rate and NMPs numbers of different feature selection algorithms.

Table 6 .
NMP of micro-expression in the CASME II database.