Human faces are central to our identity and they are important in expressing emotion. The act of smiling is important in this context and the exploration of facial changes and dynamics during the act of smiling [1
] is an ongoing topic of investigation in fields of research in orthodontics and prosthodontics, both of which aim to improve the function and appearance of dentition. Aesthetics (e.g., of smiles [3
]) are therefore an important aspect of these fields. Much research into the “science of a smile” also focuses on the effects of aging and biological sex on the shape and appearance [4
] and also the dynamics [6
] of smiling. Recent investigations have been greatly enhanced by the use of three-dimensional (3D) imaging techniques [9
] that allow both static and dynamic imaging of the face. Clinically, this research has led to improved understanding of orthognathic surgery [9
], malocclusion [10
], associations between facial morphology and cardiometabolic risk [11
], lip shape during speech [12
], facial asymmetry [13
], and sleep apnea [14
] (to name but a few examples). Clearly also, facial simulation is of much interest for human–computer interfaces (see, e.g., [15
]). The role of genetic factors on facial shape has also been the subject of much recent attention [18
] and many factors (sex, age, and genetic factors) across a set of subjects can affect the shape and dynamics of facial and/or mouth shape.
In this article, we wish to explore the question “what’s in a smile?” by using multilevel principal components analysis (mPCA) to adjust for covariates such as natural mouth shape and/or sex. Indeed, the mPCA approach has previously been shown [22
] to provide a simple and straightforward method of modeling shape. The mPCA approach is potentially also of much use in active shape models (ASMs) [27
] and active appearance models (AAMs) [32
] (see also [38
]). We remark that one such previous application of mPCA to ASMs related to the segmentation of the human spine [22
]. The authors stated that their results showed that “such a modelization offers more flexibility and allows deformations that classical statistical models can simply not generate”. They also noted that “the idea is to decompose the data into a within-individual and a between-individual component”. Hence, we remark again that mPCA provides one method of adjusting for external and/or confounding factors or covariates that can strongly affect shapes (or images). Other recent applications of mPCA [23
] allowed us to determine the relative importance of biological sex and ethnicity on facial shape by examination of eigenvalues. Modes of variation made sense because changes in shape were seen to correspond to biological sex and ethnicity at the correct levels of the model and no ‘mixing’ of these effects was observed. Finally, principal component ‘scores’ also showed strong clustering, which were again at the correct levels of the mPCA model.
Another method that allows us to investigate the effects of covariates on facial shape to be modeled is called bootstrapped response-based imputation modeling (BRIM) [20
]. The effects of covariates such as sex and genomic ancestry on facial shape were summarized in [20
] as response-based imputed predictor (RIP) variables and the independent effects of particular alleles on facial features were uncovered. Indeed, the importance of modeling the effects of covariates in images is also becoming increasingly recognized, e.g., such as in variational auto-encoders (see, e.g., [39
]) in which the effects of covariates are modeled as latent variables sandwiched between encoding (convolution) and decoding (deconvolution) layers. However, the topic of variational auto-encoders lies beyond the scope of this article. Linear discriminant functions have also been used previously (see, e.g., [42
]) to explore groupings in the subject population for image data.
The work presented here is also an expansion of [26
] that extended the mPCA approach from shape data also to include image data, where a set of (frontal) facial images from a group of 80 Finnish subjects (34 male; 46 female) each for two different facial expressions (smiling and neutral) were considered. A three-level model illustrated by Figure 1
was constructed that contains biological sex, facial expression, and ‘between-subject variation’ at different levels of the model and we use this model again here for this dataset. However, we also compare results of mPCA to those results of single-level PCA, which was not carried out in [26
]. Furthermore, the dynamic aspects of a smile are considered here in a new (time series) dataset of 3D mouth shape captured during all phases of a smile, which was also not considered in [26
]. We present here firstly the subject characteristics and details of image capture and preprocessing. We then consider the mathematical detail of the mPCA method. We then present our results for both datasets. Finally, we offer a discussion of these results in the concluding section.
Eigenvalues for both shape and also image texture via mPCA are shown in Figure 4
for dataset 1. The results for mPCA demonstrated a single non-zero eigenvalue for the level 1 (biological sex). A single large eigenvalue for the level 3 (facial expression) for mPCA occurs for shape and also for image texture, although many non-zero (albeit of much smaller magnitude) eigenvalues occur for image texture. Level 2 (between-subject variation) via mPCA tends to have the largest number of non-zero eigenvalues for both shape and image texture. The first two eigenvalues are (relatively) large at level 2, mPCA for image texture. mPCA results suggest that biological sex seems be the least important for this group of subjects for both shape and texture, although caution needs to be exercised as the rank of the matrix is 1 at this level for both shape and texture. Results for the eigenvalues from single-level PCA are of comparable magnitude to those results of mPCA, as one would expect, and they follow a very similar pattern. Inspection of these results for the eigenvalues tell us broadly that facial expression and natural facial shape (not dependent on sex or expression) are strong influences on facial shapes in the dataset. Biological sex was found to be a weaker effect comparatively, especially for shape, although again caution needs to be exercised in interpreting eigenvalues at this level for mPCA.
Modes of variation of shape for dataset 1 are presented in Figure 5
. The first mode at level 3 (facial expression) for mPCA and mode 1 in single-level PCA both capture changes in facial expression (i.e., neutral to smiling and vice versa). Changes in mouth shape in Figure 5
can be seen that relate clearly to the act of smiling in both graphs. For example, obvious effects such as widening of the mouth, corners of the mouth raised slightly, exposure of teeth can be clearly seen. However, subtle effects such as narrowing of the eyes [44
] and a slight widening at bottom of nose during smile are also seen clearly, especially for mPCA. Eyes become further apart (relatively) and the mouth becomes wider for the first mode at level 1 (biological sex) for mPCA in Figure 4
. All shapes have been scaled so that the average point-to-centroid distance is equal to 1 and so this result makes sense because men have generally thinner faces than women on average [45
]. This first mode via mPCA at level 3 probably corresponds to mode 3 or mode 2 (or a combination of both) in single-level PCA. However, modes 2 and 3 from single-level PCA are harder to interpret, and one can never preclude the possibility of mixing of different influences (e.g., sex, expression, etc.) in modes in single-level PCA. By contrast, mPCA should focus more clearly on individual influences because they are modeled at different levels of the mPCA model. The first mode at level 2 (between-subject variation) for mPCA in Figure 5
(middle row) corresponds to changes in the relative thinness/width of the face (presumably) that can occur irrespective of sex.
Modes of variation for image texture for dataset 1 are presented in Figure 6
. The first modes at each level are relatively straightforward to understand for mPCA. We see that mode 1 for level 1 (biological sex) mPCA does indeed correspond to changes in appearance due to biological sex (e.g., females tend to have more prominent eyes and cheeks [45
]), as required. Mode 1 for level 2 (between-subject) mPCA corresponds to residual changes due to left/right position (possibly) and also illumination, although this mode is slightly harder to interpret. Mode 1 for level 3 (facial expression) mPCA corresponds to changes due to the act of smiling (i.e., mean − SD = not smiling, mean = half smile, and mean + SD = full smile). We see clear evidence of a smile that exposes the teeth in this mode. Furthermore, subtle effects are seen for mPCA at this level such as increased prominence of the cheeks, increased nose width, and narrowing of the eyes [44
]. The first three modes for single-level PCA are also relatively straightforward to interpret, although arguably less so than for the first mode at each level from mPCA. For example, mode 1 possibly corresponds to residual changes in illumination and/or also to slight changes to the nose and prominence of the cheeks, which might be associated with biological sex [45
]. Modes 2 and 3 correspond clearly to changes due to the act of smiling.
Results for the standardized component ‘scores’ for mPCA for shape are shown in Figure 7
. Results for component 1 for single-level PCA demonstrate differences due to facial expression clearly because the centroids are strongly separated between smiling and neutral expressions. By contrast, component 2 for single-level PCA does not seem to reflect changes due to either facial expression or biological sex very strongly. Finally, component 3 for single-level PCA reflects differences due to biological sex (albeit mildly), as there is some separation in the centroids between males and females. The centroids in Figure 7
at level 1 (biological sex) for mPCA are strongly separated by biological sex, although not by facial expression. Hence, strong clustering by biological sex (alone) is observed for shape at level 1 (biological sex) for mPCA. The centroids in Figure 7
at level 3 (facial expression) for mPCA are strongly separated by facial expression (neutral, smiling), although not by biological sex. Strong clustering by facial expression (alone) is therefore observed at level 3 (facial expression) for mPCA, also as required. Strong clustering by facial expression or biological sex is not seen at level 2 (between-subject variation) mPCA (not shown here), i.e., all centroids by biological sex and facial expression are congruent on the origin.
Results for the standardized component ‘scores’ for mPCA for image texture are shown in Figure 8
. Component 1 for single-level PCA reflects differences due to biological sex and components 2 and 3 reflect changes due to facial expression. Again, level 1 for mPCA clearly reflects differences by biological sex and level 3 reflects differences by expression (neutral or smiling). Again, strong clustering by facial expression or biological sex is not seen at level 2 (between-subject variation) for mPCA also for image texture (not shown here), i.e., all centroids by biological sex and facial expression are congruent on the origin.
Eigenvalues via single-level PCA and mPCA for dataset 2 (‘dynamic’ 3D shape data) are shown in Figure 9
. A single large eigenvalue for the level 2 (facial expression/smiling) occurs for mPCA, although the following eigenvalues are relatively larger than for the analysis of shapes for dataset 1. All phases of a smile are captured here in the 3D video shape data in dataset 2 and this result is to be expected. Variation at level 1 (between-subject variation) for mPCA tends to have larger eigenvalues compared to those for levels 2 and 3 (facial expression/smiling). Level 3 eigenvalues via mPCA are found to be minimal, thus indicating that residual variations within each smile phase are small. Finally, results of single-level PCA are of similar magnitude and follow a similar pattern to those eigenvalues from mPCA.
Modes of variation of shape for dataset 2 are presented in Figure 10
. Results for the first mode at level 1 (between-subject variation) via mPCA in the coronal plane correspond to changes between upturned and downturned lip shape, which is consistent with changes due to subjects’ natural lip shape. Results for the first mode at level 1, mPCA in the transverse plane appear to indicate changes in the prominence of the upper and lower lips. By contrast, results for the first mode of variation at level 2 (i.e., between smile phases level) for mPCA correspond to increased mouth size and a strong drawing back (and slight upturn) of the corners of the mouth, which is consistent with the act of smiling. Mode 1 from single-level PCA is broadly similar to mode 1 at level 1 (between-subject variation) via mPCA, whereas mode 2 from single-level PCA is broadly similar to mode 1 at level 2 (between smile phases) via mPCA. Results for the modes of variation via single-level PCA for dataset 2 are therefore not presented here.
Standardized component scores from both single-level PCA and mPCA at level 2 (variation between smile phases) with respect to shape for dataset 2 are shown in Figure 11
. Very little difference between centroids divided by smile phase is seen at levels 1 or 3 for mPCA (not shown here). The centroids of component scores at level 2 via mPCA are clearly separated in Figure 11
with respect to the seven phases of a smile (i.e., rest pre-smile, onset 1 (acceleration), onset 2 (deceleration), apex, offset 1 (acceleration), offset 2 (deceleration), and rest post-smile). Indeed, we see clear evidence of a cycle in these centroids in Figure 11
over all of these smile phases for both single-level PCA and at level 2 for mPCA. These results are strong evidence that seven phases of a smile do indeed exist.
We have shown in this article that mPCA provides a viable method of accounting for groupings in our population subject set and/or for adjusting for potential confounding covariates. For example, natural face or lip shape was represented at one level of the mPCA model and shapes changes due to the act of smiling at another level (or levels) of the model. By capturing these different sources of variation we represented at different levels of the model, we are able to isolate those changes in expression due to smiling that are consistent over the entire populate much more effectively than single-level PCA. All results were found to agree with results of single-level PCA, although mPCA results were (arguably) easier to interpret than those results of single-level PCA.
For the first dataset considered here that contained two ‘expressions’ per subject (neutral or smiling), both obvious effects (widening of the mouth, corners of mouth raised slightly, exposure of teeth, and increased prominence of cheeks), and subtle effects (narrowing of the eyes and a slight widening at bottom of nose during smile) were detected in major modes of variation for the facial expression level of the mPCA model. Inspection of eigenvalues suggested that facial expression and ‘between-subject’ effects were strong influences on shape and image texture, although biological sex was a weaker effect especially for shape. Indeed, another study [24
] has noted that sexual dimorphism was weakest for a Finnish population in comparison to other ethnicities (i.e., English, Welsh, and Croatian). Furthermore, the first major mode for shape showed clearly that males have longer/thinner faces on average than women [45
] at an appropriate level of the mPCA model. Changes in image texture also clearly corresponded to biological sex, again at an appropriate level of the mPCA model. Model fits gave standardized scores for each principal component/mode of variation that show strong clustering for both shape and texture by biological sex and facial expression also at appropriate levels of the model. mPCA correctly decomposes sources of variation due to biological sex and facial expression (etc.). These results are an excellent initial test of the usefulness of mPCA in terms of modeling either shape or image texture.
For the second dataset that contained 3D time-series shape data, results of major modes of variation via mPCA were seen to correspond to the act of smiling. Inspection of eigenvalues again showed that both ‘natural lip shape’ and facial expression are strong sources of shape variation, as one would expect. Previous studies of 3D facial shape changes have posited that there are three phases to a smile [8
], namely, onset, apex, and offset. However, if one includes rests pre and post smiling, standardized component scores from both mPCA (at the appropriate level of the model) and single-level PCA demonstrated clear evidence of a cycle containing seven phases of a smile: rest pre-smile, onset 1 (acceleration), onset 2 (deceleration), apex, offset 1 (acceleration), offset 2 (deceleration), and rest post-smile. This is strong evidence that seven phases of a smile do indeed exist and it is another excellent test of the mPCA method, now also for dynamic 3D shape data.
Future research will focus on modeling the effects of ethnicity, gender, age, genetic information, or diseases (e.g., effects perhaps previously hidden in the ‘final 5% of variation’) on facial shape or appearance. The present study has not considered the effects of “outliers” in the shape or image data. Clearly, the effects of outliers (either as isolated points, subjects or indeed even entire groups of subjects) might strongly affect mean averages used to estimate centroids of groups and also covariance matrices at the various levels of the model. The simplest method of addressing this problem is to use robust centroid and covariance matrix estimation [47
] and then to carry out PCA as normal at each level. Note that robust covariance matrix estimation is included in MATLAB (2017a) and so this may be implemented easily, although a sample size of at least twice the length of the feature vector z
is required. Furthermore, the mPCA method uses averages of covariance matrices (e.g., over all subjects in the population or over specific subgroups) and robust averaging of these matrices might also be beneficial. Clearly also, we can use other forms of robust PCA [50
] and M
] might also to deal with the problem of outliers. Finally, future research will attempt to extend existing single-level probabilistic methods of modeling shape and/or appearance (e.g., mixtures models [30
] and extensions of Bayesian methods used in ASMs or AAMs [56
]) to multilevel formulations and to active learning [58
]. The use of schematics such as Figure 1
will hopefully prove just as useful in visualizing these models as they have for mPCA.