Facial Expressions and Self-Reported Emotions When Viewing Nature Images

Many studies have demonstrated that exposure to simulated natural scenes has positive effects on emotions and reduces stress. In the present study, we investigated emotional facial expressions while viewing images of various types of natural environments. Both automated facial expression analysis by iMotions’ AFFDEX 8.1 software (iMotions, Copenhagen, Denmark) and self-reported emotions were analyzed. Attractive and unattractive natural images were used, representing either open or closed natural environments. The goal was to further understand the actual features and characteristics of natural scenes that could positively affect emotional states and to evaluate face reading technology to measure such effects. It was predicted that attractive natural scenes would evoke significantly higher levels of positive emotions than unattractive scenes. The results showed generally small values of emotional facial expressions while observing the images. The facial expression of joy was significantly higher than that of other registered emotions. Contrary to predictions, there was no difference between facial emotions while viewing attractive and unattractive scenes. However, the self-reported emotions evoked by the images showed significantly larger differences between specific categories of images in accordance with the predictions. The differences between the registered emotional facial expressions and self-reported emotions suggested that the participants more likely described images in terms of common stereotypes linked with the beauty of natural environments. This result might be an important finding for further methodological considerations.


Introduction
A large body of research has consistently documented that people react to viewing natural environments with positive emotions. Moreover, research has also showed that exposure to simulated, surrogate nature has similar positive effects [1]. For various reasons, many people do not often have the opportunity to spend free time in natural environments. Thus, they may wish to relax for a short time during work, where some form of nature, even in a virtual medium, may help substitute for the experience of an actual natural environment. Therefore, it is useful to expand our knowledge of emotional reactions to simulated nature along with its restorative possibilities. This research employed various methods, from analyses of self-reported assessments of perceived emotions to measurements of various physiological reactions. This study analyzed changes in the emotional facial expression using machine vision software for an automated facial expression analysis of images that represent various types of natural environments and compared the results with the self-reported emotions.

Positive Effects of Viewing Surrogate Nature
Over several decades, research has documented the health-improving and restorative effects of direct contact with the natural environment (for a review, see [2][3][4]). Research has also solved the question of whether exposure to simulated natural scenes may also have similar positive effects. It was determined that viewing natural images can improve mood and the level of perceived restoration [5][6][7][8][9]. Exposure to videos with a natural environment can also improve mood and the level of perceived restoration while also reducing stress (e.g., [10][11][12][13][14][15]). Recent research reported that similar effects were revealed from viewing nature scenes in virtual reality (e.g., [16,17]).

Emotional Facial Expressions
The roots of facial emotion research were established in the 19th century by Darwin [27], who argued that all humans show emotion in the face and body through remarkably similar behaviors. Darwin also conducted the first detailed study of the muscle actions involved in emotions. Within the evolutionist framework, the next step in the research of facial emotional expressions was carried out by Ekman [28] Through a series of investigations, he found high agreement across members of diverse Western and Eastern cultures on selecting emotional labels that fit facial expressions. Ekman [28] defined the six basic emotions that should be common in all cultures: anger, disgust, fear, happiness, sadness, and surprise. He believed that they can be easily recognized in facial expressions. This concept of six basic emotions has preferentially been used in analyses of facial expressions of emotions. However, more recent research questioned the cultural universality hypothesis by showing that Easterners in contrast to Westerners represent emotional intensity with distinctive dynamic eye activity (e.g., [29]).
EEG and brain imagining studies demonstrate brain specificity in the judgment of discrete emotions. The perception of fearful faces activates the region of the left amygdala (e.g., [30]) and the perception of sad faces activates the left amygdala and right temporal lobe (e.g., [31]). The perception of angry faces results in activation of the right orbito-frontal cortex and cingulate cortex (e.g., [32]). The perception of disgusted faces activates the basal ganglia, anterior insula, and frontal lobes (e.g., [31]). Duchene smiles activate the left side of the lateral frontal, midfrontal, anterior temporal, and central anterior scalp regions (e.g., [33]).
A further important point is the coherence between emotion and facial expression. The research of facial emotional expressions implicitly assumes that facial expressions co-occur with emotion. Recent research evidence (for a summary, see [34]) suggests high coherence between amusement and smiling and low to moderate coherence between other positive emotions and smiling. Surprise and disgust are accompanied by their "traditional" facial expressions. However, evidence concerning sadness, anger, and fear is still limited. For sadness, it seems that high emotion-expression coherence may exist in specific situations, whereas for anger and fear, the evidence points to low coherence. Thus, further research in this field is still needed. Some authors [35] suggest an urgent need for research that examines how people move their faces to express emotions in the variety of contexts that make up everyday life. Thus, facial emotional expressions in reaction to natural scenes is the context that has been investigated only rarely.

Measurement of Emotional Facial Expressions
Apart from the facial action coding system based on subjective identification of six basic emotions in video-recorded faces [28], two other methods are used: facial electromyography and automatic computer facial expression analysis. Facial electromyography is based on monitoring the activation of facial muscles during changes in the emotional response. It requires the application of electrodes on the skin surface. It enables the identification of the specific facial muscle patterns used to display emotional experiences, such as, joy, appetite, and disgust (e.g., [36]). This technique allows for the detection of subtle facial muscle activity, but its disadvantage is its technical complexity. Moreover, having electrodes attached to the face is far from naturalistic conditions.
In contrast, automated facial expression analysis by machine vision software eliminates these disadvantages [37]. These techniques have improved considerably, and their results are comparable to technically difficult facial electromyography (e.g., [38,39]). To date, automated facial expression recognition has rarely been used in environmental psychology research. In a series of experiments conducted by Wei and his colleagues [25,26,40,41], participants were asked to take selfies or were photographed while walking on urban streets or in a forest park. The photographs were analyzed using FireFACE software. In general, it was shown that compared to the urban environment, the forest experience evoked higher happiness scores but lower neutrality scores.

AFFDEX Software for Automatic Computer Facial Expression Analysis
The current study employed a new iMotions' AFFDEX commercial software designed for the recognition of facial emotions [42]. The software is based on frame-to frame analysis of static images or videos. Typically, it is possible to achieve frame rates of 30 frames per second on laptop/desktop devices. It works in three steps: face detection, facial landmark detection and registration, and facial expression and emotion classification. In the first step, the position of a face is found in an image (it uses the same technology as iPhone or Android smartphones, for instance). In the next step, facial landmarks such as eyes and eye corners, brows, mouth corners, the nose tip, etc. are detected. After this, an internal face model is created. The face model is a simplified version of the respondent's actual face; however, it contains exactly the face features for the job to be completed. Exemplary features are single landmark points (eyebrow corners, mouth corners, nose tip) and feature groups (the entire mouth, the entire arch of the eyebrows, etc.). Finally, once the simplified face model is available, the position and orientation information of all the key features is fed as input into classification algorithms, which translate the features into specific action units (specific movements of facial muscles). The recognition of emotion expressions (anger, disgust, fear, joy, sadness, surprise, and contempt) is based on combinations of these facial actions. This coding was built on Ekman's emotional facial action coding system [28]. Automatic facial expression analysis generates numeric scores for facial expressions, action units, and emotions along with the degree of confidence. As the facial expression or emotion occurs and/or intensifies, the confidence score rises from 0 (no expression) to 100 (expression fully present). The software was tested in various preliminary explorations conducted by the Affectiva company [43], and in several independent research studies [44][45][46][47]. These studies seem to confirm that the software is reliable for recognizing basic, subtle emotional facial expressions for standardized images when participants do not intend to conceal their facial reactions; notably, the software demonstrates similar precision as the facial action coding system and facial electromyography.

The Current Study
In our previous study [48], facial expressions while viewing forest trees with foliage, forest trees without foliage, and urban images by iMotions' AFFDEX software were analyzed under laboratory conditions. Although it was assumed that natural images would evoke a higher magnitude of positive emotions in facial expressions and a lower magnitude of negative emotions than urban images, the results showed very low magnitudes of emotional facial responses; moreover, the differences within both types of natural images were not significant, nor were they significant between natural and urban images. This was explained by the fact that the images represented an ordinary deciduous forest and urban streets while in the main body of research, mostly attractive natural environments were employed. This is consistent with Joye's and Bolderdijk's findings [49] in that they observed that watching awesome natural scenes had pronounced emotional effects compared to viewing mundane natural scenes, which generated a low emotional effect.
To further explore the possibilities of automated emotional facial expression recognition in environmental psychology research, the present study employed both attractive and unattractive "mundane" natural scenes and explored their effect on facial expressions. The measurement of facial expressions conducted by iMotions' AFFDEX software designed for the recognition of facial emotions was combined with a self-reported description of emotions evoked by the images. Moreover, the effects of viewing open and closed scenes were explored. It was found that closed natural scenes may evoke the perception of danger and fear (e.g., [50][51][52]) in contrast to openness, which may promote visibility, a predictor of security in prospect-refuge theory [53]. This effect is more pronounced in females (e.g., [50]) because they are more afraid of potential crime than males. On the other hand, some studies showed that females may express more positive emotional expressions in natural environments than males (e.g., [51,54]).
To summarize, the goals of the present study were to further understand the actual features and characteristics of natural scenes that could positively affect emotional states and to evaluate face reading technology to measure such effects. We predicted that attractive natural scenes may evoke significantly higher levels of positive emotions than unattractive scenes, and open scenes may evoke significantly higher levels of positive emotions than closed scenes. Moreover, we predicted consistency between emotional facial expressions and self-reported emotions.

Participants
Fifty-one undergraduates participated in the experiment. The sample comprised young adults between the ages of 19 and 25 (mean = 20.9, SD = 1.28, 29 males, 22 females). Participants were enrolled in the first, second, or third year of various psychology courses, and they were students in informatics, financial management, or tourism at the University of Hradec Králové. Power analysis was performed using the G*Power software Version 3.1.9.7 (University of Kiel, Kiel, Germany) [55]. This study was designed to be sensitive to the detection of medium-sized effects in accordance with prior research examining the effects of attractive and unattractive nature [49]. The analysis revealed that a sample size of 34 participants would be sufficient to find significant differences (effect size f = 0.25, α = 0.05, statistical power = 0.80).

Stimulus Material
Twenty images were presented in one experimental session. They consisted of five images of attractive and open natural environments, five images of attractive and closed natural environments, five images of unattractive and open natural environments, and five images of unattractive and natural closed environments. In determining the differences between attractive and unattractive images, we followed the procedure used in Joy and Bolderdijk's study [49]. They collected pictures of attractive and awesome nature from the internet that consisted of pictures of grand and dramatic mountain scenes while images of mundane nature taken by the author consisted of photographs of everyday natural elements to eliminate any "powerful" natural elements that might trigger awe. Similarly, in our study, the attractive images used were found on the Pixabay internet server, which shares copyright-free images (https://pixabay.com/ (accessed on 1 November 2020)), and unattractive images were taken by one of the authors ( Figure 1). The attractive open images were scenes from high mountains, lakes, or a coast, which were fundamentally different from the common landscape of the Czech Republic, where the participants were living. The unattractive open scenes were scenes of the common Czech landscape. Attractive closed scenes were also taken from outside the Czech Republic, and downloaded from the Pixabay internet server while unattractive closed scenes were obtained from the common environment of Czech forests. The attractive scenes were professional photographs taken with the aim of having the greatest possible visual effect, given the choice of lighting, specific atmospheric conditions, and further digital adjustments, while the photos of unattractive scenes were taken in summer on mild-sunny days under normal lighting conditions, without any further adjustment. The intention was to capture how the scenes appeared in common everyday conditions. Although the perception of attractivity of natural scenes may be to some extent culturally dependent, our sample was culturally homogenous (all participants were born and living in the northeast Czechia), thus one may suppose that the selected images have a similar meaning in terms of attractiveness/unattractiveness and openness/closedness. open images were scenes from high mountains, lakes, or a coast, which were fundamentally different from the common landscape of the Czech Republic, where the participants were living. The unattractive open scenes were scenes of the common Czech landscape. Attractive closed scenes were also taken from outside the Czech Republic, and downloaded from the Pixabay internet server while unattractive closed scenes were obtained from the common environment of Czech forests. The attractive scenes were professional photographs taken with the aim of having the greatest possible visual effect, given the choice of lighting, specific atmospheric conditions, and further digital adjustments, while the photos of unattractive scenes were taken in summer on mild-sunny days under normal lighting conditions, without any further adjustment. The intention was to capture how the scenes appeared in common everyday conditions. Although the perception of attractivity of natural scenes may be to some extent culturally dependent, our sample was culturally homogenous (all participants were born and living in the northeast Czechia), thus one may suppose that the selected images have a similar meaning in terms of attractiveness/unattractiveness and openness/closedness.

Apparatus
The experiment was controlled by a PC computer with a 1920 × 1200 pixel resolution screen and a diagonal of 61 cm with a Logitech Webcam C920 camera situated on the top of the screen. The camera and presentation of stimuli and the data processing were controlled by the iMotions 8.1 software (iMotions, Copenhagen, Denmark). The facial expression analysis was performed by the iMotions Facial Expression Analysis Module AFFDEX. The web camera recorded facial videos while the participants viewed the stimuli, and then the videos were imported into the iMotions software for facial expression analysis postprocessing. AFFDEX enables the measurement of seven emotional categories: joy, anger, surprise, fear, contempt, sadness, and disgust. Moreover, AFFDEX also provides measurement of the involvement indicators: engagement and valence. All emotional indicators were scored by the software on a scale from 0 to 100, indicating the

Apparatus
The experiment was controlled by a PC computer with a 1920 × 1200 pixel resolution screen and a diagonal of 61 cm with a Logitech Webcam C920 camera situated on the top of the screen. The camera and presentation of stimuli and the data processing were controlled by the iMotions 8.1 software (iMotions, Copenhagen, Denmark). The facial expression analysis was performed by the iMotions Facial Expression Analysis Module AFFDEX. The web camera recorded facial videos while the participants viewed the stimuli, and then the videos were imported into the iMotions software for facial expression analysis postprocessing. AFFDEX enables the measurement of seven emotional categories: joy, anger, surprise, fear, contempt, sadness, and disgust. Moreover, AFFDEX also provides measurement of the involvement indicators: engagement and valence. All emotional indicators were scored by the software on a scale from 0 to 100, indicating the probability of having detected the emotion. A magnitude of 0 indicated that the emotion was absent; in turn, a magnitude of 100 indicated a 100% probability of having detected the emotion.

Procedure
The experimental session consisted of two parts. In the first part, the participants viewed the images while the facial emotional movements were recorded. In the second part of the experiment, the participants viewed the same images and were asked to report emotions evoked by the particular images.
After arrival, the participants signed the informed consent form. Next, they underwent the first part of the experiment. They began by reading the instructions for the first part of the experiment, which is as follows: "You will see 20 images during this experiment. The images show certain natural environments. Try to imagine that you are in this natural environment right now and look closely at the picture. Each image will be displayed for 15 s. During the time you are viewing the image, the camera monitors your face." The images were presented in a random order. Every trial started with a fixation cross situated in the center of the screen on a gray background. The participants had to fixate on the fixation cross for 2 s before the image appeared. Each image was displayed for 15 s.
After the participants completed the first part of the experiment, they began the second part. The following instruction appeared on the computer screen: "You will see the pictures you saw during the previous research session again. Take a look at them again, try to imagine that you are in this natural environment right now, and describe your perceptions and feelings by the statements, which will be given under each picture".

Measures
AFFDEX registered the probability of having detected facial emotions related to joy, anger, surprise, fear, contempt, sadness, and disgust and evaluated the involvement indicators engagement (the emotional responsiveness that the stimuli trigger) and valence (the positive or negative nature of the experience). A questionnaire used in the second part of the experiment registered four basic emotions, specifically joy, surprise, fear, and sadness. We did not ask about anger and contempt because we assumed that natural environments would not provoke these negative emotions. The item "I feel happy here" was related to the emotion joy, the item "I feel amazed here" was related to the emotion surprise, the item "This place scares me a little" was related to the emotion fear, and finally, the item "This is a pretty sad place" was related to sadness. The item "I like this place" was related to liking the environment. Each item was assessed on a five-point scale (1 = not at all, 5 = completely). The questionnaire was presented to the participants on a computer screen.

Results
AFFDEX offers various outputs of data. One possibility, which is suitable for the purposes of the current research, is to work with raw data that express the intensity of the facial emotions or the involvement indicators on a scale from 0 to 100 found in approximately 30-ms time windows. Thus, firstly, the raw data were exported. Within the measurement frequency of 30 frames per second, approximately 450 measurements were obtained for one image presented for 15 s, and approximately 2250 measurements were obtained for one participant within one image category (attractive open images, attractive closed images, unattractive open images, unattractive closed images). Next, the mean scores of the intensities of specific emotions that appeared were calculated for each participant regarding the images in each category and the involvement indicators; both sets of results were then averaged across the image categories (Table 1 and Figure 2).

Analysis of Emotional Facial Expressions
Three-way mixed ANOVAs were conducted to assess the effects of attractiveness, openness, and gender on emotional facial expressions. Attractiveness (attractive × unattractive), openness (open × closed), and gender (male × female) were chosen as predictors, and the emotional facial expression or involvement indicator was chosen as a dependent variable.

Analysis of Emotional Facial Expressions
Three-way mixed ANOVAs were conducted to assess the effects of attractiveness, openness, and gender on emotional facial expressions. Attractiveness (attractive × unattractive), openness (open × closed), and gender (male × female) were chosen as predictors, and the emotional facial expression or involvement indicator was chosen as a dependent variable.

Discussion
The present study explored whether attractive natural scenes evoke more positive facial emotional expressions than unattractive scenes and whether self-reported emotions are linked with objective measures of emotional facial expressions. AFFDEX software was used for automatic registrations of emotional facial expressions. Our results showed generally small magnitudes of emotional facial expressions while observing the images. The facial expression of joy was higher than that of the other registered emotions. Clearly, it is not surprising that natural environments evoke positive rather than negative emotions. However, contrary to our assumptions, there was no difference between facial emotions while viewing attractive and unattractive scenes, and between open and closed scenes. Thus, even attractive natural images did not generally evoke strong immediate emotional reactions.
To date, there is a lack of data from studies within the field of environmental psychology using a similar research methodology. Thus, we compared our results with data from recent studies that was conducted in the field of advertisement assessment research. In the study by Otamendi and Sutil [56], the participants viewed an advertisement lasting 91 s that consisted of 31 scenes. The advertisement showed the accompanying role that a mother plays throughout the life of a child, from birth to adulthood. The data was processed by the same AFFDEX software that was used in our study. The authors also reported small values for specific emotions, with the highest for joy with a mean = 4.82, and smaller for the other emotions, with means between 0.42 and 1.12. Only in the target group (mature aged women) did higher emotional reactions occur (mean for joy = 14.17). A further study [57] explored the emotional reaction evoked by two types of drinks. For instance, the mean magnitudes for the expressions for joy were 0.56 or 0.97, respectively. Even at such low magnitudes, the differences were statistically significant. Furthermore, in our previous study [48], where ordinary natural scenes were used, very small mean magnitudes ranging from 0.10 to 0.14 for the emotional expression of joy in viewing natural images were registered while the mean values of joy ranged from approximately 1.20-1.80 in individual participants in the present study. However, this increase in the facial expression appearance of joy was probably not caused by the effect of the attractiveness of images because the mean values of joy for attractive and unattractive images were almost equal. It is supposed that the increase in positive facial emotions was due to the instruction given to the participants, where we asked them not only to view the images as in the previous study [48] but also encouraged the participants to try and imagine that they were immersed in the specific environment.
One explanation for the very low magnitude level of facial expressions produced during the observation of natural stimuli could be based on the fact that the software for automated facial expression analysis cannot integrate contextual information into emotion recognition. Given that from an evolutionary perspective, facial expressions have an intrinsic communicative function [27,28], there might be a difference in the emotional facial expression of individuals sitting alone in a laboratory and individuals who want to share their emotional experience with someone else. It seems that the studies by Wei and his colleagues [25,26,40,41] identified significant differences between emotional facial expressions in natural and urban environments because they analyzed facial expressions on selfies that are usually taken to communicate our emotional experience on a given place and situation with our friends.
In contrast to registered emotional facial expressions, the self-reported emotions evoked by the images showed significantly larger differences between specific categories of images. As predicted, the attractive images evoked a higher level of positive emotions than unattractive images. Conversely, unattractive images evoked higher levels of negative emotions than attractive images. Importantly, we found a combined effect between the attractiveness of images and their openness; specifically, attractive and open environments evoked higher self-reported joy and surprise than attractive closed environments. Conversely, attractive and closed environments evoked higher self-reported fear and sadness than attractive open environments. Moreover, in closed attractive environments, females reported higher fear than in open attractive environments, which is in accordance with a large body of environmental psychology research showing that females may fear being attacked in closed natural environments (e.g., [51,54]).
Thus, the results showed that there was a difference between emotional facial expressions directly evoked by images viewed and subjective statements about emotions evoked by the same images. On the one hand, strong self-reported emotions might be given just by the fact that we are accustomed to describing natural environments in a positive way. Thus, it might be that the participants did not fully describe the intensity of their actual emotions; rather, they more likely described images in terms of common stereotypes linked with the beauty of natural environments. This might be an important finding for further methodological considerations.
Furthermore, this finding should be discussed in terms of differences between macroand micro-emotional expressions [58]. People express their emotions consciously by macroexpressions that last from 0.5 to 4 s [58], and the emotions are easily recognized. Microexpression is mostly expressed unconsciously and reflects a person's true emotions. According to Ekman [58], the duration of micro-expressions typically lasts between 0.5 and 4 s; more recent research proposed that they are shorter, under 500 ms [59]. Our data represented means along emotional facial expressions that appeared in 15 s intervals, corresponding to when images were presented. Because eyes move across an image, actual short emotional expressions may change. Thus, a next step in this research would be to analyze whether there might be a link between certain environmental features and immediate corresponding emotional expressions.
It is also worth commenting on the appearance of the facial expressions of contempt and disgust while viewing the natural images. We did not assume that the natural images used in the experiment evoked these emotions; rather, they were linked with concentration on the task. It was shown that nose wrinkles, which are associated with disgust, were also present in situations where participants were asked to learn information presented on a computer screen [60,61] and were not always observed with self-reports of disgust [62].
The limitation of the present findings is related to the laboratory situation. Although many studies showed positive effects of real outdoor environments and their simulation in a laboratory [1], a stronger effect may be observed with self-selected natural images or in situations where people decide to view natural images by themselves to relax or to remember the nice moments they have spent in nature. Another limitation is the small number of photographs that were used in this study and their specific selection. Obviously, natural scenes take many diverse forms all over the world, and a single investigation cannot include them all. Thus, the present study does not claim to be generalizable. Further replication studies are needed.

Conclusions
This study analyzed changes in emotional facial expression through the use of machine vision software of images representing various types of natural environments and compared the results with the self-reported emotions. While self-reported emotions reflected differences between the types of natural environments, there was no difference between facial emotions while viewing attractive and unattractive scenes. The results might suggest that participants might not have described the intensity of their actual emotions; rather, they more likely described images in terms of common stereotypes linked with the beauty of natural environments. The results also contribute to the considerations of the use of automated facial expression analysis in environmental psychology research and further expansion to understand facial emotional expressions in a specific situation.