An Immersive Serious Game for the Behavioral Assessment of Psychological Needs

game and machine learning techniques in the assessment of psychological needs can enhance the traditional assessment adding behavioral-based information. Abstract: Motivation is an essential component in mental health and well-being. In this area, researchers have identiﬁed four psychological needs that drive human behavior: attachment, self-esteem, orientation and control, and maximization of pleasure and minimization of distress. Various self-reported scales and interviews tools have been developed to assess these dimensions. Despite the validity of these, they are showing limitations in terms of abstractation and decontextualization and biases, such as social desirability bias, that can affect responses veracity. Conversely, virtual serious games (VSGs), that are games with speciﬁc purposes, can potentially provide more ecologically valid and objective assessments than traditional approaches. Starting from these premises, the aim of this study was to investigate the feasibility of a VSG to assess the four personality needs. Sixty subjects participated in ﬁve VSG sessions. Results showed that the VSG was able to recognize attachment, self-esteem, and orientation and control needs with a high accuracy, and to a lesser extent maximization of pleasure and minimization of distress need. In conclusion, this study showed the feasibility to use a VSG to enhance the assessment of psychological behavioral-based need, overcoming biases presented by traditional assessment. recognition achieved 85% accuracy (kappa: 0.63) with 88% of true positives. The fourth subscale, related to the avoidant-insecure attachment style (4: emotional self-sufﬁciency and discomfort with intimacy) achieved 88% accuracy (kappa: 0.67) including 93% of true positives. These results seem to suggest, on one hand, that the VSG situations are able to activate attachment phenomena, and on the other, the ability to recognize attachment style using VSG.


Introduction
Motivational psychology is a broad branch of psychology that studies human needs and their fulfillment for mental health and well-being [1][2][3]. Motivation can be defined as an unobservable and internal psychological process that drives observable human behaviors in order to satisfy needs [1,2]. The study of motivation involves many theoretical approaches, including dispositional, learning, humanistic, biological, cultural, and psychodynamic perspectives [1][2][3][4]. According to these, 20 years ago, the psychotherapist Klaus Grawe proposed a new theoretical model, named the consistency model, which conceives mental functioning as the interaction between motivational schemas and the satisfaction of four psychological needs: attachment, self-esteem, orientation and control, and maximization of pleasure and minimization of stress ( Figure 1) [3]. Environmental and interpersonal factors perceived to satisfy these needs moderate the maintainment and enhancement of the self, whereas those that impede need satisfaction foster distress and mental illness [3].

The Four Basic Psychological Needs
Attachment represents the reflection in adulthood of the relational bonding developed within the child-caregiver dyad over a child's first years of life [5][6][7][8]. In accordance with attachment theory [7,8], the quality of the relationship between a child and the main caregiver generates the development of internal working self-representations and others that can be recognized in three interrelated dimensions of attachment styles [9][10][11]: beliefs and attitudes, goals and needs, and plans and strategies [12]. Three attachment styles have been conceptualized [5]: secure, ambivalent-insecure, and avoidant-insecure. Secure attachment is the most common (60% of the worldwide population) [11] and is defined as a strong relationship between child and caregiver, who is perceived by the infant as a secure base that allows the exploration of real environments and gives comfort in stressful situations. Children with ambivalent-insecure attachment, representing 15% of the worldwide population [10], tend to keep a distance from caregivers and struggle to explore the environment. Lastly, avoidant-insecure attachment leads children to independently explore the environment and to avoid closeness with the caregiver; 25% of the worldwide population is characterized by this attachment style [11]. Regarding the three dimensions, the beliefs and attitudes of secure people refer to high self-esteem, usually liking others, and they think that others are trustworthy and have good intentions. With respect to goals and needs, they desire intimate relationships and search for balance between closeness and autonomy in relationships, and regarding plans and strategies, they recognize stress and constructively modulate its negative effects [12][13][14]. On the contrary, ambivalent-insecure individuals believe that people are complicated to understand and have little control over their lives, and they need extreme intimacy, showing lower levels of autonomy and fear of rejection. Regarding plans and strategies, they demonstrate intensified stress and anger to elicit responses from others, and at the same time they collaborate with others to obtain acceptance. Similarly, avoidant-insecure individuals show beliefs and attitudes of doubt regarding others' motives, believing that others are not trustworthy and doubting their honesty and integrity, and show a lack of confidence in social situations. Regarding goals and needs, they need to keep distance, limit intimate relationships, and maintain autonomy and independence, and they control stress, not showing anger and emotions in general.
Self-esteem is defined as perception of oneself and a personal evaluation one makes about oneself [15,16]. It is characterized by two dimensions, competence and worth; the former regards how people perceive themselves as capable and efficacious, whereas the latter is related to the value that everybody feels to have and reflect. Social interactions

The Four Basic Psychological Needs
Attachment represents the reflection in adulthood of the relational bonding developed within the child-caregiver dyad over a child's first years of life [5][6][7][8]. In accordance with attachment theory [7,8], the quality of the relationship between a child and the main caregiver generates the development of internal working self-representations and others that can be recognized in three interrelated dimensions of attachment styles [9][10][11]: beliefs and attitudes, goals and needs, and plans and strategies [12]. Three attachment styles have been conceptualized [5]: secure, ambivalent-insecure, and avoidant-insecure. Secure attachment is the most common (60% of the worldwide population) [11] and is defined as a strong relationship between child and caregiver, who is perceived by the infant as a secure base that allows the exploration of real environments and gives comfort in stressful situations. Children with ambivalent-insecure attachment, representing 15% of the worldwide population [10], tend to keep a distance from caregivers and struggle to explore the environment. Lastly, avoidant-insecure attachment leads children to independently explore the environment and to avoid closeness with the caregiver; 25% of the worldwide population is characterized by this attachment style [11]. Regarding the three dimensions, the beliefs and attitudes of secure people refer to high self-esteem, usually liking others, and they think that others are trustworthy and have good intentions. With respect to goals and needs, they desire intimate relationships and search for balance between closeness and autonomy in relationships, and regarding plans and strategies, they recognize stress and constructively modulate its negative effects [12][13][14]. On the contrary, ambivalent-insecure individuals believe that people are complicated to understand and have little control over their lives, and they need extreme intimacy, showing lower levels of autonomy and fear of rejection. Regarding plans and strategies, they demonstrate intensified stress and anger to elicit responses from others, and at the same time they collaborate with others to obtain acceptance. Similarly, avoidant-insecure individuals show beliefs and attitudes of doubt regarding others' motives, believing that others are not trustworthy and doubting their honesty and integrity, and show a lack of confidence in social situations. Regarding goals and needs, they need to keep distance, limit intimate relationships, and maintain autonomy and independence, and they control stress, not showing anger and emotions in general.
Self-esteem is defined as perception of oneself and a personal evaluation one makes about oneself [15,16]. It is characterized by two dimensions, competence and worth; the former regards how people perceive themselves as capable and efficacious, whereas the latter is related to the value that everybody feels to have and reflect. Social interactions can affect the self-worth dimension of the construct, and individuals try constantly to increase their self-esteem [3].
Orientation and control are psychological concepts related to self-efficacy theory [17] and refer to the self-belief of having an influence on the environment, successfully accomplishing specific goals, and producing favorable outcomes [3]. Self-efficacy is a personal judgment about the subjective perception of being able to execute a task or cope with a situation positively. According to Grawe [3] and Bandura [17], goal-setting is relevant to ensure that an individual's efficacy beliefs are in line with aims and competencies and not working against them. A positive interaction between efficacy beliefs, aims, and competencies results in the successful management of many situations and tasks. On the contrary, people with low self-efficacy believe they are not able to cope with different situations and tend to fail. Furthermore, people with high self-efficacy perceive themselves as being in control of their own lives and decisions, and people with low self-efficacy tend to think that they cannot manage situations and tasks by themselves. Finally, efficacy beliefs depend on four sources of information and processes: (a) mastery experiences: the direct experience of mastery increases self-efficacy, and failures can undermine self-efficacy beliefs; (b) vicarious experiences: self-efficacy comes from the observation of other people; (c) verbal persuasion: positive or negative verbal responses can strengthen or weaken subjective self-efficacy beliefs toward succeed or failure; and (d) emotional and physiological states: positive or negative emotional and physiological states can influence the self-belief of efficacy to succeed or fail.
Finally, the need to maximize pleasure and minimize stress/pain is a basic pleasure principle that has been theorized and studied by many authors, first and foremost by Freud [18]. The idea is that individuals seek pleasant experiences and states rather than unpleasant or painful circumstances [19][20][21]. Sought-after pleasure states can be physical, psychological, emotional, or social [22]. Avoiding or approaching pleasant or unpleasant situations relies on subjective motivation and various psychological processes, such as decision making, learning, and self-regulation [23,24]. In accordance with this assumption, two subjective motivational systems emerge: the behavioral activation system (BAS), which addresses self-enhancement, desire for pleasure, and rewards, and the behavioral inhibition system (BIS), which is oriented toward self-protection, caution against pain, and avoidance of threats [24][25][26][27][28].

Traditional Assessment and Serious Games Approaches
Traditional psychological assessment approaches refer to qualitative measures, such as the Adult Attachment Interview, and/or quantitative measures, such as the Adult Attachment Questionnaire or the Rosenberg Self-Esteem Scale in which respondents provide explicit and conscious information about themselves [29][30][31]. Although these measures are well validated and present a high internal consistency, they are showing some limitations in that they are too abstract and decontextualized from real situations and they are not able to activate specific phenomena that need to be activated in order to be manifested, such as attachment situations [32]. Furthermore, methodologically, the assessment is conducted in neutral laboratory or clinical settings to maintain control on confounding variables that can influence the veracity of the final outcomes. Finally, they are also presenting biases, such as the social desirability bias that is the tendency of self-reported respondents to answer questions in accordance with socially favorable norms to maintain a socially favorable self-presentation, affecting responses veracity [33,34].
Recently, advances in technological systems, such as videogames and computer games, as well as the amount of time children, adolescents, and adults play computer games has significantly gone up, providing the opportunity to use them for enhancing traditional psychological assessment. Indeed, their playability and appeal led researchers to become interested in the development of applications with "serious" and specific goals for the evaluation and training of abilities, going beyond common entertainment [35,36]. Serious games (SGs) are proving to be able to generate motivational engagement, due to main game features such as challenges, rewards, narrative, virtual agents, and fantasy worlds, as well as immersion and interaction, enhancing user experience. Indeed, the first computer games were supported by non-immersive systems (2D), as desktop, and the interaction was afford by keypad and mouse; more recently, SG can be experienced also through immersive systems, as head mounted display (HMD) able to completely isolate the user from reality, allowing them to explore, navigate, and interact with objects (through control sticks and gloves) in real-time as if the user was in the physical reality. Furthermore, SGs are proving to be effective to assess and train abilities adding behavioral-based information to traditional methods [37][38][39].
To create valid assessment in games, "stealth" game design is providing significant evidence mainly related to learning [40,41]. The stealth assessment (SA) game design approach aims to implicitly measure user performance during the game without asking users to self-report information about their personality, behavior, or state. SA refers to evidence-centered design (ECD) as a conceptual framework based on three models to develop valid assessments: competency, structural, and task models [42,43]. The competency model refers to the abilities or constructs and traits, also called unobservable indicators, that researchers want to measure using the game; the structural model identifies the multifaceted behaviors, also called observable indicators, that can reveal the theoretical abilities or constructs and traits; and the task model designs tasks or situations that can produce those behaviors related to the abilities, constructs, or traits that researchers want to measure. The SA approach has been shown to have greater predictive validity and less bias than traditional assessment [37,43]. Indeed, SA allows the creation of several SG contexts in which participants perform a sequence of actions, reducing test anxiety and bias with respect to traditional assessment and invisibly providing a large amount of data on various ranges of individual attributes and skills [37,43,44]. Until now, serious games (SGs) supported by SA approach, on one hand, have been mainly tested in educational field in order to enhance learning abilities [39,40]. On the other hand, SGs have been also investigated in the assessment of abilities in educational and learning [45], health [44], and military fields [46]. In motivational psychology, SGs have been less addressed and have been proposed as valid assessment methods to measure also those psychological needs' behavior-based dimensions along with traditional self-reported questionnaires [47].
Based on these premises, the aim of this study was to investigate the content feasibility of a virtual serious games (VSG) to assess the four basic psychological needs of attachment, self-esteem, self-efficacy, and stress avoidance according to the consistency model. The main hypothesis concerned the assumption that VSG contents were able to elicit the psychological needs phenomena and assess them using user performance.

Participants
A total of 61 subjects (30 women and 31 men; mean age 35.95 years; SD = 11.17) participated in this study. Table 1 shows the distribution of the sample according to demographics. The inclusion criteria were age between 18 and 55 years and a cut score higher or equal to than 24 on the Mini-Mental State Examination (MMSE) [48]. Before participating, each subject received written information about the study and was required to give written informed consent for inclusion. The study obtained ethical approval by the Ethical Committee of the Polytechnic University of Valencia (Protocol code: P17_23_01_2019).

Psychological Assessment
The following questionnaires were administered to participants before the virtual sessions: Demographics: Participants completed a general sociodemographic questionnaire including age, gender, education, family and marital status, labor status, and economic entry.
The main questionnaires were related to Grawe's model of consistency: Attachment: The Adult Attachment Questionnaire [30] is a 40-item self-reported scale that assesses four factors: (1) low self-esteem, need for approval, fear of rejection (13 items); (2) hostile conflict resolution, rancor, and possessiveness (11 items); (3) secure affect or expression of feelings and comfort with relationship (9 items); and (4) emotional self-sufficiency and discomfort with intimacy (7 items). Participants responded to the statements on a 6-point Likert scale from 1 (strongly disagree) to 6 (strongly agree). Internal consistency of the four factors was α = 0.86, 0.80, 0.77, and 0.68.
The relationship questionnaire (RQ) [11] consists of four short paragraphs, designed to measure the four attachment styles (secure, insecure-preoccupied, insecure-fearful, and insecure-dismissing). Participants are asked to indicate which paragraph describes themselves and others.
Self-esteem: The Rosenberg Self-Esteem Scale (RSE) [31] is a 10-item measure that assesses general self-esteem. Participants rate the extent to which each item applies to their self-evaluation on a 4-point Likert scale ranging from 1 (strongly disagree) to 4 (strongly agree). Five negatively worded items are reverse scored. Total scores are summed, and higher scores indicate elevated levels of self-esteem. Internal consistency is high (α = 0.93).
Self-efficacy: The General Self-Efficacy (GSE) scale [49] is a 10-item questionnaire that assesses positive self-perception in coping with various difficult situations. Participants are asked to rate each item on a 10-point Likert scale. Total score is summed, and higher scores indicate more self-efficacy. Internal reliability for GSE is between 0.76 and 0.90.
Behavioral approach and behavioral inhibition system: The Behavioral Approach/ Inhibition Scale (BIS/BAS) [25] is a 24-item questionnaire rated on a 4-point Likert scale ranging from 1 (very true for me) to 4 (very false for me). The scale presents four subscales, 1 for BIS and 3 for BAS (drive, reward, and fun seeking). Cronbach's α for the BIS, BAS drive, BAS reward, and BAS fun seeking scales are 0.74, 0.76, 0.73, and 0.66, respectively.

Task Modeling of Serious Game: ATHENEA
The VSG system was developed using Unity 5.5.1f1 software, applying c# programing language using the Visual Studio tool. Participants performed the VSG wearing an HMD device (HTC VIVE) and the interaction was ensured through two hand controllers.
In the game, a story narrative was created, taking place in a spaceship (Athenea), whose aim was to discover and settle a new land because Earth is no longer habitable. The VSG included seven virtual agents: six of them were adults, designed with specific personality traits and personal competencies according to the attachment working models [53]; more specifically, pairs of virtual agents (one man and one woman) reflected the secure, ambivalent-insecure, and avoidant-insecure attachment styles ( Figure 2). The last virtual agent was a child without a specific attachment style. Each virtual agent and the study participant had an explicit role inside the spaceship according to their personality traits and personal competencies. The specific role of the study participant was to maintain technical control of the Athenea and ensure the correct direction to possible destinations. In addition, the participant was responsible for caring for the child. Furthermore, a spaceship artificial intelligence controlled and supervised the correct functioning of the spaceship and, if something was wrong, alerted the crew to solve the problem. The spaceship was composed of four areas: a red area formed by the engine room and the hangar; a green area, which was the zone for life support, orchard, and water and air purification; the control area, formed by the weapons and system control rooms; and a blue area, including relaxation and well-being rooms, such as bedrooms, cafeteria, infirmary, and EMO room. The EMO room was designed for crew well-being, in which participants self-assessed their mood states as well as those of the rest of the crew, and their performance, and according to the crews' mood states, they could decide to improve and enhance them (Figure 3).
The EMO room was designed for crew well-being, in which participants self-assessed their mood states as well as those of the rest of the crew, and their performance, and according to the crews' mood states, they could decide to improve and enhance them (Figure 3).  The VSG consisted of 10 situations, and in each situation, several episodes and tasks were designed according to the theoretical psychological framework and the SA method. The following sections introduce the four competency models, one for each basic psychological need, and their relative indicators, including the specific tasks for each psychological need. For each of the four competency models, we presented a graphic model of the indicators (unobservables-theoretical psychological constructs-and observables-tasks and data gathered from user performance).
Attachment assessment: This included three episodes of loss, four of loneliness, four of threat, and two of suspicion. Two of these episodes (one of threat and one of loss) were related to childcare. The attachment episodes were distributed in the SG and in order for participants to go ahead, they had to make a decision based on (a) solving the problem alone; (b) seeking the help of other crew members (in this case, deciding on a crew member); (c) letting another crew member solve the problem; or (d) doing nothing. Each decision-making response was designed according to the referenced attachment theoretical framework and the main traditional questionnaires used to assess attachment style ( Table  2). The EMO room was designed for crew well-being, in which participants self-assessed their mood states as well as those of the rest of the crew, and their performance, and according to the crews' mood states, they could decide to improve and enhance them (Figure 3).  The VSG consisted of 10 situations, and in each situation, several episodes and tasks were designed according to the theoretical psychological framework and the SA method. The following sections introduce the four competency models, one for each basic psychological need, and their relative indicators, including the specific tasks for each psychological need. For each of the four competency models, we presented a graphic model of the indicators (unobservables-theoretical psychological constructs-and observables-tasks and data gathered from user performance).
Attachment assessment: This included three episodes of loss, four of loneliness, four of threat, and two of suspicion. Two of these episodes (one of threat and one of loss) were related to childcare. The attachment episodes were distributed in the SG and in order for participants to go ahead, they had to make a decision based on (a) solving the problem alone; (b) seeking the help of other crew members (in this case, deciding on a crew member); (c) letting another crew member solve the problem; or (d) doing nothing. Each decision-making response was designed according to the referenced attachment theoretical framework and the main traditional questionnaires used to assess attachment style ( Table  2). The VSG consisted of 10 situations, and in each situation, several episodes and tasks were designed according to the theoretical psychological framework and the SA method. The following sections introduce the four competency models, one for each basic psychological need, and their relative indicators, including the specific tasks for each psychological need. For each of the four competency models, we presented a graphic model of the indicators (unobservables-theoretical psychological constructs-and observables-tasks and data gathered from user performance).
Attachment assessment: This included three episodes of loss, four of loneliness, four of threat, and two of suspicion. Two of these episodes (one of threat and one of loss) were related to childcare. The attachment episodes were distributed in the SG and in order for participants to go ahead, they had to make a decision based on (a) solving the problem alone; (b) seeking the help of other crew members (in this case, deciding on a crew member); (c) letting another crew member solve the problem; or (d) doing nothing.
Each decision-making response was designed according to the referenced attachment theoretical framework and the main traditional questionnaires used to assess attachment style (Table 2). Self-esteem assessment: At the end of each situation, participants entered the EMO room to self-evaluate how they felt, to see how the rest of the crew felt according to the two dimensions of valence (positive/negative/neutral) and arousal state (low activation/high activation) and according to the feelings of the rest of the crew, participants could improve their well-being (Table 3). Table 3. Competency model of self-esteem with indicators.

Self-Esteem
Worth: feelings about self Self-evaluation Social-worth: feelings about and from others Others' evaluations Others' support Self-efficacy assessment: This assessment involved three situations through 12 games each. Games were developed and modulated according to the four sources of information proposed by Bandura (1997) [17]: mastery experience, vicarious experiences, verbal persuasion, and emotional and physiological states. Each game lasted three minutes and thirty seconds and at the minute and a half one of four sources of information appeared. At the beginning of each game, the virtual system presented to participant the specific explanation of the game. At the end of each self-efficacy situation, participants were asked to self-evaluate their perception of efficacy during the game performance. The main performance data gathered during the gameplay concerned the reaction times-starting from the end of game explanation to participant beginning and for the entire duration of the game every single interaction with the virtual stimuli were measured. Furthermore, all the correct and incorrect performance responses, as well as the number of completed trails were gathered (Table 4). Table 4. Competency model of self-efficacy with indicators.

Grawe Dimension
Unobservable Construct Observable Indicators

Self-efficacy Competence
Reaction times (ms) Correct/incorrect game performance Number of completed trails Self-evaluation Stress avoidance: For this assessment, 12 episodes on avoiding or approaching pleasant or unpleasant situations were designed. The episodes were distributed throughout the course of the SG and in order to go ahead, participants had to decide based on cognitive, emotional, or behavioral approach or avoidance responses. Each decision-making response was designed according to the main theories and questionnaires related to the BIS/BAS (Table 5).

Experimental Procedure
The study consisted of five sessions of almost 1 h each, presented to participants every other day. Before the VSG experience, participants were administered the above-mentioned questionnaires, and completed an introductory tutorial to familiarize themselves with the virtual reality system. The tutorial was conducted in a neutral room, including a video with the narrative introduction to the history, and subsequently participants were asked to move in the room using the two controllers and then back to a table where an activity was proposed. The activity consisted of picking up three geometric figures, rotating and inserting them in the right location. To verify the correct activity execution, participants had to press a button, and if the activity was correct, the button changed color from red to green and the VSG started. Subsequently, participants were introduced in the first VSG experience session, including the first three situations mainly related to attachment, self-esteem, and stress avoidance phenomena. In the second, third, and fourth sessions participants experienced the fourth, fifth, and sixth situations, respectively, and mainly related to self-efficacy need. In the last session, the seventh, eighth, and ninth, mainly related to attachment, self-esteem, and stress avoidance phenomena were performed by participants. Finally, during the last session, the tenth situation involved a conclusion episode of the narrative storytelling not related to any need was presented to participants. The narrative storytelling situations followed a sequence order, from 1 to 10 (conclusion episode) and all participants were presented with all situations and episodes following the narrative storytelling.

Data Analysis
Multivariate outlier detection was performed to find the participants whose scores on the questionnaires were extreme. Outliers within each of the four groups of variables (attachment, self-esteem, self-efficacy, stress avoidance) were identified independently of the variables of the rest of the groups. The Mahalanobis distance between subjects according to the variables in each group with more than one score (e.g., attachment, with four subscales) was calculated, along with the probability that it belonged to a Chi-square distribution. For groups with only one variable (e.g., self-esteem), the z-score was calculated. In both cases, subjects falling into to the most extreme 1% of the data distribution were defined as outliers. We found one outlier subject regarding the attachment variables and two in avoidance stress. The level of statistical significance was set as p < 0.05.
We had five datasets, one for each main variable and one with demographic data from the subjects and their respective outputs in all the scales. These demographic data (sex, age, education, marital status, employment status, and income) were added to the four main datasets. These datasets consisted of the following: Attachment: 41 variables describing the decisions taken by the subject (which actions they did and with which virtual agent) and nine variables relating to how many times each virtual agent was chosen and which one was chosen the most. Data from 59 subjects were available after outlier removal.
Self-esteem: 42 variables relating to the decisions taken by the subject (which actions they did and with which virtual agent), 26 relating to how the subjects described themselves during the game (12 were removed as their value was constant), and 224 relating to how the subjects perceived the emotional status of the virtual agents. Data from 60 subjects were available after outlier removal.
Stress avoidance: 12 variables relating to the decisions taken by the subject (which actions they did and with which virtual agent). Data from 58 subjects were available after outlier removal.
Due to the high dimensionality of both self-esteem and self-efficacy datasets, we worked separately with the group of variables described.
Outputs: There are five main scales, some with subscales, as shown in Figure 4 (attachment has four subscales, together with RQ; stress avoidance has six subscales; selfesteem and self-efficacy have only one dimension). The subject's score on each of these scales, drawn from the self-assessment questionnaire, was predicted, and simplified as high or low. This categorization of scores was done using population means as a threshold, in order to have a similar number of subjects with low and high values, and to improve discrimination between categories, with the exception of the variable RQ, which was originally qualitative, and its distribution remained the same.
Appl. Sci. 2021, 11, x FOR PEER REVIEW 10 of 17 (sex, age, education, marital status, employment status, and income) were added to the four main datasets. These datasets consisted of the following: Attachment: 41 variables describing the decisions taken by the subject (which actions they did and with which virtual agent) and nine variables relating to how many times each virtual agent was chosen and which one was chosen the most. Data from 59 subjects were available after outlier removal.
Self-esteem: 42 variables relating to the decisions taken by the subject (which actions they did and with which virtual agent), 26 relating to how the subjects described themselves during the game (12 were removed as their value was constant), and 224 relating to how the subjects perceived the emotional status of the virtual agents. Data from 60 subjects were available after outlier removal.
Stress avoidance: 12 variables relating to the decisions taken by the subject (which actions they did and with which virtual agent). Data from 58 subjects were available after outlier removal.
Due to the high dimensionality of both self-esteem and self-efficacy datasets, we worked separately with the group of variables described.
Outputs: There are five main scales, some with subscales, as shown in Figure 4 (attachment has four subscales, together with RQ; stress avoidance has six subscales; selfesteem and self-efficacy have only one dimension). The subject's score on each of these scales, drawn from the self-assessment questionnaire, was predicted, and simplified as high or low. This categorization of scores was done using population means as a threshold, in order to have a similar number of subjects with low and high values, and to improve discrimination between categories, with the exception of the variable RQ, which was originally qualitative, and its distribution remained the same.

Machine Learning Analysis
Machine learning was applied to find the best selection of variables from the VSG that predict whether the subjects would have high or low scores in the study variables. The best predictive model was obtained for each scale. If the scale had more than one set of variables (self-esteem and self-efficacy), we also investigated which set achieved better results. In addition, whether or not to include demographic variables to predict each scale was also studied.
The process of feature selection and modelling was common to all scales and followed the scheme described in Figure 5. Different machine learning algorithms (Table 6) were tested to predict the study variable subscales. The following steps were validated using 15-fold cross-validation (CV): Appl. Sci. 2021, 11, x FOR PEER REVIEW 12 of 17 Figure 5. Machine learning strategy followed for each algorithm in Table 5 and each subscale to predict. Feature selection was made and validated by cross-validation (CV). When the best features were obtained, hyperparameter tuning was carried out, also validated with CV. With the best features and hyperparameters, modelling was done, and metrics were obtained by CV. Table 7 shows the descriptive analysis of quantitative variables and the related high/low categorization using the mean of each subscale, except for RQ, which is a qualitative variable. Table 7. Descriptive analysis of variables of interest and results of categorization. High/Total column shows balancing of both categories (perfect balancing would mean High/Total of 0.5). * Relationship questionnaire (RQ) was not numeric, so the number of subjects in each category (low for insecure, high for secure) is the original one.

Machine Learning
Recognition of Attachment, Self-Esteem, Self-Efficacy, and Stress Avoidance Using the VSG. Table 8 shows the results of machine learning models for each self-assessed psychological construct. All four variables (attachment, self-esteem, self-efficacy, and stress avoidance) have at least one subscale that was modelled with accuracy ≥0.8 and kappa ≥0.5 using data from the VSG.  Table 5 and each subscale to predict. Feature selection was made and validated by cross-validation (CV). When the best features were obtained, hyperparameter tuning was carried out, also validated with CV. With the best features and hyperparameters, modelling was done, and metrics were obtained by CV. Table 6. Machine learning algorithms used, hyperparameters tuned, and values tested.

Model Parameter Values
Conditional inference trees Maximum depth (0, 100) Minimum sum of weights in a node to be considered for splitting Feature selection: The wrapper method of backward sequential feature selection was chosen. Starting from a model with all the features, in each step the feature that decreased the performance measure the most was removed. The maximum possible number of features that could be selected was 15 to avoid overfitting.
Hyperparameter tuning: Once a set of the best features was obtained, hyperparameter tuning was performed. Different hyperparameters, as shown in Table 6, were optimized in each machine learning algorithm. Ten equal-sized values in the range defined for each hyperparameter were evaluated. In the case of the support vector machine (SVM) hyperparameters, an exponential transformation (2 x ) was applied to the values. The search for the best hyperparameters for the machine learning models was limited and not continuous, so the chosen hyperparameters might not be optimal. This also might help to avoid overfitting.
Modeling: Once the best features and hyperparameters for each machine learning algorithm were obtained, the model was trained and validated with 15-fold CV to obtain the average of metrics (accuracy, Cohen's kappa, sensibility (True Positive Ratio; TPR), and specificity (True Negative Ratio; TNR)). The average of the metrics achieved in each fold is reported.
All analyses (outlier detection, Chi-square test, and machine learning) were performed on R software (version 3.6.1). Table 7 shows the descriptive analysis of quantitative variables and the related high/low categorization using the mean of each subscale, except for RQ, which is a qualitative variable. Table 7. Descriptive analysis of variables of interest and results of categorization. High/Total column shows balancing of both categories (perfect balancing would mean High/Total of 0.5). * Relationship questionnaire (RQ) was not numeric, so the number of subjects in each category (low for insecure, high for secure) is the original one.

Machine Learning
Recognition of Attachment, Self-Esteem, Self-Efficacy, and Stress Avoidance Using the VSG. Table 8 shows the results of machine learning models for each self-assessed psychological construct. All four variables (attachment, self-esteem, self-efficacy, and stress avoidance) have at least one subscale that was modelled with accuracy ≥0.8 and kappa ≥0.5 using data from the VSG.

Discussion
Motivational psychology is a broad branch of psychology that studies human needs and their fulfillment for mental health and well-being [1][2][3][4]. Motivation can be defined as an unobservable and internal psychological process that drives observable human behaviors in order to satisfy needs. Traditional assessments of basic psychological needs involve qualitative and quantitative measures, such as interviews and self-reported scales that are presenting limitations and biases, affecting the responses veracity [32][33][34]. To overcome these limitations, VSGs are probing to be effective in assessment and training in different research fields alongside traditional approaches [53,54]. According to this, the purpose of the present study was to explore the feasibility of a VSG to assess the four basic psychological needs of attachment, self-esteem, self-efficacy, and stress avoidance according to the consistency model [3]. In order to provide VSG contents' validity in the recognition of each basic psychological need, the data analysis included a broad set of supervised machine learning (SML) models combining the VSG performance for each variable and the scores in traditional questionnaires.
SML algorithms achieved high mean attachment recognition between the VSG and the four attachment questionnaire subscales (accuracy >85%, using a maximum of 15 selected features). More specifically, the main recognition was between the individual's VSG performance in the specific episodes of loss, loneliness, threat, and suspicion and the first two subscales, related to the ambivalent-insecure attachment style (1: low self-esteem, need for approval, and fear of rejection; 2: hostile conflict resolution, rancor, and possessiveness), achieving 90% accuracy (kappa: 0.76 and 0.74, respectively), including 88% of true positives for the first subscale and 68% for the second. Regarding the third subscale, related to the secure attachment style (3: secure affect or expression of feelings and comfort with relationship), the model recognition achieved 85% accuracy (kappa: 0.63) with 88% of true positives. The fourth subscale, related to the avoidant-insecure attachment style (4: emotional self-sufficiency and discomfort with intimacy) achieved 88% accuracy (kappa: 0.67) including 93% of true positives. These results seem to suggest, on one hand, that the VSG situations are able to activate attachment phenomena, and on the other, the ability to recognize attachment style using VSG.
Regarding self-esteem, a Generalized Linear Models (GLMNet) using 12 features related to how the subject reported the status of virtual agents achieved 98% accuracy (kappa: 0.96) including 97% of true positives. The set of variables related to participants' decisions and the combination of all variables reached 93% accuracy (kappa: 0.68) including 81% of true positives. On the other hand, the subjects' self-report status was clearly not useful in predicting self-esteem (kappa: 0). These results suggest, on the one hand, consistency regarding the sample frequency distribution, in which we found a larger imbalance between subjects with low (n = 37) and high (n = 23) self-esteem, and on the other hand, a positive model of others and a negative model of the self, typical of individuals with an ambivalent-insecure attachment style [12].
Self-efficacy beliefs depended mainly on attentional, planning, and overall ability to manage a task or situation according to the four sources of information, reaching accuracy of 95% and 92% (kappa = 0.89 and 0.88, respectively). This result, consistent with previous studies, suggests that if mastery and vicarious experiences, as well as verbal persuasion and emotions and physiology, are positive at the beginning and the feedback during the experience is positive, this will enhance attentional abilities and planning to achieve a goal, with the perception of high self-efficacy [17].
On the contrary, stress avoidance and coping subscales clearly had worse predictions than those of attachment, which could be due to the lack of variety in either the features (only 12 available) and/or the SG situations related to the underlying psychological need.
Despite the promising results, some limitations of the present study should be considered. The main limitation is related to the limited number of subjects for model validation.
None of the models were tested with new or external data. Future studies should include a more extended sample in order to test the models and improve the validity of the study. The SML models were validated using as ground truth the psychological questionnaires presented in the introduction. Even though they are commonly used questionnaires, an assessment performed by a psychologist could have been preferred. Another limitation of this study is represented by the categorization of predicting classes of psychological needs (low and high) rather than continuous data scores. The categorization was generated around the fixed central tendency estimate (mean) and previous research on personality has shown that the distribution of trait scores tends to be Gaussian, so that most individuals will fall into the central tendency estimate of the scale. The binary classification imposes a constriction on the mean, distancing trait scores from reality. Future studies should take into consideration the use of psychological construct data for continuous scores.
Regarding the SG contents, our results support our initial hypothesis on attachment, self-esteem, self-efficacy, and not on stress avoidance. Indeed, the virtual content definition of stress avoidance could not predict the related psychological construct, which highlights, on the one hand, the importance of validating the virtual construct-related content and SML as an effective technique to check consistency. Future works, according to this result, should modify the actual virtual situations related to this construct, test it, and validate as a final step.
Despite these limitations, the results of this study show that by using SML techniques on psychological dimensions with a VSG, it may be possible to enhance, first, the assessment of behavioral-based psychological needs, overcoming biases presented by traditional assessment, and second, to validate the appropriateness of virtual contents based on psychological needs.

Conclusions
Motivation, as mentioned in the introduction, is an unobservable and internal psychological process that drives observable human behaviors in order to satisfy needs for maintaining and improving mental heath and well-being [1,2]. According to this, virtual serious games and a stealth assessment method can be powerful tools for improving the assessment of psychological needs, overcoming limitations and biases presented by tradi-tional measures, providing simulated situations similar to real ones, and gathering data from observable human behaviors during the gameplay.
Furthermore, SML techniques can handle vast datasets, with effectiveness in the recognition of psychological constructs by advanced technologies, allowing the control of individual variability dimensions and factors. Another advantage of applying SML techniques involves validation of the related virtual content, in this case specific personality constructs. According to these factors, two conclusions can be inferred: (1) the VSG content was able to recognize attachment styles, self-esteem, and self-efficacy beliefs and less stress avoidance need; and (2) all of the results were consistent with studies on previous traditional measures, suggesting that VSG can also be an effective measurement tool alongside traditional ones to assess personality constructs.  Informed Consent Statement: Informed consent was obtained from all subjects involved in the study.

Data Availability Statement:
The data presented in this study are available on request from the corresponding author. The data are not publicly available due to restrictions on privacy policy on sensitive data categories.