Computational Psychometrics Using Psychophysiological Measures for the Assessment of Acute Mental Stress

The goal of this study was to provide reliable quantitative analyses of psycho-physiological measures during acute mental stress. Acute, time-limited stressors are used extensively as experimental stimuli in psychophysiological research. In particular, the Stroop Color Word Task and the Arithmetical Task have been widely used in several settings as effective mental stressors. We collected psychophysiological data on blood volume pulse, thoracic respiration, and skin conductance from 60 participants at rest and during stressful situations. Subsequently, we used statistical univariate tests and multivariate computational approaches to conduct comprehensive studies on the discriminative properties of each condition in relation to psychophysiological correlates. The results showed evidence of a greater discrimination capability of the Arithmetical Task compared to the Stroop test. The best predictors were the short time Heart Rate Variability (HRV) indices, in particular, the Respiratory Sinus Arrhythmia index, which in turn could be predicted by other HRV and respiratory indices in a hierarchical, multi-level regression analysis. Thus, computational psychometrics analyses proved to be an effective tool for studying such complex variables. They could represent the first step in developing complex platforms for the automatic detection of mental stress, which could improve the treatment.


Introduction
Mental stress is an important factor potentially affecting mental and physical functions. According to classical theories, stressors can be defined as challenging events requiring physiological and behavioral responses that are aimed at reinstating homeostasis. Effective coping strategies involve a rapid response that also has to be efficiently terminated afterward. If the response to the stressor is inadequate, the biological costs may become too high [1].
In the last half century, several studies have demonstrated the role of mental stress in many disorders, like cardiovascular diseases, abnormal cognitive functioning, or psychiatric disorders [2,3]: For instance, the inability to cope with stressors has been associated with the hypersecretion of corticosteroids and with an increased risk of depressive onset [1]. In addition, early and prolonged exposure to stressors has been shown to affect neurodevelopment, involving both neurobiological and of the effectiveness of the two acute stressors (Stroop Task and Arithmetic Task) by means of computational analyses.
On the other hand, the price of CE-marked medical devices has dropped significantly, along with a wide inclusion of biosensors in wristbands or other wearable devices [33][34][35]. An example is the wrist photoplethysmography that has been used to record blood volume pulse (i.e., BVP, an analog of electrocardiogram). Practically, the use of wearable sensors for both clinical and leisure uses has been increasing. Such sensors are becoming more technologically advanced and provide excellent sampling rates, acceptable precision, and effective artifact removal by algorithms that are embedded in the firmware [33,[36][37][38][39]. Nevertheless, some problems with recording the HRV using commercial biosensors are related to the sampling rate, which according to the standard HRV guidelines, needs to be at least 100 Hz [9]. of the sampling rate of commercial HRV sensors, such as for the Apple Watch, is about 100 Hz. Moreover, the manufacturers have considered the consumption of battery when computing the indices in the biosensor firmware, allowing only the indices to be transferred to the potential application instead of the raw data. The use of computational algorithms for acute, time-limited stressors that we assessed can provide a practical understanding and ready solution for the effective detection of these important variables in our daily lives.

Participants
The participants in the study consisted of 60 healthy students (30 males and 30 females) with the mean age of 21.2 (SD 2.25) ranging from 19 to 25. They were requested not to drink caffeine or alcohol and not to smoke prior to the experimental test in order to avoid any effects of these substances on the central autonomic nervous system. All the experiments were conducted in accordance with the Declaration of Helsinki and approved by the Institutional Review Board of Istituto Auxologico Italiano. Written informed consent was obtained from each participant.

Procedures
The participants who met the experimental criteria were contacted via email and/or telephone to schedule a meeting. The researcher assisted them during the sessions, maintained a neutral tone of voice, and maintained neutral behavior while the participants were being exposed to the experimental stimuli. The participants were asked to sit in front of a computer, and they were told about the general goals of the research, the procedures to be used, and the concerns associated with study involvement. The broad functions of the electrodes were explained relative to their use in the collection of the psychophysiological indices. The researcher attached the sensor electrodes in the following order. The first sensor was the thoracic respiration belt. Two skin conductance adhesive patches were then applied to the left palm. Last, the blood volume pulse (BVP) sensor was placed on the top of the index finger of the left hand. All participants were right-handed (without a history of switching the dominant hand during their lifetimes). When the subject indicated that he or she was comfortable, the researcher asked her/him to remain still during the presentation of the stimuli in order to avoid artifacts in signal acquisitions as a result of movements. At the end of the experimental session, the experimenter helped the participants remove all of the electrodes and patches and explained the aims of the experiment and the scientific rationale for using the stimuli.

Experimental Stimuli
A four-minute baseline session was conducted with all participants to establish a stable reference. The order of the application of the stimuli was randomized. The sessions included: (1) a four-minute relaxation period, i.e., panoramic slide show, (2) an acute time-limited stressor, i.e., the four-minute Stroop Color Word Task (SCWT), simply indicated as "Stroop" in the analyses, and (3) a four-minute arithmetic task (AT), simply indicated as "Arithmetic" in the analyses. The relaxation session comprised a series of panoramic photographs that were validated by Mauri and colleagues [40]. In the Stroop Color Word Task, the participants were required to name the colors of the words that were congruent or incongruent with the words' meaning [20,40]. In the Arithmetic task, the participants were required to subtract 17 from 1000, subtract 17 from the result, and continue doing so until the end of the 4-minute session. They were asked to answer as accurately as possible [21].

Recording of the Physiological Signals
The data on the autonomic nervous systems were collected by measuring three physiological responses, i.e., Blood Volume Pulse, Galvanic Skin Response, and Respiration. These responses were acquired by means of a Procomp Infinity device from Thought Technology, and Biograph Infinity 5.0.2 software was used to record them. The responses were then processed with custom software developed using MATLAB 7.10.0 (R2010a) (The Mathworks, Inc.; Natick, MA, USA). Every channel was acquired synchronously at 2048 Hz and extracted at 256 Hz for computation of indices.

Psychophysiological Signal Processing
Cardiovascular and respiratory activities were monitored to evaluate both the voluntary and autonomic effect of respiration on heart rate. We analyzed the Inter-Beat Interval (IBI) extracted from the Blood Volume Pulse sensor, a measure equivalent to the RR peaks interval extracted from the electrocardiogram; respiration (from a chest strip sensor); and their interactions. Inter-beat interval (IBI, following also RR) was transformed into an estimate of heart rate (HR) as well as the pulse amplitude (BVP Amplitude), both of which represent the relative increase in blood volume. The two indexes used to represent the heart rate measurements from BVP were the means of HR (in beats per minute) and RR mean (60,000/HR). According to the guidelines of Task Force of the European Society of Cardiology and the North American Society of Pacing and Electrophysiology, typical temporal and spectral Heart Rate Variability (HRV) indices can be extracted to evaluate the response of the autonomic nervous system [9,40]. As the temporal domain measure of the variability of heart rate, we calculated the standard deviation (SDRR) using the BVP IBI and the standard deviation of the average beat-by-beat heart rate (SDHR). The third measure of HRV in the temporal domain was the BVP amplitude, which represents the relative increase in blood volume caused by the heart's contracting (vasoconstriction), and it displays the moment-by-moment HRV, thereby providing significant insight into individual's emotional responses. For the frequency domain, spectral analysis was performed using Fourier spectral methods. In particular, Standard Heart Rate Variability (HRV) spectral-method indexes and similar indexes were used to evaluate the response of the autonomic nervous system. We calculated the magnitude of the peak frequency (also indicated as RR peak frequency) in the power spectrum. The rhythms were classified as very low frequency (VLF < 0.04 Hz), low-frequency (LF, between 0.04 and 0.15 Hz), and high frequency (HF, from 0.15 to 0.5 Hz) oscillations. This procedure also allowed us to calculate the LF/HF ratio, a well-known sympathovagal balance index.
The respiration signal was filtered to produce a smooth sinusoidal signal [40,41]. Respiration Period index represents the peak-to-peak time (maximum-to-maximum distance of the sinusoid), which allowed us to compute the Respiration rate that corresponded to the breaths per minute (BPM). Additionally, the tidal volume of the air moving in and out of the lungs during breathing was measured to obtain respiratory amplitude. Respiratory amplitude index represents the peak-to-peak amplitude as the vertical Manhattan distance between the peak (highest amplitude value) and the trough (lowest amplitude value).
The interaction between cardiovascular and respiratory activity can also be considered using the Respiratory Sinus Arrhythmia (RSA) index [9,40,41]. HR Max-HR Min is the peak-to-trough difference in heart rate that occurs during a full breath cycle. This metric is affected by RSA and is generally described as a measure of vagal tone (vagus nerve activity). High values of HR Max-HR Min represent high vagal tone.
Skin Conductance (SC) and Skin Resistance (SR) are units of electrodermal activity that are expressed as either conductance (microsiemens) or resistance (microohms). SC reflects a fairly slow physiological process, and it can be sampled at 32 Hz without distortion. The signal that we considered was expressed in microsiemens. To calculate the mean index of SC, we considered the mean of the sampled signal after the removal of artifacts.

Statistical Analyses
First, we analyzed the data using STATA MP-Parallel Edition (StataCorp LP, College Station, TX, USA) Release 14.0, SPSS (IBM Corporation, Armonk, NY, USA), Release 21, and JASP, Release 0.7.1.4 [42]. Conditions were compared using repeated measure analysis of variance (rmANOVA). A significant Mauchly's test of sphericity at 0.05 p level indicated that the assumption of homogeneity of covariances was violated, so we adjusted the F-test in terms of the degree of freedom by using the Greenhouse-Geisser test [43,44]. The corrected p-values were reported accordingly. The paired conditions were compared using the pairwise comparison with adjusted alpha level to avoid an inflated type I error rate when making multiple statistical comparisons using the Bonferroni correction.

Computational Analyses
Computational analyses were done using Python 3.4 with the Orange 3.3.5 data mining suite, which was available free in the open source code (https://github.com/biolab/orange3) and from which it is possible to see all of the algorithms used in the article. In particular, a stratified, 10-fold cross-validation was done using the following methods [45,46], i.e., (1) Logistic Regression classification algorithm with ridge regularization; (2) Random Forest classification using an ensemble of decision trees; (3) Support Vector Machine (SVM) to map inputs to higher-dimensional feature spaces that best separate different classes; and (4) Naïve Bayes, a probabilistic classifier based on Bayes' theorem. As stated before, all the algorithms used were available in the open source code and documentation related to them can be found in the Scikit user guide, which provides a detailed explanation of all the algorithms used in the study, including rank calculation, classification tree, and learners (http://scikit-learn.org/stable/user_guide.html).

Results
In the first analysis, we used classical null hypothesis significance testing (NHST) with a repeated measure design using rmANOVA and pairwise comparisons. The idea was to determine whether our three conditions (Relax, Stroop, and Arithmetic) differ and subsequently compare the pairs (Relax vs. Stroop, Relax vs. Arithmetic, and Stroop vs. Arithmetic) to identify specific differences. By using the psychophysiological indexes computed for each epoch condition, we obtained the descriptive statistics reported in Table 1. The rmANOVA univariate tests showed statistical significance (Table 2) and pairwise comparisons (Table 3) confirmed the differences between Relax and the two stressors in all measures (excluded LF/HF), but the statistical evidence in differentiating the two stressors from each other was not evident.  The NHST revealed univariate differences in psychophysiological correlates of Relax conditions with respect to the stressors. However, it was not possible to state which of the two stressors was the more powerful discriminating acute time-limited stressor from Relax.
To collect more information regarding this issue, we conducted computational analyses using a stratified 10-fold cross-validation with the indices ranked as shown in Figure 1 (for additional information about the rank scoring algorithms that were used for the Python computation, please see http://docs.orange.biolab.si/3/data-mining-library/reference/preprocess.html). The results showed a precision between 69.4% and 74.9% (Table 4) with most of the loss due to predicted Relax when actual was Stroop task, with an error ranging from 7% to 10%, as highlighted in the confusion matrices ( Figure 2). An additional analysis of the classification is shown in Figure 3, where the true positive rate (sensitivity) is plotted against the false positive rate (specificity) and the hierarchical classification of measures. Figure 4 shows the classification tree (http://scikit-learn.org/stable/modules/tree.html) that was developed.        Table 4. Stratified 10-fold Cross validation. Four learning algorithms were compared, i.e., (1) Logistic regression, (2) random forest, (3) support vector machine, and (4) Naïve Bayes. In the analysis, the classification learning algorithm was used for the classifications referring to the test used for ranking ( Figure 1) [30,[47][48][49].   A multi-level regression analysis was used to estimate the effect of HRV and respiratory indices on a possible global stress level measured through RSA. This statistical approach, also known as the Hierarchical Linear Model (HLM) [50], was chosen because the within-subject data were collected at three time points (the conditions), determining a nested data structure. The log-likelihood ratio test was then used to determine which model provided the best fit to the data. The results of the multi-level regression analysis for the RSA index (HR Max-HR min) are shown in Table 5 Table 5 with the Beta slopes of the regression and the statistical significance levels.

Discussion
The goal of this study was to quantitatively analyze psychophysiological measures during mental stress. Consistent with a great number of studies [19], we adopted the Stroop Task and the Arithmetic Task to induce mental stress. We computed several indices using signal processing data analysis. Cardiovascular activity, respiratory activity, and their interactions were measured by heart rate variability indices in the time and spectral domains and respiratory sinus arrhythmia. In addition, we considered skin conductance as an index of sympathetic activity.
Castaldo and colleagues [19] reported the expected value for psychophysiological HRV indices during acute mental stress, considering each one separately with a statistical univariate approach. In this sense, our results, except for HF index, confirmed the same trends highlighted by Castaldo and colleagues in the final meta-analysis that included all the studies in the calculation [19].
In particular, the cardiovascular activity showed an increased physiological activation by increasing the heart rate (HR) and decreasing its inverse expressed in RR peaks distances in millisecond (RR mean). Temporal short term heart rate variability decreased by increasing the standard deviation of the heart rate (SDHR), corresponding to a decrease in the SDNN, RMSSD, and pNN50 indices reported by Castaldo in the meta-analysis [19].
Regarding the frequency domain features, Castaldo reported that during acute mental stress, most studies showed increased sympathetic activation, as measured by low-frequency index (LF), and a decreased parasympathetic activation, as measured by high-frequency index (HF). The consequent sympathovagal balance is just the ratio between these two indexes (LF/HF) and reflects their trends accordingly. In our study, we found also an increase in sympathetic activity with a significantly higher LF during acute mental stress compared to relax condition. On the other hand, we found a different trend in HF, indicating also an increase of parasympathetic activity. This result is not actually surprising. In fact, Castaldo reported that Vuksanovic and colleagues noted the same trend in a 2007 study with an arithmetic task that was exactly the same that we used [19]. This effect could be due to an engagement that would produce a higher level of parasympathetic activity expressed with high frequencies (HF), as has also been reported in other studies [40,41]. In the frequency domain, we also calculated very low frequencies, which showed the same trend as did high frequencies, as expected. We calculated also the peak frequency index that we suggest integrating into further studies on acute mental stress.
Castaldo and colleagues showed the trends of each HRV index during acute mental stress tasks [19]. It seems that a huge gap exists in the literature by not considering the respiration indexes and the role of the respiration in heart rate variability. Since respiration is a very easy signal to record, we included it in the analysis of our 60 participants, discovering that due to its important properties, it can be a useful tool to explore deeply in the future. Our results showed a specific pattern of respiration during acute mental stress tasks. In particular, acute mental stress affects respiration by increasing respiration amplitude and respiration period, which means fewer breath per minute but deeper respiration. Mauri and colleagues also showed increased respiration period during acute mental stress [40].
Recording a respiration signal we are also able to compute another important index, namely the respiratory sinus arrhythmia RSA, which shows a heart rate variability in synchrony with respiration. RSA is an indirect measure of vagal tone, reflecting the way in which vagus nerve regulates emotional functions. The RSA index that we used (HR Max-HR Min) is one possible measure of vagal tone [40,41]. Our results showed a decreased vagal tone during acute mental stress. Moreover, this index appears to be effectively segregate the data into the three conditions. Indeed, HR Max-HR Min, RR peak frequencies, and Respiration Amplitude are able to predict relax or the two stress tasks (Figure 4), but they are not hugely reported in the literature. As such, they should be considered more consistently in future studies on acute mental stress.
Our results are consistent with the general values and directions of the current scientific literature on the effects of acute, time-limited stressors on HRV indices, as reported by Castaldo and colleagues [19]. However, our study went well beyond the current studies in the two aspects described below.
First, different stressors have never been compared before. In fact, although some studies evaluated different types of stressors, they never compared them [19]. In this respect, our study highlighted the substantial differences between Stroop and Arithmetic tasks and clarified how different types of mental stress induce different psychophysiological reactions, even if in the same direction. In particular, the Stroop Task appeared to be more related to short attentional processes than to mental stress probably because involving a cognitive task also resulted in engagement states failing to be different from Relax for such aspects [40]. However, even if less engaging, the Arithmetic task produced a physiological pattern indicating acute mental stress: this task can be therefore considered more appropriate for determining the associations between acute mental stress and short-term psychophysiological reactions, which is consistent with the literature overview [19,40,41]. Stroop Task is supposed to involve executive functions, as subjects are required to act differently from their usual tendencies (not read the word, but the color). Neuroimaging studies have shown that this task activates the anterior cingulate gyrus, the dorsolateral PFC, and the parietal area [41,[51][52][53].
On the other hand, we included respiration signal recording in our study. Indeed, although this is a very simple signal to record, studies inquiring respiration indices during acute mental stress are lacking and our contribution represents the first step toward a deeper understanding of the influence of respiration on mental stress in general. Respiration has the advantage to be under the direct control of conscious states, which means that we are able to modulate respiration rate and amplitude at our convenience. This study wants to promote the use of respiration techniques to treat mental stress. However, the extent to which we can invert the process needs to be investigated. In fact, from our study, it seems clear (even if to deeper investigate) that higher acute mental stress is associated with higher respiration amplitude and period; nevertheless, the extent to which reducing respiration amplitude and period can reduce the mental stress needs to be examined accurately, and this would make totally sense according to HRV biofeedback techniques wider used. In this sense, the further investigation of the RSA is an important future challenge.
In our study, we also used computational techniques based on open-source algorithms developed in Python that can classify mental states based only on psychophysiological measures. This aspect poses new challenges for the automatic recognition of stress using machine learning algorithms, which can be implemented in advanced platforms for the recognition of mental stress. In fact, these platforms would require a training dataset based on the subject's signals in order to work properly. Our study demonstrated that the Arithmetic task rather than Stroop was a better first task for a machine learning training set. Interestingly, the Arithmetic task was also practically easier when compared to the Stroop, which required a monitor showing colored words. Moreover, the Arithmetic task could be implemented easily in a mobile App, for example, as a text message or a recorded message with simple instructions.
The classification tree ( Figure 4) and the hierarchical regression analysis (Table 5) highlighted the importance of combining cardiovascular and respiratory aspects by the means of the RSA indexes and also through the Respiration period (see Figure 6). This result underlines the importance of the respiratory signal, which is often neglected in mental stress research. It is especially important when the aim is to collect data during daily activities by the means of wearable sensors and when the inclusion of a strip can be difficult. For unobtrusiveness, BVP and GSR can be detected by a wrist sensor, and a belt over the clothing can be used to detect the respiration signal. The results of our study suggest that a respiratory strip should be included in mental health studies, especially when it is important to detect short-duration events.
Our findings have implications for the computational psychometrics field [54,55]. The results of this study showed a significant potential of new computational techniques. In particular, the challenge of providing automatic feedback to users and patients can allow the development of new forms of treatment based on psychophysiological sensors. Indeed, biofeedback [56], a validated treatment for mental stress based on heart rate variability [57] or other physiological indexes [58][59][60], can be implemented in applications for the management of mental stress by using the same sensors that were used to assess stressful events. Currently, just one platform has been developed for this purpose by Gaggioli and colleagues in a block-randomized controlled trial [33], but the application of such sensors could be extended to other uses. Our findings could be extended to the implementation of an actual platform as an additional step forward in the treatment of mental stress.