A Computational Approach for the Assessment of Executive Functions in Patients with Obsessive–Compulsive Disorder

Previous studies on obsessive–compulsive disorder (OCD) showed impairments in executive domains, particularly in cognitive inhibition. In this perspective, the use of virtual reality showed huge potential in the assessment of executive functions; however, unfortunately, to date, no study on the assessment of these patients took advantage of the use of virtual environments. One of the main problems faced within assessment protocols is the use of a limited number of variables and tools when tailoring a personalized program. The main aim of this study was to provide a heuristic decision tree for the future development of tailored assessment protocols. To this purpose, we conducted a study that involved 58 participants (29 OCD patients and 29 controls) to collect both classic neuropsychological data and precise data based on a validated protocol in virtual reality for the assessment of executive functions, namely, the VMET (virtual multiple errands test). In order to provide clear indications for working on executive functions with these patients, we carried out a cross-validation based on three learning algorithms and computationally defined two decision trees. We found that, by using three neuropsychological tests and two VMET scores, it was possible to discriminate OCD patients from controls, opening a novel scenario for future assessment protocols based on virtual reality and computational techniques.


Introduction
According to the Diagnostic and Statistical Manual of Mental Disorders, Fifth Edition (DSM-5) [1], patients with obsessive-compulsive disorder (OCD) usually show obsessions and/or compulsions that reduce quality of life because of interference with daily routines, as well as work, social, or family life. This disorder affects about 2% of the population, and the World Health Organization highlighted that OCD is one of the 20 causes of disability in subjects within the 15-44 age range [2]. Moreover, OCD patients show dysfunctions in executive domains, particularly in cognitive inhibition [3], probably caused by a serotoninergic and dopaminergic dysfunction [4]. A deficit in executive functions may give problems when responding to both internal and external requirements, by inhibiting the ability to manage and orient the necessary cognitive resources. Specifically, the term "executive function" indicates a complex domain that includes a large number of cognitive processes and behavioral capabilities such as problem-solving, planning, sequencing, ability to sustain attention, resisting interference, utilizing feedback, cognitive inhibition, multitasking, cognitive flexibility, etc. [5][6][7]. Despite this abnormality, research on neuropsychological impairments in OCD produced unclear results [8]. Analyzing this specific syndrome, we can find some symptoms strongly related to dysexecutive deficits, such as checking behaviors. It is important to understand which one has a causal role in OCD and which one is a consequence of the syndrome [9]. Some articles showed deficits in planning abilities and nonverbal memory, while other studies reported deficits in cognitive flexibility and inhibition, and others displayed no neuropsychological deficits [10][11][12][13][14]. There are many possible explanations, e.g., differences in the methodology, in the instruments used, or in the characteristics of the samples. Specifically, the diversity of tools used during the assessment phase is related to an unsolved debate: the difference between paper-and-pencil and ecological tests [15]. Classic tasks in a classic setting analyze single aspects of complex domains and request simple responses to single events. Conversely, tasks in naturalistic settings may analyze cognitive functions in a complete way, requiring complex answers and, sometimes, the inhibition of inappropriate or irrelevant actions within several subtasks [6]. Therefore, it is critical to increase the ecological validity of a neuropsychological battery, especially for a complex cognitive domain such as an executive function. The assessment procedure has to become more sensitive to different aspects of patient behavior reflecting real-life situations [16]. However, it is too difficult to create a feasible assessment of executive functions during real-life situations because of implementation problems and the difficulty in involving patients in the procedure [17].
Virtual reality (VR) represents a valid solution to address the problems of classical assessment protocols. VR is a new technology that allows users to actively interact in a computer-generated tridimensional environment that simulates the real world [18]. This technology allows subjects to explore and manage several situations inspired by daily experiences using real correspondent behaviors in a more controlled, safe, and low-cost setting than real-life situations [19]. In the last few years, VR was applied for the assessment and rehabilitation of several psychological diseases such as post-traumatic stress disorder [20][21][22], anxiety [23,24], and eating disorders [25], as well as for neuropsychological domains such as neglect [26,27], executive functions [28,29], decision-making [30], spatial memory and orientation [31][32][33], and cognitive rehabilitation of schizophrenia [34].
In the current study, we proposed a computational approach based on classification learning algorithms to discriminate OCD patients from a control group. There were many studies that highlighted the usefulness of a VR-based approach for the assessment of executive functions in OCD patients [35][36][37]. These patients showed a specific pattern of symptoms that could be easily detected with the use of VR or an integrated assessment. Our purpose was to provide a heuristic decision tree for a more precise but simple diagnosis, giving evidence-based indications for a possible rehabilitation protocol using virtual reality tailored to these patients. The assessment of the OCD disorder was based on a clinical interview, integrated with a questionnaire, as well as neuropsychological assessment in rare cases. This last aspect is important because it may highlight a particular pattern of functioning in OCD patients [38]. We tried providing indications of the most useful assessment tools for the analysis of these specific aspects.

Participants
The 58 participants consisted of 29 OCD patients (mean age: 33.07; SD: 9.91), diagnosed by a clinical psychologist or psychiatrist as meeting the DSM-5 [1] criteria for OCD, and 29 healthy and partially matched controls (mean age: 40.48; SD: 15.59). The mean value for years of education (y.o.e.) of patients was 12.03 (SD: 3.2) and that for the healthy control group was 12.03 (SD: 3.2). In the OCD group, there were 14 females and 15 males, and, in the healthy control group, there were 14 females and 15 males. In Table 1, the descriptive statistics of both groups are presented. In Table 2, a comparison between age and y.o.e. and between the cognitive level of control and patient groups is presented. According to the results, the two groups were matched for both age and y.o.e. but not for cognitive level. The general exclusion criteria were as follows: (1) presence of sensory and/or motor limitation; (2) presence of deficit in general cognitive level (Mini Mental State Examination <19); (3) deficit in perception (Street Test <2.25); (4) deficit in language comprehension (Token Test <26.5); (5) anxiety (State Trait Anxiety Inventory -STAI >40); (6) depression (Beck Depression Inventory >16). We did not control the level of OCD symptoms with quantitative methods; however, all patients were currently undergoing treatment and, according to a clinician, in partial remission. Furthermore, OCD patients with comorbidities and healthy controls with any psychiatric diagnosis were excluded. The subjects involved were treated with both drugs and psychotherapy according to the standard. All participants were experienced with the use of personal computers (PCs) and came from the hospital's area. Participants were asked not to drink caffeine or alcohol and not to smoke prior to the experimental test to avoid any effects of these substances on test execution and performance.

Ethics Statement
The study was approved by the scientific review board of the "U.O. di Psichiatria dell'Azienda Universitaria Ospedaliera Policlinico 'Paolo Giaccone' di Palermo", in accordance with the Declaration of Helsinki. All participants gave written informed consent to the experimental procedure according to the rules of the scientific review board. All participant data were stored in encrypted and password protected files, following the criteria to protect personal health information [39].

Protocol
Participants were selected from the outpatient Unit of Psychiatry of Palermo University Hospital. The subject who met the experimental criteria were contacted, and a meeting was scheduled at the University Hospital. The experimental session was held by a specialized psychiatrist of the University of Palermo. At the beginning of the session, the examiner explained the general goals of the clinical protocol and the procedures to be used, and discussed the patient's doubts and concerns. During the experimental session, two parts were planned: the classical neuropsychological assessment and the VR-based assessment. The presentation order was counterbalanced; half of the patients started the assessment with the VR test, and the other half started with the classic neuropsychological battery. Before the VR-based assessment, patients were trained for the use of a joypad within a virtual environment.

Neuropsychological Battery
To understand the cognitive profile of the participants, a complete neuropsychological battery was administered. A Mini Mental State Evaluation (MMSE) [40] was administered to assess the general cognitive level. To assess verbal memory, the Digit Span (Digit S) Test [41] was used to assess short-term memory, the Short Story Recall Test and the Paired-Associate Learning Test (PALT) [42] were used to assess long-term memory, and the Corsi Span (Corsi S) and the Corsi Block Task (Corsi BT) [41] were used for the assessment of short-and long-term spatial memory. For analysis of the executive domain, several tests were used: the Frontal Assessment Battery (FAB) [43], a general battery to assess frontal lobe functions, the Trail Making Test (TMT, forms A, B, and B-A) [44] for the assessment of selective attention, and the Tower of London (TOL) Test [45] for the assessment of planning abilities. Also, a Phonemic Fluency (PF) Test and a Semantic Fluency (SF) Test [46] were used. All scores of the tests were corrected for age, education level, and gender where appropriate.

VMET
The assessment protocol was created with NeuroVR (Version 2.0, Istituto Auxologico Italiano, Milan, Italy), a free software where the user can modify a pre-existing virtual environment by selecting contents from a database of objects (both two-and three-dimensional (2D and 3D)) and videos [47], expanded with NeuroVirtual 3D [48]. The scene was visualized in the player using non-immersive displays. The task took place in a virtual supermarket shown on a laptop screen, and the patient had to use a joypad to move around the environment. All users were trained for virtual reality use in another smaller shop, specifically designed for training purposes. In the virtual supermarket, all products were organized in categories such as beverages, fruits and vegetables, breakfast foods, hygiene products, frozen foods, garden products, and animal products.

VMET Scoring
Before starting the task, the participants received a shopping list, a sheet with the rules, a map of the supermarket, information about the supermarket (opening and closing times, products on sale, etc.), a pen, and a wristwatch. The examiner read and explained all the information relative to the subject in order to guarantee complete understanding. The VMET test was composed of four main tasks. The first involved purchasing six items (e.g., one product on sale). The second involved asking the examiner information about one item to be purchased. The third involved writing the shopping list 5 min after beginning the test. The fourth involved responding to some questions at the end of the virtual session by using the given materials (e.g., the closing time of the virtual supermarket). The rules that the patients had to follow to complete the task were as follows: (1) they had to execute all the proposed tasks; (2) they could execute all tasks in any order; (3) they could not go to a place unless it was a part of a task; (4) they could not pass through the same passage more than once; (5) they could not buy more than two items per category (looking at the chart); (6) they had to take as little time as possible to complete the exercise; (7) they could not talk to the researcher unless this was a part of the task; (8) they had to go to their "shopping cart" 5 min after the beginning of the task and make a list of all their products. After the explanation of the material, the clinician measured the time, stopping it when the participant said they finished the task. During the assessment, the examiner recorded all the participant's behaviors in the virtual environment according to a predefined form. To better understand the patient's work, the following items were recorded [49]: task failures (total and partial), inefficiencies, strategies, rule breaks, and interpretation failures. When a subtask was not totally completed, a task failure occurred, and the scoring range for total errors was from 11 (all 11 subtasks were correctly done) to 33 (all 11 subtasks were incorrectly done). To calculate the scoring for each task, the scale ranged from 1-3 (1 = the task was performed correctly; 2 = the participant performed part of the task; 3 = the participant totally omitted the task). An inefficiency was deemed a behavior that could prevent the correct execution of the tasks, such as not grouping similar tasks when possible. The general scoring range was from eight (several inefficiencies) to 32 (no inefficiencies), and the scoring scale for each inefficiency was from 1-4 (1 = always; 2 = more than once; 3 = once; 4 = never). To analyze the strategies, 13 behaviors that facilitated carrying out the tasks were evaluated, such as accurate planning before starting a specific subtask. The scoring scale for each strategy was from 1-4 (1 = always; 2 = more than once; 3 = once; 4 = never), and the total score ranged from 13 (good strategies) to 52 (no strategies). A rule break occurred when patients violated one or more of the eight rules listed (e.g., talking with the examiner when not necessary). The scoring scale for each rule break was from 1-4 (1 = always; 2 = more than once; 3 = once; 4 = never) and the total score ranged from eight (a large number of rule breaks) to 32 (no rule breaks). Finally, an interpretation failure occurred when the requirements of particular tasks were misunderstood, for example, when a participant thought that the subtasks all had to be done in the order presented on the information sheet. The general score ranged from three (a large number of interpretation failures) to six (no interpretation failures), and the score for each interpretation failure ranged from 1-2 (1 = yes; 2 = no). Furthermore, for every subtask, we analyzed the following variables: (1) sustained attention; (2) maintaining the correct sequence of the task; (3) remembering the instructions; (4) divided attention; (5) correct organization of the materials; (6) self-corrections; (7) absence of perseverations. The general score ranged from seven (no errors) to 14 (a large number of errors), and the score for each interpretation failure ranged from 1-2 (1 = yes; 2 = no). According to the analysis prosed by Cipresso and colleagues [35], we analyzed three subtasks that they recognized as particularly crucial in the OCD patients' performance: (1) "going to the shopping chart after 5 min"; (2) "buying two products instead of just one"; (3) "going into a specific place and asking the examiner what to buy". These tasks represented a break during the normal task execution because they required a different, confusing, or stopping behavior, which required attention and the elaboration of different information at the same time. These tasks represented a "break in time", a "break in choice", and a "break in social rules", respectively.  [50]. Comparisons between patients and controls were done by using a series of independent sample t-tests.

Data Analysis
To classify data, we used the following approaches [51,52]: • Logistic regression classification algorithm with ridge regularization; • Random forest classification using an ensemble of decision trees; • Support vector machine (SVM), to map inputs to higher-dimensional feature spaces that best separated different classes.
Specifications about the algorithms used for computational data analyses can be found in the seminal article recently published by Zhou and colleagues (https://bmcmedinformdecismak. biomedcentral.com/track/pdf/10.1186/s12911-019-0890-0) Table 3 shows the results of OCD patients compared with normative data. The results showed intact cognitive levels in these patients.

Results
Tables 4 and 5 report the sample descriptive statistics for the neuropsychological battery and the VMET scores, respectively. Table 6 indicates the independent sample t-tests comparing OCD patients with controls, for both the executive function domain and the other cognitive domains. On the other hand, Table 7 reports the independent sample t-tests for the VMET scoring.  Clearly, both the neuropsychological battery and the VMET scores were able to differentiate patients from healthy controls; however, the mean scores of the neuropsychological battery for both patients and healthy controls were situated in the normal range ( Table 3). Because of this fact, it is important to define classification models able to identify mutual information among the variables to make predictions based on a limited number of tests in a clinical setting. To pursue this aim, we ran three different learning algorithms for a cross-validation based on logistic regression, random forest, and support vector machine. Two different models were built: one based on classical neuropsychological tests for executive functions (FAB, TMTA, TMTB, TMTBA, TOL, PF, and SF) and the other one by also adding the previously defined VMET scores. Results of the two cross-validations can be seen in Table 8.     Finally, a classification tree for both models was built based on feature selection, choosing entropy as a measure of homogeneity [56][57][58] for split selection (Figures 1 and 2). Small circles indicate the ratio of classifications reported inside the rectangle in terms of percentage of correctness in recognizing the specific characteristics. The colors indicate classification as one of the two groups: blue for OCD patients, red for control participants. Finally, a classification tree for both models was built based on feature selection, choosing entropy as a measure of homogeneity [56][57][58] for split selection (Figures 1 and 2). Small circles indicate the ratio of classifications reported inside the rectangle in terms of percentage of correctness in recognizing the specific characteristics. The colors indicate classification as one of the two groups: blue for OCD patients, red for control participants.

Discussion
The general aim of this study was twofold. Firstly, we investigated executive functions in OCD patients and controls. To this purpose, we used the virtual version of the multiple errands test and a

Discussion
The general aim of this study was twofold. Firstly, we investigated executive functions in OCD patients and controls. To this purpose, we used the virtual version of the multiple errands test and a classic neuropsychological battery. On the other hand, our purpose was to find a method to discriminate the two groups in a better way. Indeed, the goal was scaled on the minimal number of variables possible to pursue an ecological assessment of executive functions with these patients. VMET was demonstrated to be effective in the assessment of several patients, such as for OCD [35], Parkinson's disease [28], and stroke [49]; however, previous studies focused on single scores, such as time, total and partial errors, inefficiencies, and others. In this study, we exploited the multivariate nature of the VMET for the assessment of a particular patient sample (i.e., OCD), where dysfunction is slightly higher than or equal to a normative sample. Thus, this aim shed new light on using multidimensional scaling for understanding the deficits of executive functions.
The results showed a clear difference between OCD patients and the control group, particularly in executive functions, as highlighted in Table 4 for the classic neuropsychological test and Table 5 for most of the VMET scores. However, the complexity of a complete neuropsychological battery with the complete execution of VMET hindered the assessment of these patients. With this limitation in mind, we used computational techniques, which are also used in VR settings [59], to advance our knowledge based on consistent and relevant data from the sample presented. The first strategy was to understand the classification ability within the sample with a supervised machine learning approach. The results showed precision levels between 71.4% and 84.6%, making us confident of the goodness of fit of the model to data. This result is important since, on the one hand, it allowed discriminating OCD patients from healthy subjects (see Table 6, where results are all related to such a domain). On the other hand, at the clinical level, we need to reduce complexity and, consequently, the number of used tests. To this purpose, we opted for a visual classification tree to provide clear indications (based on data) of heuristic choice (Figures 1 and 2). The main and most encouraging result was that, by using both the classic neuropsychological battery and the VMET scores, we obtained a tree based on only five variables (Figure 2), two of which are VMET-based and, not surprisingly, particularly related to OCD patients (errors and divided attention).
Even if the tree based on classic neuropsychological battery (Figure 1) could provide a useful tool for patient evaluation, it does not represent a simple tool to be used for the assessment of executive functions as a whole. On the other hand, the tree based on semantic and phonemic fluency, as well as the TOL and the two VMET scores (Figure 2), can be very useful as a shortened model for the assessment of executive functions in OCD patients.
During the assessment process of OCD, the clinician has many variables to take into account, and deficits of executive function could be one of them. A method which provides clinicians with a simple and complete indication of the best assessment process may have great impact on the clinical process and on rehabilitation.
In clinical neuropsychological assessment, there is always the necessity of integrating different types of information, such as psychometric and ecological data [60,61] in order to better understand the patient's cognitive functioning. VMET is able to play a crucial role in integrating classical neuropsychological tests with ecological settings, especially for executive function assessment in different type of patients.
Making a guided diagnosis for a specific cognitive domain in a specific target of patients is an important goal for both the assessment and the rehabilitation process because it could be able, on one hand, to reduce the time and effort expended by patients and clinicians. On the other hand, a virtual rehabilitation program developed for a targeting assessment would potentially be more personalized and efficient.
This study also had some limitations that could be overcome in future studies. Firstly, the sample size was limited; in a future study, an accurate sample size calculation could be done. Also, adding a clinical control group could be interesting in order to understand the potential of our algorithm for differential diagnosis.
According to our algorithm, the important test for discrimination between OCD patients and controls are fluency, both semantic and phonological, Tower of London, and VMET. This setting would require a minimum amount of time and reduced effort for the clinician during the assessment procedure.
Author Contributions: E.P. wrote the first version of the article. F.L.P. and C.L.C. recruited controls and patients and executed the clinical tests and the experimental protocol. P.C. and E.P. carried out the statistical and computational analyses and wrote the results. G.R. and D.L.B. supervised the scientific and clinical aspects of the study. All authors revised and approved the final version of the manuscript.
Funding: This research received no external funding.

Conflicts of Interest:
The authors declare no conflicts of interest.