Assessment of Executive Function in Patients with Traumatic Brain Injury with the Wisconsin Card-Sorting Test

This review aimed at providing a brief and comprehensive summary of recent research regarding the use of the Wisconsin Card-Sorting Test (WCST) to assess executive function in patients with traumatic brain injury (TBI). A bibliographical search, performed in PubMed, Web of Science, Scopus, Cochrane Library, and PsycInfo, targeted publications from 2010 to 2020, in English or Spanish. Information regarding the studies’ designs, sample features and use of the WCST scores was recorded. An initial search eliciting 387 citations was reduced to 47 relevant papers. The highest proportion of publications came from the United States of America (34.0%) and included adult patients (95.7%). Observational designs were the most frequent (85.1%), the highest proportion being cross-sectional or case series studies. The average time after the occurrence of the TBI ranged from 4 to 62 years in single case studies, and from 6 weeks up to 23.5 years in the studies with more than one patient. Four studies compared groups of patients with TBI according to the severity (mild, moderate and/or severe), and in two cases, the studies compared TBI patients with healthy controls. Randomized control trials were seven in total. The noncomputerized WCST version including 128 cards was the most frequently used (78.7%). Characterization of the clinical profile of participants was the most frequent purpose (34.0%). The WCST is a common measure of executive function in patients with TBI. Although shorter and/or computerized versions are available, the original WCST with 128 cards is still used most often. The WCST is a useful tool for research and clinical purposes, yet a common practice is to report only one or a few of the possible scores, which prevents further valid comparisons across studies. Results might be useful to professionals in the clinical and research fields to guide them in assessment planning and proper interpretation of the WCST scores.


Introduction
Traumatic brain injury (TBI) is an alteration in normal brain function or any other evidence of brain pathology caused by an impact from external mechanical forces, such as rapid acceleration or deceleration, a bump or jolt to the head or penetration by a projectile. About sixty-nine million people experience TBI from all causes each year, with the Southeast Asian and Western Pacific regions experiencing the greatest overall burden of disease [1]. TBI is mild, moderate or severe, depending on the resulting severity and duration of loss of consciousness, post-traumatic amnesia and neuro-radiological evidence of cerebral damage. This classification system is highly reliable for first diagnosis; however, its prognostic value for long-term neuropsychological outcome is still limited as it rarely takes into account premorbid factors, underlying structural damage and the impact of non-neurological factors [2].

Methods
The author performed a bibliographical search in the PubMed, Web of Science, Scopus, Cochrane Library and PsycInfo databases. The terms "traumatic brain injury" and "TBI" were entered along with "Wisconsin card-sorting test" and "WCST". Inclusion criteria were: (1) research papers, (2) published in peer-reviewed journals, (3) published during the last decade (2010 to May 2020) and (4) available in English or Spanish. Exclusion criteria were: (1) not original research (e.g., letters, dissertations, reviews and/or meta-analyses) and (2) content not related to the objective of the study (i.e., study design not including patients with TBI and/or not using the WCST). The author accessed the online resources on 20 May 2020. After applying the inclusion and exclusion criteria, a final list of references was generated and the full content of the manuscripts was consulted to verify their relevance to the objectives of the review. Through complete and thorough readings of the manuscripts, the following data were collected: year of publication, research team setting, participant age stage (pediatric or adult) and study design. Regarding participants, the sample size, age, gender, severity of TBI and time since TBI were recorded. Concerning the WCST, the version used and reported scores were registered, along with its use in the study.

Results
An initial search produced 387 results from the five sources, which was reduced to 186 by filtering duplicated results. After applying the exclusion criteria and reviewing abstracts and/or manuscripts, the list was reduced to 47 relevant publications ( Figure 1). Table 1 summarizes the basic features of the publications. The highest proportion of publications came from the United States of America (34.0%), followed by Italy (14.9%) and other countries worldwide (51.1%). Regardless of the language spoken in the setting of research, all publications provided an abstract written in English, yet two manuscripts were in Spanish. The highest proportion of publications (n = 20) came from settings whose official language is English (United States of America, Canada and Australia), and the remainder (n = 27) came from countries with diverse languages, such as Italian, Spanish, Portuguese, French, Hebrew, Japanese, Korean, Malay, Serbian and Thai. Most studies included adult samples (95.7%), and only two (4.3%) had pediatric samples.  Table 2 summarizes information about the design of the studies with the corresponding references. Observational (descriptive) designs were the most frequent (85.1%), particularly cross-sectional and case series studies. There were seven randomized control trials. The original WCST version, comprising 128 cards and applied by an interviewer, was the most frequently used version (78.7%). Only three studies (5.4%) made use of computerized versions. Regarding WCST scores, perseverative errors (46.8%) followed by categories completed (42.6%) and perseverative responses (31.9%) stood out as the most often reported. In the selected studies, the use of the WCST was diverse. Characterization of the clinical profile of participants was the most frequent purpose (34.0%), in accordance with the high percentage of observational studies. In some cases (38.3%), WCST scores aimed at detecting differences in the executive function performance of TBI patients in comparison to clinical groups or healthy controls, or between groups of TBI patients defined by a selected clinical criterion (e.g., self-awareness, history of suicide attempt, anosmia).  Table 1 summarizes the basic features of the publications. The highest proportion of publications came from the United States of America (34.0%), followed by Italy (14.9%) and other countries worldwide (51.1%). Regardless of the language spoken in the setting of research, all publications provided an abstract written in English, yet two manuscripts were in Spanish. The highest proportion of publications (n = 20) came from settings whose official language is English (United States of America, Canada and Australia), and the remainder (n = 27) came from countries with diverse languages, such as Italian, Spanish, Portuguese, French, Hebrew, Japanese, Korean, Malay, Serbian and Thai. Most studies included adult samples (95.7%), and only two (4.3%) had pediatric samples.   Regarding the characteristics of the TBI patients participating in the studies, in some cases, samples included mostly (57.4%) or only (6.4%) men, and others included mostly (8.5%) or only (10.6%) women. Sex was not reported in eight (17.0%) manuscripts. Mean age in pediatric samples ranged from 14 to 15 years and from 20 to 77 in adult samples. The average time after the occurrence of the TBI, as reported in 34 manuscripts, ranged from 4 to 62 years in single case studies, and from 6 weeks up to 23.5 years in the studies with more than one patient. Although not stated, the time intervals suggest that all patients had passed the acute phase. Severity of the TBI in participants was reported in 37 manuscripts. Most studies (23.4%) included patients with a mild to severe TBI, followed by samples with moderate to severe TBI (12.8%) and mild to moderate TBI (10.6%). Some studies included exclusively patients with mild (14.9%), moderate (2.1%) or severe (14.9%) TBI.
A total of 32 studies exclusively included patients with TBI: six single cases, 11 analyzing all participants as a single group, eight grouping participants according to some clinical feature (e.g., severity of TBI, disability, history of suicide attempt, anosmia) and seven randomized control trials comparing groups by treatment (e.g., cognitive rehabilitation, growth hormone replacement therapy, sertraline medication, neuro-feedback training). On the other hand, 15 studies included not only TBI patients but also healthy controls (n = 12), clinical controls (n = 2) or both a healthy control group and clinical control group (n = 1). Only four studies compared the performance of patients according to severity (mild, moderate and/or severe); two of them also including a healthy control group. Most studies (72.3%) used the WCST scores as a descriptive feature of participants, comparing between groups or not. Table 3 presents the results and references. Additionally, in an effort to offer an overview of the performances across studies, the author selected the ones using the 128-card version, the most often used, and reporting perseverative errors and/or completed categories scores, which were the most often available. Table 4 summarizes the results. Compared to healthy controls, TBI patients produced more perseverative errors, as did those with a severe TBI in comparison to those with a moderate or mild TBI. Given the differences across samples and the type of reported scores, further analyses were not possible. Regarding the number of completed categories (possible range: 0 to 6), in a couple of studies, participants scored the highest, while most scores ranged from 4 to 5. In addition, some scores from patients with TBI were not so distant from those from healthy controls.     Only 12 publications [12,22,25,30,38,[41][42][43]46,47,50,51] reported including effort measures to account for performance/symptom validity. The preferred measures were the Trail Making Test (e.g., total completion time in seconds), the Computerized Assessment Response Bias and the California Verbal Learning Test (e.g., trials 1-5 learning score, long-delay free-recall score).
The author and an invited rater independently assessed the quality of the six randomized control trials with the corresponding Critical Appraisal Skills Programme (CASP) checklist [57] (Table A1 in Appendix A). Interrater reliability by intraclass correlation coefficient (by two-way mixed model and absolute agreement) was 0.70 (95% CI = 0.28-0.67).

Discussion
Here, the author presents a review providing a brief and comprehensive summary of recent research using the WCST to assess executive function in patients with TBI. The United States of America has produced most of the studies reviewed, yet they represent research from 17 different countries, from all continents. Worldwide, TBI is recognized as a health priority, given its related global burden (estimated in 2016 as 8.1 million years of life lived with disability; YDLs) and an expected increment in its incidence in view of the exponential population growth, population ageing and more frequent use of vehicles [58]. The WCST is not language-based; instructions are simple and oral interaction between the interviewer and the interviewee can be minimal. This is an important feature in applying the WCST across settings with different languages, yet always keeping in mind the need for standardized scores.
Reports from pediatric samples are scarce. This might be due not only to the limited age range compared with adults, but also to the fact that adults are more likely to be exposed to environmental conditions (e.g., traffic collisions, combat injuries) that may lead to a TBI. Moreover, it must be kept in mind that younger children (i.e., infants and toddlers) are also at very high risk of TBI; however, the WCST is not meant for this population, and other available measures should be considered (e.g., the Behavior Rating Inventory of Executive Function-Preschool Version (BREF-P) and the Developmental NEuroPSYchological Assessment (NEPSY)). The WCST has been standardized for people aged from six and a half to 89 years of age. Demographically corrected normative data provides score profiles according to age: 13 standardized scores tables for children/adolescents (6.5 to 20 years old) and 60 standardized scores tables for adults (21 to 89 years old), which also consider the number of years of education. This does not mean that the WCST is rarely applied to the pediatric population, as it is often used for other clinical conditions which are more common in infants in comparison to TBI, such as attention deficit hyperactivity disorder (ADHD) and autism [59].
The original and most widely used version of the WCST includes 128 cards to match. However, a shortened version presenting only the first deck of 64 cards is available, particularly for assessment situations with time restrictions or when the patient's attention span is compromised, as might be the case with elderly [60]. There is also the Modified WCST (M-WCST), a shorter, simpler and less ambiguous version that includes two sets of 24 cards. The M-WCST lacks response cards which share more than one feature with the stimulus cards, the interviewee's first response sets the category criteria, and it estimates perseverative errors differently [61]. More recently, computerized versions have become available, which have the advantages of a more efficient use of resources, improving reliability by equal assessments, and decreasing errors in test presentation, response recording and scoring. This could be very useful for research purposes, particularly when recruiting big samples. Yet, scores from the computerized versions do not seem quite equal to those of manual versions, and new norms for computer versions still need to be established [62].
As an instrument to assess executive function in TBI, the WCST scores have served different purposes according to the studies' designs. In most cases, scores were included as an index of the TBI participants' clinical status, in observational studies with a single case, case series or comparison between groups. Results from these studies helped to gather evidence of the sensitivity (potential to detect those with executive dysfunction) and specificity (potential to exclude those with no executive dysfunction) of the WCST, being essential features of diagnostic instruments. Although for research purposes, using the WCST scores as an index of executive dysfunction might be practical when assessing an individual in clinical practice, relying on a single indicator might lead to invalid conclusions. The WCST should be part of a larger and comprehensive neuropsychological battery, and its results should be interpreted along with information from different sources, such as interviews with the patient and relatives, observations and behavioral checklists. Evidence has shown that the capacity of performance-based executive measures, such as the WCST, to predict a patient's ability to function adaptively in daily life after TBI is variable. Patients with compromised executive function can perform well on tests of executive functioning, but demonstrate real-world behavioral disturbances with a reduction of autonomy. Thus, clinicians and researchers must consider the ecological validity of the selected measures and complement them with behavioral data from other sources Kibby and colleagues [63], in testing the ecological validity of the WCST, found that perseverative responses did not significantly correlate with level of job performance, yet it did predict occupational status (from manual labor to higher-level positions). On their part, Pezzuti and colleagues [64] constructed and validated an ecological version of the WCST aimed at the elderly, that was found to be more discriminating and to have more advantages than the traditional versions. The ecological value of the WCST for discriminating functional outcomes, rather than just evidencing the severity of the injury or the presence of a condition, comes forward as a topic worthy of further research.
Randomized control trials used the WCST scores as an outcome variable when testing the effectiveness of a medical treatment (e.g., sertraline [54], growth hormone therapy [52]) or a rehabilitation program (e.g., CogSmart [50,51]; STEP [53]; neuro-feedback [55], vocational problem-solving [56]). The scores obtained before group allocation and treatment, as in the case of descriptive studies, might be useful to evidence the WCST's sensitivity and specificity.
Once the person completes the WCST, all scores can be calculated, yet the studies only reported one or a few of them. Perseverative errors were the scores most often reported. A perseverative response occurs when the interviewee matches a card using the same criterion (color, form or number) used in the immediate previous match, regardless of the response being correct or not. Thus, perseverative errors refer to the number of perseverative responses that were not correct; i.e., the response did not match the valid criterion in turn. Perseverative errors index an incapacity to inhibit a learned response despite knowing from feedback that the response is incorrect. As a measure of executive function, we expect that the greater the dysfunction, the higher the score in perseverative errors. The reported results seem to agree with this, yet the diversity in samples prevents valid comparisons to decide whether there is a significant association between TBI severity and executive dysfunction. Another often-reported score is the number of categories completed; that is, the number (0 to 6) of categories with a sequence of 10 consecutive correct matches to the criterion in turn. Once the subject completes a category, the sorting criterion changes. The test ends when the person completes all six categories or after he/she sorts all the stimulus cards. The ability for completing categories represents the person's capacity to promptly identify the sorting criterion and persevere in providing the correct responses. In some cases, the studies reported that patients with TBI achieved the highest score, while in other cases their scores were not significantly low, and even not so different from those of the controls. This may suggest that patients with TBI have a good ability to figure out the sorting criterion, properly adapting their responses from feedback, despite having difficulties in promptly shifting the criterion, as higher scores in perseverative errors suggest. Raw scores from the interview can be transformed to T scores and percentile scores, which would be very useful for research and clinical purposes, respectively. It would be ideal if all WCST scores were available from the manuscripts, so that further analyses with data across studies could be made, providing a valid insight into the executive function profile of patients with TBI.
The only author performed the bibliographical search, data collection and analyses, which may be a source of bias. In addition, due to language limitations, four manuscripts that might be relevant to the study's objective (two in Chinese, one in French and one in Turkish) could not be included.

Conclusions
This brief review of recent research on the use of the WCST to assess executive function in patients with TBI showed that interest in the topic is worldwide and has been mainly focused on adult populations. Although shorter and/or computerized versions are available, the original WCST with 128 cards is still the most often used. The WCST is a useful tool for research and clinical purposes, yet a common practice is to report only one or a few of the possible scores, which prevents further valid comparisons across studies.
Funding: The Hospital Regional de Alta Especialidad de la Península of Yucatán supported this work by covering the Article Processing Charges.

Acknowledgments:
The author thanks Azalia Avila-Nava for their collaboration as a rater in the assessment of the quality of randomized control trials.

Conflicts of Interest:
The author declares no conflict of interest.

Availability of Data and Material:
The dataset generated and/or analyzed during the current study is available from the corresponding author on reasonable request. The original CASP checklist includes item 7 (How large was the treatment effect?) and item 8 (How precise was the estimate of the treatment effect?). These items were substituted by item *Was the estimated treatment effect adequately reported? : Yes; x: No, ?: Can't tell.