1. Introduction
Progress in critical care has led to an increase in the number of patients who survive severe traumatic and non-traumatic brain injury (TBI and NTBI). This has resulted in an increase in the number of patients with prolonged disorders of consciousness (DOC) of more than 4 weeks after the onset of brain lesion, such as unresponsive wakefulness syndrome (UWS = vegetative state = VS) and minimally conscious state (MCS) [
1,
2,
3]. Patients with UWS demonstrate sleep and wake cycles and may maintain cortical functions associated with awareness [
2], but are otherwise unresponsive to their external environment [
4]. MCS involves inconsistent but clear and reproducible signs of awareness [
3]. Patients in MCS show non-reflexive, purposeful behaviors, but are unable to communicate effectively [
3].
The care of patients with DOC requires the assessment of different aspects of the brain injury and of the clinical condition of these patients. Quantitative assessments can provide objective data that help caregivers understand patients’ abilities, track changes over time, and tailor interventions effectively. Any sign of communicative ability is highly significant, as it offers insights into the regaining of consciousness and the emergence of elementary cognitive and communicative potential, indicating the possibility of further rehabilitation [
5].
Various assessment tools have been suggested to evaluate different aspects of brain lesions. Some of them, like the Glasgow Coma Scale (GCS) and the Coma/Near-Coma (C/NC) scale, were designed for severe disabilities in acute care settings. The GCS examines patients in coma, focusing primarily on detecting conditions relevant to the acute phase, and has a prognostic value [
6,
7]. The C/NC was intended to evaluate patients in UWS or near-UWS, not emergence from DOC [
8,
9]. Other scales were intended for later stages following brain injury. These included the Wessex Head Injury Matrix (WHIM) and the Sensory Modality Assessment and Rehabilitation Technique (SMART). WHIM is a 62-item hierarchical scale for the assessment of basic behaviors, social interaction, communication, attention, and cognitive skills. It provides a total score but no guidelines for interpretation [
8,
10]. SMART provides guidelines to differentiate UWS from MCS and predicts recovery from UWS but requires about three weeks to complete [
8,
10,
11]. The Disability Rating Scale (DRS) is designed to evaluate a range of performances relevant to patients with brain injuries. It consists of eight items, assessing conditions ranging from arousal and awareness to functioning in society. Scores range from 0 (no disability) to 29 (severe vegetative state) but it is not used for differential diagnosis between disorders of consciousness [
12,
13]. Today, the most recommended and frequently used scale for the assessment of DOC behavioral characteristics, and for distinguishing MCS from UWS and emerging from MCS, is the Coma Recovery Scale–Revised (CRS-R), with scores ranging between 0 and 23 [
1,
8,
14,
15].
Consciousness may also be indirectly detected by neuroimaging findings that showed a relationship with consciousness [
15,
16]. These include positron emission computed tomography (PET), electroencephalography (EEG), functional magnetic resonance imaging (fMRI), and functional near-infrared spectroscopy (fNIRS) [
14,
15,
16]. Neuroimaging, which may pre-clinically distinguish consciousness recovery, is recommended in cases of uncertainty about evidence of consciousness, when factors interfering with neurobehavioral assessment are identified or in the absence of signs of consciousness at the bedside [
16,
17].
The care of patients who suffer from severe communicative disability and the monitoring of their progress require quantitative assessment of their communicative performance [
17]. The Lowenstein Communication Scale (LCS) was developed primarily for such an assessment [
18]. The LCS consists of 25 tasks (or items) equally divided between 5 subscales: mobility, respiration, visual responsiveness, auditory comprehension, and communication (verbal or alternative). Each task is scored 0–4 (0 = No response; 1 = One or very few signs of response; 2 = Inconsistent minimal responsiveness; 3 = Diminished response; 4 = Consistent response). The total score ranges between 0 and 100. The full LCS scoring sheet and instructions appear in Borer-Alafi et al. [
18]. The original LCS version is in Hebrew (
Supplementary Materials Table S1) and was translated into English using the translation and back-translation method.
In a subgroup of 22 patients with UWS and MCS after TBI, Borer-Alafi et al. found that LCS has a good inter-rater agreement [
18]. They also showed that among 42 such patients, 27, who eventually showed sufficient cognitive and sub-language behaviors to be referred for further comprehensive rehabilitation, during their stay in intensive care for consciousness rehabilitation (ICCR), had significantly higher LCS scores on the motor, visual, and auditory subscales, as well as in the total scores [
18]. Expert consensus indicated that the LCS has acceptable content validity [
8]. Members of the interdisciplinary team at the ICCR department reported that by evaluating oral and alternative communication using the scale, they were able to identify initial signs of recovery of communication and provide means for patients to improve communication at an early stage. Speech and language clinicians with no previous experience with LCS reported that the instructions, provided in a separate manual, were clear even for first-time users. LCS seems, therefore, to be a promising tool for the evaluation of communicative performance in ICCR patients, which may assist clinicians in formulating early, targeted treatment plans.
Some of the arguments in favor of LCS use were subjective, however, and analyses of the psychometric properties of LCS did not include NTBI patients or discrimination between patients with an initial diagnosis of MCS and UWS. In addition, the expert consensus that supported LCS content validity indicated that the evidence for LCS reliability and validity was insufficient [
8]. To provide the missing evidence, in this study, we further investigated LCS reliability and validity on a group of TBI and NTBI patients with DOC in ICCR, and separately on patients in MCS or UWS.
3. Results
3.1. Participants
Of the 174 patients who were admitted to the ICCR Department between 2019 and 2021, 56 were recruited for the study based on the inclusion and exclusion criteria. Of these, 15 were further excluded: one who regained consciousness before the first evaluation, five who were recruited by mistake (two who did not meet the age criterion and three who did not have MCS or UWS for at least 28 days), and nine whose LCS or corresponding CRS-R evaluations were missing. The patients who were excluded did not significantly differ from those who were included in age (median 56, range: 17–79 and median 52, range: 20–71, respectively,
p = 0.26), sex (80% and 68% men, respectively,
p = 0.31), and etiology (TBI for 60% of the excluded and 56% of the included,
p = 0.52).
Table 1 presents the demographic and clinical data of the 41 remaining patients who were included in the study. Of the included participants, 22 (54%) had only LCS evaluation 1, one (2%) had only LCS evaluation 2, and 18 (44%) had both LCS evaluations completed. Based on CRS-R scores, consciousness state was UWS in 23 (56%) of the participants and MCS in 18 (44%) at evaluation 1 (admission). At evaluation 2, it was UWS in 8 of the 19 assessed patients, MCS in 3, and 8 regained full consciousness. Personal patient data are presented as
Supplementary Materials Table S2).
3.2. LCS Scores
At evaluation 1, the mean LCS score for the entire patient population was 23 (SD = 15, range 8–66.5). For patients with UWS, it was 14 (SD = 5, range 8–27), and for those with MCS of 35 (SD = 14, range 13–67). At evaluation 2, the mean score for the entire patient population was 38 (SD = 26, range 11–98). It was 16 (SD = 5, range 11–24) for patients with UWS, 29 (SD = 11, range 18–39) for patients with MCS, and 64 (SD = 18, range 43–98) for patients who regained consciousness. At both evaluations, significant differences in LCS scores were found between the persons with different consciousness states (p < 0.001) but not between the two raters.
3.3. Inter-Rater Reliability
3.3.1. Agreement in Item Scores Between Raters
Table 2 and
Table 3 present the total agreement and kappa values for each LCS item, within each subscale, for all the patients, and separately for the UWS and MCS patients. For the entire patient group, total agreement in LCS scores between raters A and B was found in 59–100% of the examined patients in the different tasks at evaluation 1, and in 72–100% at evaluation 2. The total agreement was found for 80% of the patients or more, in 20 of 25 tasks. The corresponding Cohen’s kappa values were significant for all tasks, at both evaluations, 1 and 2 (
p < 0.001). Inter-rater agreement was good to excellent (kappa values > 0.6) for all tasks, except “responsiveness”, “response to noise”, “respiration for speech production”, and “blink reflex, rhythm/fluency or speed”, at evaluation 1, which showed fair-to-moderate agreement (kappa values = 0.4–0.6).
3.3.2. Correlation Between Raters in Subscale and Total Scores
The coefficients of the correlations between the scores of the paired raters for the five subscales and for the total LCS scores, at evaluations 1 and 2, were strong for the entire patient group, moderate to strong for the UWS group, and strong for most subscales of the MCS group (
p < 0.01,
Table 4 and
Table 5). Only for the mobility subscale of the MCS group at evaluation 2, the correlation was not significant.
3.3.3. Difference Between Raters in Subscale and Total Scores
In the subscales, we observed significant inter-rater differences at evaluation 1 only for “auditory comprehension” in the entire patient group and the UWS group, and at evaluation 2 in the entire patient group and the MCS group (
p < 0.05). The difference resulted mainly from inter-rater differences in the “response to noise” task, which was small (0.42 points). There was a significant difference between raters in the “mobility” subscale for patients with MCS (
p = 0.041), mainly because of differences in “responsiveness” (0.38 points) and “speech/swallow” tasks (0.31 points). There were no significant differences between raters in other subscales in the entire patient group or in the separate UWS and MCS groups (
Table 4 and
Table 5). In the total scores, at evaluation 1, we found a small but significant difference between rater A and B only in the entire patient group (1 point difference on a 100-point scale,
p = 0.042,
Table 4). At evaluation 2, the total scores did not differ significantly between raters (
Table 5).
3.3.4. Intra-Class Correlation in Subscale and Total Scores
ICC values were excellent for the total scores and the subscales but moderate for the “communication” subscale for patients with MCS and good for most of the subscale for patients with UWS (
Table 6).
3.4. Internal Consistency
In the entire patient group, Cronbach’s alpha values of the total LCS scores were acceptable (
Table 7). Cronbach’s alpha decreased when the “mobility”, “visual responsiveness”, “auditory comprehension”, and “communication” subscales were eliminated. At the same time, the elimination of the “respiration” subscale increased Cronbach’s alpha values (
Table 7). In the subscales, Cronbach’s alpha values were below adequate only for “respiration.” Elimination of the majority of the tasks decreased the internal consistency of their corresponding subscales, but elimination of several tasks increased the internal consistency of their subscales (
Table 7). These tasks were “assisted respiration”, “spontaneous respiration”, “matching tasks”, and “comprehension of two-step commands” at evaluation 1, and “matching tasks” at evaluation 2.
Table 8 and
Table 9 display the internal consistency separately for the UWS and the MCS groups.
3.5. Responsiveness
In 64 pairs of evaluations in which rater A assessed the 4 comparable subscales in 16 patients, LCS identified 45 score changes between evaluations 1 and 2, and CRS-R identified 38 changes. In 60 pairs of evaluations, in which rater B assessed the 4 subscales in 15 patients, LCS identified 46 score changes and CRS-R 35. The difference in responsiveness between the two scales did not reach statistical significance, however (
p > 0.1,
Table 10).
3.6. LCS and CRS-R Relationship
The coefficients of the correlations found between LCS and CRS-R total scores were 0.842 or above (strong) for the entire patient group (p < 0.001). For the first and second raters, and for evaluations 1 and 2, respectively, they equaled 0.842 and 0.843, and 0.936 and 0.949 for the entire patient group (strong); 0.582 and 0.651, and 0.914 and 0.872 for the MCS group (moderate to strong); and 0.554 and 0.601, and 0.789 and 0.853, for the UWS group (moderate to strong) (p < 0.05). Total LCS scores at evaluation 1 were significantly related to a diagnosis of full consciousness based on CRS-R scores at evaluation 2 (Odds ratio, OR = 1.260; 95% confidence interval, CI = 1.002–1.583; p = 0.048).
4. Discussion
This study adds evidence required to establish the suitability of LCS for bedside assessment of communicative performance in patients with DOC. It presents the only new evidence about LCS, after the original presentation of the tool [
18]. It provides a more detailed demonstration of the LCS psychometric properties and allows their generalization for patients with traumatic or non-traumatic brain injuries in both UWS and MCS. The results support the use of LCS for the assessment of DOC patients in clinical or research settings, in accordance with the criteria of the American Congress of Rehabilitation Medicine [
8].
The findings show that the instrument is reliable and responsive, and support its validity for the entire DOC group and each of its UWS and MCS subgroups. For all the DOC groups, LCS reliability was manifest in the good or excellent inter-rater agreement for the majority of the tasks (
Table 2 and
Table 3); in the generally good correlations between scores of the two raters, with only small differences between them (
Table 4 and
Table 5); and in the high ICC values (
Table 6). It was also manifest in the good or high internal consistency values for the total LCS score and most LCS subscales (
Table 7,
Table 8 and
Table 9). LCS responsiveness was demonstrated by the comparison showing that it was at least as responsive as CRS-R (
Table 10). Together with the moderate-to-strong correlations found between the LCS and CRS-R scores, this supports LCS criterion validity.
Overall reliability was good despite the tendency of the “respiration” LCS subscale and certain tasks to compromise it. The “respiration” subscale reduced the internal consistency of the total LCS score (
Table 7), and the “response to noise”, “blink reflex”, “responsiveness”, “matching”, “mimicry”, “assisted respiration”, “spontaneous respiration”, and “comprehension of two-step commands” tasks contributed to the relatively low inter-rater agreement, a small significant difference between raters’ scores, and reduced internal consistency of the subscales (
Table 2,
Table 3,
Table 4,
Table 5,
Table 6,
Table 7,
Table 8 and
Table 9).
The “respiration” subscale reduced the internal consistency because of the ceiling effect reflected in the scores of the “assisted respiration” and “spontaneous respiration” tasks, which were high relative to the tasks in the other subscales, probably because patients were admitted to the ICCR only after being taken off the ventilator.
The decrease in internal consistency that the “response to noise” task caused may be explained by the fact that at times, the response is subtle and may not be noticed. The “matching” and “comprehension of two-step commands” tasks reduced the internal consistency of their subscales, possibly because they were difficult for patients with DOC; therefore, many of the participants had a score of 0 and hardly correlated with other tasks in their subscales.
Comparison with CRS-R supports LCS validity. Previous publications showed that the CRS-R, which assesses similar properties, is itself a proper criterion for LCS assessments [
1,
8,
15]. The Coma Recovery Scale (CRS) was developed in 1991 by Giacino et al. to evaluate consciousness; its revised version, CRS-R, introduced in 2004, is customarily used for differentiation between MCS and UWS with high reliability, validity, and sensitivity [
1,
24]. CRS-R is recognized as the most comprehensive and reliable behavioral assessment for DOC [
8,
15], and it has been translated into several languages [
25,
26,
27,
28,
29]. Several publications showed that CRS-R detected responses indicating improved consciousness in patients with DOC better than the Disability Rating Scale (DRS), the Wessex Head Injury Matrix (WHIM), the Glasgow Coma Scale (GCS), and the Full Outline of Unresponsiveness scale (FOUR). They also showed reasonable correlations between CRS-R scores and the scores of these scales [
1,
27,
28,
30]. Although the correlation between LCS and CRS-R scores was good, it was not very high. This and the differences found between them confirm that the scoring of the scales is not identical and that they score somewhat different things. Although both scales include the two components characterizing emergence from MCS, functional interactive communication and functional use of two different objects [
3], CRS-R scores consciousness, whereas the LCS focuses more on communication. Additional support for LCS validity is its predictive ability, demonstrated by the significant relationship between the first LCS assessment and full consciousness diagnosis at evaluation 2. Its good reliability, the supported validity, and ease of administration make the LCS suitable for assessment of communicative performance in patients with DOC. In our studies of patients with DOC, therefore, we used CRS-R for the assessment of consciousness state [
31,
32] and the LCS for the assessment of communication [
32]. Another aspect regarding the use of the LCS is that the LCS instructions, although clear, leave some details to the clinician’s professional judgment, which likely increases the possibility of a more complete assessment of responses but may also reduce exact reproducibility.