EncephalApp Stroop Test as a Screening Tool for the Detection of Minimal Hepatic Encephalopathy in Patients with Cirrhosis—Single-Center Experience

: Background: Minimal hepatic encephalopathy (MHE) is the mildest form of hepatic encephalopathy. One of the neuropsychological tests that detects MHE is the Stroop test (via Encepha-lApp). The aim was to evaluate the Stroop test for the screening and diagnosis of MHE. Methods: This prospective case–control study was performed at the Clinic for Gastroenterology and Hepatology, University Clinical Center of Serbia, and included patients with cirrhosis and MHE and healthy controls. In all patients, the presence of MHE was confirmed using the animal naming test. The Stroop test was performed on each participant, and the results were compared between the two groups. The test has two components, the “OFF” and “ON” states. Results: A total of 111 participants were included. The median OFF time did not differ between the two groups, 106.3 and 91.4, p > 0.05. However, in patients with MHE, the median values of ON time and total time were significantly higher, with 122.3 vs. 105.3 and 228.0 vs. 195.6, respectively, p < 0.05. Statistical significance between patients and controls in examined parameters was detected in younger participants and the group with higher educational levels. Conclusions: The Stroop test displayed limited sensitivity in Serbian patients. Age and education affect time measurements and test performance.


Introduction
Hepatic encephalopathy (HE) is a brain dysfunction that is a result of liver insufficiency and/or portosystemic shunting and occurs in end-stage liver disease [1].Even though it is considered common, the actual prevalence of HE is difficult to establish, mainly because of the variable clinical presentation, which can range from mild neuropsychiatric abnormalities to coma.During recent years, several classifications of HE have been proposed, and HE nomenclature remains a matter of debate.According to the widely used West Haven criteria, HE should be classified as minimal, grade I, grade II, grade III, or grade IV-where minimal HE (MHE) and grade I HE are considered as "covert" HE, while HE grades II, III, and IV are considered as "overt" HE (OHE) [2].However, because of their convenience and clinical significance, it has been decided that the terms "overt" and "covert" HE should remain in use [3].
Minimal hepatic encephalopathy is the mildest form of HE and is defined as the presence of signs of brain dysfunction (only detectable by specific psychometric tests) in the absence of disorientation and asterixis [1], which makes it impossible to diagnose using routine physical examination.It begins with subtle cognitive impairments, which later commonly progress into the overt form of HE.The estimated prevalence of MHE on presentation is up to 80% [1,4].The diagnosis of MHE is relevant because the condition is common, it is a predictor of OHE bouts, it severely affects patients' quality of life, and reduces patients' socio-economic potential [5][6][7].MHE is associated with an increased risk of falls, decreased ability to work, and impairment of driving skills, which is possibly associated with more traffic accidents and both mild and severe traffic violations [5,6,8].However, the MHE remains under-diagnosed.According to the survey conducted by the American Association for the Study of Liver Diseases (AASLD), only 50% of medical professionals evaluated their patients for the presence of MHE [9].
When it comes to diagnosing MHE, three groups of tests are available to date: neuropsychological (pen and paper, digital), neurophysiological, and psychophysical [5,10].Two of the best-known neuropsychological pen and paper tests are the psychometric hepatic encephalopathy score (PHES) and the animal naming test (ANT).Digital forms of these types of tests include continuous reaction time, the inhibitory control test, and the SCAN test, while neuropsychological and psychophysical tests include electroencephalogram and critical flicker frequency-which all require time, equipment, and trained personnel and are rarely used in regular clinical settings [5].One of the neuropsychological digital tests that is being frequently used is the Stroop test (via EncephalApp), designed to evaluate patients' psychomotor speed and cognitive flexibility.Bajaj et al. developed EncephalApp to administer the Stroop test in 2013 as a new smartphone application that can rapidly identify patients at risk of HE to help physicians in the early detection of MHE and the prevention of its complications and sequelae [11].The EncephalApp Stroop test is widely used to detect MHE and predict the initial episode of OHE.They conducted research showing that this app-based test is as accurate as other available tests but is much simpler to perform (does not require trained personnel), significantly more comfortable for both the patient and the person performing the test (doctors, other medical staff), and can be completed in 5 min [11,12].Moreover, to shorten test performance time even more, a newer version of the Stroop test was developed recently (QuickStroop) which can be performed within one minute, with similar sensitivity and specificity [13].
From the several studies performed by Bajaj et al., as well as from the results of studies from other countries, it can be concluded that various socio-demographic factors, such as the level of education and smartphone use experience, can significantly influence the test results [14].Even though official recommendations cannot be made regarding the primary prophylaxis of OHE, given its importance and significance, the EncephalApp Stroop test has been increasingly (although externally) used not only for MHE identification but also for the prediction of the first OHE episode as well [15].Therefore, there is a justified need to validate this test for different populations.
The Stroop test is convenient and easy to use and implement in everyday clinical practice.Therefore, this study aimed to evaluate the Stroop test (translated, Serbian version) for the screening of MHE in a tertiary care center in Belgrade, Serbia.

Study Design
This prospective case-control study was performed from February 2020 to August 2022 at the Clinic for Gastroenterology and Hepatology, University Clinical Center of Serbia.Both healthy controls and patients with cirrhosis and MHE were included in this study.Cirrhosis was diagnosed based on the following criteria: pathohistological verification and/or typical laboratory, endoscopic, and radiological findings in a setting of known chronic liver disease (liver nodularity on imaging, endoscopically or radiologically verified presence of portosystemic collaterals, laboratory signs of decreased synthetic and excretory liver function).In all patients, the presence of MHE was screened using ANT (defined as naming less than 15 animals in one minute) [16][17][18].The animal naming test is a rapid test used to evaluate semantic fluency, where patients are asked to name as many animals as possible in one minute.It has been increasingly used as a fast and reliable bedside test in MHE screening in patients with cirrhosis [16][17][18][19].Exclusion criteria were as follows: age < 18 years, uncontrolled OHE (defined as mini mental status examination score < 25), organic brain syndrome (including degenerative and ischemic brain disorders, neuroinfection, metabolic disorders), consumption of alcohol or psychoactive substances (including sedatives and antipsychotic medications) at least one month prior to testing, metabolic syndrome, and color blindness.Color blindness was excluded using the "pseudoisochromatic Ishihara test" application.Age-matched healthy controls consisted mostly of patients' friends and family members, as well as healthcare workers.

EncephalApp-Stroop Test
The application was downloaded from the Play Store (EncephalApp Stroop) and used on Samsung Galaxy E SM-T561 tablets.A translated Serbian version of EncephalApp was used.The Stroop test measures the presence of psychological impairment by asking patients to perform two simple tasks: to identify the color of pound signs (#) appearing on the screen, and to identify the color of the text of the words written on the screen afterward.The test has two components, the "OFF" and "ON" states.In the "OFF" state, the patient sees a neutral stimulus (pound signs, ###) presented in red, green, or blue and is advised to respond in a timely manner by touching the matching color of the stimulus out of the color options displayed at the bottom of the screen.During the "ON" state, the patient sees discordant stimuli and now has to touch the color of the word presented, which is the name of the color in discordant coloring (Figure 1).
Gastroenterol.Insights 2024, 15, FOR PEER REVIEW 3 many animals as possible in one minute.It has been increasingly used as a fast and reliable bedside test in MHE screening in patients with cirrhosis [16][17][18][19].Exclusion criteria were as follows: age < 18 years, uncontrolled OHE (defined as mini mental status examination score < 25), organic brain syndrome (including degenerative and ischemic brain disorders, neuroinfection, metabolic disorders), consumption of alcohol or psychoactive substances (including sedatives and antipsychotic medications) at least one month prior to testing, metabolic syndrome, and color blindness.Color blindness was excluded using the "pseudoisochromatic Ishihara test" application.Age-matched healthy controls consisted mostly of patients' friends and family members, as well as healthcare workers.

EncephalApp-Stroop Test
The application was downloaded from the Play Store (EncephalApp Stroop) and used on Samsung Galaxy E SM-T561 tablets.A translated Serbian version of EncephalApp was used.The Stroop test measures the presence of psychological impairment by asking patients to perform two simple tasks: to identify the color of pound signs (#) appearing on the screen, and to identify the color of the text of the words written on the screen afterward.The test has two components, the "OFF" and "ON" states.In the "OFF" state, the patient sees a neutral stimulus (pound signs, ###) presented in red, green, or blue and is advised to respond in a timely manner by touching the matching color of the stimulus out of the color options displayed at the bottom of the screen.During the "ON" state, the patient sees discordant stimuli and now has to touch the color of the word presented, which is the name of the color in discordant coloring (Figure 1).The specific outcomes recorded after the Stroop test were 1. OFF time (total time for five correct runs in the "OFF" state), 2. ON time (total time for five correct runs in the "ON" state), 3. time difference (ON time minus OFF time), and 4. total time (ON time plus OFF time) [11,14,15].
The study was approved by the Institutional Ethics Committee of the University Clinical Center of Serbia (protocol code: 341/19, date of approval: 14 September 2023).

Statistical Analyses
Results are presented as counts (absolute and relative), means ± standard deviation, or median (25th-75th percentile), depending on data type and distribution.The normality of the distribution was examined both graphically and numerically.Categorical variables were analyzed by a Chi-square or Fisher's exact test, where appropriate.Groups were compared using parametric (t-test, ANOVA) and nonparametric (Chi-square, Fisher's Exact test, Mann-Whitney U test, Kruskal-Wallis test) tests.Linear regression was performed to evaluate the relationship between dependent variables and independent The specific outcomes recorded after the Stroop test were 1. OFF time (total time for five correct runs in the "OFF" state), 2. ON time (total time for five correct runs in the "ON" state), 3. time difference (ON time minus OFF time), and 4. total time (ON time plus OFF time) [11,14,15].
The study was approved by the Institutional Ethics Committee of the University Clinical Center of Serbia (protocol code: 341/19, date of approval: 14 September 2023).

Statistical Analyses
Results are presented as counts (absolute and relative), means ± standard deviation, or median (25th-75th percentile), depending on data type and distribution.The normality of the distribution was examined both graphically and numerically.Categorical variables were analyzed by a Chi-square or Fisher's exact test, where appropriate.Groups were compared using parametric (t-test, ANOVA) and nonparametric (Chi-square, Fisher's Exact test, Mann-Whitney U test, Kruskal-Wallis test) tests.Linear regression was performed to evaluate the relationship between dependent variables and independent variables.To obtain a normal distribution of variables in the model, logarithmic transformation was applied.Group sample sizes of 59 and 52 achieve 57.52% power to reject the null hypothesis of equal means when the population mean difference is 33.0 with a standard deviation of 80.0 and with a significance level (alpha) of 0.05.All p-values less than 0.05 were considered significant.All data were analyzed using SPSS 29.0 (IBM Corp. Released 2023.IBM SPSS Statistics for Windows, Version 20.0.Armonk, NY, USA: IBM Corp.).

Results
In this study, a total of 111 participants were included in the final analysis (59 with cirrhosis and 52 healthy volunteers).One patient with cirrhosis was excluded due to inadequate testing (lack of motivation to complete the testing).No difference regarding age was observed between patients and healthy controls.The majority of patients were classified as Child-Pugh class B (n = 37, 62.7%), while the etiology of liver disease was most commonly due to alcohol-related liver disease (n = 22, 37.3%).Clinical and socio-demographic characteristics of patients and healthy controls are presented in detail in Table 1.The median of the number of unsuccessful attempts was identical in both groups.However, the mean value was higher in the patients' group, which was of statistical significance.The percentage of patients with unsuccessful attempts was higher in patients; however, this result did not reach statistical significance.All median values regarding the time measurements were higher in the patient group.Significant difference between groups was observed in the time ON, total time, and time difference measurements, while time OFF values were higher in the patients' group but did not reach statistical significance (Table 2).We have also evaluated whether disease etiology affects time measurements.No statistically significant differences in time ON, time OFF, and total time were noted in patients with alcohol-related liver disease cirrhosis compared to patients with cirrhosis of other etiology (Table 3).Using the area under the curve, the classification power of each time parameter was examined.The highest area value was present in time difference, followed by time ON, while the lowest area value was detected in time OFF (Figure 2).For presented parameters, cut-off values together with sensitivity (Sn) and specificity (Sp) were as follows: time difference 14.We have also evaluated whether disease etiology affects time measurements.No statistically significant differences in time ON, time OFF, and total time were noted in patients with alcohol-related liver disease cirrhosis compared to patients with cirrhosis of other etiology (Table 3).Using the area under the curve, the classification power of each time parameter was examined.The highest area value was present in time difference, followed by time ON, while the lowest area value was detected in time OFF (Figure 2).For presented parameters, cut-off values together with sensitivity (Sn) and specificity (Sp) were as follows: time difference 14.Afterward, the patients' group was stratified based on the Child-Pugh class and further compared.All time parameters had the highest medians in the class C group; however, these results did not reach statistical significance.The demographics and differences in time parameters concerning the Child-Pugh class are presented in detail in Table 4. Patients are further divided into subgroups, defined by age (<45 years and ≥45 years) and education (≤12 years and >12 years of education).Given the known effect of duration of formal education on cognitive functioning in older age [20], by dividing education years by 12, we aimed to stratify patients who completed at least secondary education curriculum from to those who did not (according to the national educational policy).Sub-analyses of the examined parameters are presented in detail in Tables 5 and 6.In younger patients, all time measurements were significantly higher in the patients' group, compared to healthy participants.On the contrary, in older patients (≥45 years), no significant difference in time measurements was observed.In the >12 years education subgroup, the number of unsuccessful attempts was significantly higher in patients compared to healthy controls.The same trend was not observed in the ≤12 years education subgroup.Interestingly, all time measurements were significantly higher in the group of patients only in the >12 years education subgroup, while no statistically significant difference in time measurements was detected between patients and healthy controls in the ≤12 years education subgroup.
Finally, a multivariable model was used to assess the influence of cirrhosis on time measurements using the EncephalApp Stroop test (Table 7).Due to high variability in several time measurements, logarithmic transformation was used for variance stabilization.Two models were performed simultaneously: the model without adjustment and the model with adjustment for age and education.Both models reveal an increase in time measurements in the patients' group, compared to healthy controls.Both logarithmic transformed and non-transformed models exhibit the same direction, while logarithmic transformed values reveal better performance, compared to the non-transformed values.

Discussion
Minimal hepatic encephalopathy is commonly encountered in patients with cirrhosis.Even though MHE is impossible to diagnose using standard physical examination, screen-ing rates for MHE have been reported as low among clinicians.Since MHE is associated with an increased risk of OHE (both first and repeated episodes), hospital admissions, worsened survival rates, and deranged quality of life, the adequate and timely diagnosis of MHE is essential in the treatment of patients with cirrhosis.Moreover, even though HE was previously considered "reversible", increasing evidence suggests that in some cases, permanent neurological sequelae as a result of neuronal and glial damage can remain [21].It has been suggested that repeated bouts of OHE lead to permanent brain damage, clinically recognized as "persistent" HE.This has been reported in studies that have shown that in several cases, liver transplantation did not lead to the reversal of impairment [22].The aforementioned results confirm that there are several equally important pathophysiological components in HE development-metabolic and reversible, and neurodegenerative and irreversible.Moreover, Nardelli et al. have demonstrated that cirrhotic patients with prior HE showed significant learning impairment compared to those without prior bouts of HE, despite medical treatment [23].Therefore, to prevent irreversible changes and provide our patients with timely diagnosis and adequate treatment (prophylactic and therapeutic), according to the current guidelines of the European Association for the Study of the Liver, routine screening for HE is strongly encouraged in all patients with cirrhosis [3].
In this study, we assessed the screening power of the EncephalApp Stroop test for discriminating MHE in a group of Serbian patients with cirrhosis and MHE that was previously diagnosed using the animal naming test.As expected, significant differences in the total number of unsuccessful attempts and all recorded times (time OFF, time ON, total time, time difference) were observed in patients with MHE when compared to healthy controls.According to our results, time difference was found to have the highest predictive value when it comes to distinguishing patients with MHE from those with no neurocognitive alterations, with an AUC of 0.650, and sensitivity and specificity of 61% and 63.5%, for a cut-off value of 14.2 s.
These results are in line with the study by Zeng et al., who reported a similar AUC for time difference (0.53) [24].When it comes to total time, various results have been reported in the literature, depending on the examined population.We observed that the total time cut-off for the detection of MHE was >211.7 s, which is significantly higher than the results reported by Zeng et al. [24].However, since this cut-off value displayed limited sensitivity (64.4%), our results are only partially in concordance with the previously mentioned study, in which the authors suggested that a total time of >186.63 s was suggestive of MHE, with a sensitivity of 86% [24].On the other hand, Kaps et al. reported that a total time cut-off value of >224.7 s had the best discriminatory ability for MHE diagnosis in the German population; all of these results are similar to our proposed cut-off values [25].However, in contrast to this cohort study of Serbian patients, studies in the US population have shown rather good sensitivity and high specificity of the test and effectiveness in assessing the risk of progression to overt hepatic encephalopathy [12].Hanai et al. investigated QuickStroop performance and reported that a total time cut-off value of >218.3 s had the best discriminatory ability for MHE diagnosis in the Japanese population, and they have even extended their research to the usefulness of the QuickStroop test in predicting OHE bouts, which was not examined in our study [26].
We have also demonstrated that in younger patients (<45 years) all time measurements were significantly higher, while no difference in time measurements between patients and controls was observed in older patients.Additionally, the same trend was observed in patients and controls with higher levels of education (>12 years), while no difference was observed in participants with lower educational levels.Therefore, a significant difference between our and previously published results could be explained by the differences in age and educational levels of different cohorts.In our cohort, the majority of participants included (n = 75, 67.56%) displayed a lower level of education, which could be considered as a surrogate for digital literacy and also the reason behind the low sensitivity and prolonged cut-off total time value of the EncephalApp Stroop test in this study group.Bajaj et al. have evaluated various factors that could influence time measurements including electrolyte disturbances (mainly hyponatremia), performed therapeutic interventions, and the type of device used for the test completion (iPad vs. iPod).However, they did not report any differences in time measurements when participants were subdivided into groups according to age [14].On the other hand, several other research groups did report increasing in time measurements with age, which led to proposing different cut-off values for time measurements depending on the age [27,28].This could be explained by welldocumented mild cognitive dysfunction that accompanies older age and is a result of various physiological and pathophysiological processes [29].In addition, slight mental alterations can be caused by other co-morbidities, irrespective of the liver disease or the presence of hepatic encephalopathy per se [30].According to our results, higher time measurements were reported in older participants (both patients and controls), which is in line with the results reported in the literature.In addition, Zeng et al. have determined a negative correlation between the experience with electronic platforms and the duration of formal education with the risk of a diagnosis of MHE via the EncephalApp Stroop test [24], which is similar to our results.Interestingly, Cunha-Silva et al. reported that there was no influence of gender, age, education, and familiarity with smartphones on the test results [31].These opposed results could be partially explained by the sample size; however, further larger, multi-center studies are required.

Limitations of the Study
We are aware of the several limitations of our study.First, this is a single-center pilot study, with a limited number of participants.Second, we identified patients with MHE only through ANT and did not compare the EncephalApp Stroop test results and time parameters to other tools used in the screening and diagnosis of MHE.Third, no strict inclusion and exclusion criteria were applied to healthy controls.We did not perform a diagnostic work-up on healthy controls and included all volunteers without significant personal history who appeared healthy.However, this study could be considered a beginning and inspiration for future multi-center studies regarding this matter in Serbia, which could lead to official test validation in the Serbian language, which could offer timely MHE diagnosis.In addition, longitudinal studies which would evaluate not only the diagnostic but also the prognostic value of the Stroop test would also be of great importance.

Conclusions
In summary, the EncephalApp Stroop test has not displayed good AUC values and sensitivity for the diagnosis of MHE in this cohort of Serbian patients.Additionally, we have demonstrated the significant effect of age and education on the time measurements and subsequent test performance.Even though we live in the digital era, the lack of digital literacy skills is common in our country, especially in the older population.Therefore, one should bear in mind the aforementioned when performing testing which requires even the use of basic digital knowledge.The EncephalApp Stroop test could be used as a rapid screening tool, but establishing the diagnosis of MHE likely requires a combination of diagnostic tests, depending on the patient.
Informed Consent Statement: Patient consent was waived by the Ethics Committee since test performance is a part of the local diagnostic algorithm.Healthy controls consented to be included in the study by agreeing to participate in testing.

Table 1 .
Socio-demographic and clinical characteristics of the examined population.
Results are presented as count (%), mean ± standard deviation, or median (range), where appropriate; a Independent samples t-test; b Pearson chi-square test; c Mann-Whitney U test.

Table 2 .
Group differences regarding the measurements.

Table 3 .
Time differences regarding disease etiology.
a Mann-Whitney U test.
Results are presented as mean ± standard deviation (with median where appropriate), count (%), or median (25-75th percentile); a Mann-Whitney U test; b Pearson chi-square test.

Table 3 .
Time differences regarding disease etiology.

Table 4 .
Demographics and time parameters concerning Child-Pugh class.
b Fisher's exact test; c Kruskal-Wallis test.

Table 5 .
Patients vs. controls in age subgroup analysis.

Table 6 .
Patients vs. controls in education subgroup analysis.Data presented as count (%) or mean ± standard deviation and median (25-75th percentile); a Pearson chi-square test; b Fisher's exact test; c Mann-Whitney U test.

Table 7 .
Univariable and multivariable models with time measurements as dependent and group plus age and education as covariates.
Results are presented as beta (p-value); Partial Eta squared.Eta-Partial Eta Squared.