Psychometric Properties of the Polish Version of the 36-Item WHODAS 2.0 in Patients with Low Back Pain

The World Health Organization Disability Assessment Schedule 2.0 (WHODAS 2.0) is considered by the World Health Organization (WHO) to be a useful tool for assessing the functioning and disability of the general population as well as the effectiveness of the applied interventions. Until this study, no data regarding the validity of the 36-item WHODAS 2.0 in chronic low back pain (LBP) in Poland have been explored. This study was conducted on 92 patients suffering from chronic LBP admitted to the rehabilitation ward. The Polish version of the 36-item WHODAS 2.0, the Sf-36 Health Survey (SF-36), the Oswestry Disability Index (ODI), the Hospital Anxiety and Depression Scale (HADS) and the Visual Analogue Scale (VAS) questionnaires were applied to assess patients. The scale score reliability of the entire tool for the study population was very high. The Cronbach’s alpha test result for the entire scale was 0.92. For the overall result of the WHODAS 2.0, the Intraclass Correlation Coefficient (ICC1,2) was 0.928, which confirmed that the scale was consistent over time. The total result and the vast majority of domains of the 36-item WHODAS 2.0 correlated negatively with domains of the SF-36 questionnaire; thus, a higher WHODAS 2.0 score was associated with a lower score on the SF-36 questionnaire. We found that the minimal clinically important difference (MCID) for the total WHODAS 2.0 score in patients after rehabilitation for LBP was 4.87. Overall, the results indicated that the Polish version of the 36-item WHODAS is suitable for assessing health and disability status in patients with LBP.


Introduction
Currently, low back pain (LBP) is the most common health problem [1], affecting an estimated 70% to 85% of the population [2,3]. The frequency of LBP problems is still increasing [4]. LBP puts significant limitations on the functioning of individuals [5]. Chronic LBP is a serious burden on health, social, and work systems [6,7]. LBP is a major contributor to disability worldwide [8] and is in sixth place in terms of overall disease burden [9]. WHODAS 2.0, the SF-36 Health Survey (SF-36), ODI, the Hospital Anxiety and Depression Scale (HADS), the Visual Analogue Scale (VAS), and sociodemographic data were also collected; study II-mainly two days after the study I (using the WHODAS 2.0) and study III-1 month after the completion of rehabilitation in the hospital (to assess responsiveness, the WHODAS 2.0 and VAS were used) ( Figure 1). (HADS), the Visual Analogue Scale (VAS), and sociodemographic data were also collected; study IImainly two days after the study I (using the WHODAS 2.0) and study III-1 month after the completion of rehabilitation in the hospital (to assess responsiveness, the WHODAS 2.0 and VAS were used) ( Figure 1).

Ethics Approval
This study was approved by the Bioethics Committee of the University of Rzeszów (Resolution No. 33/05/2019). All participants were instructed on the purpose and course of the study. They also 105

Ethics Approval
This study was approved by the Bioethics Committee of the University of Rzeszów (Resolution No. 33/05/2019). All participants were instructed on the purpose and course of the study. They also received information that they could withdraw from participation at any time without any consequences. Moreover, they were asked to sign informed consent in order to take part in the research.

The 36-item WHODAS 2.0
In accordance with the WHO rules, the 36-item WHODAS 2.0 was translated and culturally adapted by the ICF Council at the Poland Health Protection IT Systems, led by Professor Anna Wilmowska-Pietruszyńska, based on the agreement with the WHO [32].
The 36-item WHODAS 2.0 is used to measure general disability and disability in six domains: Do1 Cognition (6 items), Do2 Mobility (5 items), Do3 Self-care (4 items), Do4 Getting along (5 items), Do5 Life activities (8 items), and Do6 Participation (8 items). During the interview, the response refers to the last 30 days. Answers to the questions are rated on a 5-point scale identifying the level of difficulty or problem (1 = none; 2 = mild; 3 = moderate; 4 = severe; 5 = extreme or cannot to do). The obtained results are converted on the scale from 0 to 100 [33]. The psychometric properties of the 36-item WHODAS 2.0 have been examined in a cross-sectional study of older people in Poland [34].

The SF-36 Health Survey (SF-36)
The SF-36 version 2.0 is a general tool for measuring the health-related quality of life. The questionnaire contains 36 items used to measure eight domains: Physical functioning (10 items), Role limitations due to physical health (4 items), Bodily pain (2 items), General health perceptions (5 items), Vitality (4 items), Social functioning (2 items), Role limitations due to emotional problems (3 items), Mental health (5 items), Reported health transition (1 item).
Additionally, changes in general health over the preceding year are recorded. The remaining items in the questionnaire concern the experiences from the preceding month. In addition, the first four domains form the Physical component scale (PCS), whereas the next four ones create the Mental component scale (MCS).
The answers given by the participants are normalized so that the quality of life measures calculated on this basis are in the range of 0-100 points, where the value 0 is always the worst and the value 100 points relates to the best quality of life [35,36]. License agreement number QM030224 was obtained for using the SF-36 v. 2.0 questionnaire for the research.

The Oswestry Disability Index (ODI)
The modified ODI is a tool assessing the functional disability of a patient with LBP. It includes 10 items referring to pain and activities of daily living, each scored from 0 to 5. The total score is calculated through multiplying the sum, giving a range of 0 to 50 [37,38].

The Hospital Anxiety and Depression Scale (HADS)
The HADS is a tool commonly used for self-assessment detecting non-physical symptoms of anxiety and depression. It includes 14 items, i.e., seven items refer to the anxiety subscale (HADS anxiety) and seven other items refer to the depression subscale (HADS depression) [39]. Each item is rated on a 4-point scale ranging from 3 to 0. After adjusting for six items that are scored reversely, the sum of all responses is used to calculate the two subscales [40,41].

The Visual Analogue Scale (VAS)
The VAS is used to assess the intensity of pain using a visual scale, where 0 represents the total absence of pain, and 10 indicates unbearable pain.

Sociodemographic Data
Sociodemographic data were collected to provide basic information concerning sex, age, place of residence and education.

Statistical Analyzes
In order to receive the results, it is necessary to perform statistical analyses. The obtained data were analyzed using the R software, version 3.6.1. For the initial data analysis, the researchers used descriptive statistics measures.

Internal Consistency
In order to assess the internal consistency, the Cronbach's alpha-coefficient was used. Cronbach's alpha values between 0.70 and <0.95 indicated the adequate internal consistency reliability of the scale [42,43].

Test-Retest Reliability and Measurement Error
The reliability of the 36-item WHODAS 2.0 was assessed using the test-retest method. The time between the two measurements made by different interviewers amounted to 2 days on average. During this period there should have not been significant changes in the phenomenon under study. The Intraclass Correlation Coefficient (ICC 2,1 ), with a 95% confidence interval (CI), was used to measure the relative reliability [42,44]. It is the ability of a questionnaire to capture similar scores on 2 separate occasions of test administration, given when the patient's condition has not changed [45]. The relative reliability indicates the degree of consistency and agreement between two measures [46]. The standard error of measurement (SEM) quantifies what was assessed to measure the absolute reliability. The determination of the absolute reliability of measures is critical to ensure repeated measurements with satisfactory stability and sensitivity to real changes over time [45]. The absolute reliability indicates how much dispersion and error this measurement contains [46]. As for the discussed study, the SEM was calculated as follows: SEM = SD

√
(1 − ICC 2,1 ) [44,[47][48][49]. In addition, the minimal detectable change at the level 95% (MDC 95 ) was calculated. The MDC estimates the minimal amount of change in the score that confirms that the change is truly eliminating measurement error. In this case, the MDC was obtained using the formula: MDC = SEM × 1.96 × √ 2, where 1.96 was derived from the 0.95% CI of no change, and √ 2 showed two measurements assessing the change [44,47,50].

Internal Structure
The internal structure of the 36-item WHODAS 2.0 was also assessed by analyzing the correlations between the items in a given domain and the domain itself, as well as correlations appearing between the items and the overall result. Pearson's correlation coefficient was used.

Floor and Ceiling Effects
To detect the floor and ceiling effects, they were established by determining the percentage of subjects who scored the lowest or highest results with reference to the 36-item WHODAS 2.0. Floor or ceiling effects was observed if there were more than 15% of participants providing the lowest or highest possible score, respectively [51].

Convergent Validity
The convergent validity was assessed by correlating the results of the 36-item WHODAS 2.0 and the SF-36 questionnaire, the HADS and the ODI. The analysis was performed by examining Pearson's correlation coefficient. Adults with a lower quality of life and lower mood should have a higher level of disability [52]. We also assumed that the disability assessment using the ODI questionnaire would correlate with the assessment using the 36-item WHODAS 2.0. The greater the disability in the assessment of the ODI questionnaire, the higher the disability in the 36-item WHODAS 2.0.

Known Group Validity
The known group validity was assessed to test whether the 36-item WHODAS 2.0 distinguished two groups which should have different levels of construct. We took into account the simplicity in assessing these problems and the possibility of assigning function problems to the ICF framework. The occurrence of pain (ICF b280, a sensation of pain) was considered a health problem affecting the disability. The pain level was assessed using the VAS scale. For the purposes of the analysis, a dichotomous variable was created to divide the studied population into groups based on the following cutoffs: VAS ≤ 5 and VAS ≥ 6. Adults with higher levels of pain should be characterized by a higher level of disability [53,54]. The comparison of the 36-item WHODAS 2.0 results in the two groups was performed using the Student's t-test.

Responsiveness
The responsiveness refers to the ability of an instrument to distinguish clinically important changes as the result of an intervention. In order to assess the responsiveness, standard effect size (ES) and standardized response mean were calculated (SRM). ES is defined as a change in the mean score of the 36-item WHODAS 2.0 (between test 1 and 3) divided by the SD of the baseline score. Paired-samples t-test was used to examine the mean change between test 1 and test 3. SRM was calculated by dividing the mean score change by the SD of that score change. Absolute values of 0.20 or less, 0.21-0.79, and 0.80 or greater represent small, moderate, and large responsiveness, respectively, for ES and SRM [55].
To access responsiveness, the minimal clinically important difference (MCID) with its standard error (SE) was assessed [56]. The MCID was calculated on the basis of linear regression analysis, where the dependent variable was the change between 1st and the 3rd study in the 36-item WHODAS 2.0 (separately for each domain), and independent variable-change by 1 point on the VAS.

Socio-Demographic Characteristics
In the studied adult population, 61.96% were women. The average age was 66.0 (SD = 11.6) years. Slightly more of the respondents lived in the countryside (52.17%). Most of the respondents had secondary education (48.91%). The average level of pain on the VAS scale in the studied population was 5.77 points. According to the 36-item WHODAS 2.0, the average disability score for the study group was 41.53 ± 13.84. The highest average level of disability was observed in Do2 Mobility (

Internal Consistency
The scale score reliability of the 36-item WHODAS 2.0 for the study population was very high. The Cronbach's alpha test result for the entire scale was 0.921. As for the Cronbach's alpha for the individual domains, it ranged from 0.786 (Do4 Getting along) to 0.904 (Do5 Life activities) ( Table 2).

Test-Retest Reliability and Measurement Error
The value of the ICC 2,1 ranged from very high (for Do4, ICC 2,1 was 0.936) to high (for Do5, ICC 2,1 was 0.759). For the overall result of the WHODAS 2.0, the ICC 2,1 was 0.928, which confirmed that the scale was consistent over an approximate 2 day period ( Table 2). The total score WHODAS 2.0 result was characterized by a low measurement error (SEM = 3.77). The smallest SEM was found in the Do4 Getting along domain (SEM = 4.44), and the largest in Do5 Life activities (SEM = 11.01). The SEM for the overall result and all WHODAS 2.0 domains were less than 50% of the respective standard deviation values, except for Do3 Self-care and Do5 Life activities, where the measurement error was recorded at the limit of 50% of the standard deviation values for these domains. The MDC was the best (the smallest) for the total result (MDC 95 = 10.45), indicating that 10.45 was the minimal amount of change in the score of an instrument that must occur for an individual in order to be sure that the change in the score is not simply attributable to measurement error ( Table 2).

Internal Structure
All subscales for WHODAS 2.0 were moderately strongly or strongly correlated with the total result (r ranged from 0.482 to 0.762) ( Table 3).

Floor and Ceiling Effects
No floor or ceiling effects for the overall result the 36-item WHODAS 2.0 were found. However, over 15% of respondents reported the lowest possible score for WHODAS 2.0 Do1 cognition and Do4 getting along, indicating possible floor effects for these two domains ( Table 2).

Convergent Validity
The convergent validity was tested by correlating the results obtained with the 36-item WHODAS 2.0, the results of the SF-36, the HADS and the ODI questionnaires.
The total result and the vast majority of domains of the 36-item WHODAS 2.0 were negatively correlated with domains of the SF-36 questionnaire; thus, a higher score on the WHODAS (higher disability) was associated with a lower score on the SF-36 questionnaire (lower quality of life). The weakest correlation was found with Do3 Self-care and Do4 Getting along.
The total result and all the domains of the 36-item WHODAS 2.0 were positively correlated with each domain of the HADS questionnaire; thus, a higher score on the 36-item WHODAS 2.0 (higher disability) was associated with a higher score on the HADS questionnaire (anxiety and depression). These findings confirm that adults with higher anxiety and depression are characterized by a higher level of disability (Table 4).  The total result of the 36-item WHODAS 2.0 and all the domains were correlated with the ODI questionnaire. These findings confirming that adults with a higher physical disability measured by the ODI have a higher level of disability by the 36-item WHODAS 2.0 (Table 4).

Known Group Validity
We found significant differences among the selected subgroups of pain. With the possible exception of Do4 Getting along, we found differences between the selected subgroups and the total score of the 36-item WHODAS 2.0. These findings indicate that adults with higher levels of pain are likely characterized by a higher level of disability (Table 5).

Responsiveness
The statistical evidence indicates that all WHODAS 2.0 domains decreased between the first and third study (i.e., from test 1 to test 3). Nearly all domains showed a moderate to large degree of responsiveness, respectively, as signified by the ES and SRM values. The largest MCID was found in the case of Do2 Mobility (7.93 ± 0.70), and the smallest in the case of Do1 Cognition (1.71 ± 0.34) ( Table 6).

Discussion
To the best of our knowledge, this is the first study in which the researchers have evaluated the psychometric properties and validation of the Polish version of the 36-item WHODAS 2.0 engaging the patients with chronic LBP. This study is important due to the need for the implementation of valuable and reliable clinical tools for assessing the functioning and disability of patients with musculoskeletal pathology and for assessing rehabilitation progress. The 36-item WHODAS 2.0 implementation in Poland is associated with the simultaneous implementation of ICF. Indeed, in the recent past, the LBP Core Set Self-Report Checklist (LBP-CS-SRC) has been recently developed to facilitate people in self-rating activity limitations and participation restrictions [57]. Albeit LBP-CS-SRC is useful to understand the patients' perspectives [58], the 36-item WHODAS 2.0 has been defined as an instrumental tool for the clinical assessment of disability and the ability to function in patients with LBP by the latest WHO resolution for the International Classification of Diseases 11 th Revision (ICD-11) [59].
The results of our research have shown that the Polish version of the 36-item WHODAS 2.0 presents good psychometric properties and can be useful for the clinical examination of adult patients with LBP in Poland.
We found a very good scale score reliability of the entire Polish version of the 36-item WHODAS 2.0. In our case, the Cronbach's alpha test value for the whole scale was 0.92. The Cronbach's alpha value for individual domains ranged from 0.79 to 0.90. The tool is reliable and not "redundant" i.e., it does not contain too many questions still exploring the same subject. The tool meets Nunnally's criteria, according to which for a good Cronbach's alpha scale it must be >0.70 [43]. Similar reliability of the WHODAS 2.0 test was received by Silva et al. while validating the Portuguese version of the 36-item WHODAS among 204 patients with musculoskeletal pain [26]. The authors of this article have confirmed the reliability of the WHODAS 2.0 by also examining 60-70 year-olds living in Poland, establishing Cronbach's alpha for the whole scale on the level 0.89 and for individual domains it ranged from 0.85 to 0.86 [34]. Moreover, the reliability of the WHODAS 2.0 test was correspondingly obtained by Moen et al., where the Cronbach's alpha was 0.93 for the whole score and for individual domains it ranged from 0.75 to 0.94 [60]. It is worth mentioning that other authors also obtained high WHODAS 2.0 reliability scores [61][62][63].
We confirmed the good repeatability of the 36-item Polish WHODAS 2.0. For the overall result the 36-item WHODAS 2.0, the ICC 2,1 was 0.93 and for domains it ranged from 0.76 to 0.94. Kutlay et al. examined patients with osteoarthritis who received the ICC retest-test value for the overall score of 0.97, and for individual domains in the range of 0.87-097 [64]. As for the Chinese version of the WHODAS 2.0, the ICC values for the total score was 0.80 and for domains it ranged 0.83-0.89 [65]. Moen et al., examining patients referred for somatic rehabilitation, found an acceptable ICC reproducibility of the total score and the different domains except for self-care [60].
In our study, the SEM for the overall score and the WHODAS 2.0 domains were less than 50% of the respective standard deviation values. The exceptions were the domains of Do3 self-care and Do5 life ctivities, for which the measurement error was found to be at the level of 50% of the standard deviation value. For the overall result of the 36-item WHODAS 2.0, the SEM was 3.77, while the MCD 95 was 10.45. In the studies of Serrano-Dueñas et al., the SEM for the WHODAS overall result slightly exceeded 50% of the measurement error (SEM = 51.7%) [66]. Silva et al., in their research, received the relatively small deviations of the SEM (2.94) and the MDC (8.15) indicated the good reliability of the 36-item WHODAS 2.0 summary score [26].
No floor effect and ceiling for the overall result of the 36-item WHODAS 2.0 was found. However, the Do1 Cognition and Do4 Getting along domains showed floor effects over 15%. The result for Do4 was actually on the border of the adopted norm. In contrast, due to the attributable proportion of patients with no cognitive problems, a high floor result was expected in the cognition domain. The low percentage of ceiling and floor scores obtained in the summary score and in the other domains could support the use of these scores in rehabilitation assessment in patients with LBP. Serrano-Dueñas et al., assessing patients with Parkinson's disease, found floor effects in the Do1 Cognition domain, and the Do5 Life activities domain of the WHODAS 2.0 scale receiving 17.2% and 22.9%, respectively [66].
We found a positive correlation between the six domains of the Polish version of the 36-item WHODAS 2.0 (p < 0.001). A good correlation coefficient was identified between the total score and each domain (r = 0.482-0.762), demonstrating a good internal structure. Similarly, the total score had a good correlation with the six domains of the 36-item WHODAS 2.0 in the case of the traditional Chinese version (p < 0.05; r = 0.7-0.76) [67].
We also tested the convergent validity of the 36-item WHODAS 2.0. The overall result and the vast majority of domains of the 36-item WHODAS 2.0 were negatively correlated with domains of the SF-36 questionnaire; hence, a higher score on the WHODAS (higher disability) was associated with a lower score on the SF-36 questionnaire (i.e., lower quality of life). All WHODAS 2.0 domains were correlated with SF-36 domains like physical functioning, body pain, PCS and MCS. The weakest correlation was found between Do3 Self-care and Do4 Getting along of the WHODAS 2.0 questionnaire, and other SF-36 domains. Baron et al. demonstrated the strong correlation of WHODAS 2.0's total score with the SF-36 PCS (τ = −0.51, p < 0.001) and moderate correlation with the SF-36 MCS (τ = −0.43, p < 0.001). The WHODAS 2.0 domains, likewise, were all moderately to strongly correlated with the subscale domains and total scores of the SF-36 [20]. Other authors also obtained correlations between the 36-item WHODAS 2.0 and the SF-36 from weak to high [22,24,68]. These findings confirm that adults with a lower quality of life have a higher level of disability.
The total result and entire domains of the 36-item WHODAS 2.0 were statistically positively correlated with each domain of the HADS questionnaire; thus, a higher score on the 36-item WHODAS (higher disability) was associated with a higher score on the HADS questionnaire (anxiety and depression). These findings confirm that adults with higher anxiety and depression are characterized by a higher level of disability. We did not find any other studies in which the HADS scale was used for convergent validity for the 36-item WHODAS 2.0. However, studies using other scales assessing the occurrence of depression confirm the significant relationship between a higher level of depression and a higher level of disability. For instance, Rotarou et al. noticed a strong association of functional disability with increased depression in patients [69], whereas Sjonnesen indicated that the WHODAS 2.0 was sensitive in case of assessing the impact of depression [70].
The total result and all domains of the 36-item WHODAS 2.0 were significantly correlated with the ODI questionnaire. These findings confirm the hypothesis that adults with a higher physical disability measured by the ODI have a higher level of disability measured by the WHODAS 2.0. Saltychev et al. showed that the total scores of the WHODAS and the ODI were strongly correlated. Authors also suggested that the assessment of disability in the case of the population with LBP might be better estimated by the WHODAS 2.0 unlike the ODI [27]. According to Varjonen, both the WHODAS 2.0 and the ODI assessed the level of functioning of people experiencing LBP equivalently [28].
We confirmed that the 36-item WHODAS 2.0 had satisfactory validity for people with a different health status. In our study, the results of the WHODAS differed between people experiencing less and more pain. Similar situations were in all domains, except Do4 Getting along. These findings confirm our hypothesis that adults with higher levels of pain are characterized by a higher level of disability. With reference to the research performed by Baron et al., the patients with early arthritis were divided into two subgroups according to the results of the Center for Epidemiological Studies Depression Scale. These researchers noticed that the 36-item WHODAS 2.0 was able to distinguish patients with low and high depression symptoms [20]. Additionally, Garin et al. pointed out that as for most of the WHODAS 2.0 domains there were statistically significant, differences regarding groups with various clinical severity as for their medical condition and between professionally active and inactive due to poor health (p < 0.001) [30]. Serrano-Dueñas et al. revealed that in respect of the WHODAS 2.0, the getting along domain and the life activities domain were not significantly different in terms of staging according to the Hoehn and Yahrfor scale, whereas other domains and the total scale indicated differences [66].
The values of all WHODAS 2.0 domains changed between the first and third study. Results in all domains decreased significantly, i.e., the disability decreased. Almost all domains showed a moderate to large degree of responsiveness, respectively, for ES and SRM. The WHODAS 2.0 responsiveness scores for the total results were −1.35 (SRM) and −0.86 (ES) at 4 weeks after discharge. Meesters et al. found the WHODAS 2.0 responsiveness scores −0.35 (SRM), −0.34 (ES) at 6 weeks after discharge [71]. Moreover, in the research performed by Garin et al., analyzing the group of patients whose health condition had improved, they indicated small to moderate responsiveness coefficients (ES = 0.3-0.7), but higher than in the group of the SF-36 [30]. Additionally, Chwastiak et al. showed that the WHODAS 2.0 was well responsive (ES = 0.65) while assessing the treatment results [24].
Federici et al. emphasized that the 36-item WHODAS 2.0 is adequate for assessing disability and health status. Although it is an important issue for rehabilitation, MCID score for the WHODAS 2.0 should be established [19]. We found that the minimal clinically important difference in case of the total WHODAS 2.0 score in patients after rehabilitation for low back pain was 4.87. The largest MCID was found for Do2 Mobility (7.93 ± 0.70), and the smallest for Do1 Cognition (1.71 ± 0.347). The 36-item WHODAS can accurately capture changes in disability after rehabilitation in patients with LBP and thus can be used as a valid primary endpoint for clinical trials [72].
A weakness of our analysis is that our sample size prevented the use of a more robust test of the internal factor structure of the WHODAS 2.0 such as confirmatory factor analysis (CFA) [73]. The study's strengths include the use of standardized methods for the assessment of psychometric properties. It is the first study in Poland and one of the few in the world to analyze the usefulness of the WHODAS 2.0 questionnaire in assessing the disability of patients with chronic back pain. The scientific foundation of this issue is particularly important in the context of the implementation of the ICF and the WHODAS 2.0 questionnaire in Poland for general use in rehabilitation departments and other physiotherapy units.

Conclusions
These findings show that the 36-item WHODAS is suitable for evaluating health status and disability, and that it is a reliable and valid tool for assessing patients with chronic LBP according to its psychometric properties. Because it can capture changes in disability after rehabilitation in patients with LBP, the 36-item WHODAS 2.0 can thus be used as a valid primary endpoint for clinical trials. Regarding that the 36-item WHODAS 2.0 is an easy-to-use, generic instrument, based on the principles of ICF, with high feasibility, it could be considered as a first-choice tool in the rehabilitation field that might implement the management of people with disability due to LBP.

Conflicts of Interest:
The authors declare no conflict of interest.