1. Introduction
Carpal tunnel syndrome (CTS) is the most common peripheral neuropathy of the upper limb. It is characterized by pain and paresthesia in the area supplied by the median nerve [
1]. The American Academy of Orthopedic Surgeons (AAOS) and the American Society for Surgery of the Hand (ASSH) provide guidelines necessary for the diagnosis of CTS.
In addition to a clinical examination and specific imaging tests, they also include the use of standardized Patient-Reported Outcome Measures (PROMs), which provide valuable information about the symptoms, their impact on daily functioning, and the effectiveness of selected therapeutic procedures from the patient’s perspective [
2]. There are over a dozen English-language-specific PROMs for the assessment of the upper limb. Some of them are used for the subjective assessment of patients with dysfunction of the entire upper limb, such as the Disabilities Arm, Shoulder, and Hand Questionnaire (DASH) and its shortened version, the QuickDASH [
3], the American Shoulder and Elbow Surgeons questionnaire (ASES) [
4], or the Shoulder Pain and Disabilities Index questionnaire (SPADI) [
5]. Others, such as the Michigan Hand Outcomes Questionnaire (MHQ), are specific to the area of the hand [
6]. All of them are characterized by good psychometric properties, but they do not contain items directly related to CTS, offering only general information on the degree of dysfunction or the severity of symptoms in the upper limb.
In the case of CTS, the most frequently used questionnaire is the Boston Carpal Tunnel Questionnaire (BCTQ) [
7], developed by Levine et al. in 1993, which is used in many different countries. Available language versions include Chinese [
8], Japanese [
9], Korean [
10], Portuguese [
11], Spanish [
12], Thai [
13], and Turkish [
14,
15], among others [
16,
17]. The results of studies using the BCTQ are easily and measurably comparable. In 2006, a systematic review of the psychometric properties of the BCTQ was published by Leite et al., which confirms that it is a standardized tool for determining the severity of symptoms and the functional status of patients with CTS. It is characterized by good psychometric properties, such as validity, reliability, and sensitivity to clinical changes [
18].
The aim of our study was to conduct the cultural and linguistic adaptation of the BCTQ to a Polish version and to assess its psychometric properties—reliability, validity, and responsiveness—among patients with CTS undergoing physiotherapy.
2. Materials and Methods
2.1. Participants
Between April and August 2019, we initially recruited 126 patients diagnosed with CTS for the study. Patients were selected based on an interview with either an orthopedist or neurologist and had an ultrasound (n = 50) or electromyography (EMG; n = 53). Ultimately, 103 patients diagnosed with mild or moderate CTS were chosen for the study. We took into account factors that would exclude patients from extracorporeal shock wave therapy, such as implantable cardioverter defibrillator, cardiovascular failure, acute inflammation, fever, blood coagulation disorders and the use of anticoagulants, pregnancy, neoplastic disease, previous injuries or fractures of the upper limb, congenital defects of the upper limb, neurological diseases (e.g., Parkinson’s disease, multiple sclerosis, amyotrophic lateral sclerosis, polyneuropathy, and stroke), advanced degenerative and rheumatic processes involving the upper limb, cervical discopathy at the C5-Th1 with root compression, taking medications that worsen psychophysical fitness, and inability to complete the questionnaires (aphasia and dementia disorders). All patients were native Polish speakers. This study was conducted at the outpatient physiotherapy unit of Jasło Health Center, Poland.
Figure 1 presents a flow chart depicting the recruitment process and subsequent testing stages of the patients.
2.2. Sample Size
A post-hoc analysis of the test power significance was conducted for Intraclass Correlation Coefficients (ICCs) for the null hypothesis ICC = 0.7 with a sample size of 103 individuals, significance level of 0.05, and the expected ICC value in our population. The test power is extremely high with over 0.999 for each scale. It showed that the sample size was statistically satisfactory.
2.3. Design
This was a cross-sectional study with repeated measures during retest examinations. Before conducting the research, on 31 October 2017, consent was obtained from the authors of the source version of the BCTQ, represented by Professor Barry Simmons, for the translation and cultural adaptation of the BCTQ to Polish and for the assessment of its psychometric properties.
Stage 1 involved translation and cultural adaptation of the BCTQ into Polish.
The process of translation and language adaptation was conducted in line with the international guidelines proposed by Beaton et al. [
19].
Figure 2 presents the six steps of the linguistic and cultural adaptation of the BCTQ-PL.
Stage 2 involved a prospective evaluation of the essential psychometric properties of the BCTQ-PL.
The patients (n = 103) underwent three tests. During test I the patients completed the Polish versions of all the questionnaires. During test II, 2–7 days after test I (retest), the patients completed only the BCTQ. After completing the questionnaire, each patient underwent the first extracorporeal shock wave therapy (ESWT): a series of four treatments at weekly intervals, frequency 10 Hz, pressure 2 Ba, number of strokes 2500, focus transmitter 15 mm, and average energy density 0.22 mJ/mm2. Test III was performed 3 months after the completion of the full series of ESWT and involved filling in the Polish versions of all the questionnaires. All the patients answered all the questions.
2.4. Measurements
2.4.1. PROMs
Boston Carpal Tunnel Questionnaire (BCTQ)—Polish Version (PL)
The questionnaire consists of two scales. The Symptom Severity Scale (SSS) has 11 items, including the degree of pain experienced by the patient during day and night and the frequency of painful episodes including numbness, and weakness and tingling in the hand, as well as difficulty grasping and handling small objects. The Functional Status Scale (FSS), on the other hand, includes eight items related to functionality, including difficulty in carrying out such activities as writing, buttoning, holding a book, using a telephone, opening jars, housework, carrying shopping bags, washing, and dressing. The responses are given on a 5-point scale, where 1 means the lowest degree of symptoms/no difficulties with a given activity, and 5 means the most severe symptoms/inability to perform a given activity. Each scale produces a final score in the range of 1–5, which is the total of all the responses in the questionnaire divided by the number of items answered. The result is rounded to 1/100. The higher the score on the five-point scale, the greater the degree of hand disability [
7].
Disabilities of the Arm, Shoulder, and Hand Questionnaire (QuickDASH)—Shortened Polish Version
The questionnaire contains 11 items related to the difficulties in the performance of activities using the upper limb, as well as the symptoms and their impact on social activities, work, and sleep. The score is in the range of 0–100, with the higher score corresponding to a greater impediment in the upper limb [
3].
Visual Analogue Scale (VAS)
The scale is used to determine the intensity of pain experienced by the patient. A point corresponding to the patient’s pain is marked on a 10 cm scale, where 0 corresponds to no pain and 10 reflects the worst pain imaginable [
21].
2.4.2. Provocative Tests
Phalen’s Test
During the test. the patient performs palmar flexion of the wrist for 60–120 s. A positive test result is recorded (representing a median nerve injury) if, after this time, the patient reports increased symptoms, e.g., tingling and numbness [
22,
23].
Hoffman–Tinel Sign
This test is performed by gently tapping, with the index finger, on the median nerve trunk at the level of the carpal flexion crease. The test result is considered positive (representing a median nerve injury) if radiating pain and paresthesia occur in the area supplied by the median nerve [
22,
23].
2.4.3. Muscle Strength Test
Hand Grip Strength Test with Dynamometer
This was performed using a Jamar® Hand dynamometer [kg] (Fabrication Enterprises Inc., White Plains, NY, USA; manufactured in China), in compliance with the guidelines of the American Society for Surgery of the Hand (ASSH) [
24,
25]. Three measurements were carried out and the arithmetic mean was calculated from these.
2.5. Statistical Analyses
The authors used SPSS Statistics software version 24. It was assumed a level of statistical significance was reflected by p < 0.05. The distribution of the results was verified using the Kolmogorov–Smirnov test.
2.5.1. Reliability
Standard error of measurement (SEM) determines to what extent the values of a given measure will differ in subsequent measurements made under the same conditions for purely random reasons [
28].
Minimal detectable change (MDC) defines the smallest difference between two measurements that (with 95% confidence) does not result only from random fluctuations [
29].
2.5.2. Validity
The construct validity was assessed by correlating (SCC) the scores of the BCTQ-PL and the reference questionnaires and tests. The authors evaluated the significance of the relationship between the values obtained in the BCTQ-PL and the results of Hoffman–Tinel sign and Phalen’s test using the t-test for independent samples.
The following a priori hypotheses (10) were made: (1) BCTQ-PL SSS will strongly correlate with QuickDASH (both tools assess the symptom severity and the functional impact of upper limb conditions, due to which a strong correlation is theoretically expected); (2) BCTQ-PL FSS will strongly correlate with QuickDASH (FSS and QuickDASH both focus on functional limitations of the upper extremity, the fact being conducive to a strong correlation); (3) BCTQ-PL SSS will strongly correlate with VAS (SSS covers pain and sensory disturbances, which are core factors measured by the VAS); (4) BCTQ-PL FSS will correlate moderately or weakly with VAS (FSS measures functional ability rather than pain intensity, which suggests a weaker theoretical link with VAS); (5) correlations between BCTQ-PL and QuickDASH will be stronger than those between BCTQ-PL and SF-36 (QuickDASH is disease-specific for the upper limb function, whereas SF-36 is a generic quality-of-life tool, leading to expected stronger correlations between the former and the BCTQ-PL); (6) correlations between FSS and SF-36 will be stronger than between SSS and SF-36 (FSS relates more directly to daily functioning, which aligns better with the functional domains assessed by SF-36); (7) correlations between FSS and SF-36 PCS will be stronger than between FSS and SF-36 MCS (FSS reflects physical disability, thus aligning more closely with the PCS of SF-36); (8) correlations between SSS and SF-36 PCS will be stronger than between SSS and SF-36 MCS (symptom severity in CTS, e.g., pain and numbness, impacts physical health more directly than mental health); (9) both BCTQ-PL scales will correlate moderately or weakly with hand grip strength (grip strength is an objective functional test, but the function is often preserved in mild-to-moderate CTS, leading to limited correlation with subjective symptom reports); and (10) both BCTQ-PL scales will significantly differentiate patients with positive vs. negative results on Tinel–Hoffmann and Phalen’s tests (these provocative tests are diagnostic tools for CTS, so differences in SSS and FSS scores between positive and negative groups are theoretically expected). According to the criteria proposed by Terwee et al. [
26], construct validity of the BCTQ-PL is deemed acceptable when at least 75% of a priori hypotheses are confirmed in a sample of no fewer than 50 participants.
The unidimensionality of the BCTQ-PL questionnaire was analyzed separately for both scales. For this purpose, exploratory factor analysis (EFA) was applied to examine the number of dimensions of the SSS and FSS, and to reduce the information contained in the original detailed questions to a smaller number of directly unobservable factors (an eigenvalue greater than 1.0, as well as an explained variance of more than 10% were assumed for each subscale). The Kaiser–Meyer–Olkin measure of the sampling adequacy (KMO) was set at >0.70 to indicate adequate sampling, and the significance level of the Barlett Test of Sphericity was p < 0.001, indicating that the EFA could be used for data analysis.
2.5.3. Responsiveness
The Wilcoxon test was used to assess the significance of the changes in the BCTQ-PL scores before (test I) and after physiotherapy (test III).
Effect size (ES) was calculated to detect clinical changes in patients receiving physiotherapy. Absolute ES values ≤ 0.20 represent low responsiveness, values in the range of 0.21–0.79 reflect moderate responsiveness, and values ≥ 0.80 indicate a high responsiveness of the assessment tool used to detect clinical changes in the patient’s condition.
Standardized response mean (SRM) was interpreted as in the case of ES [
30,
31,
32].
2.6. Ethical Considerations
The study was approved by the institutional Bioethics Committee at the University of Rzeszow, Poland (Resolution No. 5/01/2019).
4. Discussion
This paper presents the process of translation and cultural adaptation of the BCTQ to a Polish version, which was carried out in accordance with international guidelines [
14,
18,
19]. During this stage of the study, no problems were identified and a conceptual equivalence of the BCTQ-PL with the source version was obtained. The subsequent validation analyses demonstrated that the Polish version of BCTQ is characterized by high reliability, validity, and responsiveness to clinical changes, which is in line with the findings reported by the authors of other-language versions [
33,
34,
35].
The reliability of the questionnaire was investigated along with an analysis of the internal consistency and repeatability of BCTQ-PL. Internal consistency was determined by calculating the Cronbach’s α coefficient, which was ≥0.70, as anticipated. Our results are almost identical to those obtained for the original version of the questionnaire, with values of 0.89 for the BCTQ SSS and 0.91 for the FSS [
7]. Similarly, high values of Cronbach’s α were identified in the study by Kim et al., i.e., 0.89 for the SSS and 0.90 for FSS [
36], and by other researchers [
9,
10,
16,
17,
34].
In order to assess the repeatability of the BCTQ-PL questionnaire, a two-to-seven-day interval between the test and re-test was used. This interval was long enough for the patients to forget the questions from test 1, but short enough so that their state of health did not change [
9,
10,
15,
33]. The good repeatability of the questionnaire was shown by the very high ICC values (over 0.90 for both scales), and the finding was consistent with the results reported in other studies [
8,
9,
10,
15,
16,
36,
37]. Only the Persian version was found to present a poorer repeatability than the other language versions, as reflected by the ICC values of 0.53 for the SSS and 0.77 for the FSS [
17]. In the source version, the repeatability of the tool was measured by performing a test–retest procedure on two consecutive days, and using the Pearson’s coefficient. The result was 0.91 for the SSS and 0.93 for the FSS, which also shows excellent repeatability [
7]. The test–retest analysis also included the calculation of the SEM and the MDC. The current findings related to the MDC indicate that clinicians and researchers may consider that the differences in the SSS at the level of at least 0.43, and in the FSS at the level of at least 0.59 are dependable. Higher MDC values were reported by the authors of the Chinese version of the questionnaire, i.e., 0.86 for SSS and 0.75 for FSS [
8]. However, in the Arabic version of the questionnaire validated by Alanazy et al., the MDC amounted to 4.7 and 4.5 for the SSS and the FSS, respectively [
37].
As a result of the analyses concerning the construct validity assessment, eight out of ten (i.e., 80%) a priori hypotheses were confirmed. In accordance with the guidelines presented by Terwee et al. [
26], this result indicates an acceptable construct validity of the BCTQ-PL. Two a priori hypotheses, the ones related to hand grip strength and the results of provocative tests (Phalen’s test and Tinel–Hoffmann sign), were not confirmed. This may be explained by the fact that mild to moderate CTS cases were included in our sample (SSS
= 2.98); in such cases, global hand grip strength often remains relatively preserved. Additionally, the sensitivity of provocative tests varies widely in the literature (ranging from 42% to 85% for Phalen’s test and from 38% to 100% for the Hoffmann–Tinel sign [
22]), which may explain the limited discriminant power of the BCTQ-PL in differentiating between patients with positive and negative results. These findings further highlight the complexity of CTS evaluation and support the need for a multidimensional assessment approach that combines both subjective and objective measures. The validity of the original version of BCTQ was confirmed using the correlation between the SSS and FSS and such measures as grip strength, pinch strength, the two-point discrimination test, and the Semmes–Weinstein monofilament test, as well as the velocity of midrange sensory conduction [
7]. Similar to our study, the correlation was compiled by researchers from Japan [
9]. In the Dutch version of the BCTQ, the authors assessed the validity by investigating the correlation between the BCTQ and the Likert scale, grip strength, ultrasound and electrophysiological examinations; however, no correlation between BCTQ and the above measures was found [
33].
We examined the unidimensionality of the BCTQ-PL using an EFA separately for both subscales. The FSS presented the unidimensional structure, while the SSS showed a correlation of items in two subgroups. Items no. 1, 2, 6, 8, 9, and 10 addressing nocturnal symptoms were not related to everyday life, and items no. 3, 4, 5, 7, and 11 were more related to daytime symptoms concerning functionality. Very few BCTQ validation studies performed a similar factor analysis as reported in the current study [
9,
33,
38]. The factor analysis of the Dutch version of the BCTQ, in a study in which the patients were examined before and after the surgical intervention, showed similar findings to those reported in the present study as regards the unidimensionality of the FSS. However, the SSS completed by patients before the surgery, according to the latter authors, appear to measure three different subsets of items, namely, “daytime symptoms” (α = 0.80), “night symptoms” (α = 0.83), and “hand manipulation abilities” (α = 0.72). After the surgical intervention, the factor analysis of the SSS changed further and the researchers distinguished only two subgroups of questions. The subsets of items related to “nocturnal symptoms” and “hand manipulation skills” merged into one coherent group of questions, while the “daytime symptoms” items remained as a separate subgroup [
33]. Imadea et al., in the Japanese version of BCTQ, also distinguished between two factors in the SSS [
9]. These findings indicate the need for further research using such analyses as the CFA or Rasch modeling to confirm the factor structure of the BCTQ-PL.
In the process of assessing the responsiveness of the BCTQ-PL to clinical changes, very clear, positive effects were observed following the physiotherapy applied in the study (
p < 0.001). The severity of symptoms, on average, decreased by 1.04 and the functional status of the patients improved by about 0.77. The study validating the original version reported a mean decrease in the SSS after surgery by 1.5, and the FSS score improved by 1.0 [
7]. In the present study, the ES values were very high (1.62 and 0.99 for the SSS and FSS, respectively), which proves that the BCTQ-PL is highly responsive to clinical changes after therapy. The ES values in the present study were almost identical to those reported for the original version of the BCTQ [
7] and consistent with the findings of other researchers [
9,
33,
38]. The SRM values obtained for both scales also confirmed a high responsiveness of the BCTQ-PL. Similar SRM values were reported in the case of the Dutch version (SSS-1.49 and FSS-0.76) [
33] and slightly lower values were found in the Chinese version (SSS-1.03 and FSS-0.62) [
8].
Limitations and Future Considerations
The limitations of the present study include the fact that no assessment of psychometric properties was performed on patients undergoing other forms of intervention in CTS, such as the commonly used median nerve decompression surgery, injections of corticosteroids, or other types of physical therapy. Notably, the exclusion criteria applied, some of which were directly related to the contraindications for ESWT, may have inadvertently introduced sample imbalances, particularly with respect to comorbidities and the distribution of CTS in dominant vs. non-dominant limbs. Future studies should aim to validate the BCTQ-PL in heterogeneous groups of patients receiving treatment based on different therapeutic approaches and recruited from various healthcare centers, in order to improve the generalizability of the findings. Although one objective clinical measure (hand grip strength) was included, the validation process primarily relied on subjective PROMs. To further strengthen the construct validity of the BCTQ-PL, future studies should include additional objective clinical assessments, such as electroneurography or electromyography. A key limitation of the present study is that, although the SSS and FSS were used as separate scales in line with the original and other language versions of the BCTQ, the EFA suggested the possibility of a two-factor structure within the SSS. Future research should verify this finding using a confirmatory factor analysis (CFA) or Rasch modeling on larger and more diverse samples to assess whether distinguishing subscales for “nocturnal symptoms” and “daytime symptoms related to functionality“ would improve structural validity.