Interrater and Intrarater Reliability of the EasyForce Dynamometer for Assessment of Maximal Shoulder, Knee and Hip Strength

This study aimed to determine the interrater and intrarater reliability of EasyForce dynamometer for assessing shoulder, knee, and hip muscle strength in healthy young adults. Shoulder, knee, and hip maximal isometric strength were measured using the EasyForce in healthy adults (11 women and 12 men). Three repetitions of shoulder internal rotation, abduction, knee flexion, extension, and hip abduction and adduction were performed. The tests were performed by three raters on the same day. The results showed good to high inter- and intrarater reliability (intraclass correlation coefficient range: 0.63–0.91). Moreover, the absolute reliability of the EasyForce was slightly higher than acceptable for all tests (CV > 10%) except for hip abduction on the right leg (CV = 7.2%). The EasyForce dynamometer can be considered a reliable tool for assessing shoulder internal rotation and abduction, knee extension and flexion, as well as hip abduction and adduction strength. The EasyForce dynamometer showed no differences between the raters’ measurements, which could be of great importance for professionals who want to perform the tests regardless of their strength on the values.


Introduction
The assessment of muscle strength has received a lot of research attention in areas where it is important to analyze the health and physical status of individuals [1]. In sports science, muscle strength assessment is primarily done for the purpose of setting normative standards for inclusion in certain sports [2,3], identifying and selecting potential talents [2], improving physical performance [4], or determining the effects of the training process [5]. Medical science requires strength assessment in the rehabilitation process following surgical interventions [6], to detect the risk of potential injury [7], assess the status of patients with neurological diseases [8,9] or disease's progression [4,6], and set the normative standards of muscle strength levels for the general population [10]. For adequate and widespread strength assessment, it is of great importance to providing valid, reliable, discriminative, and practical equipment for that purpose [4,6,11,12].
The two most commonly used methods for assessing muscle system functionality are the manual muscle test (MMT) and an isokinetic dynamometer [4,6,[12][13][14][15]. However, both of these approaches have weaknesses. MMT is successfully used in patients with neurological impairments and during the rehabilitation process [14]. Existing research has agreed that this method is inexpensive, fast, and easy to perform, but has failed to explore deficiencies in larger muscle groups [16] and minor strength deficits in relation to normal

Participants
The study included 23 healthy young participants (12 males, 11 females; age: 21.4 ± 2.1 years) who reported being active in their leisure time. Mean body height was 1.82 ± 0.09 m for males, and 1.69 ± 0.06 m for females, and the mean body mass was 78.1 ± 8.9 kg for males, and 60.1 ± 6.2 kg for females. The inclusion criteria were the absence of injuries in the past 6 months and the absence of other medical conditions. All participants were thoroughly informed about the experimental procedures and signed an informed consent form before starting with the tests. The experimental protocol was approved by the Republic of Slovenia National Medical Ethics Committee (approval no. 0120-99/2018/5) and was performed in accordance with the latest revision of the Declaration of Helsinki.

Study Design and Procedures
All measurements were performed by three raters with a background in kinesiology, who were familiarized with the procedures before the measurements. Rater 1 was male, height, 1.70 m; and weight, 68 kg; rater 2 was female, height, 1.67 m; weight, 59.5 kg; and Diagnostics 2022, 12, 442 3 of 12 rater 3 was female, height, 1.70 m; weight, 58.5 kg. The raters undertook a period of training and familiarization in the use of the EasyForce HHD ( Figure 1) to ensure competency and efficiency. In addition, a pilot study was carried out on two participants prior to the commencement of testing. Throughout the testing period, each rater was blinded to the values obtained by the other raters. The EasyForce device continuously records the pulling force (with ±1% accuracy as per the manufacturer). After the force level is dropped below zero value, the data acquisition is stopped, and the results are displayed on the device. A new measurement is commenced after pressing the reset button. This is important, as it neglects any additional forces caused by movements after the force is dropped to zero.

Study Design and Procedures
All measurements were performed by three raters with a background in kinesiology, who were familiarized with the procedures before the measurements. Rater 1 was male, height, 1.70 m; and weight, 68 kg; rater 2 was female, height, 1.67 m; weight, 59.5 kg; and rater 3 was female, height, 1.70 m; weight, 58.5 kg. The raters undertook a period of training and familiarization in the use of the EasyForce HHD ( Figure 1) to ensure competency and efficiency. In addition, a pilot study was carried out on two participants prior to the commencement of testing. Throughout the testing period, each rater was blinded to the values obtained by the other raters. The EasyForce device continuously records the pulling force (with ±1% accuracy as per the manufacturer). After the force level is dropped below zero value, the data acquisition is stopped, and the results are displayed on the device. A new measurement is commenced after pressing the reset button. This is important, as it neglects any additional forces caused by movements after the force is dropped to zero. Before performing the testing, the anthropometric characteristics of the participants were measured. Shoulder, hip, and knee strength measurements were taken with EasyForce dynamometer within the same visit. Prior to the measurements, participants performed a 15 min warm-up consisting of a stationary bike ride for five minutes, followed by 10 min of dynamic stretching exercises and bodyweight resistance exercises (lunges, squats, push-ups, glute bridges).
The order of the tasks was randomized across participants, but the order of the tasks was constant for each rater in assessing individual participants. The order of the raters was also randomized for each participant. All assessments were performed on both legs. For all tasks, three trials were performed on each side with 30 s rest in between. Prior to each task, the subjects performed three warm-up trials at submaximal intensity (~50, ~70, and ~90% of self-perceived maximal effort) to familiarize themselves with the task. During the measurements with the EasyForce, the instruction was to build up the maximal force gradually (~1-2 s) and sustain it for an additional ~3-4 s. Verbal encouragement was given throughout the tasks. After completion of the practice trials, subjects completed three trials on each leg/arm, and measures were recorded. Before performing the testing, the anthropometric characteristics of the participants were measured. Shoulder, hip, and knee strength measurements were taken with EasyForce dynamometer within the same visit. Prior to the measurements, participants performed a 15 min warm-up consisting of a stationary bike ride for five minutes, followed by 10 min of dynamic stretching exercises and bodyweight resistance exercises (lunges, squats, push-ups, glute bridges).
The order of the tasks was randomized across participants, but the order of the tasks was constant for each rater in assessing individual participants. The order of the raters was also randomized for each participant. All assessments were performed on both legs. For all tasks, three trials were performed on each side with 30 s rest in between. Prior to each task, the subjects performed three warm-up trials at submaximal intensity (~50,~70, and~90% of self-perceived maximal effort) to familiarize themselves with the task. During the measurements with the EasyForce, the instruction was to build up the maximal force gradually (~1-2 s) and sustain it for an additional~3-4 s. Verbal encouragement was given throughout the tasks. After completion of the practice trials, subjects completed three trials on each leg/arm, and measures were recorded.

Set-Up for the EasyForce Measurement
Measurements with the EasyForce dynamometer were performed according to the manufacturer's recommendations ( Figure 2). The EasyForce is a belt-stabilized HHD that continuously records tension force, with ±1% accuracy as assured by the manufacturer. After the force is dropped to zero, the measurement is terminated, and peak and average force are displayed. A new measurement is started only after resetting the device, which Diagnostics 2022, 12, 442 4 of 12 prevents any small movements performed after the measurement (i.e., after the force reaches zero) to influence the recorded peak and average values.

Data Analysis and Statistics
Statistical analyses were done with SPSS (version 25.0, SPSS Inc., Chicago, USA). Descriptive statistics are reported as mean ± standard deviation. Intra-class correlation coefficients (ICC) with the two-way random single-measure model (i.e., ICC2,1) for absolute agreement was used to assess the relative reliability of our outcomes. We considered ICC values < 0.5 as indicative of poor reliability, values between 0.5 and 0.75 for moderate reliability, values between 0.75 and 0.9 for good reliability, and values greater than 0.90 for excellent reliability [37]. Our previous study that assessed intervisit reliability of EasyForce for knee and hip muscles showed mostly excellent reliability (ICC > 0.90), therefore, we expected the ICC scores for this study to be >0.0. According to the recommendations by Bujang et al. [38], a sample of 18 participants would be needed to assure with 90% statistical power that the reliability is excellent (ICC > 0.90; the alternative hypothesis being that the reliability is below the good threshold; ICC < 0.75). Because interrater reliability could be lowest than intervisit reliability, we increased the sample size to 23. Absolute reliability was assessed with typical error [39], expressed as coefficient of variation (CV). Based on previous studies, the acceptable boundary of <10% for acceptable reliability was used for CV. Second, for the analysis of the agreement between the raters and to assess systematic between-rater bias, that is, if values obtained by one rater systematically differed from that of another rater, ANOVA was used. Values were expressed as mean ± SD and 95% confidence interval (CI). The significance level was set at α < 0.05. For the knee extension assessment (Figure 2A), the participants were seated on a bed table, with the knee flexed to 90 • , hands resting on the thighs and the trunk in an upright position. The dynamometer was placed 2 cm above the malleolus, and the examiner was positioned behind the participant. For the assessment of knee flexion strength ( Figure 2B), the participants were in a prone position on the table with the tested knee flexed to 90 • and the dynamometer placed at the same point on the body as for the knee extension. While the dynamometer was fixed to the floor with the examiner's foot, the hip abduction strength was assessed in the side-lying position ( Figure 2C). The knee of the non-tested leg was flexed to 90 • , while the upper leg was extended. The dynamometer was placed 2 cm above the lateral condyle, with the other end attached to the table. The examiner stabilized the pelvis during the measurements by holding it with both hands. For the assessment of hip adduction strength ( Figure 2D), the same position was adopted, with the dynamometer fixed to the bottom leg with one end and the other end firmly attached to the bed table frame. Both legs were extended, and the examiner supported the upper leg, which was in slight abduction.
For the shoulder measurements (shoulder abduction and internal rotation), participants were in the prone position with their toes, abdomen, chest, and mentum touching on the portable table ( Figure 2E,F). For the shoulder internal rotation (Shoulder IR) ( Figure 2E), the placement of HHD positioned the transducer head just proximal to the ulnar styloid process on the ventral forearm. For shoulder abduction ( Figure 2F), the person was prone with the shoulder abducted to 90 • and elbow flexed to 90 • with the upper arm resting on the table. The upper arm, shoulder, scapula, and trunk were stabilized by manual fixation by the examiner's hand, arm, and trunk, if necessary.

Data Analysis and Statistics
Statistical analyses were done with SPSS (version 25.0, SPSS Inc., Chicago, IL, USA). Descriptive statistics are reported as mean ± standard deviation. Intra-class correlation coefficients (ICC) with the two-way random single-measure model (i.e., ICC 2,1 ) for absolute agreement was used to assess the relative reliability of our outcomes. We considered ICC values < 0.5 as indicative of poor reliability, values between 0.5 and 0.75 for moderate reliability, values between 0.75 and 0.9 for good reliability, and values greater than 0.90 for excellent reliability [37]. Our previous study that assessed intervisit reliability of EasyForce for knee and hip muscles showed mostly excellent reliability (ICC > 0.90), therefore, we expected the ICC scores for this study to be >0.0. According to the recommendations by Bujang et al. [38], a sample of 18 participants would be needed to assure with 90% statistical power that the reliability is excellent (ICC > 0.90; the alternative hypothesis being that the reliability is below the good threshold; ICC < 0.75). Because interrater reliability could be lowest than intervisit reliability, we increased the sample size to 23. Absolute reliability was assessed with typical error [39], expressed as coefficient of variation (CV). Based on previous studies, the acceptable boundary of <10% for acceptable reliability was used for CV. Second, for the analysis of the agreement between the raters and to assess systematic between-rater bias, that is, if values obtained by one rater systematically differed from that of another rater, ANOVA was used. Values were expressed as mean ± SD and 95% confidence interval (CI). The significance level was set at α < 0.05. Table 1 shows the isometric strength data reported by the EasyForce HHD during the assessed movements of the shoulder, knee, and hip joint.

Interrater Reliability
Interrater reliability results are presented in Table 2 and Figure 3. Interrater ICC values ranged from 0.82 to 0.91 for shoulder, 0.65 to 0.83 for knee, and 0.63 to 0.89 for hip (Figure 3), showing moderate to excellent relative reliability. One-way ANOVA results also showed there were no significant differences between raters for all resistance muscle tests indicating no systematic bias. However, there was a high within-individual variation (CV = 11.24-23.54%) for all tests.

Intrarrater Reliability
Intratester reliability results are presented in Table 3 and Figure 4. The ICC values obtained by rater 1 for all tests and both limbs ranged from moderate to excellent (0.66-0.91). Good to excellent reliability was also indicated by the highest ICC values obtained by rater 3 (0.76-0.91).

Intrarrater Reliability
Intratester reliability results are presented in Table 3 and Figure 4. The ICC values obtained by rater 1 for all tests and both limbs ranged from moderate to excellent (0.66-0.91). Good to excellent reliability was also indicated by the highest ICC values obtained by rater 3 (0.76-0.91).  The absolute reliability (Table 3) of the EasyForce was slightly higher than acceptable (CV > 10%) with the exception of hip abduction on the right leg (CV = 7.2%). Moreover, one-way ANOVA results showed significant differences (p < 0.05) in several tests and both sides indicating systematic bias. For the knee measurements, there were significant differences in rater 1 (right knee flexion p = 0.046; eta squared (η 2 ) = 0.103; left knee extension p = 0.021, η 2 = 0.137), as well as for knee extension in both legs for rater 3 (right knee extension p = 0.001, η 2 = 0.218; left knee extension p = 0.016, η 2 = 0.126). Results for shoulder and hip showed significant differences (p < 0.05) in rater 1 left shoulder abduction p = 0.023, η 2 = 0.136) and rater 2 (left shoulder IR p = 0.043, η 2 = 0.121; left hip abduction p = 0.001, η 2 = 0.242).

Discussion
The purpose of this study was to evaluate inter-and intrarater reliability of the new, portable, dynamometer EasyForce for assessing the isometric strength of shoulders, knee, and hip muscles. Overall, the results demonstrate moderate to excellent inter-and in-

Discussion
The purpose of this study was to evaluate inter-and intrarater reliability of the new, portable, dynamometer EasyForce for assessing the isometric strength of shoulders, knee, and hip muscles. Overall, the results demonstrate moderate to excellent inter-and intrarater reliability with high within-individual variation for average peak torques in all tests.
Traditionally, shoulder internal rotation strength is assessed in a seated [40] supine [41] or prone position [40]. All shoulder joint measures of isometric strength used in this study demonstrated clinically acceptable levels of interrater reliability (ICC 0.82-0.91) using the prone position. Inter-and intrarater reliability results for shoulder isometric strength measures were similar to those of Hayes et al. and Cadogan et al. [42,43] for abduction (ICC = 0.84-0.96). In addition, isometric testing of the shoulder abductors using the HHD has shown excellent reliability in patients with shoulder pain (ICC = 0.77-0.98) [43] and in an asymptomatic university population (ICC = 0.94) [44]. Regarding shoulder IR, Katoh [45] found high ICC values (>0.9) in examining the test-retest reliability for HHD. Despite the high ICCs for shoulder measurements, our results revealed some systematic differences between trials and low absolute interrater reliability (CV > 10%).
Differences between trials for raters may be due to possible alterations in their technique following the performance of the initial trial. Moreover, reliable assessments with HHD require that the participants' strength does not exceed the strength of the raters [46]. Despite satisfactory reliability (ICC > 0.80) for all measurements, current results revealed some systematic differences between trials as well. Nevertheless, it can be assumed that taking three measurements are sufficient to account for trial-to-trial variability. Therefore, the validity of the EasyForce is yet to be confirmed due to the abovementioned limitations and certain differences in testing positions.
Knee isometric strength tests have been widely used to estimate knee joint strength. Although the results of strength assessments are reported with different units of measurement, the data from this investigation can be compared to other studies. Previous studies conducted with or without a stabilization device reported moderate to high reliability for the assessment of isometric knee strength [26,[47][48][49]. Studies that used HHD without a stabilization device reported lower inter-and intrarater reliability values [47,48]. This is in line with the current results regarding knee flexion and extension, where ICC values were from 0.65 to 0.83, showing moderate to good relative reliability. Additionally, absolute inter-and intrarater reliability were slightly over the acceptable threshold for knee measures (CV > 10%). Similarly, previous studies [26,50] also showed CV values higher than 10%. In a study by Lu et al. [50], the interrater CV ranged between 21.3 and 42.5% for the HHD without stabilization. On the contrary, Martins et al. [26], using the belt stabilization, showed slightly lower values for the absolute reliability for the knee strength assessment (CV = 12.0-22.0%). We can assume that better results for the CV and reliability could be achieved with more experienced raters. Therefore, it could be recommended that clinicians and practitioners practice the test extensively before applying it to patients or at least until they reach an acceptable CV, which is lower than 10%. Nevertheless, according to the high inter-and intrarater relative reliability results, we can state that EasyForce provides reliable data for assessing the isometric muscle strength of knee flexors and extensors, which supports the use of the dynamometer.
The interrater and intrarater reliability of HHD in assessing isometric hip strength has previously been established in healthy subjects [49,51,52]. In the study by Kollock et al. [51], intrarater reliability values ranged from 0.70 to 0.94 in the assessment of hip muscles. Maffiuletti and Mentiplay et al. [49,52] have also demonstrated good-to-excellent reliability for assessing the strength of the hip muscles. The current study also demonstrated moderate to high relative inter-and intrarater reliability for hip muscles ranging from 0.63 to 0.89. Studies that have investigated the reliability of HHD for measuring hip strength reported low reliability due to uncomfortable positions where stabilization of the pelvis is more difficult, like in prone and standing positions [47,53]. Accordingly, it was reported that the sidelying position (EasyForce measurements) provides the most valid hip abduction strength measurement [54], with slightly higher values obtained compared to the supine position.
This study had several strengths. The interrater reliability was carried out by three raters instead of two, which might have provided even more reliable information. We have also avoided the information bias since the raters were blinded from the strength values. The biggest limitation was that the sample consisted of a healthy population, limiting the generalization to other populations. Therefore, future studies should examine this protocol in clinical populations. Nevertheless, we consider the results of the current study important in providing normative values in healthy people.

Conclusions
The EasyForce dynamometer can be considered a reliable tool for assessing shoulder IR and abduction, knee extension and flexion, as well as hip abduction and adduction strength. Specifically, we found good to high intra-and interrater reliability with a slightly higher within-individual variation. The EasyForce dynamometer showed no differences between the raters' measurements, which could be of great importance for professionals who want to perform the tests regardless of their strength on the values. Moreover, the biggest advantage of EasyForce is its affordability and portability, which allows the device to be used in different areas and settings.
Funding: The authors were supported by the Slovenian Research Agency through the project TELASI-PREVENT [L5-1845] (Body asymmetries as a risk factor in musculoskeletal injury development: studying aetiological mechanisms and designing corrective interventions for primary and tertiary preventive care) and by the University of Primorska through internal research program KINSPO (2990-1-2/2021).

Institutional Review Board Statement:
The experimental protocol was approved by the Republic of Slovenia National Medical Ethics Committee (approval no. 0120-99/2018/5) and was performed in accordance with the latest revision of the Declaration of Helsinki.
Informed Consent Statement: Informed consent was obtained from all subjects involved in the study. Written informed consent has been obtained from the patient(s) to publish this paper.