Intra- and Inter-Rater Reliability of Strength Measurements Using a Pull Hand-Held Dynamometer Fixed to the Examiner’s Body and Comparison with Push Dynamometry

Hand held dynamometers (HHDs) are the most used method to measure strength in clinical sitting. There are two methods to realize the assessment: pull and push. The purpose of the present study was to evaluate the intra- and inter-rater reliability of a new measurement modality for pull HHD and to compare the inter-rater reliability and agreement of the measurements. Forty healthy subjects were evaluated by two assessors with different body composition and manual strength. Fifteen isometric tests were performed in two sessions with a one-week interval between them. Reliability was examined using the intra-class correlation (ICC) and the standard error of measurement (SEM). Agreement between raters was examined using paired t-tests. Intra- and inter-rater reliability for the tests performed with the pull HHD showed excellent values, with ICCs ranging from 0.991 to 0.998. For tests with values higher than 200 N, push HHD showed greater differences between raters than pull HHD. Pull HHD attached to the examiner’s body is a method with excellent reliability to measure isometric strength and showed better agreement between examiners, especially for those tests that showed high levels of strength. Pull HHD is a new alternative to perform isometric tests with less rater dependence.


Introduction
Quantifying the magnitude of strength is useful for rehabilitation programs, providing helpful information on setting target values, for setting up appropriate exercise loads, and the effectiveness and progress of treatment [1]. The evaluation of strength is one of the usual practices by health professionals to assess healthy individuals [2][3][4] and in the management patients with different lower limb or upper limb pathologies [5], such as knee osteoarthritis [6], rotator cuff injuries [7], and neurogenic thoracic outlet syndrome [8]. Among the tools to measure strength in clinical sitting, the most used is hand held dynamometers (HHDs) [9], since it has advantages such as portability, cost, and ease of use compared to other more expensive and less versatile methods (i.e., isokinetic dynamometer) [10]. In general, HHDs can be classified into two types, push or pull [1,[11][12][13].
Push HDD consists of the patient having to push against the HHD, which is usually stabilized by the examiner's hand and has been shown to be a reliable method [10,14]. This push mode has the disadvantage that examiner's sex and strength influence the strength values (the reliability increases when the rater is stronger than the subject) [14]. Pull HHD consists of the patient pulling the HHD, which is generally attached to a rigid structure such as espalier, stretcher, or glass suction cup and showing to be also a reliable method [12,[15][16][17][18].

Materials and Methods
This cross-sectional study enrolled 40 healthy volunteer subjects who were recruited through advertising in Blasco Ibañez Campus of the University of Valencia (Table 1 shows the participants' characteristics). The specific inclusion criteria were: (1) participants' age between 18 and 40 years; (2) not having undergone a surgical operation on the lower or upper limb in the last two years; and (3) not having suffered pain episodes in the lower or upper limb two months before data collection. After a detailed explanation of the study procedures, the participants signed informed consent. The experimental protocol was approved by the Ethics Committee of the University of Valencia (Spain) (H1533739889520). Data collection was carried out in the clinical research laboratory of the Department of Physiotherapy (University of Valencia). Table 1. Characteristics of the participants (n = 40).

Procedures
Two sports and health professionals (a female and a male) were chosen to carry out the isometric tests, both with 1 year of clinical experience and with a master's degree. Both raters, with different body composition (body mass: 55.4 kg and 91.3 kg and stature: 166 cm and 180 cm, respectively) were chosen to reflect different profiles of clinicians working both clinical and research settings. As previous authors [13], the raters completed a test of one maximum repetition of seated bench press as an indicator of general upper-extremity strength (47 kg for female tester and 81 kg for male tester). Raters received a 1-h training session on how to perform the measurements with both the pull HHD and the push HHD. Following the training, they performed testing procedures with 3 volunteers, supervised in turn by a health professional with extensive HHD experience. Both examiners were blinded to the strength values, with a third researcher responsible for viewing the strength values and recording them.
The pull HHD selected for the study was DiCI (Ionclinics S.L, L'Alcudia, Spain), which registers the traction strength through two hooks in series [19]. For the DiCI measurement, one end was attached with a strap to the subject's ankle or wrist and the other end, with a belt, to the examiner's body (Appendix A). On the other hand, the push HHD used was MicroFET2 (Hoggan Health Technologies Inc., Salt Lake City, UT, USA), widely used in the literature [20,21].
Isometric tests were performed on the dominant leg or arm in two sessions with a one-week interval between them. The two sessions began evaluating the strength using the pull HHD by tester 1 (male) and, thus, evaluating the intra-tester reliability (both intrasession and intersession). Subsequently, the isometric strength of the participants was again measured either by rater 1 or rater 2, randomly, with the pull HHD the first session and with the push HHD the second session in order to examine the inter-rater reliability of each HHD (Figure 1).
The pull HHD selected for the study was DiCI (Ionclinics S.L, L'Alcudia, Spain), which registers the traction strength through two hooks in series [19]. For the DiCI measurement, one end was attached with a strap to the subject's ankle or wrist and the other end, with a belt, to the examiner's body (Appendix A). On the other hand, the push HHD used was MicroFET2 (Hoggan Health Technologies Inc., Salt Lake City, UT, USA), widely used in the literature [20,21].
Isometric tests were performed on the dominant leg or arm in two sessions with a one-week interval between them. The two sessions began evaluating the strength using the pull HHD by tester 1 (male) and, thus, evaluating the intra-tester reliability (both intrasession and intersession). Subsequently, the isometric strength of the participants was again measured either by rater 1 or rater 2, randomly, with the pull HHD the first session and with the push HHD the second session in order to examine the inter-rater reliability of each HHD (Figure 1). Before performing the isometric tests, the anthropometric characteristics of the participants were measured. A warm-up was performed on a bicycle with low resistance and at comfortable speed (80 revolutions per minute) for 10 min and three submaximal isometric contractions for each position. In addition, these submaximal contractions were also used to familiarize the participants with correct execution of the tests.
All tests were performed on a stretcher. The lower limb tests were performed both in the supine position for hip abduction (Hip-ABD), hip adduction (Hip-ADD), ankle flexion (Ank-F), and ankle extension (Ank-E) tests; in the prone position for hip extension (Hip-E), hip rotation external (Hip-ER), and internal (H-IR); and in the sitting position for hip flexion test (Hip-F). The upper limb tests were performed in supine for elbow flexion (Elb-F) and extension (Elb-E), for shoulder flexion (Sho-F), extension (Sho-E), and abduction (Sho-A), and for shoulder internal rotation (Sho-IR) and external (Sho-ER). These isometric tests ( Figures A1 and A2), both for lower and upper limb, were selected because they showed small measurement variation in previous studies [13,22,23]. Test order were randomized for each participant to avoid systematic bias related to this. Two 5 s MVICs were performed per movement with 60 s of rest between measurements. A rest of 10 min was applied between rater measurements. The participants were instructed to make the maximum effort and received oral motivations to maintain the strength performed.

Statistical Analysis
Participant characteristics and strength values (Newtons) are presented as mean ± standard deviation (SD) or percentages, as appropriate. The mean between repetitions was used for analyses. Custom written scripts computed with MATLAB (version R2019b; Before performing the isometric tests, the anthropometric characteristics of the participants were measured. A warm-up was performed on a bicycle with low resistance and at comfortable speed (80 revolutions per minute) for 10 min and three submaximal isometric contractions for each position. In addition, these submaximal contractions were also used to familiarize the participants with correct execution of the tests.
All tests were performed on a stretcher. The lower limb tests were performed both in the supine position for hip abduction (Hip-ABD), hip adduction (Hip-ADD), ankle flexion (Ank-F), and ankle extension (Ank-E) tests; in the prone position for hip extension (Hip-E), hip rotation external (Hip-ER), and internal (H-IR); and in the sitting position for hip flexion test (Hip-F). The upper limb tests were performed in supine for elbow flexion (Elb-F) and extension (Elb-E), for shoulder flexion (Sho-F), extension (Sho-E), and abduction (Sho-A), and for shoulder internal rotation (Sho-IR) and external (Sho-ER). These isometric tests ( Figures A1 and A2), both for lower and upper limb, were selected because they showed small measurement variation in previous studies [13,22,23]. Test order were randomized for each participant to avoid systematic bias related to this. Two 5 s MVICs were performed per movement with 60 s of rest between measurements. A rest of 10 min was applied between rater measurements. The participants were instructed to make the maximum effort and received oral motivations to maintain the strength performed.

Statistical Analysis
Participant characteristics and strength values (Newtons) are presented as mean ± standard deviation (SD) or percentages, as appropriate. The mean between repetitions was used for analyses. Custom written scripts computed with MATLAB (version R2019b; The Mathworks, Natick, MA, USA) was used to perform all statistical analyses by a researcher blinded for measurements.
Second, for the analysis of the agreement between the strength measurements (rater 1 and rater 2) and to assess systematic between-rater bias, that is, if values obtained by one rater systematically differed from that of the other rater, paired t-tests were used [26]. Furthermore, the differences between raters were calculated for each method and they were compared using paired t-test, with a level of significance p < 0.05. Additionally, to illustrate the differences between HHDs as a function of the strength obtained, the Bland Altmann plots were performed in those tests with higher strength values.
Sample size was calculated using the formula for reliability studies based on confidence intervals (CIs) described by [27]. With the number of instruments (k) equal to 2, the CI around r (the reliability coefficient) of 0.05, and an estimated r of 0.95, the sample size (n) was calculated to be 25 participants. However, ultimately, we included 15 more participants in the final sample in order to increase the study power. Table 2 shows the intra-rater reliability for the tests performed with the pull HHD, both intra-session and inter-sessions. The intra-session reliability showed excellent values, with ICCs ranging from 0.996 to 0.998. Furthermore, the SEM values were less than 1%. The inter-session reliability obtained similar values, with ICC higher than 0.995 and SEMs lower than 1%.  Table 3 shows the inter-rater reliability and agreement for the tests performed with the pull HHD. All tests showed excellent reliability (ICCs > 0.991), with SEMs lower than 1%. The agreement between rater showed differences between the measurements of rater 1 and rater 2 ranging from −0.69% to −3.78%, always in favor of rater 1.   Figure 2 illustrates differences between raters for the measurements of each participant, in the lower limb ( Figure 2A) and the upper limb ( Figure 2B). As can be seen, for some movements (e.g., hip abduction/adduction or hip rotations) both the pull and the push HHD methods showed differences lower than 20 N (rater differences ranged between 0.20% to 0.89% for pull HHD (Table 3) and between 0.26% to 1.59% for push HHD (Table 4)). On the other hand, for movements such as Hip-F, Ank-F, or Sho-E, both methods show greater differences between raters, but these are greater for the push HHD than for the pull HHD method; the differences between raters are −3.61%, −3.78%, and −2.84% for the pull HHD (Table 3) and −9.68%, −12.91%, and −9.71% for the push HHD (Table 4).

Inter-Rater Reliability
push HHD methods showed differences lower than 20 N (rater differences ranged between 0.20% to 0.89% for pull HHD (Table 3) and between 0.26% to 1.59% for push HHD (Table 4)). On the other hand, for movements such as Hip-F, Ank-F, or Sho-E, both methods show greater differences between raters, but these are greater for the push HHD than for the pull HHD method; the differences between raters are −3.61%, −3.78%, and −2.84% for the pull HHD (Table 3) and −9.68%, −12.91%, and −9.71% for the push HHD (Table 4).     The differences by method as a function of the strength obtained in these three tests are illustrated in Figure 3 by means of the Bland Altman plots. Bland Altman plots show how from strength values greater than 200 N, the differences between raters for the push HHD increase progressively, while the differences in the pull HHD remain stable.

Discussion
Our results support our initial hypothesis that stabilizing a pull HHD to the examiner's body has excellent reliability achieved for isometric strength measurements performed by examiners with different manual strength and in tests with different strength values. In addition, this new method presents a better agreement between examiners than push HHD against the hand, especially for tests with strength values greater than 200 N.
To our knowledge, this study is the first to examine the reliability of a pull HHD attached to the examiner's body. Intra-reliability for this method proved to be excellent (ICCs > 0.998). Other studies with stabilized pull HHDs (these to a fixed external element) have also obtained high ICCs for intra-rater reliability, both for hip and ankle tests (ICCs

Discussion
Our results support our initial hypothesis that stabilizing a pull HHD to the examiner's body has excellent reliability achieved for isometric strength measurements performed by examiners with different manual strength and in tests with different strength values. In addition, this new method presents a better agreement between examiners than push HHD against the hand, especially for tests with strength values greater than 200 N.
To our knowledge, this study is the first to examine the reliability of a pull HHD attached to the examiner's body. Intra-reliability for this method proved to be excellent (ICCs > 0.998). Other studies with stabilized pull HHDs (these to a fixed external element) have also obtained high ICCs for intra-rater reliability, both for hip and ankle tests (ICCs ranged from 0.88 to 0.98) [12] and shoulder tests (ICCs ranged from 0.94 to 0.98) [16,17]. Otherwise, compared with other studies where they have used pull HHD attached to structures, our method showed ICCs for inter-rater reliability (ICCs > 0.991) similar or slightly superior to those studies (ICCs ranging from 0.69 to 0.99 for hip tests, from 0.76 to 0.99 for ankle tests, and from 0.86 to 98 for shoulder tests) [12,15,17,28]. Thus, the reliability of attaching a pull HHD to the examiner's body would not be inferior to attaching it to a fixed external element.
The agreement between the examiners' measurements proved to be different between methods, especially in those tests with strength values greater than 200 N. As previous authors have described, in those tests with values greater than 200 N, the measurements of HHD without fixation compared to with fixation tend to underestimate the strength values [13,29]. Our results provide the novelty that fixing the HHD to the examiner's body is sufficient to reduce such underestimation. For example, in tests such as Hip-F or Ank-F (with values close to or greater than 300 N), the differences between raters for the push HHD were 9.68% and 12.91%, respectively, compared to 3.61% and 3.78% for the pull HHD. In the upper limb this is similar, where the Sho-E, with values close to 300 N, showed differences between testers of 9.71% for push HHD versus 2.84% for pull HDD. The differences for push HHD between raters are similar to studies that have also used examiners with different strengths. For example, Kelln et al., 2008 found 8.87% differences for Ank-F [30].
This study proposes a new method to perform isometric tests, stabilizing an HHD pull in the examiner's body. This method has shown excellent inter-and intra-rater reliability, and compared to other methods, its use provides clinical advantages for sports and health professionals. First, compared to other methods that have tried to solve the problem of examiner interaction (e.g., fixing the HHD to espalier, metal bar, or glass suction cup) [12,15,17,28,31,32], pull HHD method of this study presents a similar reliability to such methods but without subtracting clinical application as it does not need external fixation or is limited to specific movements. Second, pull HHD reduces the interaction of the examiner's strength compared to the use of push HHD against the hand. Since push HHD is a common method of strength measurement among sports and health professionals due to its easy use, but it presents the bias of the examiner's interaction, pull HHD fixed to the examiner's body can be an alternative of easy use and less bias. Likewise, other types of clinical test has been used to assess the muscle performance, but with weak positive correlation against HHD [33].
This study had several strengths. First, we performed multiple tests of both lower limb and upper limb movements, eight and seven, respectively. To the best of our knowledge, this is the first study to examine the reliability of fifteen isometric tests, so we proposed a broad measurement protocol with HHD in the same study. Second, we avoided an information bias, since the raters were blinded from the strength values as there was a third researcher who was in charge of reading and recording them. In turn, a fourth researcher in charge of the statistical analysis was blinded as to which HHD corresponded to the different strength records.
The main limitation was that the measurements were made on a healthy population, limiting their generalization to other populations. Although it has been shown that the reliability of HHDs is lower in healthy population than in patients (due to greater strength and less variability), future studies should examine our protocol in clinical populations. Likewise, the inter-rater reliability was carried out by two raters, a procedure that according to the literature is sufficient for validation, but that the involvement of three or more raters might have provided even more reliable information. Future studies should address this limitation by considering, at least, three raters. Even so, we consider that this first study is essential to provide normative values in healthy people with which to compare.

Conclusions
This study examines the intra-and inter-rater reliability of a new proposal to measure isometric strength, a pull HHD attached to the examiner's body. This method showed excellent reliability and acceptable agreement between the examiners' measurements, who had a different body and strength profile. Furthermore, compared to the traditional method of strength measurement with HHD, pushing against the examiner's hand, pull HHD showed better agreement between examiners, especially for those tests that showed high levels of strength. Thus, this new use of pull HHD may represent a new alternative for professionals who want to perform isometric tests with less influence of their strength on the values.  Informed Consent Statement: Informed consent was obtained from all subjects involved in the study.

Data Availability Statement:
The datasets generated during and/or analyzed during the current study are available from the corresponding authors on reasonable request.

Conflicts of Interest:
The authors declare no conflict of interest.

Appendix A
Diagnostics 2021, 11, x FOR PEER REVIEW 9 of 11 different body and strength profile. Furthermore, compared to the traditional method of strength measurement with HHD, pushing against the examiner's hand, pull HHD showed better agreement between examiners, especially for those tests that showed high levels of strength. Thus, this new use of pull HHD may represent a new alternative for professionals who want to perform isometric tests with less influence of their strength on the values. Informed Consent Statement: Informed consent was obtained from all subjects involved in the study.

Data Availability Statement:
The datasets generated during and/or analyzed during the current study are available from the corresponding authors on reasonable request.

Conflicts of Interest:
The authors declare no conflict of interest.