Reliability and Validity of a Novel Wearable Device for Measuring Elbow Strength

Muscle strength is an important clinical outcome in rehabilitation and sport medicine, but options are limited to expensive but accurate isokinetic dynamometry (IKD) or inexpensive but less accurate hand-held dynamometers (HHD). A wearable, self-stabilizing, limb strength measurement device (LSMD) was developed to fill the current gap in portable strength measurement devices. The purpose of this study was to evaluate the reliability and validity of the LSMD in healthy adults. Twenty healthy adults were recruited to attend two strength testing sessions where elbow flexor and extensor strength was measured with the LSMD, with HHD and with IKD in random order, by two raters. Outcomes were intra-rater repeatability, inter-rater reproducibility and inter-session reproducibility using intra-class correlation coefficients (ICC). Limits of agreement and weighted least products regression were used to test the validity of the LSMD relative to the criterion standard (IKD), and calibration formulas derived to improve measurement fidelity. ICC values for the LSMD were >0.90 for all measures of reliability and for both muscle groups, but over-predicted extensor strength and under-predicted flexor strength. Validity was established by transforming the data with the criterion standard-based calibration. These data indicate that the LSMD is reliable and conditionally valid for quantifying strength of elbow flexors and extensors in a healthy adult population.


Introduction
Strength measurements are key outcomes in surgical, clinical and academic research disciplines [1][2][3][4]. Valid, reliable strength measurements are essential not only to assessment outcomes [5,6], they often form the basis of clinical decision making in rehabilitation practice [7] and sport medicine [8]. However, options for reliable and accurate objective measurement of muscle strength in the clinic have not changed in three decade and remain limited to hand-held dynamometry (HHD) [9] and isokinetic dynamometry (IKD) [2], both of which have distinct advantages and disadvantages.
In this study we propose an alternative approach using a wearable limb strength measurement device (LSMD) [29] for objective strength assessment of elbow flexor and extensor muscle groups [30,31]. The device is described in Figure 1. The LSMD includes an aluminum frame with three members-a long bar with a spring-loaded selection knob at one end (inset Figure 1) connects to a wrist pad with an internal load cell (inset Figure 2) that aligns perpendicular to the wrist. The selection knob has spring-loaded pins that snap into a plate on a third bar with foam pads at either end. Using the knob, the user can select between the extension (Figure 1ii) or flexion (Figure 1iii) configurations for testing limb strength. Alternately, they can lock the device flat for storage/transport (Figure 1i). The portability and ease of use of the LSMD makes it appropriate for a broad spectrum of applications, including: athletics [17,32], physical therapy [33][34][35], planning and tracking medical interventions [18,36], and academic research [37]. However, the measurement performance of the LSMD has yet to be established.  The purpose of this study was: (1) to evaluate the reliability of the LSMD, in comparison to HHD and IKD performance, and (2) to evaluate the validity the LSMD using IKD as a 'gold standard', and apparatus that is neither portable nor easily affordable, and requires technical expertise for operation and maintenance. In this study we propose an alternative approach using a wearable limb strength measurement device (LSMD) [29] for objective strength assessment of elbow flexor and extensor muscle groups [30,31]. The device is described in Figure 1. The LSMD includes an aluminum frame with three members-a long bar with a spring-loaded selection knob at one end (inset Figure 1) connects to a wrist pad with an internal load cell (inset Figure 2) that aligns perpendicular to the wrist. The selection knob has spring-loaded pins that snap into a plate on a third bar with foam pads at either end. Using the knob, the user can select between the extension (Figure 1ii) or flexion (Figure 1iii) configurations for testing limb strength. Alternately, they can lock the device flat for storage/transport (Figure 1i). The portability and ease of use of the LSMD makes it appropriate for a broad spectrum of applications, including: athletics [17,32], physical therapy [33][34][35], planning and tracking medical interventions [18,36], and academic research [37]. However, the measurement performance of the LSMD has yet to be established.  The purpose of this study was: (1) to evaluate the reliability of the LSMD, in comparison to HHD and IKD performance, and (2) to evaluate the validity the LSMD using IKD as a 'gold standard', and  The purpose of this study was: (1) to evaluate the reliability of the LSMD, in comparison to HHD and IKD performance, and (2) to evaluate the validity the LSMD using IKD as a 'gold standard', and develop correction coefficients for the LSMD to improve performance in both flexor and extensor muscle strength assessment.

Methods
Evaluation of the LSMD was accomplished using a two-stage experiment comparing isometric strength measurements of the human arm in flexion and extension using the LSMD, the HHD (MicroFET 2 hand-held dynamometer, Hoggan Scientific, LLC, Salt Lake City, UT, USA) and the IKD (Cybex Humac Norm, Computer Sports Medicine, Inc, Stoughton, MA, USA) configured for isometric strength measurement. Specifications for each device are shown below in Table 1. The study was approved by the university research ethics board, and all participants provided informed signed consent prior to enrollment in the study.

Participants and Raters
Twenty healthy adults were recruited from the university student and staff population via poster advertisement and word of mouth. Included were adults between the age of 19 and 60. Exclusion criteria were upper limb fracture or elbow injury in the last 2 years, shoulder or elbow surgery in the last year, or stroke or other neurological disorder affecting the upper limbs.
In addition to the pool of healthy adult participants, a pair of raters was present for all tests, alternating between performing the tests and assisting the other rater, in accordance with the experimental design outlined below. Prior to the first round of testing, both raters received identical training in the use of the HHD, LSMD and IKD from qualified individuals. The two raters were denoted as rater A and rater B. Raters were blind to other raters' assessments.

Experimental Design
A flow diagram of the experimental design is shown in Figure 3. Briefly, the design consisted of two sessions spaced 7-14 days apart. In session I, all 20 participants were assessed with all three devices by rater A. Order of device testing (HHD, LSMD, IKD) was randomized, as was the order of evaluating flexor and extensor strength. Three repetitions were conducted for each device and muscle group. This data was used to evaluate intra-rater reliability. For session II, participants were randomized into two groups of 10 participants. Group 1 was assessed again by rater A, and group 2 was assessed by rater B. The former data was used to evaluate inter-session reliability, and the latter data used to evaluate inter-rater reliability, for each device and muscle group. Only the dominant arm was tested, determined by asking participants which hand they write with.

Protocol
All tests were conducted at the Andrew and Marjorie McCain Human Performance Lab at the University of New Brunswick. Standard protocols for HHD [40], IKD [41] and LSMD [30,31] were employed for isometric strength testing. Briefly, the elbow was palpated to locate the lateral epicondyle which was designated as the axis of rotation of the elbow joint. The location of the distal forearm load from the testing device was also measured to control for moment arm differences. Figure 4 shows the positioning of the LSMD on a participant for flexor and extensor strength assessment. The LSMD fixes the joint at approximately 90 degrees flexion. Therefore, the isometric strength tests for HHD and IKD were also conducted at an elbow angle of 90 degrees. This is the typical testing angle for isometric strength measurement of elbow flexors and extensors [40].  For session II, participants were randomized into two groups of 10 participants. Group 1 was assessed again by rater A, and group 2 was assessed by rater B. The former data was used to evaluate inter-session reliability, and the latter data used to evaluate inter-rater reliability, for each device and muscle group. Only the dominant arm was tested, determined by asking participants which hand they write with.

Protocol
All tests were conducted at the Andrew and Marjorie McCain Human Performance Lab at the University of New Brunswick. Standard protocols for HHD [40], IKD [41] and LSMD [30,31] were employed for isometric strength testing. Briefly, the elbow was palpated to locate the lateral epicondyle which was designated as the axis of rotation of the elbow joint. The location of the distal forearm load from the testing device was also measured to control for moment arm differences. Figure 4 shows the positioning of the LSMD on a participant for flexor and extensor strength assessment. The LSMD fixes the joint at approximately 90 degrees flexion. Therefore, the isometric strength tests for HHD and IKD were also conducted at an elbow angle of 90 degrees. This is the typical testing angle for isometric strength measurement of elbow flexors and extensors [40]. For session II, participants were randomized into two groups of 10 participants. Group 1 was assessed again by rater A, and group 2 was assessed by rater B. The former data was used to evaluate inter-session reliability, and the latter data used to evaluate inter-rater reliability, for each device and muscle group. Only the dominant arm was tested, determined by asking participants which hand they write with.

Protocol
All tests were conducted at the Andrew and Marjorie McCain Human Performance Lab at the University of New Brunswick. Standard protocols for HHD [40], IKD [41] and LSMD [30,31] were employed for isometric strength testing. Briefly, the elbow was palpated to locate the lateral epicondyle which was designated as the axis of rotation of the elbow joint. The location of the distal forearm load from the testing device was also measured to control for moment arm differences. Figure 4 shows the positioning of the LSMD on a participant for flexor and extensor strength assessment. The LSMD fixes the joint at approximately 90 degrees flexion. Therefore, the isometric strength tests for HHD and IKD were also conducted at an elbow angle of 90 degrees. This is the typical testing angle for isometric strength measurement of elbow flexors and extensors [40].  Maximum voluntary isometric contractions (MVIC) were acquired for each device and muscle group three times each with a 30 s rest between repetitions. A minimum of 5 s rest was used between flexor and extensor tests, and a 20 min break was used between different device tests. The latter was required to ensure muscle recovery from the previous test, and to set-up testing with the other devices. To minimize the impact of varied subject motivation and inter-subject variability in related psychological traits, a consistent, positive set of high demand instructions were used to motivate the users throughout their contractions [42].

Data Analysis
HHD and LSMD measure force, whereas the IKD measures torque. To facilitate comparison with the IKD, each MVIC recorded with the HHD and LSMD was converted to an equivalent torque at the elbow using the force measurements and the measured distances between the elbow joint center and line of action of the HHD and LSMD for each participant. As such, muscle strength was measured as a torque expressed in units of N*m.

Objective 1: Reliability of LSMD for Strength Assessment
The torque data were used to determine two measures of reproducibility (inter-session and inter-rater reliability) and one measure of repeatability (intra-rater reliability), using the intraclass correlation coefficient ICC as described by Shrout and Fleiss [43].
For intra-rater reliability the three repeated measurements were treated as fixed effects in a two-way mixed effects ANOVA model (ICC model 3 for consistency, or ICC con (3,1)) to obtain the component of variance attributable to the targets (the cohort of 20 participants) and those associated with random error. The ICC con (3,1) was taken to be a measure of how well the same rater in identical conditions was able to consistently measure the strength of a participant.
For the two measures of reproducibility, each rater was assigned a different randomly selected subset of 10 participants from the full cohort-Group 1 was used to assess inter-session reliability and Group 2 was used to assess inter-rater reliability. Maximum strength from the three repetitions of each session was used in this analysis. ICC model 2 for absolute agreement, or ICC aa (2,1) using a two-way random effects ANOVA model was used to obtain the components of variance attributable to the targets (the cohort of 10 participants), judges (2 raters or 2 sessions) and those associated with random error. The ICC aa (2,1) was taken to be a measure of the agreement between two different raters and between two different sessions with the same rater.
In addition, a post-hoc analysis using simple main effects of the ANOVA model was used to follow-up on the significance of differences between raters and between sessions.

Objective 2: Validity of the LSMD for Assessment
Using the torque data from Session I, the validity of the LSMD was evaluated by directly comparing it to the corresponding IKD data across all 20 participants. This comparison treated IKD as the 'gold standard' for isometric strength measurement, with the relative performance of the LSMD determined both qualitatively using limits of agreement (LoA) [44], and quantitatively, using weighted least products (WLP) regression [45]. The WLP regression was required rather than an ordinary least squares analysis due to the expected, and observed, heteroscedasticity of strength measures [46].
Although the LSMD was built with stiff steel members, the joint assembly and the foam contact rollers allow for small deflections of the device, especially with stronger users, potentially over-or under-estimating the elbow torque being applied. To correct for these effects, the WLP regression analysis between LSMD and the IKD from the above analysis was used to form the coefficients (slope b, and intercept a) of a linear regression equation where τ LSMD and τ IKD are the measured torques from LSMD and IKD session I data, respectively. The coefficients can then be used to correct the measured LSMD data by calibrating the LSMD against the criterion standard IKD.
where τ Cal LSMD is the calibrated LSMD data. Finally, to test the validity of the calibration, the LSMD data from Session II data were corrected with the derived equation and compared to Session II IKD data using WLP regression as described above.
All statistical analyses were conducted with SPSS (Statistical Package for the Social Sciences, v23, IBM-SPSS, Chicago, IL, USA).

Results
Twenty healthy adults (eight female) were recruited into the study. There were no drop-outs. Participant demographics and handedness, summarized by group and for the full cohort, are shown in Table 2. Mean peak elbow torques in extension and flexion for all three devices are summarized in Table 3 for each group and for the full cohort.

Intra-Rater Repeatability
This analysis used repeated measurement data (3 trials) from session I measured by rater A on the full cohort (n = 20). Reliability results using the ICC con (3,1) model are shown in Table 4. All three devices were found to be repeatable with all ICC values exceeding 0.96 except for flexion strength with IKD which was a little lower at 0.915 but still considered in the excellent range. Table 4.
Reliability results for intra-rater repeatability, inter-rater reproducibility and inter-session reproducibility.

Inter-Rater Reproducibility
This analysis used session I and session II data measured by rater A and rater B, respectively, on a sub-sample of participants (group 2, n 2 = 10) measured by both raters. Reproducibility results using the ICC aa (2,1) model are shown in Table 4. ICC values for this analysis were lower than for intra-rater results. Although flexor assessment had good inter-rater reliability with all three devices with ICC values >0.85, for extensor strength measurement only the LSMD achieved good results (ICC = 0.897), whereas the HHD and IKD results were poor (ICC = 0.575 and 0.654 respectively).

Inter-Session Reproducibility
This analysis used session I and session II data measured only by rater A, on a sub-sample of participants (group 1, n 1 = 10) measured by this rater in both sessions. Reproducibility between sessions for a single rater using the ICC aa (2,1) model are shown in Table 4. Inter-session reproducibility results were excellent (ICC > 0.90) for all measures except flexion strength with the HHD, which had lower but good (ICC = 0.854) reproducibility.

Post-Hoc Analysis of Reproducibility Results
Simple main effects were examined in the ICC aa (2,1) model for reproducibility across sessions (group 1) and raters (group 2). These data are shown in Table 5. In terms of between-session differences, only the HHD for flexor strength was statistically significant (p = 0.017). Between-rater differences were more severe with a significant difference for HHD in extension (p = 0.001), and IKD in both extension (p = 0.025) and flexion (p = 0.023).

Criterion Validity
This analysis used session I data for the LSMD and the IKD devices. For the purpose of validation, the LSMD was considered the experimental measurement and IKD considered the criterion standard (i.e., 'gold standard') measurement. Model IIA regression results are shown in Figure 5.

Criterion Validity
This analysis used session I data for the LSMD and the IKD devices. For the purpose of validation, the LSMD was considered the experimental measurement and IKD considered the criterion standard (i.e., 'gold standard') measurement. Model IIA regression results are shown in Figure 5. The plots show that the LSMD over-predicts elbow extensor strength and under-predicts elbow flexor strength. Analysis of the regression coefficients for LSMD versus IKD showed a significant proportional bias effect for both flexion and extension, as shown in Table 6. The plots show that the LSMD over-predicts elbow extensor strength and under-predicts elbow flexor strength. Analysis of the regression coefficients for LSMD versus IKD showed a significant proportional bias effect for both flexion and extension, as shown in Table 6.

Calibration Using the Criterion Standard
Regression coefficients from the above analysis of LSMD and IKD session I data were used to derive calibration formulas for LSMD elbow extensor and flexor measurement.
where τ e and τ f are the measured extensor and flexor torques, respectively, and τ e Cal and τ f Cal are the corrected torques. Finally, Equations (3) and (4) were applied to LSMD data from session II and compared to IKD data for session II. Results are shown in terms of regression coefficients in Table 6 and measured torque values in Table 7. Before calibration, the LSMD overestimated extension strength by 21% and underestimated flexion strength by 14%. Calibration narrowed these effects, with the LSMD underestimating strength by 3% of the IKD measurement in both flexion and extension, falling within the random error measured during repeated measurements with the criterion device. LoA plots for extensor and flexor strength measurements are shown in Figure 6.

Discussion
Reliable and valid measurement of muscle strength is paramount for clinical decision making [5,6]. However, the literature shows few advances in strength assessment beyond the ubiquitous HHD and IKD devices introduced more than three decades ago. The LSMD was originally designed as a component of the BioTone™ system, intended for muscle tone assessment in the clinic of persons

Discussion
Reliable and valid measurement of muscle strength is paramount for clinical decision making [5,6]. However, the literature shows few advances in strength assessment beyond the ubiquitous HHD and IKD devices introduced more than three decades ago. The LSMD was originally designed as a component of the BioTone™ system, intended for muscle tone assessment in the clinic of persons with upper motor neuron syndrome resulting from neurological injury or disease [47]. More broadly, the LSMD has potential for assessing elbow (and knee) strength in any population, for any number of compelling reasons. Whether quantifying strength of seniors for prescribing interventions, or evaluating contractile strength in athletes following sport-related injuries, the LSMD could provide a portable alternative to fixed-IKD. Furthermore, while the HHD is a clinically accepted approach to assessing strength, the superior reliability of the LSMD could improve the quality of strength assessments used in practice and in clinical trials of new treatments and drugs.
The purpose of this study was to evaluate the reliability and validity of the LSMD for these broader applications. This study focused on healthy adults as a first step to establish the quality, and limitations, of the measurement capability of the LSMD.

Reliability of the LSMD
To put the reliability results into context, recommended ICC cut-offs from the published literature were examined. Schrama et al. [3] provided a set of recommended cut-offs for ICC results in strength measurement studies that were specific to the type of reliability being measured. These cut-offs are shown below in Table 8. The guidelines of Lohr et al. [48] and Kottner et al. [49] were used to supplement the recommendations of Schrama et al. by providing a clinical viewpoint on minimal standards for ICCs. The analysis of van Trijffel et al. [50] provided a logical set of boundaries for judging the meta-reliability of a study that lists many ICC values associated with a single protocol. Note: where more than one cut-off value was recommended in the literature, the most stringent value was chosen.
While all three devices passed the van Trijffel et al. meta-reliability standards [50], the LSMD was the only device to exceed the ICC cut-offs in all testing conditions. In fact, the ICC values for the LSMD ranked highest for all but two of the six experimental conditions (see Table 4). It can be concluded that, despite any deficits in accuracy due to small deflections of the device, measurement of MVIC strength with the LSMD was repeatable and reproducible. This makes sense given that the LSMD structure (3-point contact) must equilibrate on the arm of the user arm during muscle contractions.
Provided that the arm geometry of the user does not change between assessments, one would expect this equilibrium point to be consistent across assessments.
The IKD and the HHD devices also performed acceptably across each reliably metric. These data agree with the HHD reliability data presented in studies by Visser et al. [51], Bohannon [52] and Aufsesser et al. [53]. Similarly, the Cybex fixed-IKD reliability data agree with the intra-and rater reliabilities reported by Mathur et al. [54], Ekstrand et al. [55], and Stratford and Balsor [56].
The post-hoc analysis of inter-session and inter-rater reproducibility (Table 7) provided a better understand the reasons for observing lower reproducibility of the HHD and IKD for some measures. The between-raters data in Table 7 revealed a significant between-raters difference for the IKD in both extension and flexion. These data suggest that, for the Cybex IKD device, the source of the inter-rater variability is independent of the type of elbow motion. It can be concluded that alignment of the elbow joint in the Cybex machine was most likely responsible, given that it is the only source of variability directly involving the rater. While both raters were trained with the same instructional content and hands-on practice, clearly even small differences in alignment with the Cybex motor axis can result in less reproducible results, reported by others [21,57]. The ability of the LSMD to self-equilibrate likely explains why rater variability in aligning the user with the device was less influential.
While comparing reliability of the LSMD to the field 'gold standard' (IKD) is important, the reliability performance of the LSMD was also examined relative to the HDD. Although manual muscle tests are often the choice of test in the clinic [58], HHD is by far the most ubiquitous form of objective strength assessment in the clinic and research laboratory, and is known to be susceptible to variability across rater physical strength/size and skill level [10,34,59]. Both raters in this study were novice, though equally trained, assessors. As shown in Table 7, difference between-raters was observed for the HHD elbow extensor torque, and difference between sessions was noted for HHD elbow flexion torque.
The between-rater difference was likely due in part to a protocol deviation that occurred, reflecting the novice level of the two raters. One rater started with a break-test rather than a make test, which was then maintained for consistency. This would account for observed difference in elbow extension, since the ability to stabilize against strong is different between the make and break tests [56]. Finally it should be noted that a few participants in the study were, by all accounts, stronger than the raters, which no doubt contributed to the higher variability across raters and sessions [13,60].

Validity of the LSMD
Although excellent reliability was demonstrated for the LSMD, the validity of the device when tested against the criterion standard was less impressive before calibration. As shown in Table 3 and Figure 5, the LSMD over-predicted extensor muscle strength, and under-predicted flexor muscle strength. Fortunately, however, the disagreement between LSMD and criterion standard was a linear function of strength magnitude and was easily modelling using the WLP regression results. After applying these corrections to the measured LSMD data from a different session (data not used to create the calibration equations), the difference between LSMD and criterion standard disappeared, lending credence to this approach for improving the validity of both extensor and flexor muscle strength measurements.
There are some important implications of these findings. As noted above, the sample studied was from a population of university students and staff, representing a young and healthy population with higher than average strength compared to the general population (c.f. study by Hogrel et al. [61]) but demonstrating heteroscedastistic variability. Therefore, the effect of device deformation was driven by the strongest participants in the study. In clinical studies using the LSMD [30,31,62], participants were not likely strong enough to deform the LSMD. Nevertheless, the need for a correction at all remains a design concern.
To further explore this characteristic, Figure 7 shows a force diagram of the LSMD in its extensor and flexor configurations. Assuming the force through the wrist cuff remains perpendicular to the forearm, the force vectors through the proximal and distal upper arm cuffs provide insight into how the device self-stabilizes, and how it may in some circumstance negatively impact the measurement accuracy.
with higher than average strength compared to the general population (c.f. study by Hogrel et al. [61]) but demonstrating heteroscedastistic variability. Therefore, the effect of device deformation was driven by the strongest participants in the study. In clinical studies using the LSMD [30,31,62], participants were not likely strong enough to deform the LSMD. Nevertheless, the need for a correction at all remains a design concern.
To further explore this characteristic, Figure 7 shows a force diagram of the LSMD in its extensor and flexor configurations. Assuming the force through the wrist cuff remains perpendicular to the forearm, the force vectors through the proximal and distal upper arm cuffs provide insight into how the device self-stabilizes, and how it may in some circumstance negatively impact the measurement accuracy. The right side of Figure 7 shows the device in extension strength configuration. Since the device represents a simple 3-force body, all three forces must pass through a common point in order for equilibrium to be achieved. For each configuration, vectors are shown at two extremes, where the green vectors show the desired force directions, and the orange vectors show the potentially problematic force directions.
For extension it can be seen that, depending on the direction of force through the posteriorproximal cuff, the force through the anterio-distal cuff located at the crux of the elbow could transmit loads to the forearm, thus inflating the measured elbow torque (shown by the orange vectors). For flexion it can also be seen that variability in anterio-proximal cuff force vector has a potential effect. If force directions are oriented as shown with the orange vectors, the net vertical force is directed upward, potentially causing the device to migrate upward on the arm and causing apparent underprediction of flexor strength. It is likely that these biomechanical factors play a role in device measurement fidelity, and are currently the subject of further study and design by the authors.

Limitations
There are some important limitations of the study to point out. Although the full cohort was of sufficient sample size (n = 20), the sub-groups used to test reproducibility were smaller (n = 10), making them more susceptible to sampling bias. Nevertheless, ICC values agreed well with the literature and any aberrant findings were explainable. The right side of Figure 7 shows the device in extension strength configuration. Since the device represents a simple 3-force body, all three forces must pass through a common point in order for equilibrium to be achieved. For each configuration, vectors are shown at two extremes, where the green vectors show the desired force directions, and the orange vectors show the potentially problematic force directions.
For extension it can be seen that, depending on the direction of force through the posterior-proximal cuff, the force through the anterio-distal cuff located at the crux of the elbow could transmit loads to the forearm, thus inflating the measured elbow torque (shown by the orange vectors). For flexion it can also be seen that variability in anterio-proximal cuff force vector has a potential effect. If force directions are oriented as shown with the orange vectors, the net vertical force is directed upward, potentially causing the device to migrate upward on the arm and causing apparent under-prediction of flexor strength. It is likely that these biomechanical factors play a role in device measurement fidelity, and are currently the subject of further study and design by the authors.

Limitations
There are some important limitations of the study to point out. Although the full cohort was of sufficient sample size (n = 20), the sub-groups used to test reproducibility were smaller (n = 10), making them more susceptible to sampling bias. Nevertheless, ICC values agreed well with the literature and any aberrant findings were explainable.
The study only included healthy adult participants. Indeed, the intent of the device is for assessing strength across the population. This includes seniors and clinical populations that may have different underlying force generating characteristics, and thus reliability can not necessarily be extrapolated from the data in the present study. Future studies will be needed to evaluate the psychometrics of the LSMD in specific populations where applications exist. However, the data given in Figure 6 provides sufficient information for a clinician to infer the suitability of the LSMD for their individual needs.
This study only included the upper-extremity LSMD for the elbow joint. Although it is likely that the LSMD knee device (discussed in [30]) behaves similar to the elbow device, since it is essentially a scaled-up version of the arm model, future studies will need to apply the testing protocol to evaluate reliability and validity of the LSMD for lower-extremity assessment.

Conclusions
The reliability (repeatability and reproducibility) of the LSMD was found to be comparable to established commercial devices for measurement of elbow flexor and extensor strength in healthy adults. Validity of the LSMD was found to be acceptable when measurements were calibrated against the accepted gold standard IKD measurement. It is concluded that a calibrated wearable device such as the LSMD can be used to get reliable and valid measurements of elbow strength.
Author Contributions: M.B. wrote the manuscript draft, designed and executed the experiment, and analyzed the data. A.S. managed the BioTone project, provided hardware and software support, and assisted with manuscript preparation and data collection. C.A.M. led the development of the LSMD, conceived of and acquired funding for the project, and assisted with manuscript preparation and data analysis. All authors have read and agreed to the published version of the manuscript.

Funding:
The project was funded through the Atlantic Canada Opportunities Agency, Atlantic Innovation Fund, Project # 195180.