Assessment of Shoulder Range of Motion Using a Wireless Inertial Motion Capture Device—A Validation Study

(1) Background: Measuring joint range of motion has traditionally occurred with a universal goniometer or expensive laboratory based kinematic analysis systems. Technological advances in wearable inertial measurement units (IMU) enables limb motion to be measured with a small portable electronic device. This paper aims to validate an IMU, the ‘Biokin’, for measuring shoulder range of motion in healthy adults; (2) Methods: Thirty participants completed four shoulder movements (forward flexion, abduction, and internal and external rotation) on each shoulder. Each movement was assessed with a goniometer and the IMU by two testers independently. The extent of agreement between each tester’s goniometer and IMU measurements was assessed with intra-class correlation coefficients (ICC) and Bland-Altman 95% limits of agreement (LOA). Secondary analysis compared agreement between tester’s goniometer or IMU measurements (inter-rater reliability) using ICC’s and LOA; (3) Results: Goniometer and IMU measurements for all movements showed high levels of agreement when taken by the same tester; ICCs > 0.90 and LOAs < ±5 degrees. Inter-rater reliability was lower; ICCs ranged between 0.71 to 0.89 and LOAs were outside a prior defined acceptable LOAs (i.e., > ±5 degrees); (4) Conclusions: The current study provides preliminary evidence of the concurrent validity of the Biokin IMU for assessing shoulder movements, but only when a single tester took measurements. Further testing of the Biokin’s psychometric properties is required before it can be confidently used in routine clinical practice and research settings.


Introduction
Accurately assessing joint range of motion (ROM) is integral in clinical orthopaedics and research settings. Several methods and instruments are available for measuring ROM, varying from visual estimation through to specialised kinematic assessment laboratories [1]. Each method has benefits and limitations, with the universal goniometer being the most commonly adopted technique due to portability and ease of use [1]. Goniometers, when used correctly, can accurately measure ROM, however measurement quality is influenced by the tester's manual skills and methods used [2,3]. In recent years, the popularity of commercially available motion tracking devices has increased the use of wearable motion capture systems to measure ROM. Inertial measurement units (IMUs) are one type of motion tracking device that have been widely adopted due to their ease of use, relative low cost, and portability [3][4][5]. These devices utilise recent advancements in the miniaturisation of motion capturing sensors to produce a light, non-invasive and wireless instrument that has the potential to assess human movement in a variety of environments [6,7].
The shoulder complex is one of the most complicated joint systems in the body, incorporating the glenohumeral, acromioclavicular, scapulothoracic, and sternoclavicular joints [8]. The shoulder's design allows for tri-planar upper limb movements, which due to the unique structure and coupled movements that form scapulo-humeral rhythm, produce kinematics that cannot be accurately captured when compared to traditional mechanical or robotic joints [9,10]. For these reasons measuring upper limb kinematics is regarded as the most difficult problem in human motion estimation [5]. However, this challenge must be addressed because measuring shoulder movements is important for assessing upper limb movement, which has obvious clinical significance. If clinically suitable IMUs were available to replace expensive, difficult to access kinematic labs, it would help clinicians better understand the impact of their interventions in routine clinical practice.
Wearable IMU devices are limited by inertial sensor drift and linear acceleration interference [11,12]. Inertial sensors measure segment orientation indirectly by integrating acceleration and angular velocity signals; during this process small errors accumulate over time, which is termed inertial sensor drift [13]. Errors also occur during relatively large linear acceleration, which is termed linear acceleration interference [12]. Investigating human kinematic using accelerometers or gyroscopes alone have been limited by the aforementioned challenges [11,14,15]. To overcome these issues, a tri-axial gyroscope, accelerometer and magnetometer have been combined into a single device and accompanied by a fusion algorithm [6,12,14]. In theory, combining data from each sensor can reduce measurement error and provide a more accurate estimation of motion [14,15].
Few studies have directly compared IMU and goniometer measurements of shoulder ROM. Yoon et al. compared IMU and goniometer measurements of static shoulder positions and found high levels of agreement for some positions (i.e., 95% LOA < ±5 • ), but not all [16]. Agreement reduced at higher levels of shoulder elevation, indicating heteroscedasticity. A recent review by Garimella et al. looked at the accuracy of portable and inexpensive motion capture devices for measuring joint angles compared to benchmark systems, typically optical systems [17]. The investigators demonstrated that from 2009-2017, IMUs were the most popular devices investigated, and found mean average error ranged 0.8 • -5.0 • for upper limb measurements. Measurement accuracy was dependent on the joint under measurement; joints with large range of movement, such as the shoulder, typically had larger measurement error. The IMU's underlying algorithm, which combines data from individual components and filters noise, played a significant role in measurement accuracy, and varied between devices. Given that almost 50% of the devices reviewed in the study were from the same developer, Gerimella et al. recommended that the accuracy of other devices also be evaluated.
'Biokin' is a locally-developed ROM measurement IMU device that combines tri-axial data from a gyroscope, accelerometer and magnetometer to assess motion [18]. It is small, light-weight and wearable, and can be used for measuring various limb movements. We have previously demonstrated that the IMU can accurately measure wrist movements [18], but its ability to accurately measure ROM at other joints needs to be established.
The primary aim of this study was to investigate whether the Biokin IMU can accurately measure active shoulder movements in a typical clinical environment. Specifically, we compared Biokin shoulder ROM measurements to universal goniometer measurements in healthy adults. A secondary aim was to investigate the inter-rater reliability of Biokin and goniometer measurements.

Participant Recruitment
Participants were hospital staff or students from our institution. Participants were volunteers and were recruited via communal bulletin boards and posters. Participants were excluded if they had active upper limb pathology and/or symptoms such as pain when moving their arm.

Data Collection
Demographic information was collected from all participants including age, sex, handedness, occupation, sport participation, and prior shoulder injuries.
Data was collected in a hospital orthopaedic ward. Four active movements were tested on each participant's right and left shoulder according to a standardised protocol: abduction; flexion; and internal rotation and external rotation at 90 degrees of shoulder abduction (see Tables 1 and 2 for starting positions, and Krishnan et al. for diagrammatic representations of each movement) [10]. Prior to each test, a researcher (MR or HG) demonstrated the movement and the participant practiced the movement until the researcher was satisfied it was performed correctly. Participants were instructed to move the arm in each direction as far as they comfortably could. Pointing to ulna styloid process Note: All movements occurred with participant standing and the participant's limb was fully exposed.
One researcher (MR or HG) collected goniometer measurements from each participant and during each movement, IMU measurements were also taken. A second researcher (MR or HG), who was blind to the first measurements, repeated the process. One goniometer and one IMU measurement was taken by each researcher for each movement on each arm.
The IMU was calibrated prior to ROM testing according to the device instructions. The IMU was securely attached to the participant's forearm with a self-adhering strap, 10 cm distal to the lateral epicondyle (see Figure 1). Measurements were sent wirelessly to a mobile phone and subsequently transferred to Biokin specific software to calculate ROM which was performed by a researcher (NN or PP) who was blind to goniometer measurements.
The IMU was calibrated prior to ROM testing according to the device instructions. The IMU was securely attached to the participant's forearm with a self-adhering strap, 10 cm distal to the lateral epicondyle (see Figure 1). Measurements were sent wirelessly to a mobile phone and subsequently transferred to Biokin specific software to calculate ROM which was performed by a researcher (NN or PP) who was blind to goniometer measurements.

Data Analysis
Agreement between IMU and goniometric measurements for each movement on each limb was assessed using intraclass correlation coefficients (ICC) and Bland-Altman analysis [19]. Inter-rater reliability was assessed by comparing each researcher's IMU and goniometer measurements for each movement using ICCs and Bland-Altman analysis.
ICCs (2,1) were calculated using a two-way random effects model. The magnitude of correlation required to ensure adequate reliability is contested, and clinically acceptable correlations have been suggested anywhere from 0.75 to 0.90 [20,21]. We defined ICCs of less than 0.5 as indicative of poor reliability, values between 0.5 and 0.75 as indicative of moderate reliability, values between 0.75 and 0.90 as indicative of good reliability, and values greater than 0.90 as indicative of excellent reliability [22].
Bland-Altman analysis determines the limits of agreement (LOA) between two measurements [23]. Judgement is required to determine the clinical relevance of the results [24]. We considered a clinically significant change in ROM to be at least 10 degrees, which is consistent with suggestions by others [25]. Consistent with other investigators, acceptable agreement between measurements required the LOA to be within five degrees of no difference between measurements [16].
Assuming alpha of 0.05, power of 0.80, two observations per movement, and an ICC 0.8-0.9, a minimum of 46 observations per movement were required [26]. We considered it feasible to assess 30 participants, where each participant would provide two measurements for each movement direction (e.g. one flexion measurement from each shoulder), producing 60 measurements for each movement direction.

Ethical Considerations
Ethical approval was provided by the organisation's Human Research Ethics Committee (ref: 16/53). All participants provided informed written consent.

Data Analysis
Agreement between IMU and goniometric measurements for each movement on each limb was assessed using intraclass correlation coefficients (ICC) and Bland-Altman analysis [19]. Inter-rater reliability was assessed by comparing each researcher's IMU and goniometer measurements for each movement using ICCs and Bland-Altman analysis.
ICCs (2,1) were calculated using a two-way random effects model. The magnitude of correlation required to ensure adequate reliability is contested, and clinically acceptable correlations have been suggested anywhere from 0.75 to 0.90 [20,21]. We defined ICCs of less than 0.5 as indicative of poor reliability, values between 0.5 and 0.75 as indicative of moderate reliability, values between 0.75 and 0.90 as indicative of good reliability, and values greater than 0.90 as indicative of excellent reliability [22].
Bland-Altman analysis determines the limits of agreement (LOA) between two measurements [23]. Judgement is required to determine the clinical relevance of the results [24]. We considered a clinically significant change in ROM to be at least 10 degrees, which is consistent with suggestions by others [25]. Consistent with other investigators, acceptable agreement between measurements required the LOA to be within five degrees of no difference between measurements [16].
Assuming alpha of 0.05, power of 0.80, two observations per movement, and an ICC 0.8-0.9, a minimum of 46 observations per movement were required [26]. We considered it feasible to assess 30 participants, where each participant would provide two measurements for each movement direction (e.g. one flexion measurement from each shoulder), producing 60 measurements for each movement direction.

Ethical Considerations
Ethical approval was provided by the organisation's Human Research Ethics Committee (ref: 16/53). All participants provided informed written consent.

Results
Participant characteristics can be found in Table 3. Twelve participants were nurses, nine were doctors, four were students, and the remaining were other hospital staff. All participants completed all measurements.

IMU versus Goniometer Measurements
ICCs demonstrated excellent reliability (ICC > 0.90) between IMU and goniometer measurements for all shoulder movements (see Table 4). The mean difference between IMU and goniometer measurements was less than one degree for all movements. The difference in IMU and goniometer measurements was within five degrees of the mean difference for approximately 95% of participants.

IMU versus IMU and Goniometer versus Goniometer Measurements
ICCs ranged from 0.71 to 0.89, indicating moderate to good reliability when each tester's measurements were compared for the IMU and goniometer (see Tables 5 and 6). Mean differences between each tester ranged between 0.9 and 5.2 degrees, and limits of agreement were wide, consistently greater than 10 degrees either side of the mean difference in each tester's scores.

Discussion
Examining the reliability of the measuring system is necessary to estimate measurement precision [26]. The current study compared Biokin IMU and goniometer measurements for active shoulder ROM and found high levels of reliability and agreement, providing evidence of concurrent validity of the IMU for measuring shoulder ROM. High agreement suggests that a single assessor's IMU or goniometer measurements could be used interchangeably. However, inter-rater reliability and agreement was considerably lower, suggesting that one assessor's measurements cannot be exchanged for another's measurements for either the IMU or goniometer.
In the current study, LOAs were within our predefined limits and reliability estimates were consistently higher when the same tester took measurements than when different testers took measurements. Most studies of shoulder ROM using goniometers have found intra-rater reliability greater than inter-tester reliability (see Norkin and White for a summary) [25]. Potential sources of error for goniometer and IMU measurements include differences in starting positions and participant effort. Kebaetse et al. found shoulder abduction reduced by 23.6 degrees when participants' trunks were slouched versus erect [27]. Subtle differences in goniometer arm alignment can also lead to errors in repeated measurements [27]. When assessing change over time, it is recommended to use the same assessor where possible so that differences in measurement values reflect real change rather than measurement error [25].
Our results are similar to Yoon et al. as one of few studies that compared IMU and goniometer measurements for shoulder ROM [16]. Unlike our study, Yoon et al. compared measurements for static shoulder positions rather than movements through range. In combination, these studies provide initial evidence regarding the utility of IMUs for accurately measuring shoulder ROM.
Assessing ROM is a fundamental component of a musculoskeletal examination [16]. In clinical settings, the goniometer is commonly used to assess ROM, but is dependent on assessor availability and skill to manually assess and record measurements. Wearable motion capture systems, such as the Biokin, can potentially improve the ease of collecting measurements, and increase the amount and type of data collected. We tested an IMU in a typical clinical environment and found the device could be fitted to a participant in just a few seconds. Fitting the IMU needed one anatomical landmark to be located (lateral epicondyle), whereas the goniometer required three (axis and a landmark for each goniometer arm), which might reduce the IMU's measurement error. However, reliability and agreement estimates were similar for the IMU and goniometer in our study.
The relative simplicity of fitting and wearing the IMU could allow participants to fit and use the device by themselves, thereby increasing the variety of contexts that ROM could be collected when compared to a goniometer. The Biokin could allow clinicians or researchers to remotely monitor participants' movements during daily activities, work, or sports; other IMUs have been used to collect shoulder movement in the participant's workplace [28]. IMUs could also provide participants with feedback on their performance and progress during rehabilitation and help clinicians monitor and better understand the impact of their interventions. Assessing agreement between measurements when an IMU is used independently by a participant than when compared to a clinician or researcher is present is a necessary subject of future research.
Our study has limitations, which represent opportunities for further research. First, the patient cohort consisted of healthy participants, and our results might not be replicated in those with shoulder pathology. Second, each tester took only one measurement with each modality, and intra-rater reliability for each modality could not be determined, nor the effect of averaging measurement across multiple attempts. Third, it is uncertain how these results relate to other joints or more complex movement patterns, such reach-to-grasp upper limb movements.

Conclusions
Wearable motion capture devices are becoming increasingly common as sensor technology advances. Biokin is one type of wearable IMU that combines tri-axial data from a miniaturised gyroscope, accelerometer, and magnetometer to assess motion. In this study we provide evidence for the concurrent validity of IMU measurements of shoulder ROM compared to goniometer measurements when taken by a single tester. A greater understanding of the psychometric properties of IMUs for assessing different movements by a variety of testers and in different contexts is required before they can be confidently used in routine clinical practice.