Evaluation of In-Cloth versus On-Skin Sensors for Measuring Trunk and Upper Arm Postures and Movements

Smart workwear systems with embedded inertial measurement unit sensors are developed for convenient ergonomic risk assessment of occupational activities. However, its measurement accuracy can be affected by potential cloth artifacts, which have not been previously assessed. Therefore, it is crucial to evaluate the accuracy of sensors placed in the workwear systems for research and practice purposes. This study aimed to compare in-cloth and on-skin sensors for assessing upper arms and trunk postures and movements, with the on-skin sensors as the reference. Five simulated work tasks were performed by twelve subjects (seven women and five men). Results showed that the mean (±SD) absolute cloth–skin sensor differences of the median dominant arm elevation angle ranged between 1.2° (±1.4) and 4.1° (±3.5). For the median trunk flexion angle, the mean absolute cloth–skin sensor differences ranged between 2.7° (±1.7) and 3.7° (±3.9). Larger errors were observed for the 90th and 95th percentiles of inclination angles and inclination velocities. The performance depended on the tasks and was affected by individual factors, such as the fit of the clothes. Potential error compensation algorithms need to be investigated in future work. In conclusion, in-cloth sensors showed acceptable accuracy for measuring upper arm and trunk postures and movements on a group level. Considering the balance of accuracy, comfort, and usability, such a system can potentially be a practical tool for ergonomic assessment for researchers and practitioners.


Introduction
Work-related musculoskeletal disorders (MSDs) remain a substantial burden to individuals, organizations, and societies worldwide. In Europe, MSDs are the most prevalent work-related health problem: about 43% of European Union (EU) workers reported back pain, and 41% reported muscular pains in the shoulders, neck, and/or upper limbs in 2015 [1]. Work in tiring positions is still common in current workplaces, as reported by 43% of workers for being exposed to at least a quarter of their work time in the EU [1]. In Sweden, it has been estimated that the total costs of MSDs were 102.3 billion SEK in 2012, which equaled 2.8% of the national gross domestic product (GDP) [2].
In order to design effective intervention programs and prevent MSDs, a better understanding of the underlying mechanisms between exposures and outcomes, the development of practical and reliable risk assessment methods, and a wider use of such high-quality risk assessment methods are among the key steps as suggested by researchers [3][4][5]. However, physical exposure has generally been assessed via questionnaires [6], which suffer from low

Demographic Data
Twelve volunteers (five males and seven females) were involved in this study. Before the experience, they were informed about the study and signed informed consent. The mean (±standard deviation) age of the participants was 32.8 ± 11.3 years, the height was 174.2 ± 10.2 cm, the weight was 68.7 ± 10.2 kg, and the BMI was 22.6 ± 2.7 kg/m 2 . Eleven participants are right-handed, and one is left-handed. The study was approved by the Regional Ethics Committee in Stockholm (Dnr: 2019-01206).

Experimental Setups
For this study, two sets of inertial measurement units were used (Figure 1), with each set containing three sensors (Movesense, Suunto, and Helsinki, Finland). The first set of sensors was attached directly to the skin using double-sided tape, with two on the upper arms at the insertion of deltoids and one on the upper back at the level of T1-T2 vertebrae. An additional piece of medical tape was put above the sensors on the skin to avoid relative movement. This setup is referred to as "skin sensors" in the following text.
Sensors 2023, 23,3969 3 of 17 calculated and compared for each occupational activity. and the resulting differences from the comparisons can provide knowledge about the accuracy and limitations of measurements for the practical use of smart workwear systems both in the lab and in the field.

Demographic Data
Twelve volunteers (five males and seven females) were involved in this study. Before the experience, they were informed about the study and signed informed consent. The mean (±standard deviation) age of the participants was 32.8 ± 11.3 years, the height was 174.2 ± 10.2 cm, the weight was 68.7 ± 10.2 kg, and the BMI was 22.6 ± 2.7 kg/m . Eleven participants are right-handed, and one is left-handed. The study was approved by the Regional Ethics Committee in Stockholm (Dnr: 2019-01206).

Experimental Setups
For this study, two sets of inertial measurement units were used (Figure 1), with each set containing three sensors (Movesense, Suunto, and Helsinki, Finland). The first set of sensors was attached directly to the skin using double-sided tape, with two on the upper arms at the insertion of deltoids and one on the upper back at the level of T1-T2 vertebrae. An additional piece of medical tape was put above the sensors on the skin to avoid relative movement. This setup is referred to as "skin sensors" in the following text. The second set of sensors was placed in an elastic T-shirt (Wergonic AB, Stockholm, Sweden), with pockets placed at both the upper arms and upper back for the IMU sensors. The shape of the pocket and the extra sensor case with a matching shape feature were designed to prevent sensor rotation and limit relative movement errors ( Figure 1). The second setup is referred to as "cloth sensors" in the following text. The shirt size, with a range of small to extra-large, was chosen for each participant to be comfortable and tight. The two sets of sensors were placed close to each other without overlapping.
Both the accelerometer and the gyroscope data from the IMU sensors were sampled at 104 Hz and collected by the Movesense showcase iPhone application (Amer Sports Digital Services Oy, Helsinki, Finland) using Bluetooth.

Experimental Protocol
The experiment consisted of calibration steps and simulated work tasks. The calibration was necessary for the data fusion presented in the next section. It consisted of three The second set of sensors was placed in an elastic T-shirt (Wergonic AB, Stockholm, Sweden), with pockets placed at both the upper arms and upper back for the IMU sensors. The shape of the pocket and the extra sensor case with a matching shape feature were designed to prevent sensor rotation and limit relative movement errors (Figure 1). The second setup is referred to as "cloth sensors" in the following text. The shirt size, with a range of small to extra-large, was chosen for each participant to be comfortable and tight. The two sets of sensors were placed close to each other without overlapping.
Both the accelerometer and the gyroscope data from the IMU sensors were sampled at 104 Hz and collected by the Movesense showcase iPhone application (Amer Sports Digital Services Oy, Helsinki, Finland) using Bluetooth.

Experimental Protocol
The experiment consisted of calibration steps and simulated work tasks. The calibration was necessary for the data fusion presented in the next section. It consisted of three calibration poses, and participants were instructed to hold each pose still for three seconds (  After the calibration, participants were introduced to the work tasks and instructed to perform the tasks as they would naturally do. When possible, they were also instructed to use their dominant hand to mainly perform the tasks. The duration of each task was two minutes. The different tasks were chosen to represent work scenarios using the upper arms and back at low and high angle amplitudes and velocities. This allows the assessment of the shirt setup in different conditions of use. The tasks performed were as follows (  After the calibration, participants were introduced to the work tasks and instructed to perform the tasks as they would naturally do. When possible, they were also instructed to use their dominant hand to mainly perform the tasks. The duration of each task was two minutes. The different tasks were chosen to represent work scenarios using the upper arms and back at low and high angle amplitudes and velocities. This allows the assessment of the shirt setup in different conditions of use. The tasks performed were as follows (

Data Fusion and Signal Processing
Raw data from the IMUs were processed in MATLAB (version R2022a, MathWorks Inc., Natick, MA, USA). The inclination angle, inclination velocity, and generalized velocity were computed for the sensors on the arms. The sagittal inclination angle and sagittal

Data Fusion and Signal Processing
Raw data from the IMUs were processed in MATLAB (version R2022a, MathWorks Inc., Natick, MA, USA). The inclination angle, inclination velocity, and generalized velocity were computed for the sensors on the arms. The sagittal inclination angle and sagittal inclination velocity were computed for the trunk. The posture and movement computations of both the arms and trunk followed the processing steps described in Fan et al. [27]. Firstly, data from accelerometers and gyroscopes were integrated with a sensor fusion algorithm to reduce the effects of non-gravitational (dynamic) acceleration and generate corrected gravitational acceleration. In the sensor fusion algorithm, the original data were resampled to 128 Hz and processed by a Kalman filter with the recommended coefficients [28]: 0.005 rad/s for the gyroscope white noise, 0.1 m/s 2 for the accelerometer white noise, and 0.0005 rad/s 2 for the gyroscope bias. Then, the corresponding angles of each body part were calculated using the reference poses: • Inclination angles (arms): upper arm inclination angles were obtained by calculating the relative angle to the reference I-pose [29]; • Forward/Sagittal inclination angles (trunk): the forward inclination angles (inclination angles on the sagittal plane) were obtained using Hansson forward/backward projections, the corresponding I-pose as the reference, and forward trunk bending to indicate the direction [30].
Synchronization between the two sets of sensors was performed using cross-correlation and then visually checked for each individual participant. Finally, two types of angular velocities were calculated for comparison since both computational methods had been used and reported in previous research [31][32][33][34]. In addition, recent studies have identified large differences in the values between these two computational methods [27,29,35]. Since there are currently no standard metrics for assessing the arm's angular velocity, the performance of the in-cloth sensors vs. on-skin sensors using both metrics is worth evaluating. The two types of angular velocities were described below:

•
The inclination velocities (arms and trunk): were computed by using a simple temporal derivation, i.e., dividing the difference between two samples of inclination angles by the sampling time;

•
The generalized velocities (arms): the upper arm generalized velocities were obtained [30] by dividing the angular difference of the gravitation vectors between two samples on a unit sphere with the sampling time [30,35].

Statistical Analysis
After synchronizing and extracting the upper arm and trunk angles and velocities of each work task, a comparison between the skin sensors and cloth sensors was made on the following parameters: For the upper arm and trunk inclination angles, the 5th, 10th, 50th, 90th, and 95th percentiles of the angles and the percentage of time with the angles less than 20 • , as well as the time over 30 • , 45 • , 60 • , and 90 • , were calculated. For the upper arm inclination and generalized velocities, as well as the trunk inclination velocities, the 5th, 10th, 50th, 90th, and 95th percentiles were calculated. A paired comparison was made by using the mean absolute error (MAE) and its standard deviation (SD) for all parameters for each work task. In addition, Bland-Altman plots of the median and the 90th percentile of the upper arm and trunk angles and inclination velocities for all tasks were applied to show the differences and the limits of agreement (calculated as mean ± 1.96 SD) between the two sensor setups.

Angular Distributions
For the dominant upper arm, the cloth-sensor setup generally had small MAEs compared to the skin-sensor setup, ranging from 1.2 • to 4.1 • for the median upper arm inclination angle (Table 1). Larger errors were observed for the cleaning dishwashers and The differences and limits of agreement between the skin sensors and cloth sensors during the simulated tasks for the dominant and non-dominant arms are also presented with Bland-Altman plots in Figure 4. Similarly, larger differences were observed for the cleaning dishwasher and cleaning windows tasks. For the dominant arm, the mean difference was −0.15 • for the median inclination angle, and the limits of agreement were −6.5 • and 6.2 • . The mean difference for the 90th percentile dominant arm inclination was 0.85 • , with limits of agreement of −11 • and 13 • . For the non-dominant arm, the limits of agreement were smaller than those for the dominant arm, with −5.4 • and 4.1 • for the median inclination angle and −7.5 • and 9.6 • for the 90th percentile inclination angle.
In addition, individual differences were observed, and larger errors between the cloth sensors and skin sensors were observed for a few participants. Figures 5 and 6 illustrate this variance in the time-series angular measurements of the cloth sensors against the skin sensors. In Figure 5, the angular measurements by the cloth sensors were in good agreement with the skin sensors, as illustrated by the example of one participant cleaning windows. As a comparison, in Figure 6, larger differences were observed, as shown by the example of one participant cleaning the dishwasher. The differences became larger when the arms were lifted higher for the upper arms, and a constant difference was observed for the trunk inclination throughout the task.
For the trunk, the MAEs between the cloth and skin sensors ranged from 2.7 • to 3.7 • for the median forward inclination angle ( Table 2). The maximum MAEs were observed for the lifting boxes and cleaning dishwasher tasks, with MAEs equal to 6.8 • and 5.8 • for the 95th percentile angles, respectively. For the percentage of time spent with angles less than 20 • , the largest difference was observed for the task of sorting mail, with the MAE equal to 10.7%. A potential reason could be that during this specific task, the participants spent a lot of time around 20 • trunk inclination (mean time percentage of 78%), and the error would lead to misclassification for trunk inclination <20 • .  In addition, individual differences were observed, and larger errors between the cloth sensors and skin sensors were observed for a few participants. Figures 5 and 6 illustrate this variance in the time-series angular measurements of the cloth sensors against the skin sensors. In Figure 5, the angular measurements by the cloth sensors were in good agreement with the skin sensors, as illustrated by the example of one participant cleaning windows. As a comparison, in Figure 6, larger differences were observed, as shown by the example of one participant cleaning the dishwasher. The differences became larger when the arms were lifted higher for the upper arms, and a constant difference was observed for the trunk inclination throughout the task.  For the trunk, the MAEs between the cloth and skin sensors ranged from 2.7° to 3.7° for the median forward inclination angle ( Table 2). The maximum MAEs were observed for the lifting boxes and cleaning dishwasher tasks, with MAEs equal to 6.8° and 5.8° for the 95th percentile angles, respectively. For the percentage of time spent with angles less The Bland-Altman plots show the limits of agreement between the skin sensors and cloth sensors for the trunk inclination angle (bottom row, Figure 4). The mean difference of the median trunk inclination was 0.09 • , with limits of agreement of −8.4 • and 8.6 • . Larger differences are observed for the 90th percentile trunk inclination, with a mean difference of −1.1 • and limits of agreement of −14 • and 12 • . In addition, individual differences were observed, especially during the tasks of lifting boxes and cleaning the dishwasher.

Angular Velocity
For the dominant arm, the MAEs between the cloth and skin sensors were generally small, ranging from 1 • /s to 4.5 • /s for the median inclination velocity (Table 3). Maximum errors are found for the sorting mail and cleaning windows tasks, with MAEs equal to 15.3 • /s and 26.1 • /s for the 95th percentile inclination velocity, respectively. These larger differences might be due to the sleeves not following the upper arm movements properly, especially during faster motions and at high inclination angles. For the non-dominant arm, the MAEs between the two sensor setups of the median inclination velocity ranged from 0.5 • /s to 2.1 • /s ( Table A2 in the Appendix A). The MAEs of the median trunk forward inclination velocity had smaller values, ranging from 0.4 • /s to 2 • /s ( Table 4). The lifting boxes task had the largest difference, with MAE equal to 13.2 • /s for the 95th percentile inclination velocity. The limits of agreement between the skin sensors and cloth sensors of the upper arms and trunk inclination velocities during the simulated tasks are also shown as Bland-Altman plots in Figure 7. For the dominant arm, the mean difference value was 0.75 • /s, and the limits of agreement were −5.6 • /s and 7.1 • /s for the median inclination velocity. The larger dispersion of data points was observed for the window cleaning task. This could be partly due to the large variance in individual work techniques. For the 90th percentile inclination velocity of the dominant arm, the mean difference value was 2.7 • /s, and the limits of agreement were −23 • /s and 28 • /s. For the trunk median inclination velocity, the mean difference was 0 • /s, and the limits of agreement were −3.8 • /s and 3.8 • /s. For the 90th percentile trunk inclination velocity, the mean difference value was −1.5 • /s, and the limits of agreement were −16 • /s and 13 • /s. A larger dispersion was observed for the box-lifting task. limits of agreement were −23 °/s and 28 °/s. For the trunk median inclination velocity, the mean difference was 0 °/s, and the limits of agreement were −3.8 °/s and 3.8 °/s. For the 90th percentile trunk inclination velocity, the mean difference value was −1.5 °/s, and the limits of agreement were −16 °/s and 13 °/s. A larger dispersion was observed for the boxlifting task.    The generalized angular velocities showed significantly higher differences between the two sensor setups. For the median upper arm generalized velocity, compared to the upper arm inclination velocity, the maximum MAEs increased from 3.8 • /s to 15.3 • /s for the dominant arm and from 2.3 • /s to 3.9 • /s for the non-dominant arm (Tables A3 and A4). The differences became more evident when looking at the 95th percentile of angular velocity. This could be explained by the definition of generalized angular velocity, where movements in all directions are included, compared to inclination velocity, where the only change in inclination is included.

Discussion
This study evaluated in-cloth against on-skin sensors for measuring trunk and upper arm postures and movements for smart workwear systems during simulated work tasks. For most tasks, high agreements between the two sensor setups were observed for the upper arm and trunk angles. For the arm, slightly higher errors were observed for the 90th and 95th percentiles of arm inclination angle and velocity during cleaning windows and cleaning the dishwasher. For the trunk, slightly higher errors were observed for the 90th and 95th percentiles of trunk inclination and velocity for lifting boxes and cleaning the dishwasher. The generalized velocity had distinctively higher errors for both the upper arms and trunk. The in-cloth sensors showed acceptable accuracy on a group level for measuring upper arm and trunk inclinations and inclination velocities.
The simulated tasks in this study were chosen to cover a large range of work activities that may involve arm and trunk movements, thus evaluating the in-cloth sensors in different settings. Activities like cleaning windows and cleaning dishwashers involved higher movement amplitudes for the dominant arm. The errors of the in-cloth sensor compared to on-skin sensors were higher in these cases, which is to be expected. These larger differences might also be due to the sleeves not following the upper arm movements properly, especially during faster motions and at high inclination angles. As shown in Table 1, the MAEs increased in general from the 5th to the 95th percentile of the upper arm angle. Still, the MAEs were less than 4.1 • for all the median arm inclination values. A similar phenomenon was observed in the arm inclination velocities ( Table 3). The median arm inclination velocity had MAEs smaller than 4.5 • /s in all tasks. Higher errors were observed when the generalized velocities were calculated (Table A3 in the Appendix). The maximum MAE for the median generalized velocity was 15.3 • /s during window cleaning (the reference value was 124.2 • /s), and the MAEs were significantly higher for the 95th percentile of arm generalized velocity. This is expected since the definition of generalized velocity includes motions on all planes, compared to inclination velocity, which only includes motions/changes in the inclination. Therefore, the performance of the incloth sensors can be affected to a greater degree by the cloth and motion artifacts during the tasks.
For the non-dominant arm, the in-cloth sensors had lower MAEs than the dominant arm regarding the inclination angle and velocity (Table A1). This was also expected as the non-dominant arm was less used. The maximum MAE was observed for the 95th percentile inclination angle while cleaning the dishwasher, during which participants usually used their non-dominant arm to a greater degree. For the median inclination angles, the MAEs were less than 2 • for all tasks. Concerning the non-dominant arm inclination velocities (Table A2), the overall MAEs were smaller than 6.6 • /s. Higher MAEs were also observed for the non-dominant arm generalized velocities (Table A4 in the Appendix A).
Regarding the trunk, lifting boxes and cleaning dishwashers involved higher movement amplitudes. The maximum MAE for trunk forward inclination angles was 6.8 • for all tasks, which was observed during lifting boxes (Table 2). In general, the errors for trunk inclination velocity were quite small, with maximum MAEs of 2 • /s and 13.2 • /s for the median and 95th percentile values, respectively, observed during the lifting boxes task (Table 4).
One thing worth noticing is that the MAEs for trunk inclination remained on a similar level from the 5th percentile to the 95th percentile throughout each task, even when the trunk's forward inclination angle was small. Whereas for the upper arms, the MAEs in general increased for the higher percentiles of arm inclination (Table 1) and when the arms were lifted higher. This type of error is further illustrated in Figure 6. The relatively constant error for the trunk could be caused by the non-optimal fit of the clothes. The looseness of the garment where the trunk sensor was located or a potential overlap of the cloth sensor and skin sensor could lead to the cloth sensor having a slightly different tilt compared to the skin. Regarding the errors observed for the upper arms, they could potentially be caused by the elasticity of the sleeve fabric, leading to slightly larger cloth artifacts when lifting the arms high.
In addition to the fit of the clothes, different individual work techniques and individual height may also imply variances in the level of errors. For example, there was a high variance in the individual arm inclination angles and velocities during cleaning windows and the dishwasher and a high variance in trunk velocities while lifting boxes. Therefore, this variance is good to include in the experiment so the results can represent different work scenarios and individuals.
Another limitation was the placement of the two sensor setups, which should ideally be at the same location, i.e., at the insertion of the deltoids and the level of T1-T2 vertebrae. However, since overlapping of the sensors was undesirable, they could not be placed in the same place. Therefore, the cloth sensors were placed carefully close to the skin sensors without overlapping each other. However, for a few participants, the overlapping of the cloth sensors on the skin sensors of the upper arms was observed. This can lead to overestimated errors of the cloth sensors since normal wear of the T-shirt will be tighter on the skin and potentially a better fit on the body without another sensor in between.
Future studies can look into error-correcting algorithms for the in-cloth sensors set up to improve their performance for smart workwear systems. This study highlights the existing errors in such a system and can contribute to how to find the most adapted approach in future studies. One potential method is the use of artificial intelligence-based algorithms; for example, Lorenz et al. [16] used a probabilistic neural network based on a supervised learning method to reduce loose cloth artifacts.

Conclusions
This work evaluated the in-cloth sensors against the on-skin sensors in simulated work tasks for upper arms and trunk posture assessment. Errors from in-cloth sensors were quite low for all median values of inclination angles and velocities. Larger errors were observed for the 90th and 95th percentiles of inclination angles and velocities. The performance depended on the tasks and was affected by individual factors, such as the fit of the clothes. Nevertheless, future work should compensate for the cloth artifacts and thus improve measurement accuracy. In conclusion, in-cloth sensors showed acceptable accuracy for measuring upper arm and trunk postures and movements on a group level. >60 • 0 ± 0 (0) 0.2 ± 0.3 (0.4) 2.1 ± 3.4 (4.7) 2.2 ± 2.1 (17.6) 0.9 ± 1.3 (4.3) >90 • 0 ± 0 (0) 0 ± 0.1 (0) 0 ± 0 (0.1) 1.8 ± 1.8 (3.4) 0.5 ± 0.8 (1.4) Table A2. The mean ± standard deviation of the mean absolute errors (MAEs) of the non-dominant arm inclination velocity between cloth sensors and skin sensors during the five simulated tasks, with the reference value of skin sensors shown in brackets (n = 12). 3.5 ± 2.4 (52.7) 1.9 ± 1.6 (68.6) 5.5 ± 4.6 (106) 3.7 ± 3 (94.7) 95th

Dominant Arm, Generalized Velocity
Simulated Work Tasks