4.1. Validation
The present study evaluates the performance of a new measuring device based on inertial motion units (IMUs). IMU technology has experienced a progressive improvement over the past few years. However, at the present time few IMU systems have evaluated the concordance of their measurements with previously established gold standards.
The correlation of the new system with an already established gold standard was necessary to ensure that the new device to be implemented would return similar values. ICC test scores showed that there was an excellent correlation between both systems, with values ranging from 0.887 to 0.974. These results were in accordance with some other systems in the field and even show better performance than others [
29,
30]. It is generally accepted that results falling above 0.7 of correlation are to be considered as good correlation, and with the lowest value almost close to 0.9, both systems can be regarded as excellently correlated. The correlation of IMUs with the cervical range of motion (CROM) device has been calculated also by other researchers. In this investigation, the results indicated that the pairing of the IMUs with the CROM was excellent [
27]. According to their results and ours, the IMUs showed good accuracy in comparison with already established reliable methods of measurement.
However, correlation does not imply agreement between the measurements. To assess the degree of agreement, a
t-test for related samples and a limits of agreement analysis were conducted. Statistically significant differences were found for all the comparisons between the EBI-5 system and the BTS system. Although the variance analysis returned significant
p values, the maximum mean difference between the measures was 1.3° which in other studies has been considered as a good agreement [
30].
The limits of agreement analyses are depicted in the Bland-Altman plots in
Figure 2,
Figure 3 and
Figure 4. This type of analysis shows how much agreement is to be expected from the comparison of both systems. Our results showed that the vast majority of our samples fell within the range of the limits of agreement, and those limits were similar to those reported in the literature, if not improving them [
28]. It is important to remark that the highest limits were around 7
o of discrepancy. The differences found can be explained in different ways. Some authors determine that the inner working of IMUs might return slightly different values than the calculation of the movement of reflective markers in the space [
29]. Another hypothesis is that different configurations of markers and IMUs might result in differences in the results obtained. Furthermore, the difference in the sampling rates can affect the values obtained [
31]. Nevertheless, mean differences between both systems show low values and, therefore, comparisons between systems can be made. Some authors argue that even with slight discrepancies between measuring devices, validated tools with low disagreement are acceptable for clinical and investigatory purposes [
32].
4.2. Case-Control
The main problem when the quantification of the degree of injury is needed is to address the reference against which to compare the ROM of the patient. The AMA guidelines have long been used by health-related professions to answer this need. However, at the time they were conceived, precise measuring devices were not widespread and, thus, an approximation to the normal values a person should achieve was a fair comparison.
As our study has shown, this fair comparison becomes insufficient when precise ROM measurements are to be classified into pathologically limited or non-limited cases. It has been a surprising result that when the AMA guidelines were used over a healthy-subjects population (the control group) more than half of these would be under the expected ROM and, therefore, they should be categorized as pathologically limited. Only when the references were changed to normality intervals adjusted for age did the number of people considered to be healthy start to match reality. We would not expect any of the healthy subjects to present significant ROM limitations since the exclusion criteria would have prevented such an event. Other studies have investigated the effect of age on the ROM and all concluded that ROM consistently decreases with each decade of life [
21,
23]. Our results show that an unadjusted age reference (control group compared to AMA guideline for cervical ROM) leads to poor results and induces one to think that healthy people might have a significant limitation of ROM.
Previous research found statistical differences between using the AMA guidelines for cervical ROM or age-adjusted references for every decade of life. However, this study was not conducted on a healthy population hence it was not clear which guidelines were best to determine the degree of limitation [
33]. The present results show that AMA guidelines for cervical ROM were unable to determine that a healthy population had no ROM impairment. These results also question the alleged sensitivity of the AMA cervical ROM guidelines to detect different degrees of injury across the decades of life.
The differences between Swinkels values and our own are around 3% except for the flexion-extension, where Swinkels values do not match our own and the difference was around 20%. While the device used in the present study compensates for the trunk motions, thus, eliminating extra range of motion from the final measure, the CROM device does not eliminate these combined movements. A recent systematic review analyzed the published normative values of cervical ROM obtained using different technologies. Their conclusions state that only the normative values obtained with the CROM device show consistency across studies. Consequently, they present the pooled normative values for the CROM device as the best normative values to use. However, their statement regarding the CROM guidelines as the only useful guidelines should be questioned. There is a high discrepancy in the extension movement between the studies using tools such as goniometers or the CROM device and the devices that use digital measuring devices. The CROM device cannot compensate for movement in associate joints, therefore, we might be evaluating the combined movement of multiple joints and not only the cervical range [
34]. Digital devices usually account for these combined movements and remove them from the equation. The biomechanical model used for measuring can influence the results. Similar instrumentations or reference frames may correlate better than others that differ more than the former [
27,
35]. Another reason for discrepancies between normative data could be differences in the anthropometrics of the population. When the effect of anthropometrics in human motion is studied, performance is bound to subject characteristics (anthropometrics) and, therefore, should be accounted for [
36]. The population from which we obtained the normative values were of university students and with a normal distribution of their ROM scores; there is no reason to think they will not follow a normal distribution in their anthropometrics for any population of Spanish students. These normative values will not be suitable for the clinical setting as they might not reflect the whole Spanish population and may not be applicable to other age decades. Nevertheless, future work will focus on obtaining these normative values for the whole population.
Once the data were normalized to the age intervals, the normality of the data was lost. One could argue that the loss of a normal distribution in the normalized values would be indicative of some bias not considered. However, the skewness resultant from this transformation suggests that that most of our results gather around the 100% mark and, thus, our healthy subjects are better classified. A similar trend is also observed in the case group with an increased skewness when normalized with the reference values that correspond to their age. This behavior matches the one that is observed in the studies of health care costs. Costs can never be lower than 0 in health and the most expensive treatments are also the least frequent, therefore, all the results are positively skewed [
37]. It seems reasonable that the most frequent values are of less severity, while the worst outcomes are less frequent in the sample. In this case, the skewness reflects the nature of the clinical presentation of signs in pathology, with worse outcomes being less frequent that milder ones.
Using the normalized references with our own normative values did not result in all the healthy subjects receiving 100% of movement in all planes. If we were to consider the absence of pathology having 100% of the ROM, many healthy subjects would be incorrectly classified as injured. To include more than 90% of these subjects in a healthy consideration, the threshold of normalized movement had to be lowered to 90%. Only the right bending would have required a lower bound, however, we consider that lowering it too much will result in a greater probability of false negatives.
Lowering the bound on the case group also has an effect, as more people would be considered as non-limited in each movement. Lowering the bound does not result in the transformation of injured patients into healthy ones, it only reflects better which movements are truly limited. The AMA guidelines will classify almost all the patients as injured when in clinical practice some patients will not show any ROM limitation at all. Applying normative data adjusted by age indicates that some patients would not show ROM impairment at the assessment. Changing the way things are interpreted can allow for a better injury-oriented treatment in each patient, not wasting time treating unproblematic aspects and, therefore, improving the outcome. It should also be considered that this information ought to be crossed with the presence of other pathological symptoms such as pain. As our data show in the control group, a person can exhibit a mobility lower than 90% of the normalized ROM and be completely asymptomatic. Having a limitation in ROM does not directly imply having an active injury. This consideration should be especially taken into consideration by clinicians, since finding a limitation does not always imply the necessity to treat it.
4.3. Limitations
Some drawbacks can be stated in the present study. The first limitation is that some degrees of difference must be accounted for when contrast is to be made with photogrammetric systems. The present devices are intended for clinical use in the rehabilitation environment. If measurements are to be taken in fields where precision must be higher, like surgery, this would not be the measuring device of choice. The second drawback is that the case-control phase focused only on the age interval of 18–30 years, it would be unwise to assume that this age group answers for all age intervals. However, we would expect to see similar results with other age intervals since what we are seeing here is the sheer amount of healthy people (control group) that would be considered injured if the AMA guidelines are used to obtain percentages of movement. However, it is reasonable that, as older individuals are evaluated, this effect would be less striking than the results obtained. Another limitation is that anthropometrics or gender were not considered, and although all distributions for ROM follow normality and gender does not affect ROM in the cervical spine, the assessment of these variables as confounding factors would favor even more the results presented herein.