Clinical Implications of the General Movement Optimality Score: Beyond the Classes of Rasch Analysis

This article explores the clinical implications of the three different classes drawn from a Rasch analysis of the general movements optimality scores (GMOS) of 383 infants. Parametric analysis of the class membership examines four variables: age of assessment, brain injury presence, general movement patterns, and 2-year-old outcomes. GMOS separated infants with typical (class 3) from atypical development, and further separated cerebral palsy (class 2) from other neurodevelopmental disorders (class 1). Each class is unique regarding its quantitative and qualitative representations on the four variables. The GMOS has strong psychometric properties and provides a quantitative measure of early motor functions. The GMOS can be confidently used to assist with early diagnosis and predict distinct classes of developmental outcomes, grade motor behaviors, and provide a solid base to study individual general movement developmental trajectories.


Introduction
The General Movement Assessment (GMA), a categorical analysis of the quality of an infant's spontaneous general movements, has been established as a systematic, valid, and reliable assessment of the integrity and function of the young nervous system [1,2] and is currently the most recommended assessment for identifying a high risk for cerebral palsy (CP) [3,4].
The GMA is an observational tool based on gestalt pattern recognition of the infant's spontaneously generated movements from birth to 5-months post term. General movements occur in age-specific patterns. Normally, general movements in preterm and term age comprise the entire body and manifest themselves in a variable sequence of neck, trunk, arm, and leg movements [2]; they wax and wane in intensity, speed, and amplitude. Abnormal general movements at that age are characterized by a lack of variability, especially in the movement sequence [1,2]. Experienced observers achieve high inter-scorer agreements (89-93%) [2].
In addition to the categorical assessment of general movement patterns, a detailed assessment at preterm and term age that examines different components of these movements was first introduced by Ferrari et al. [5], adapted by Einspieler et al. [6], and resulted in the General Movement Optimality Score (GMOS) [7]. The GMOS applies the Prechtl optimality concept [8], resulting in an ordinal semi-quantification of the general movements' quality, in which neck, trunk, upper, and lower extremities are scored separately, with inter-scorer agreements ranging from 0.69 to 0.82 (Cohen's Kappa) [7]. In a large group of preterm and term infants, GMOS differentiated not only infants with normal and abnormal movement patterns, but also, within the abnormal classification, poor repertoire and cramped synchronized movements [7]. The relationship between the GMOS and the GMA depended on infants' age, except for "tremulous movements" items which occurred across infants from all ages with normal and abnormal general movements [7]. The GMOS is utilized both in clinical and research areas. It has demonstrated correlations with the GMA classifications, but poor prediction for the GMA at 3 months corrected age [9]. It is worth noting, however, that significant changes in the movement patterns occur during this period, and likely contribute to the poor prediction. The GMOS has also been used to evaluate short term neurological outcomes of clinical hypoxia [10,11] and of neonatal intervention [12].
Recently, Barbosa et al. [13] explored the GMOS's psychometric properties using Rasch analysis, a statistical method used in test development. The Rasch probabilistic model [14] considers the measure of the item's difficulties and a person's abilities together on the same scale. Rasch transforms ordinal data into linear measure with equal-interval units, called logits, which are used to describe the measures of both individuals and items. Different Rasch analysis models were performed using the GMOS [13] with the Mixed Rasch Model (MRM) presenting the best overall fit, with good to optimum separation indexes and reliability coefficients. This suggests that the GMOS is a reliable interval unidimensional assessment that works differently for three distinct groups of infants that are separated into classes. In MRM, each infant is assigned to one of the classes she or he has the highest probability of belonging to, based on their scoring in each detailed item, so that each class (or group of infants) presents unique functioning rating scale and different item hierarchy. MRM validated the GMOS as a quantitative assessment and proposed improvements, including deleting two items ("tremulous movement", "upper and lower extremities") and revising the scoring criteria of 5 others (neck, amplitude, and speed of upper and lower extremities). MRM did not, however, explain who the infants in each class are, making it difficult to establish clinical implications. Therefore, the purpose of this study was to identify the infants who belonged to each class [13], by exploring differences in specific clinical features, including age of assessment, type of brain injury, GMA classification, and clinical outcome at two years of age, and ultimately to understand the essence of the qualitative differences across the classes.

Experimental Section
Secondary data analyses were performed on previous work by Einspieler et al. [7] and Barbosa et al. [13]. Details of the 383 infants each videotaped once following the standards of the Prechtl's GMA, the original score sheet, and scoring procedures are published elsewhere [7,13]. A list with the GMOS item names, scoring criteria, and hierarchical items' structure (i.e., logits by class) is presented in Appendix A (adapted from Barbosa et al. [13]). The ethical boards of all centers involved [7] approved the recording and assessment of spontaneous movements. Data analysis was performed in compliance with protocols approved by the Institutional Review Board of the Medical University of Graz (ethical approval number 27-388ex 14/15; 27-388ex14/15). Institutional Review was deemed not necessary for secondary data analysis by the ethical board of the home institution of the lead authors. Written informed consent was obtained from all participants prior to study.
Frequencies and percentage were used to explain class composition and explore its implications to clinical practice. ANOVAs, cross tabulations, and Chi-square (SPSS version 21; SPSS Inc., Chicago, IL, USA) were used to compare class composition based on age of assessment, type of brain injury, GMA classification, and clinical outcome at two years of age.
Data Availability: The datasets generated and analyzed during the current study are available from the corresponding author on reasonable request. Table 1 shows the specific distribution of video recording per gender, age of assessment, brain injury (images/clinical signs: presence of intra ventricular hemorrhage (IVH), periventricular leukomalacia (PVL), or hypoxic ischemic encephalopathy (HIE)/Sarnat classification), GMA classification, and outcome at two years of age; and Table 2 shows the ANOVAs and Chi-square results.  There was a significant relationship between class assignment and age of assessment, both for weeks of age (continuous, p < 0.001) as well as age groups (categorical, p < 0.001) (i.e., <34 weeks post menstrual age (PMA) vs. ≥34 weeks PMA). Infants in class 2 were older (40.59 weeks) than the ones in class 1 (37.6 weeks) and class 3 (38.73 weeks) ( Table 2). The majority of the sample was of older infants. Most of the very preterm and moderate preterm infants (59.4%) belonged to class 1 (27.7% of the class). The late preterm, term, and post-term infants were similarly distributed among classes 1, 2, and 3 (31%, 36.7%, and 32.3% respectively) (Tables 1 and 2).

Results
The distribution of infants with distinct grades of brain injury (images/clinical sings: normal and grade 1 PVL/IVH/Sarnat classification vs. grades 2, 3, and 4 PVL/IVH/Sarnat classification, combined) was significantly different among the classes, p < 0.001. Class 1 had a similar number of infants with normal/grade 1 injury (49.5%) to infants with higher grade injuries (grades 2, 3, and 4) (50.5%). Class 2 had a higher proportion of infants with higher grade injuries (77%) and class 3 was mostly composed of infants with normal/grade 1 injury (78%).
The relationship between class and GMA classification into poor repertoire, normal, cramped synchronized, and chaotic movements differed among the classes: p < 0.001. Post hoc analysis demonstrated that all but chaotic movements were differently distributed across the classes, with standard residuals greater than +/− 1.96, p = 0.0055.
The association between class and outcome at two years of age (i.e., normal, CP, and neurodevelopmental disorders other than CP) was examined by first combining outcome into typical and atypical among the three classes, which showed a predominant proportion of infants in Class 3 with typical outcome compared to Class 1 and 2, p < 0.001. Next, we examined only the atypical infants from classes 1 and 2, which showed that class 1 had the most infants with other neurodevelopmental disorders (67.10%), while class 2 had the most with CP (70.65%), p < 0.001.
As there was an overlap between classes 1 and 2, we then further explored the interaction among age of assessment, brain injury, GMA, and outcome separately within these classes with cross-tabulations and chi-square (Table 3) with the intent to further differentiate and understand their uniqueness.
Class 1 was mostly composed of infants with poor repertoire (95.6%), resulting in a nonsignificant interaction between GMA and outcome, p = 0.074. Class 2, on the other hand, showed a significant interaction between GMA and outcome, p < 0.001. Post hoc analysis showed that the distribution of GMA varied among the outcomes for all but chaotic movements, with standardized residuals greater than +/−1.96, p = 0.0055. Although 18.7% of the infants with cramped synchronized movements in class 2 turned out to have other neurodevelopmental disorders, cramped synchronized movements were mostly related to the outcome of CP (81.3%). Moreover, 93.8% of the infants (61/65) in class 2 who were diagnosed with CP had cramped synchronized movements.
We then further explored the characteristics of the infants with poor repertoire within these two classes. The overall proportion of such infants in class 1 (95.6%) was much higher than in class 2 (26.8%). Nevertheless, there was a similar chance (approximately 50%) within both classes that these infants would have other neurodevelopmental disorders (Table 3). In class 1, the other half of the infants was roughly divided between normal outcome and CP; in class 2, only 8.7% of infants with poor repertoire turned out to have CP. Infants with poor repertoire in class 1 had brain lesions similarly distributed between less (48.4%) and more (51.6%) severe; whereas in class 2, there was a higher proportion of infants with more (63.6%) severe injuries. Age also varied between classes 1 and 2 in infants with poor repertoire, being older in class 2 (93.9%) than in class 1 (71%).
No interactions were found between GMA and brain injury, GMA and age of assessment, and brain injury and age of assessment (for all infants), in either class 1 or 2 ( Table 3). The relationship between brain injury and outcome was significant for all infants in both classes. Class 1 presented a higher frequency of other neurodevelopmental disorders, both in the lower and higher injury groups (p = 0.043). Normal outcome, however, was more frequent in the lower grade injuries, and CP in the higher-grade injuries. Class 2 had a higher proportion of infants with CP in both injury groups (42.1% and 73% in low and high injury, respectively, p = 0.045). Normal outcome and other neurodevelopmental disorders were more frequent in the lower grade injuries. Nevertheless, when interaction between brain injury and outcome was considered given the GMA classification, an interaction was no longer significant in either class 1, p = 0.072, or class 2, p = 0.452. Finally, we explored the relationship between brain injury and outcome distribution given the age groups (<34 and ≥34 weeks PMA). Class 1 showed a significant relationship for the younger infants, p = 0.013, but not for the older ones, p = 0.267. Younger infants with lower grade injuries had more normal and other neurodevelopmental disorders, whereas younger infants with higher grade injuries had a higher incidence of CP. Contrarily, in class 2, a significant relationship was found for the older infants, p = 0.031, but not for the younger ones, p = 1.000. Older infants had higher grade injuries (62/80, 77.5%) and a higher frequency of CP (45/62, 72.5%).

Discussion
The validity of the GMOS as a quantitative assessment to describe different aspects of the quality of general movements was recently demonstrated [13]: MRM indicated the existence of three distinguishable groups of infants (i.e., classes) for whom the GMOS needs to be considered separately as the items' difficulties and infants' abilities are class dependent. Therefore, describing who the infants were in each class allowed for understanding the specific relationship between individual GMOS items and specific clinical features, including outcomes. This information could be very useful in a clinical setting, helping to introduce therapy to the infants who need it the most.

Class Formation
Exploring the commonalities and differences of infants in each class clarified their clinical relevance. The overall class composition is described next (also Tables 1 and 2). Class 1 was the most diverse class, with 27.7% younger infants (<34 weeks PMA), mixed grades of brain injury, and mostly infants with poor repertoire classifications in GMs. Half of these infants were later diagnosed with other neurodevelopmental disorders (the other half divided between normal and having CP) at two years of age. Class 2 was mostly older infants (≥34 weeks PMA) (95.1%), higher levels of brain injury, mostly cramped synchronized GMs, and infants later diagnosed with CP. Class 3 was mostly older infants (83.7%), no to grade 1 brain injury, normal GMs, and later diagnosed as presenting a normal outcome.
While infants of all ages might have poor GMOS performance, age of assessment was a factor influencing class formation. The interaction between class membership, presence of brain injury, GMA classifications, and outcome at two years of age was also statistically significant and helped guide the interpretation of how GMOS items work differently for each group of infants. The GMOS clearly differentiated infants with typical/normal outcome from those with atypical development (i.e., CP and other neurodevelopmental disorders). Class 3, with more infants with normal features, explains the results in Barbosa et al. [13], which showed for this class most favorable movement qualities (i.e., higher GMOS total raw and logit scores, and more individual items scored in the highest score categories). Classes 1 and 2, contrarily, presented mixed groups of infants with some overlap between them. Examining the interaction among age of assessment, brain injury, GMA, and outcome separately within each of these classes (Table 3) allowed us to further appreciate their qualitative differences and demonstrated their unique relationship to atypical development.
Class 1's composition explains the normal-like distribution of GMOS scores we previously reported [13]. It also supports the literature in which a large proportion of preterm infants have transient abnormal general movements [15,16] (categorized as poor repertoire in particular), that can either normalize or deteriorate within the first few months of life and result in different outcomes [1,2,7,17,18] (i.e., normal, CP, attention deficit/hyperactivity disorder -ADHD, cognitive impairment, developmental coordination disorder, autism). An important clinical implication for class 1 is that these infants need to be closely monitored to differentiate between transient and enduring abnormality to guide intervention as necessary. Class 2's composition explains the least optimal quality of movements reported for this class [13]. This is also in agreement with previously reported lower total GMOS scores for post term infants [7,9] and supports that outcome tends to be worse if infants do not normalize their movement patterns sooner [1]. This is not surprising, given most infants in this class had cramped synchronized movements, which appear at approximately 35-36 weeks PMA [6,7] and concur with lack of fluency and variability of GMOS movement components [7]. It also confirms the predictive value of cramped synchronized movements with the outcome of CP [19,20]. Given the highest reliability for this class in our previous study [13], its composition also suggests that the GMOS best separates infants with CP from the others (Table 2).
Besides, while supporting previous research in which infants with normal GMA tend to have normal outcome and infants with cramped synchronized GMA tend to have CP [7,20], our results provided further information on the development of infants with poor repertoire GMA. Half of the infants with poor repertoire tend to present other neurodevelopmental disorders at two years of age, regardless of which class they belonged to. The other half, however, developed differently, pending on which class they belonged to. The combination of poor repertoire with brain injury and age of assessment further contributed to clarifying the trajectory of infants with poor repertoire, separating them and providing preliminary information to predict their outcome. Having poor repertoire GMA concomitant with being younger was more associated with class 1, with higher probability of presenting neurodevelopmental disorders other than CP, regardless of brain injury level. On the other hand, having poor repertoire, being older, and presenting higher levels of lesion was more related to class 2, with a higher chance of having CP.

Relationship between GMOS Scores (Total and Item Performance) and Clinical Populations
These results allow us to interpret how infants with different clinical features and outcomes scored on the GMOS total score and on each item. Detailed comparison of item difficulties across classes is a particularly informative procedure about the qualitative differences among the classes. Specifically, in the GMOS, this helps understanding the differences in the motor performance by different groups of infants. The focus of this comparisons is on the items which are relatively more difficult for one group but easier for the others. We encourage the reader to revisit Barbosa et al. [13] and further explore specific items' scores per class, noticing the items' total score (logits) are all different among the classes, with the easiest items located at the bottom of the table for each group of infants (Appendix A).
A practical example of this is the item "space: lower extremities". This item difficulty order is similar between infants with normal outcome (class 3) and those with other neurodevelopmental disorders (class 1), being 19th and 17th, respectively. The quality of the use of space varies between these classes, however, with infants with normal outcome scoring mostly as showing variability in space, and infants with other neurodevelopmental disorders scoring mostly with limited use of space [13]. For infants with CP (class 2), on the other hand, the item "space: lower extremities" was placed in a higher difficulty order (11th) than other items, although they also tended to have limited use of space and had a similar item location (i.e., logit) as infants in class 1. This suggests that while having similar logits, it is much harder for an infant later diagnosed with CP to have variable use of space than either infants later diagnosed with other neurodevelopmental disorders or infants with normal outcome. Some clinical implications of this could be to observe the easiness of progression (in repeated tests) in the lower extremities use of space, which can help differentiate those that might develop CP (slower gain in motor control) from those who will go on to have other neurodevelopmental disorders. It might also help in planning intervention, by facilitating the behaviors beginning to emerge, possibly using a passive range of motion to provide variability to self-initiated movement while also offering environmental affordances that stimulate active movement (i.e., placing toys at different locations in relationship to the infant's body to stimulate kicking).

A Word of Caution
All analyses were based on secondary and group data. New studies with prospective data and a wide range of the infant population are needed to investigate the modified scoring system and its application in enhancing the understanding of general movements' subtleties and prediction of outcome. To continue the research of the GMOS, now knowing it is a working metric scale, future steps would be to (1) examine potential GMOS cut-off scores to discriminate infants with different outcomes, and to (2) identify who would improve or deteriorate in their components of movement over time by studying the GMOS developmental trajectories. This would allow one to further predict long-term neurodevelopmental outcome and to track subtle changes in the quality of movement resulting from intervention (rehabilitation and/or pharmacological interventions). Another interesting question to explore in the future is if the combination of total GMOS scores (e.g., above or below the median) with GMA classification (e.g., poor repertoire) can assist with outcome prediction.

Conclusions
Our previous study demonstrated (using Rasch analyses/MRM) that the GMOS is a reliable unidimensional assessment with functioning rating scales and different item hierarchy for qualitatively differentiating infants, grouping them into three classes [13]. In the present study, we identified the clinical features of each class. Combining the results of both studies provides valuable information and supports the psychometric validity of the GMOS as a solid and reliable quantitative assessment that works differently for three clinical meaningful groups of infants, providing further content and discriminative validity evidence for the GMOS quantitative properties. The modified GMOS separated infants with typical (class 3) from atypical development (classes 1 and 2), and then further separated CP (class 2) from other neurodevelopmental disorders (class 1) based on the combination of their specific clinical features. While further predictive studies are needed, we recommend using the GMOS to evaluate infant motor performance in more detail and assist in referring an infant sooner for targeted intervention, during times of greater brain plasticity, when a greater impact of intervention may be seen.