The Effectiveness of Robot- vs. Virtual Reality-Based Gait Rehabilitation: A Propensity Score Matched Cohort

Robot assisted gait training (RAGT) and virtual reality plus treadmill training (VRTT) are two technologies that can support locomotion rehabilitation in children and adolescents affected by acquired brain injury (ABI). The literature provides evidence of their effectiveness in this population. However, a comparison between these methods is not available. This study aims at comparing the effectiveness of RAGT and VRTT for the gait rehabilitation of children and adolescents suffering from ABI. This is a prospective cohort study with propensity score matching. Between October 2016 and September 2018, all patients undergoing an intensive gait rehabilitation treatment based on RAGT or VRTT were prospectively observed. To minimize selection bias associated with the study design, patients who underwent RAGT or VRTT were retrospectively matched for age, gender, time elapsed from injury, level of impairment, and motor impairment using propensity score in a matching ratio of 1:1. Outcome measures were Gross Motor Function Mesure-88 (GMFM-88), six-min walking test (6MWT), Gillette Functional Assessment Questionnaire (FAQ), and three-dimensional gait analysis (GA). The FAQ and the GMFM-88 had a statistically significant increase in both groups while the 6MWT improved in the RAGT group only. GA highlighted changes at the proximal level in the RAGT group, and at the distal district in the VRTT group. Although preliminary, this work suggests that RAGT and VRTT protocols foster different motor improvements, thus recommending to couple the two therapies in the paediatric population with ABI.


Introduction
Acquired brain injury (ABI), occurred after a period of normal development, is one of the main causes of death and neurologic disability in children after infancy [1]. Motor impairments are common and often require prolonged assistance [2]. One of the primary rehabilitation goals for children and adolescents suffering from ABI is the improvement of walking ability, in terms of pattern, quality, and independence.
In the last years, standard gait rehabilitation has been flanked by technology-based treatments. Among others, robot-assisted gait training (RAGT) is widely used in the gait rehabilitation of adults with different diseases such as stroke, spinal cord injury, and multiple sclerosis [3][4][5][6], as well as of children and adolescents with neuro-motor impairment [7]. The most common gait rehabilitation robots available for the developmental age are exoskeletons. These devices operate mechanically on the human body by means of cuffs connected to the patient's lower limbs. There are wearable exoskeletons both for overground walking and for walking on a treadmill. Advantages of exoskeletons with respect to standard gait training are the repetitiveness of the movement, the intensity of the

Participants
Between October 2016 and September 2018 all consecutive patients undergoing an intensive gait rehabilitation treatment at the Scientific Institute E. Medea (Bosisio Parini, Italy) assisted by a robot or using a treadmill plus virtual reality system were prospectively observed. Inclusion criteria were: diagnosis of ABI; age between 4 and 20 years; a level of motor impairment ranging from I to IV, classified according to the Gross Motor Function Classification System (GMFCS); adequate comprehension and cooperation; and absence of visual impairment. Exclusion criteria were: severe muscle spasticity; injection of botulinum toxin in lower limbs during the 6 months prior to the enrollment; variation in oral skeletal muscle relaxant drug dose in the month prior to treatment; previous orthopedic surgery; a diagnosis of severe learning disabilities; behavioral problems; visual or hearing difficulties that would impact on function and participation.
This study was performed in accordance with the Declaration of Helsinki and the Ethics Committee of Scientific Institute E. Medea approved the observational study protocol (protocol code: GIP355; date of approval; 23 September 2016). Patients or their parents provided written informed consent. The trial was registered in the repository of the Italian Ministry of Health (registration number: 001095).

Intervention
The intervention lasted one month and consisted of 20 45-min sessions of conventional physiotherapy and 20 45-min sessions of either RAGT or VRTT. Being an observational study, patients were assigned to RAGT or VRTT intervention on the basis of clinical decisions.
The conventional physiotherapy included stretching of hip flexor and hamstring muscles, muscle strengthening exercises, such as squats, static and dynamic balance training, postural transitions such as sit-to-stand, and over ground walking training with particular attention to gait smoothness, stability, and endurance.

Robot-Assisted Gait Training
RAGT was performed using the Lokomat ® (Hocoma AG, Volketswil, Switzerland), an active lower limb exoskeleton ( Figure 1A). During training, speed, body-weight support and guidance force were personalized on each patient to assure active participation. The initial body-weight support was set at 50%, and gradually decreased according to the individual's response to the intervention. The guidance force was initially set to 100%, and then gradually reduced up to 5% above the automatic stop threshold. To engage subjects and to increase their active participation and motivation in gait practice, therapists provided frequent oral encouragement and augmented performance feedback (implemented in the exergames) was used in all the sessions A therapist, trained and certified by Hocoma, was always present during the training sessions. The VRTT included exercises to improve walking and balance abilities in engaging VR environments, for example, by displaying in real-time the joints kinematic during walking through a forest or by transferring load from one body side to the other to avoid obstacles while practicing ski. The training was highly personalized for the motor and cognitive performance of each patient. Experienced physiotherapists, trained and certified by Motek, defined and performed the training sessions on the GRAIL system.

Assessment
Baseline measures included: patient's age at the beginning of the therapy and at occurrence of the ABI, time elapsed from injury, gender, etiology, motor impairment, intelligence quotient (IQ) and Gross Motor Function Classification System (GMFCS) level.   The VRTT was performed with the Gait Real-time Analysis Interactive Lab (GRAIL, Motek, Houten, The Netherlands) that is an immersive VR system for gait assessment and rehabilitation ( Figure 1B). It is equipped with a dual-belt treadmill, a two-degree of freedom platform, and a 180 • cylindrical screen where virtual environments are projected and synchronized with the treadmill and the subject. A Vicon motion-capture system (Oxford Metrics, Oxford, UK) equipped with 10 optoelectronic cameras (sample frequency 100 Hz) surrounds the system. Subjects interact with virtual environments with their movement, thanks to passive markers located in different body parts depending on the activity. The system returns visual, proprioceptive and auditory feedback to the subject to support rehabilitation.
The VRTT included exercises to improve walking and balance abilities in engaging VR environments, for example, by displaying in real-time the joints kinematic during walking through a forest or by transferring load from one body side to the other to avoid obstacles while practicing ski. The training was highly personalized for the motor and cognitive performance of each patient. Experienced physiotherapists, trained and certified by Motek, defined and performed the training sessions on the GRAIL system.

Assessment
Baseline measures included: patient's age at the beginning of the therapy and at occurrence of the ABI, time elapsed from injury, gender, etiology, motor impairment, intelligence quotient (IQ) and Gross Motor Function Classification System (GMFCS) level. GMFCS is a 5-level classification system describing the gross motor function of children and adolescents [27], and has been previously used to classify subjects with ABI [28].
Participants underwent a motor assessment before (T0) and at the end of the treatment (T1), which included the following outcome measures: Gross Motor Function Measure-88 (GMFM-88), which was selected as primary outcome, 6 min walking test distance (6MWT), Gillette Functional Assessment Questionnaire (FAQ), and 3-dimensional gait analysis (GA).
The GMFM-88 is an assessment tool designed for the assessment of gross motor function in children and adolescents (under 18 years old) with cerebral palsy and includes 88 items, divided into 5 dimensions, each of them representing a particular movement or position. Items span the spectrum of gross motor activities in five dimensions: A: Lying and rolling; B: Sitting; C: Crawling and kneeling; D: Standing; E: Walking, running, and jumping. Total score and dimensions D and E were considered in this study, since more related to the interventions. The validity of GMFM-88 in the evaluation of gross motor function in children with ABI has been previously demonstrated [29].
The 6MWT rates gait endurance during self-paced walking within 6 min through the hospital corridors. Verbal standardized instructions are given to the patient during the test, which includes walking at a comfortable speed, turning 180 • every 25 m and covering as much distance as possible within the time limit of 6 min [30].
The FAQ is a questionnaire that assesses levels of mobility during everyday life, in a 10-level classification [31]. The FAQ is administered by asking questions to the parents or the child him/herself. GA performs a quantitative analysis of gait movement. The GA laboratory is equipped with eight optoelectronic cameras, an optoelectronic system (Elite, BTS Bioengineering, Milan, Italy) with a sampling rate of 100 Hz, and two force plates (Kistler Group, Winterthur, Switzerland) embedded in the floor. Patients were asked to walk at their preferred speed, and to wear their orthoses and footwear only if they were unable to walk barefoot.

The Propensity Score Algorithm
A PSM algorithm was used to identify matched cohorts as a subgroup of the unmatched cohorts, and was defined as follows. First, covariates were selected among the baseline measurements under the hypothesis that they contribute to the choice of the treat-Life 2021, 11, 548 5 of 14 ment. Age, gender, time elapsed from injury, GMFCS and motor impairment were selected as covariates. Then, a logistic regression was performed to estimate the propensity scores, considering the intervention as outcome variable and selected covariates as predictors. The matching between RAGT group and VRTT group was obtained by using the 1:1 nearestneighbor procedure that means that each individual of the VRTT group was matched with one of the RAGT group in terms of propensity score, discarding individuals with propensity scores outside the range of the other group. Finally, to check the model adequacy, the standardized differences between the groups were computed before and after matching for continuous, dichotomous, and categorical variables, according to [25]. The PSM algorithm was developed in Rstudio by means of the MatchIt library. The matchit() function was used with the method "nearest" to implement the 1:1 nearest-neighbor matching.

Gait Parameters Extraction
For each GA assessment, an expert physiotherapist collected and processed at least five trials for the left and the right limbs using dedicated software (EliteClinic, BTS Bioengineering, Milan, Italy). The most representative trial was then selected for further analyses. BTS Smart Clinic software was used to extract spatio-temporal and kinematic data for each selected gait cycle.
Spatiotemporal features included: walking velocity, cadence, bilateral stride duration, and bilateral step length and width.
Kinematic curves were analyzed in Matlab by using an ad hoc algorithm designed to extract, for the right and left leg, the foot progression in stance, maximum and minimum flexion angle and the range of motion (ROM) in the sagittal plane for ankle, knee, and hip, and the ROM of pelvic tilt, obliquity, and rotation.
Furthermore, starting from kinematic data, the BTS Smart Clinic software automatically computed the Gait Deviation Index (GDI). The GDI was developed and validated by Schwartz and Rozumalski in 2008 [32]. It is defined as the scaled distance between 15 gait feature scores (selected as those that explain the 98% of data) for a subject and the average of the same 15 gait feature scores for a control group of typically developing children. Therefore, the GDI provides an overall assessment of the deviation from a physiological gait pattern. The GDI ranges from 0 to 100, where 100 indicates the absence of gait pathology [32].
For the GA parameters, the mean value between left and right side was considered.

Statistics
The Kolmogorov-Smirnov test was run to test data distribution; since normality was not verified, non-parametric tests were used, and data were represented with median and interquartile range values.
A Mann-Whitney U test and a Pearson Chi-squared test were performed between groups on continuous and dichotomous/categorical baseline measures, respectively, before and after the propensity score matching.
Considering the matched cohorts, the time effect was evaluated independently in each group by comparing baseline and post-treatment scores by means of the Wilcoxon signed rank test. The effect of the intervention (group effect) was evaluated by comparing the pre-post changes of each outcome between the two groups using the Mann-Whitney U test.
Finally, when the minimal clinical important difference (MCID) was available in the literature, the percentages of patients who exhibited a clinically important change (prepost improvement above the MCID) were computed for the two groups. Similarly, the percentages of patients experiencing a worsening above the MCID were computed. A Pearson Chi-squared test was performed between groups to look for differences in terms of improved, stable and worsened patients. For 6MWT and GMFM-88 (and its items D and E), MCID values were set at 30 m and 5, respectively, as suggested by [33]. For step Life 2021, 11, 548 6 of 14 length, MCID was defined equal to 0.2 m as defined by [34], while for gait kinematics in the sagittal plane MCID was set at 5 • as proposed in [35].
The statistical analysis was performed in SPSS v21. The significance level was established at p < 0.05.

Unmatched Cohort
The unmatched cohort was composed of 70 patients allocated into two groups in a non-randomized way: 39 were allocated in the RAGT group and 31 were allocated in the VRTT group. The IQ evaluation was available for 57 patients (28 in the RAGT group, 29 in the VRTT group). Due to equipment availability, or the inability of patients to perform a test, 6MWT was performed in 32 patients in the RAGT group and 28 patients in the VRTT group, FAQ was present for 38 patients in the RAGT group and 15 patients in the VRTT group, GMFM was performed in 34 patients in the RAGT group and 27 patients in the VRTT group, and GA was available for 20 patients in the RAGT group and 26 patients in the VRTT group. There were no dropouts in the study ( Figure 2). post improvement above the MCID) were computed for the two groups. Similarly, the percentages of patients experiencing a worsening above the MCID were computed. A Pearson Chi-squared test was performed between groups to look for differences in terms of improved, stable and worsened patients. For 6MWT and GMFM-88 (and its items D and E), MCID values were set at 30 m and 5, respectively, as suggested by [33]. For step length, MCID was defined equal to 0.2 m as defined by [34], while for gait kinematics in the sagittal plane MCID was set at 5° as proposed in [35].
The statistical analysis was performed in SPSS v21. The significance level was established at p < 0.05.

Unmatched Cohort
The unmatched cohort was composed of 70 patients allocated into two groups in a non-randomized way: 39 were allocated in the RAGT group and 31 were allocated in the VRTT group. The IQ evaluation was available for 57 patients (28 in the RAGT group, 29 in the VRTT group). Due to equipment availability, or the inability of patients to perform a test, 6MWT was performed in 32 patients in the RAGT group and 28 patients in the VRTT group, FAQ was present for 38 patients in the RAGT group and 15 patients in the VRTT group, GMFM was performed in 34 patients in the RAGT group and 27 patients in the VRTT group, and GA was available for 20 patients in the RAGT group and 26 patients in the VRTT group. There were no dropouts in the study ( Figure 2).  Table 1. Differences between groups were found in the time elapsed from injury, severity of the impairment,  Table 1. Differences between groups were found in the time elapsed from injury, severity of the impairment, motor impairment and etiology. Gender, age, and IQ did not show significant differences between the two groups.

Matched Cohorts
The PSM algorithm identified 15 patients in each group. The median and interquartile range of the baseline measures in the matched cohorts are shown in Table 2. No statistically significant differences in any of the baseline variables were observed. The standardized differences of baseline measurements before and after the matching is shown in Figure 3. The SD after the match was reduced (mean value 0.7 ± 0.6 before, 0.4 ± 0.3 after the matching procedure), except for age at injury and gender, which were already quite small in the unmatched cohort.  Figure 4 shows GMFM, 6MWT, and FAQ in the matched groups, before and treatment. Both groups presented statistically significant improvement for the prim   Figure 4 shows GMFM, 6MWT, and FAQ in the matched groups, before and after treatment. Both groups presented statistically significant improvement for the primary outcome, with the GMFM-88 increasing in RAGT group (Wilcoxon signed rank test p = 0.003) as well as in VRTT group (p = 0.009). Furthermore, both groups showed statistically significant improvements in GMFM dimensions D (Wilcoxon signed rank test p = 0.005 for RAGT and p = 0.018 for VRTT) and E (Wilcoxon signed rank test p = 0.002 for both RAGT and VRTT). The percentage of patients with clinically relevant changes in GMFM-88, GMFM-D and GMFM-E were 54%, 62%, and 69% in the RAGT and 21%, 29% and 50% in the VRTT group. Nobody experienced a worsening in his/her gross motor abilities. The FAQ significantly increased in both groups (Wilcoxon signed rank test p = 0.017 for RAGT and p = 0.046 for VRTT), while the 6MWT improved significantly in the RAGT group (Wilcoxon signed rank test p = 0.003) with 53% of patients with clinically relevant changes and 7% of patients with a worsening above MCID and had a trend of improvement (p = 0.056) in the VRTT group, with 43% of patients with clinically relevant improvements and 0% with clinically relevant worsening. No differences in the therapy effect were found, as demonstrated by the Mann-Whitney U test (all p-values > 0.070).

Outcomes in Matched Cohorts
Life 2021, 11, x FOR PEER REVIEW 9 of 14 significant deterioration, the same worsening obtained in the VRTT group. Considering the pelvis, only the RAGT group showed statistically significant improvements. No significant differences between the two interventions were found, as shown by the group effect analysis, reported in the last column. Table 5 reports the number of improved, stable, and worsened patients in each group. No significant differences between the two groups were found.    Table 3 shows spatiotemporal parameters in the matched cohorts: step length and stride length significantly improved only in the RAGT group. However, the percentage of patients with clinically relevant changes in the step length was equal to 8% in both groups. In contrast, the GDI had a statistically significant improvement only in the VRTT group. No statistically significant differences between the two groups were found in any of the analyzed parameters, as shown by the last column that reports the pre-post change for each group and p-values obtained with Mann-Whitney U test. The kinematic measures evaluated with the GA, highlighted that each treatment targeted different joints (see Table 4). Specifically, the VRTT group experienced improvement at the foot and ankle level. The minimum ankle flexion improved above MCID in 31% of patients in the VRTT group and in 25% in the RAGT group. ROM of ankle flexion showed clinically relevant changes in 23% of patients in the VRTT group while in the RAGT group 17% of patients had clinically relevant improvement and 8% had clinically relevant worsening. Considering the minimum knee flexion, the VRTT group showed a significant worsening, with 0% of patients improving and 15% getting worse. However, the same parameter improved in 8% of patients and worsened in 33% of patients in the RAGT group. In contrast, ROM of knee flexion significantly improved in the RAGT group, with 42% of patients improving above MCID, while 23% of patients improved and 8% worsened in the VRTT group. RAGT group significantly improved also in the ROM of hip flexion, with 33% of patients above MCID (vs. 23% in the VRTT group) and 8% with a significant deterioration, the same worsening obtained in the VRTT group. Considering the pelvis, only the RAGT group showed statistically significant improvements. No significant differences between the two interventions were found, as shown by the group effect analysis, reported in the last column.  Table 5 reports the number of improved, stable, and worsened patients in each group. No significant differences between the two groups were found.

Discussion
The main aim of this study was to compare two different interventions for gait rehabilitation in children and adolescents with ABI, one exploiting RAGT using the Lokomat device and one using immersive VRTT with the Grail system. The assignment of a patient to a treatment depended on clinicians' decisions and thus provided unmatched cohorts. Therefore, a retrospective matching procedure was mandatory before comparing the efficacy of the two interventions.
Data analysis performed on the matched cohorts revealed that gross motor abilities significantly improved in both groups. However, specifically considering the GMFM-88 and GMFM-D there is a trend of higher percentage of patients in the RAGT group that gained a clinically relevant change. Results also showed that similar percentages of patients in each group improved their endurance and their step length above MCID, even though the improvements were statistically significant only for the RAGT group. Therefore, patients who underwent RAGT treatment had a slightly higher functional gain. Regarding gait analysis, data confirmed a beneficial intervention of RAGT at proximal level (i.e., pelvis and hip) and a positive effect of VR on distal districts (i.e., foot and ankle) and on the overall gait pattern quality. Both treatments barely worked on knee joint. The statistical analysis did not find any differences between the two interventions.
These results are in accordance with previous studies describing the effectiveness of RAGT and VRTT on the locomotion of children with ABI. Indeed, previous studies on RAGT described a proximal-to-distal differential effect on the lower limbs [15] and an enrichment of the main functional measures [14,16]. Interestingly, in the current work, no changes were observed in terms of gait speed and knee district, with only a small percentage of patients showing improvements in terms of knee range of motion. This may be due to differences in the participants' severity: the investigated cohorts included in this work were composed by patients with mild impairment (i.e., 28 patients with GMFCS II, one patient with GMFCS I and only one patient with GMFCS III) while previous works by Beretta and collaborators included also GMFCS III and IV, i.e., with more severe motor impairment and thus with more potential for improvement. Furthermore, this work confirms results obtained in previous preliminary studies showing the effectiveness of VR on gross motor abilities and on distal joints of children and adolescents suffering from ABI [21,22].
The results of this work suggest that RAGT treatment and VRTT treatment are both effective although working on different districts and competencies. This could find an explanation in the type of intervention performed with the two devices. On one hand, exoskeletons for the lower limbs provide several repeated movements with a fixed kinematic trajectory: this is of course is beneficial in terms of endurance but eliminates variations in the kinematics, which are fundamental for therapy-mediated motor re-learning. Therefore, the reduced sensory feedback might explain the small improvement of the gait pattern [36]. Furthermore, no improvements at distal level were observed in the RAGT group and this can be explained by the fixation of the ankle joint. On the other hand, VRTT is not susceptible to exoskeleton constraints and provides training of the lateral weight shift considering the natural variability in leg and pelvis kinematics. Furthermore, it allows for a task-oriented training that focuses on the practice of skilled motor performance (i.e., locomotion), fostering neural reorganization [37]. These may lead to an increased room for improvement in locomotion pattern. This work has some limitations. First, although the initial cohorts were quite large compared to traditional studies involving RAGT or VR rehabilitation in the developmental age, the use of the PSM caused a reduction of the sample size, which was small in the matched cohorts (N = 30). Indeed, the PSM has been used in several studies with large sample sizes by matching cases between groups (see as examples [38,39]), but only once in a small sample size [40]. Nevertheless, PSM is a powerful tool that enables excellent matching of baseline characteristics and thus mimics randomization.
A second limitation is that standardized differences remained moderate after the matching, likely due to the small sample size. However, in small matched samples, moderate SD could still be consistent with a correctly specified PSM [25].
A third limitation is related to the use of PSM. Although adjustment was made for several variables, it is possible that residual confounders between the groups could have been omitted in the analysis. Nevertheless, in this study, many covariates were used in the propensity model thus maximally reducing baseline differences between cohorts.
Another issue is the absence of a follow-up assessment, and therefore we could not observe effects in the medium-or long-term. However, the main goal of this study was to assess possible differences between two gait interventions based on advanced technologies, and a pre/post study is a first step in this direction.
Finally, the range of age of the participants was quite broad and the small sample size did not allow to perform age stratification.
Despite such limitations, the authors believe that this is a valuable and novel work, even if addressing a niche topic, that can provide suggestions and open new perspectives on the use of rehabilitation technologies in the developmental age.
Future works will use larger sample size, will compare the intervention effect at different ages and will investigate long-term benefits. Finally, considering that a percentage (about 10%) of patients in both groups experienced worsening in some variables, it needs to be investigated which patient feature and what environmental component or emotive/psychological aspect determines the response to treatment.

Conclusions
This work compared the effectiveness of two interventions (i.e., RAGT and VRTT) for the gait rehabilitation of children and adolescents suffering from ABI. To our knowledge, this is the first time that these rehabilitation technologies have been compared in paediatric populations. Recently, more and more rehabilitation technologies have come on the market. Each of them has its indication for use but this is often too generic, e.g., for gait rehabilitation, and does not provide specific indications on target users. Thus, it is difficult for a clinician to choose the best device for each patient. This work would like to contribute in this direction. The approach used and the results obtained, although preliminary, pave the way for the definition of guidelines for the treatment of children and adolescents suffering from ABI. Our observations suggested that RAGT and VRTT protocols foster different motor improvements, with RAGT inducing an improvement in terms of endurance and proximal joint kinematics and VRTT enhancing gait pattern and distal joint kinematics. Therefore, a good approach could be to couple the two interventions in order to achieve a more complete recovery of walking ability.  Informed Consent Statement: Informed consent was obtained from all subjects or their parents involved in the study.

Data Availability Statement:
The authors confirm that the data supporting the findings of this study are available within the article. The complete raw data that support the findings of this study are available at 10.5281/zenodo.4630814.