A Repertoire of Virtual-Reality, Occupational Therapy Exercises for Motor Rehabilitation Based on Action Observation

: There is a growing interest in action observation treatment (AOT), i.e., a rehabilitative procedure combining action observation, motor imagery, and action execution to promote the recovery, maintenance, and acquisition of motor abilities. AOT studies employed basic upper limb gestures as stimuli, but—in principle—the AOT approach can be effectively extended to more complex actions like occupational gestures. Here, we present a repertoire of virtual-reality (VR) stimuli depicting occupational therapy exercises intended for AOT, potentially suitable for occupational safety and injury prevention. We animated a humanoid avatar by ﬁtting the kinematics recorded by a healthy subject performing the exercises. All the stimuli are available via a custom-made graphical user interface, which allows the user to adjust several visualization parameters like the viewpoint, the number of repetitions, and the observed movement’s speed. Beyond providing clinicians with a set of VR stimuli promoting via AOT the recovery of goal-oriented, occupational gestures, such a repertoire could extend the use of AOT to the ﬁeld of occupational safety and injury prevention.


Summary
Traditionally, motor rehabilitation could be achieved through motor practice, whereas recent evidence shed light on the potential of covert approaches to promote the recovery of motor abilities [1,2]. Within this class of methods, the so-called action observation treatment proved extremely valuable in driving the motor rehabilitation pathway towards better outcomes and faster recovery [3]. The principle underlying AOT is based on the mirror mechanism, i.e., the neural mechanism allowing an external stimulus (visual or acoustical) to activate a part of the motor system (about 30% of motor neurons in premotor and motor parietal cortices [4]) as if the observer was performing that action by himself [5]. Such a "backdoor" entry to the cortical motor system allows therapists to maintain the excitability of the neural motor system even in the presence of altered motor capacities [6], ultimately preserving the motor skills of the individual [7].
Data 2022, 7, 9 2 of 10 Another major advantage intrinsic to the AOT is that the to-be-observed stimuli can be designed with any degree of complexity and individualization, differently from the easiness often characterizing traditional rehabilitation exercises. Potentially, one could envision future motor rehabilitation procedures stemming from the patient's daily activities and aiming to rehabilitate not only the functioning of the injured body district but also the more general functionality about the actions more often implemented in his/her daily life.
A seminal example in this perspective is represented by occupational activities, often requiring the repeated execution of specific manual gestures per day. To name a few, carpenters, construction workers, cleaners all possess a motor repertoire specific for their work activity, whose fast restoration after an injury is cardinal for increasing the patient's quality of life and relieving him/her and the caregivers from the burden of a long-lasting rehabilitative treatment. Inspired by the possibility of AOT to meet this line of reasoning, in the framework of a collaboration between the Institute of Neuroscience of the National Research Council of Italy and INAIL (the Italian National Institute for insurance against occupational injuries), we designed a set of 18 visual stimuli depicting occupational therapy exercises intended for upper-limb motor rehabilitation of specific workers.
A basic solution would have been to videotape and upload 18 videos of the chosen occupational gestures. However, modern technologies offer several solutions to maximize the efficacy of the AOT [3,8,9]. We then opted for designing stimuli in virtual reality, offering an immersive and more engaging experience to the patient. However, the advantages brought by VR are not limited to the patient, as the therapist himself can individualize fundamental parameters like the viewpoint [10], the number of repetitions, and the speed of the observed movement. In our dataset, all these aspects can be intuitively controlled via the graphical user interface. In addition, as the dataset contains the entire Unity project, researchers could manipulate the animations according to their needs, e.g., reproducing the environment familiar to their patients, the objects they more often interact with, and replacing the type of humanoid avatar.
The primary aim of sharing this dataset is to provide clinicians with a set of VR stimuli promoting via AOT the recovery of goal-oriented, occupational gestures. Such a sharing would also enable the administration of VR AOT to rehabilitation teams lacking programming and technical skills, as the stimuli are ready to be deployed on low-cost devices. In turn, this latter aspect would enlarge the experimental sample of people receiving AOT for occupational gestures and make the procedures more homogeneous, in line with recent recommendations in the field [11].
Notwithstanding, such a repertoire could become quintessential for AOT-based procedures in occupational safety. Indeed, AOT can also be used out of the clinical boundaries, especially for maintaining motor skills in at-risk populations, like workers dealing with hazardous gestures [3]. Extending AOT to occupational injuries prevention would massively broaden the applicability of our dataset to daily procedures.

Data Description
The dataset is downloadable from https://doi.org/10.5281/zenodo.5592131 (accessed on 26 October 2021). In the Supplementary Materials, we reported detailed instructions on how to run the Unity project in editor mode. The dataset contains a single Unity Project, including 19 scenes, i.e., 18 scenes reproducing the upper limb exercises and the main scene. This latter one serves as a graphical user interface from which every exercise can be launched and its parameters adjusted. In the next paragraphs, we will describe these two elements separately. Figure 1 reports the main scenario with the initial graphical user interface. It is an interactive scene allowing the user to set the parameters for the exercise visualization. Inputs are collected through the keyboard, specifically the arrows buttons, with left/right ones allowing to navigate through the different controllers and the up/down ones allowing Data 2022, 7, 9 3 of 10 to scroll over the different values of a given menu. Each exercise was further associated with a letter of the qwerty keyboard (from q to k), whose press allows an immediate selection of the desired exercise. Figure 1 reports the main scenario with the initial graphical user interface. It is an interactive scene allowing the user to set the parameters for the exercise visualization. Inputs are collected through the keyboard, specifically the arrows buttons, with left/right ones allowing to navigate through the different controllers and the up/down ones allowing to scroll over the different values of a given menu. Each exercise was further associated with a letter of the qwerty keyboard (from q to k), whose press allows an immediate selection of the desired exercise. Once the exercise has been selected, the user can tailor the visual perspective to display the movement through the top-right drop-down menu (Camera). Three different perspectives can be selected:

Grafical User Interface
1. VFront: the camera is placed in front of the avatar performing the movement; such a third-person view makes the observer perceive the movement as made by another individual. 2. VLat: the camera is placed laterally to the avatar, ensuring better visibility of the sagittal plane; even here, the observer perceives the movement as made by another individual. 3. VSubj: the camera frames the movement from a position dynamically following the avatar's nasion. In this way, the observer perceives the movement as made by him/herself. To limit the possible motion sickness and guarantee a good degree of immersivity, the camera further follows the rotation of the head made by the observer.
Finally, two additional sliders positioned at the bottom of the screen allow the user to regulate the movement speed and the number of repetitions to display. These Once the exercise has been selected, the user can tailor the visual perspective to display the movement through the top-right drop-down menu (Camera). Three different perspectives can be selected:

1.
VFront: the camera is placed in front of the avatar performing the movement; such a third-person view makes the observer perceive the movement as made by another individual.

2.
VLat: the camera is placed laterally to the avatar, ensuring better visibility of the sagittal plane; even here, the observer perceives the movement as made by another individual.

3.
VSubj: the camera frames the movement from a position dynamically following the avatar's nasion. In this way, the observer perceives the movement as made by him/herself. To limit the possible motion sickness and guarantee a good degree of immersivity, the camera further follows the rotation of the head made by the observer.
Finally, two additional sliders positioned at the bottom of the screen allow the user to regulate the movement speed and the number of repetitions to display. These controllers are intended to allow the operator to adjust the difficulty of the observe-and-execute task, with faster and less replayed movements representing the more challenging exercise configuration.
Once all parameters are set, the selected exercise can be started by pressing the play button (shortcut with the space key).  Table 1 reports the exercise list and a detailed description of the performed movements. Eighteen occupational therapy exercises have been agreed upon in collaboration with the physiotherapists of the INAIL motor rehabilitation center in Volterra, Italy. These are primarily intended for the motor rehabilitation of the upper limb, with a particular focus on the shoulder joint. Typically, these exercises are inserted in the rehabilitative pathway of patients following proximal humerus fractures, surgical repair of rotator cuff tears, clavicle fractures, scapula fractures, and acromioclavicular joint injuries. Exercises can be unimanual or bimanual, with the latter requiring a higher level of coordination. The difficulty varies according to the operational range in which movements take place. For this reason, movements were divided into five levels of difficulty, as depicted in Figure 2:  controllers are intended to allow the operator to adjust the difficulty of the observe-andexecute task, with faster and less replayed movements representing the more challenging exercise configuration. Once all parameters are set, the selected exercise can be started by pressing the play button (shortcut with the space key). Table 1 reports the exercise list and a detailed description of the performed movements. Eighteen occupational therapy exercises have been agreed upon in collaboration with the physiotherapists of the INAIL motor rehabilitation center in Volterra, Italy. These are primarily intended for the motor rehabilitation of the upper limb, with a particular focus on the shoulder joint. Typically, these exercises are inserted in the rehabilitative pathway of patients following proximal humerus fractures, surgical repair of rotator cuff tears, clavicle fractures, scapula fractures, and acromioclavicular joint injuries. Exercises can be unimanual or bimanual, with the latter requiring a higher level of coordination. The difficulty varies according to the operational range in which movements take place. For this reason, movements were divided into five levels of difficulty, as depicted in Figure  Some of the proposed exercises can occur in different operational ranges, and thus they are proposed in multiple versions. Exercises were planned with various levels of difficulty to allow the design of incremental treatments. Indeed, in the initial rehabilitation stage, the therapist can choose basic exercises limited to the low/middle operational ranges, while greater difficulties can be gradually advanced along with the treatment course. Finally, the table includes some working activities whose occupational duties might be assimilated to the specific exercise. Some of the proposed exercises can occur in different operational ranges, and thus they are proposed in multiple versions.

Scene Description
Exercises were planned with various levels of difficulty to allow the design of incremental treatments. Indeed, in the initial rehabilitation stage, the therapist can choose basic exercises limited to the low/middle operational ranges, while greater difficulties can be gradually advanced along with the treatment course. Finally, the table includes some working activities whose occupational duties might be assimilated to the specific exercise.

Quality Assessment
To verify the goodness of the VR stimuli produced by our pipeline, we administered an online behavioral questionnaire to 35 naive participants, making them observe the videos of each exercise and then asking, for each video, a quantitative judgment about (a) the goodness/quality of the overall movement, (b) the realism of the postural attitude of the avatar, (c) the fidelity in the animation of the hand-object interactions, and (d) the realism of the scenario surrounding the avatar. In our intents, these scores provide a quality assessment of each video, returning not only a general quality index (a) but also targeting the quality of specific aspects of the animation, namely, the whole-body posture (b), the hand-object interaction (c), and the context in which the action takes place (d).
All four questions required a score ranging from 1 (very low/bad) to 5 (very high/good), with a score of 3 representing the neutral judgment. For data analysis, we considered the 35 scores collected for each video and feature and conducted one-sample t-tests against the mean of 3. In other words, we evaluated whether each 35-elements distribution was statistically larger than the sufficiency threshold (i.e., 3) or not. The resulting p-values were Bonferroni corrected (n = 72) to account for the multiple comparisons issue and reduce the false-positive ratio. All these results have been reported in Table 2 to let the reader appreciate the perceived fidelity of the individual videos.
Statistical results revealed that 60 out of the 72 tests (83%) were significant, thus reinforcing the overall goodness of the stimuli dataset. In terms of videos, it is worth noting that twelve stimuli were significantly above 3 in all the scores, while the six remaining videos had non-significant results spread across multiple features. Conversely, in terms of the explored features, the scores attributed to the scenarios were significantly higher than 3 in all the videos, indicating that subjects systematically perceived the scenarios surrounding the avatar as realistic, while all the other features presented low scores for some of the stimuli. Examining the individual stimuli (see Table 2), the paint-rolling stimulus presented a non-significant overall quality score, while the weightlifting stimulus had a non-significant score concerning the avatar posture. In general, more critical results were reported for the other four stimuli (housework, office works, sawing off high branches, screw), with multiple scores not reaching the statistical significance for each of these stimuli.
In light of these results, it is worth discussing the possible underlying reasons. One caveat of this questionnaire is that we conducted the experiment via an online platform, making participants watch the 2D videos depicting the whole movement. However, this aspect is not negligible, as it could explain some of the spatial discrepancies perceived in videos. For instance, the contact between the tools (e.g., the paint-roller or the saw) and the surrounding objects (e.g., the wall or the branch) was defined by using colliders (i.e., built-in Unity components that provide collision detection between virtual objects). In other words, taking as an example the case of paint-rolling, the paint-roller/wall contact is ensured for the entire action duration. However, when switching to a 2D rendering, some videos have an impoverished spatial fidelity among the objects.
In summary, the results of the behavioral validation suggest that the majority of the developed stimuli faithfully reproduce the chosen occupational gestures. Although satisfactory, 6 of the stimuli can be improved, but this number could be mitigated when experiencing them in virtual reality, i.e., the environment in which our dataset should express its highest potential.

Methods
To make VR stimuli as realistic as possible in terms of movement kinematics, we opted for performing a high-density recording of the whole-body kinematics from a healthy volunteer (24 years old) during an in-lab execution of the 18 exercises. The performance was monitored and reviewed by physiotherapists to ensure compliance with the requirements. According to Edinburgh Handedness Inventory, Oldfield, 1978 [12], the subject is righthanded. The study was conducted according to the principles expressed in the Declaration of Helsinki of 1975, revised in 2008. The participant provided written informed consent before the experimental sessions. The local ethics committee approved the study (Comitato Etico dell' Area Vasta Emilia Nord.10084, 12.03.2018).
An extensive kinematic acquisition was collected via multiple systems (Figure 3). The motion-capture system (inertial sensors, Biomech, Awinda, XSens, The Netherlands) consists of a hardware station connected ( Figure 3A) to a dedicated PC with USB cable. The station is equipped with an antenna that enables wireless communication with motion trackers positioned on the subject. The motion trackers ( Figure 3B) are miniature inertial measurement units embedding 3D linear accelerometers, 3D rate gyroscopes, 3D magnetometers, and a barometer. The motion trackers are positioned on the subjects' skin according to the protocol required from the manufacturer reported in Figure 3C using the specific strap. In particular, 17 sensors were positioned on the subject to capture the movement of the following 23 body segments, which included head, neck, eighth and tenth thoracic vertebra, third and fifth lumbar vertebra, right and left shoulder, right and left arm, right and left forearm, right and left hand, pelvis, right and left thigh, right and left shank, right and left foot, and right and left forefoot.
Additionally, the Manus Prime II Xsens gloves (Manus, The Netherlands) were used to track the hand and fingers movements ( Figure 3D). Such gloves support 11 degrees of freedom for each finger by incorporating industrial-grade flex sensors fused with inertial measurement units ensuring the finest motor movements. Before the exercise execution, a calibration procedure was performed to ensure the correct temporal and spatial alignment between the recording sensors. to track the hand and fingers movements ( Figure 3D). Such gloves support 11 degrees of freedom for each finger by incorporating industrial-grade flex sensors fused with inertial measurement units ensuring the finest motor movements. Before the exercise execution, a calibration procedure was performed to ensure the correct temporal and spatial alignment between the recording sensors. The experimental subject was required to perform the 18 exercises reported in Table  1. During each exercise, the subject had to interact with objects identical, or at least similar to those shown in the VR stimuli, to ensure that the adopted kinematics were suitable for the required exercise.
The Xsens MVN Analyze software was used to compute the orientation and the position data for each body joint and finger. A detailed description of the used biomechanical model can be found in paragraph 23.5 ("Anatomical Model") of the user MVN manual available online (see https://www.xsens.com/hubfs/Downloads/usermanual/MVN_User_Manual.pdf, accessed on 7 January 2022). The processed data containing all the segments' position and orientation were exported into joint/bone hierarchy, usually named the skeleton, which defines the bones inside the mesh and their reciprocal movements. The MVN Analyze software allowed exporting of kinematics data in an FBX file containing the skeleton in a compatible format for Unity. Such a file was used to animate humanoid models.
The FBX files containing the motion capture data were imported in the Unity 3D game engine software (version 2019.4.13f1). Then, data were used to animate a rig humanoid avatar (available on https://renderpeople.com/free-3d-people/, accessed on 7 January 2022, selected from the 3D ANIMATED PEOPLE list fbx file named "rp_nathan_ani-mated_003_walking").
The animations were then isolated from the FBX and, if necessary, edited through the Unity animation tool to remove any artifacts or adjust the body segments' postures. All the animation files can be found in the Kinematics folder within the project. Subsequently, a single scene was implemented for each exercise. The experimental subject was required to perform the 18 exercises reported in Table 1. During each exercise, the subject had to interact with objects identical, or at least similar to those shown in the VR stimuli, to ensure that the adopted kinematics were suitable for the required exercise.
The Xsens MVN Analyze software was used to compute the orientation and the position data for each body joint and finger. A detailed description of the used biomechanical model can be found in paragraph 23.5 ("Anatomical Model") of the user MVN manual available online (see https://www.xsens.com/hubfs/Downloads/usermanual/ MVN_User_Manual.pdf, accessed on 7 January 2022). The processed data containing all the segments' position and orientation were exported into joint/bone hierarchy, usually named the skeleton, which defines the bones inside the mesh and their reciprocal movements. The MVN Analyze software allowed exporting of kinematics data in an FBX file containing the skeleton in a compatible format for Unity. Such a file was used to animate humanoid models.
The FBX files containing the motion capture data were imported in the Unity 3D game engine software (version 2019.4.13f1). Then, data were used to animate a rig humanoid avatar (available on https://renderpeople.com/free-3d-people/, accessed on 7 January 2022, selected from the 3D ANIMATED PEOPLE list fbx file named "rp_nathan_animated_003_walking").
The animations were then isolated from the FBX and, if necessary, edited through the Unity animation tool to remove any artifacts or adjust the body segments' postures. All the animation files can be found in the Kinematics folder within the project. Subsequently, a single scene was implemented for each exercise.
For each exercise, colliders were placed on the avatar's hand and the objects to ensure a proper rendering of the hand/object interaction. Once a collision happened between these two, the object's position became consistent with the finger kinematics until the object was released. The timing of the object release occurs through delegated events on the animations.
The unity project is ready to be deployed on the most common commercial VR headmounted display since the Oculus and Steam VR plugins have already been imported within the project. Moreover, the project was developed in the Android platform, and it Data 2022, 7, 9 9 of 10 is also available for editor mode, allowing the end-user to modify the contents according to specific needs. The graphical user interface was designed to take the keyboard as an input. While wired keyboards are used in editor mode and with wired viewers, a Bluetooth keyboard can be interfaced in standalone head-mounted displays.

User Notes
Download all the Unity project folders and subfolders and open them with Unity version 2019.4.13.f1. When installing Unity, make sure to install also the Android Build Support. The project is ready to be opened in editor mode. Open the main scene and run the project. Press a letter from q to k, and the corresponding scene will appear. Use the right arrow to scroll over the selectors and set the desired parameters. Once the desired setting appears, press "play" with the space key. The GUI will disappear, and the animation will begin. To stop the animation and return to the GUI, press the space key again.
To create an application for a standalone head-mounted display (e.g., Oculus Quest), in the Main Scene hierarchy, select the child "Panel" of the "MainUI" canvas, and in the script "World2ScreenUISwich" click on the "worldspace" field. The Oculus prefab for the camera setting (OVR Camera Rig) is already in the hierarchy. Connect the Oculus Quest to a computer through a USB cable. Put on the Quest and enable USB debugging for this computer. Open the Build Settings window in Unity and click on "Build and Run." For a quick view of the content of each scene, in the folder "Recording" within the Unity Project, there are eighteen videos, each relative to an exercise displayed in three different perspectives.

Conclusions
The dataset provides a repertoire of virtual-reality (VR) stimuli depicting occupational therapy exercises intended for AOT. Eighteen occupational therapy exercises were rendered by fitting the kinematics of a healthy subject onto a humanoid avatar. An online validation about the perceived quality of the videos was performed, indicating that videos were perceived as satisfactory in terms of overall movement quality, realism of the avatar's postural attitude, fidelity in the animation of the hand-object interactions, and realism of the scenario surrounding the avatar.
We strongly believe that such a dataset could stimulate interest in those clinicians who daily practice occupational therapy and aim to integrate conventional rehabilitation paradigms with action observation treatment. To date, while the scientific and clinical community potentially taking advantage of AOT is quite extensive, the technical difficulties implicit in the creation of VR AOT stimuli prevented a similarly extensive diffusion of VR AOT clinical application. Sharing our dataset is intended to overcome such limitations, at least in the field of occupational therapy. Dedicated clinical trials are needed to evaluate whether an action observation treatment accompanying the conventional rehabilitation favors a faster recovery of the occupational gestures, shortening the time of return to work.
Notwithstanding, such a repertoire of VR AOT stimuli could be employed in nonclinical realms, like occupational safety and prevention. Indeed, the same principles underlying the AOT efficacy during motor rehabilitation apply to the maintenance of motor skills in people at risk of events with dramatic consequences, like workers dealing with hazardous gestures during their working routine.  Informed Consent Statement: Informed consent was obtained from all subjects involved in the study.

Data Availability Statement:
The data presented in this study are openly available in Zenodo at https://doi.org/10.5281/zenodo.5592131, accessed on 7 January 2022.