Design and Analysis of Cloud Upper Limb Rehabilitation System Based on Motion Tracking for Post-Stroke Patients

: In order to improve the convenience and practicability of home rehabilitation training for post-stroke patients, this paper presents a cloud-based upper limb rehabilitation system based on motion tracking. A 3-dimensional reachable workspace virtual game (3D-RWVG) was developed to achieve meaningful home rehabilitation training. Five movements were selected as the criteria for rehabilitation assessment. Analysis was undertaken of the upper limb performance parameters: relative surface area (RSA), mean velocity (MV), logarithm of dimensionless jerk (LJ) and logarithm of curvature (LC). A two-headed convolutional neural network (TCNN) model was established for the assessment. The experiment was carried out in the hospital. The results show that the RSA, MV, LC and LJ could reﬂect the upper limb motor function intuitively from the graphs. The accuracy of the TCNN models is 92.6%, 80%, 89.5%, 85.1% and 87.5%, respectively. A therapist could check patient training and assessment information through the cloud database and make a diagnosis. The system can realize home rehabilitation training and assessment without the supervision of a therapist, and has the potential to become an e ﬀ ective home


Introduction
At present, there are about 2,000,000 people suffering from stroke each year in China, with approximately 66 percent surviving, usually accompanied by motor function defects [1]. Due to the increasing number of stroke patients and the relatively small number of professional physicians, the task of upper limb rehabilitation training and assessment are arduous.
For many patients, hospitals are relatively far away. Therapists can only provide one-to-one treatment. Home upper limb rehabilitation training is a more flexible method; it can provide patients with timely assistance, according to the needs of patients. Therefore, it is very meaningful to design an effective home rehabilitation training and evaluation method.
Virtual game technology has developed rapidly and become an effective tool for rehabilitation training of post-stroke patients at present [2][3][4][5]. Preliminary studies have shown that a virtual reality game system can provide subjects with a high level of enjoyment and increase motivation to engage in activity [6]. Virtual reality can make the measurement more immersive, and improve the enthusiasm and participation of the subjects. In addition, a virtual game system is very natural and active; it can provide some potential benefits for therapists and patients to supplement the lack of traditional methods [7]. Mobini et al. designed a ball game to measure the hand movements and assess the intra-session and inter-session reliability of upper limb performance indices in patients [8]. Adams

Platform
The architecture of the motion-tracking cloud-based rehabilitation system is shown in Figure 2. It comprises a patient side, cloud platform and therapist side. On the patient side, the Kinect v1.0 [27] was used to collect the motion information of the patient to control the movement of an avatar in a 3D-RWVG scene. The rehabilitation-training scene was a virtual game in the 3-dimensional reachable workspace, which could stimulate the movement of the upper limb. The rehabilitation assessment scene contained a guidance action video for evaluation of the upper limb workspace recorded by a therapist. The video, audio and patient information, the evaluation of the upper limb performance parameters, and the assessment results from the TCNN were stored on the cloud platform as the rehabilitation training and assessment results. On the therapist side, the therapist could check the results of patient rehabilitation training and assessment. According to the information from the cloud platform, the therapist could analyze the effectiveness of the patient's training and guide the next rehabilitation training. The therapist had full access to all the patients' information, while patients only had access to his or her own information. Unity3d was used to design the reachable workspace experimental platform on the computer. Asset store package Kinect with MS-SDK (Microsoft Software Development Kit) and iTween Visual Editor were used to build the virtual scene.

The Cloud System Architecture
The cloud upper limb rehabilitation system was established based on the browser/server structure. The browser/server is a distributed client server architecture in which the client computer has access to many server software applications distributed across the network simply by installing a browser. The browser interface is uniform and minimizes the time and cost of training. Logically, it is divided into three layers: client, application server (Web server) and data server. The client is mainly responsible for human-computer interaction, including some graphics and interface operations related to data and applications. The Web server is mainly responsible for the centralized management of client applications. The application server is mainly responsible for the centralized management of application logic, that is, transaction processing. Application servers can also be divided into several types according to the specific business they handle. The data server is mainly responsible for the storage and organization of data, the distributed management of database, the backup and synchronization of database, etc.
In the cloud rehabilitation training and evaluation system, the client is the interface between the user and the whole system, and is the input and output interface of data. The Web server starts the corresponding process to respond to the client's request, carries on the business processing, and then returns to the first layer. If the request submitted by the client includes access to data, the Web server logs on to the database server and cooperates with the database server to complete processing. Unity uses the HTTP (Hypertext Transport Protocol) protocol to interact with the Web server to realize telecommunication between stroke patients and hospitals.

The 3-Dimensional Rehabilitation Training Virtual Game
A time-limited virtual game can stimulate the participants' motivation and a task-oriented virtual game can exercise the subject's responsiveness. Therefore, the 3-dimensional reachable workspace virtual game was developed to encourage post-stroke patients to stretch their upper extremities and swing their arms to train their upper limbs.
The game scene was designed on a sea floor environment with the aim of obtaining fish and gold coins. Figure 3a shows the game scene. In the figure, the side view is in the lower left corner, and the subject's image is in the lower right corner. The avatar, coral, seaweed, stones, and fish models (i.e., various kinds of fish, including clownfish, green turtle, starfish) are added to the scene. In order to make the scene more vivid and increase the immersive sense of participants, special effects are added in the scene, including an atomization effect, rising bubbles, swaying plants in the water, and fish with a swimming animation.
The Kinect has the function of tracking the human skeleton; it can obtain the coordinate position information of 20 joints of the human skeleton in real time. In the 3D-RWVG, the skeleton of the subject identified by the Kinect was bound to the skeleton of the virtual character model, as shown in Figure 3b, in which each joint in the left figure corresponds to the joint of the avatar, so that the subject could control the character model to play the game. Considering the age of patient, the mirror control was used for convenience. This means that the avatars displayed on the screen reflect the subjects' arm movements, as if they were standing in front of a mirror.
Collision detection is conducted between the hand and fish or gold coins. When the hand collides with the fish, the fish disappears and the gold coins are created in the fish's position, and then the hand collides with the gold coins, thus the score increases. To make the game more interesting, additional gold coins are dropped from above randomly. To increase the multi-sensation, various kinds of sounds were added, namely, background music, gold coins falling, voice and scoring cheers. An additional window was set up to provide the depth direction distance between the fish (coin) and the subject. The game was defined using three levels (fast, medium, and slow) by the swim speed of the fish. According to the practical situation of patient, an appropriate level is selected. In this game, the subject swings his arm to catch fish and get coins; at the same time, arm kinematics are collected by the Kinect to analyze the performance of motion parameters.
There may be an occlusion phenomenon when patients sit in front of the Kinect to do upper limb movement, as shown in Figure 3c. When hand, elbow joint and shoulder joint of the Kinect, are in the same straight line, there will be occlusion; the Kinect and each joint are equivalent to a point. When occlusion occurs, the position information of the Kinect and each joint of the upper limb is shown in the Figure 3c.
The angle between the arm and the horizontal plane is α, and the angle between the arm and the vertical plane is β. At this time, the phenomenon is processed by the following methods to reduce the acquisition of error signals. Before the patient starts to do exercise, the joint information of the upper limbs was collected, and the length of the upper arm and fore arm were calculated as L se and L eh , respectively. The location information of the occlusion points can be calculated according to Equation (1), where the coordinates of each joint are hand (Hx, Hy, Hz), elbow (Ex, Ey, Ez) and shoulder (Sx, Sy, Sz), and K stands for Kinect.
According to the upper extremity motion in reference [28,29], the paths of fish were designed. In the game, fish come into sight randomly from three directions: upper, left, and right. When the distance between fish and the avatar is equal to the length of the tested arm, the fish will swim by the designed path. Figure 4a shows the designed paths; the coordinate system of the game space is determined based on the Kinect's coordinate system, with the shoulder joint used as the reference point. The paths were designed separately in four directions: vertical direction, horizontal direction, oblique direction, and backward direction. The datum plane of the path division is the horizontal plane and the vertical plane passing through the shoulder joint, similar to the latitude and longitude of the earth.  Figure 4b shows the condition of the subject when playing the fish game. The sphere is the reachable workspace of the subject; the red space curve is the planned path of the fish. To catch a fish the subject must extend his or her arm.

Rehabilitation Assessment Module Data Analyses
In order to reduce sports occlusion, five different sports are selected as the evaluation criteria for family rehabilitation according to the sports in the Fugl-Meyer test, the action research arm test (ARAT) and reference [30], as shown in Figure 5.

Preprocessing
The sixth-order low-pass Butterworth filter with a cut-off frequency of 30 Hz was used to filter the trajectory data. The shoulder-center is taken as the origin. To eliminate offsets and scaling during data acquisition, all the joints coordinates were normalized by the equations as follows.
where X 0 (x 0 , y 0 , z 0 ) are the original coordinates, SC (x sc , y sc , z sc ) are the shoulder center coordinates, X k (x k , y k , z k ) are the coordinates after normalization, µ is the mean, and σ is the standard deviation.

Reachable Workspace Relative Surface Area (RSA)
The ideal upper limb reachable workspace is a part of spherical enclosure with a radius of the arm expansion length. The reachable workspace is divided into four quadrants (I, II, III, and IV) in which each quadrant corresponds to a quarter of a sphere. The shoulder joint is defined as the coordinate origin, the horizontal is the direction in which the arm extends horizontally, and the vertical direction is parallel to the direction in which the subject stands. The avatar is mirror controlled by the subject, so the coordinate system and four quadrants are shown mirrored in Figure 6. The right hand is selected as an example to analyze the reachable workspace, and the left hand can be calculated in the same way. The collected 3D upper extremity motion trajectory is analyzed using the methods described previously [21,28], with some improvements.
Briefly, according to the trajectory of the upper extremity, the sphere is fitted by the least squares method to determine the parameters of the working space sphere. Then, the trajectory data are projected into the spherical coordinate system, and the maximum boundary of the trajectory is determined by the α-shape geometry. Catmull-Rom splines are applied to smooth the boundary. Next, all information is projected back into Cartesian coordinates to extract the corresponding accessible surface patches. The application of a triangular mesh avoids the problem of jagged edges, making the surface boundary smoother than before.
The quadrants and the total accessible surface area are calculated. The relative surface area is the ratio of the actual surface area to the total surface area, which can be calculated by Equation (6) to facilitate the comparison between different subjects. Therefore, the value of RSA is between 0.0 and 1.0, where 1.0 represents the area of the entire sphere.

Smoothness Analysis
In the rehabilitation training, it is important that the affected limb can carry out a task smoothly and quickly [4,25,31,32]. In the 3D reachable workspace, the movement smoothness can reflect the coordination and recovery of the patient's upper limbs. In this paper, the mean velocity (MV), logarithm of dimensionless jerk (LJ), and logarithm of curvature (LC) are selected as further performance indices to assess movement smoothness of affected upper limbs.
Mean velocity is the mean value of the hand velocity. The logarithm of the curvature is the logarithm of the median of the path curvature, and the curvature-based measure enables quantification of the motion irregularity. The logarithm of dimensionless jerk is the median of the dimensionless jerk of the hand's path. It does not depend on amplitude and duration, but instead reflects changes in the shape of the motion, and general deviations in smoothness. They are defined as follows: where (X, Y, Z) is the three-dimensional space position of the tested hand which is collected by the Kinect. V i is the speed of the hand at the i th data sample, the number of samples is N, t 1 is the time of origin and t 2 is the finish time of the i th motion. V mean is the mean velocity in the whole movement.

Convolutional Neural Network Assessment
A convolutional neural network is usually used in analysis of images and time series. Some researchers have used different CNNs to do human action classification [33][34][35][36] based on images. However, it has never been used in upper limb assessment.
We define a two-headed convolutional neural network (TCNN) to assess upper limb performance at home for each movement. The shoulder abduction (90 • ) motion is taken as an example to analyze the TCNN structure, as shown in Figure 7.
In order to better analyze the characteristics of the upper limbs, the trajectory data of the shoulder joint, elbow joint, wrist joint, hand joint and shoulder center joint were collected. These data were preprocessed by the methods described above. There were 12 variables for each time step and the time step is 120, that's 120 data points on the timeline. There were (120 × 12) features in one row. The input data shape was 244 × 120 × 12, and the test data shape was 80 × 120 × 12.
The TCNN had two heads, each with layers. The left head had three different 1D CNN layers with 16, 32 and 64 filters separately; the kernel size of each layer was three, and the activations were relu(Rectified Linear Units). Then, each followed a MaxPooling1D layer with (2 × 2) pool size. The dropout layer was used to prevent neural networks from overfitting, after each MaxPooling1D layer. Then the learned features were flattened to one long vector. The right head network has the same structure as the left one, besides the kernel sizes, which were five. Before making the prediction, the interpretations from both head were connected within the model and interpreted by a fully connected layer with the relu activation. The network was optimized by the adam (Adaptive Moment Estimation) version of the stochastic gradient descent. The loss function was categorical_crossentropy. For the model and evaluation, the epoch was 50 and batch size was 32.
Because neural networks are stochastic, different results may be obtained even with the same data and the same configuration. This is a feature of the network, which provides adaptive capabilities for the model, but requires a slightly more complex approach to evaluation. We evaluated the model 10 times, and then summarized each score. Then mean and standard deviation were used to evaluate the performance of the model.

Experiments
The experiment was conducted in the Nanjing Tongren Hospital in Nanjing, China. A total of 35 patients were selected to participate in the experiment. The exclusion criteria were cognitive impairment or inability to cooperate. All subjects volunteered to take part in the experiments, and signed the written informed consents. The local scientific and ethics committee approved this work.
Before the experiment patients were instructed to use the system, until all were familiar with it. In the experiment, each patient used his cloud rehabilitation system 10 times, and a therapist assessed the motor FMA at the same time. The motion trajectory and the FMA scores were recorded as the dataset to establish the TCNN model.
A questionnaire was conducted to investigate the feedback of the subjects on the use of the system. The questions were designed to examine five aspects: practicality, attractiveness, convenience, operability and interest. The highest score for each item was five, and the lowest was one.

Subjects
The total FMA score for upper limbs is 66. In order to facilitate the experiment and conform to the purpose of this study, home rehabilitation training and evaluation, the FMA score of the selected patient is higher than 10. Patients with FMA scores below 10 can hardly move their arms.
The detailed information about patients is shown in Table 1. It contains the test arm side, stroke reason (cerebral hemorrhage or infarction), as well as the information of each patient. Figure 8 shows the rehabilitation scene in Nanjing Tongren hospital.  The questionnaire survey result is shown in Table 2. As can be seen from the table, the score of each item is more than four, and the standard deviation is small, indicating that each patient was relatively satisfied with this system. Most patients find this system to be very interesting.

Reachable Workspace Results
To analysis the reachable workspace RSA, the motion paths for subjects were collected in the 3-dimensional rehabilitation training virtual game. The 3D surface is divided into four quadrants: area1, area2, area3 and area4. Figure 9 shows the intuitive graphical visualization of the 3D reachable workspace for different post-stroke patients. In Figure 9a, representing patient1 with an FMA score of 96, the RSA distribution between each quadrant is as uniform as in healthy subjects. The arm of patient 2 (Figure 9b) mainly moves in the sagittal plane and the lateral coronal plane, but the force on the medial side is small. For patient 3 the accessibility at the top of the quadrant is slightly reduced (Figure 9c), and the hand could not reach the I quadrant. Patient 4 (Figure 9d) has almost no force to lift his arm to the upper quadrant. Due to muscle weakness, the hand of patient 5 (Figure 9e) only moves within the ipsilateral quadrant of the lower side, that is to say, the affected limb is mainly active in the IV quadrant.
As can be seen from Figure 9, as the FMA score decreases, the RSA reduces, especially in the I, II, and III quadrant, and the trajectory is wild and bumpy in the intuitive view. Thus, with the increase of upper limb injuries, the smoothness of the trajectory of the upper limb reachable workspace is reduced, the movement quality is decreased and the movement becomes clumsy. The RSA of the patients' arm could be obtained to facilitate the understanding of the patient's condition.

Smoothness Analysis Results
The further performance indices are analyzed from a trajectory which is selected from the motion of shoulder abduction (90 • ). Figure 10 shows the motion trajectory and analysis of velocity (V), the mean of LJ (LJ) and the mean of LC (LC). It is clear that the patient's hand movement trajectory is clumsier than that of the healthy person. The irregular change of V in patients was more serious than that in healthy people. The MLJ value of 7.2611 is greater than that of healthy people (MLJ = 2.8654). The MLC for patients was 2.0029, the MLC of the healthy was 1.59, and there were small differences between each MLC. However, when focusing on the small value of the LC, the difference was obvious. Because the trajectory is a back and forth movement, there is obviously a difference between each MLC, in a single journey. Overall, patients' motions fluctuate more than that of healthy people. From the curves of the performance indices, we could see the motor ability of the affected upper limb.

Convolutional Neural Network Assessment Results
To assess the motor function of the affected upper limb, the TCNN models were established for the five motions. The 10-time evaluation accuracy for the motion shoulder abduction (90 • ) is shown in Figure 11. The mean is 92.6% and the standard deviation is 0.0088. This indicates that the recognition result of the TCNN model is stable. Figure 12 shows the loss and accuracy on training and validation data in 50 epochs. This graphic indicates that after 25 epochs, the classification results tend to be stable, and the accuracy of the model is 93.75%. The accuracy of the five motion models is shown in Figure 13. The model accuracy means for motions 1-5 are 92.6%, 80%, 89.5%, 85.1% and 87.5%, respectively.
The shoulder adduction has the lowest accuracy of 80%, because many patients often place the affected arm on the leg, which may affect the accuracy of the shoulder adduction model to some extent. For the motion shoulder flexion (180 • ), the model accuracy was 89.5%. During arm flexion, occlusion may occur when the arm was raised at shoulder level.
Although there was some occlusion in the motion "Hand to lumbar spine", the position information of each joint changed in the direction of depth, which made up for this flaw. The accuracy of the classification model is acceptable (85.1%).

Discussion
To relieve the burden of the occupational therapist on the rehabilitation of upper extremity motor function, our study provides a relatively practical, interesting and simple way to do home rehabilitation training and assessment, namely, the cloud rehabilitation system for upper limb-based motion tracking.
In this system, the 3D-RWVG is developed to allow home rehabilitation training, encouraging post-stroke patients to stretch their arm and catch fish. RSA, motion smoothness and paths are fused for motion performance assessment. RSA represents ROM to evaluate the performance in terms of the range of motion. MV, LJ and LC evaluate performance in terms of the smoothness of motion. The TCNN is assessed from the general performance of the motion path.
The five types of motion selected do not include nose-touching movements, because when the subjects do this motion, the occlusion phenomenon is obvious. In order to improve the accuracy of the assessment, based on reference [30] and the Fugl-Meyer scale, the five motions in this paper were selected.
The analysis of RSA shows that there is relatively obvious loss in reachable workspace, especially for the affected side and upper quadrants (I, II, and III) for post-stroke patients as compared to healthy subjects. There are big differences between patients with different FMA scores in upper quadrants (I, III).
The cloud motion-based rehabilitation system for upper limbs is an automated data collection system, which is attractive and inexpensive, and designed to reduce the burden on participants. In the experiment, seated post-stroke patients waved their arms in the fishing game for upper limb training, and the computer is used for upper limb motion assessment. It is convenient and incentivizes patients' home rehabilitation.
Visual feedback and auditory feedback in the 3D-RWVG could increase the immersion and pleasure of subjects. Interacting with the game could motivate and encourage patients to stretch their arm; the patients could strengthen the exercise unknowingly and achieve the effect of rehabilitation.
By comparing the assessment method with references [30] and [37], it could be seen that the TCNN method is more appropriate and convenient than the evaluation method in reference [30], which found a strong linear relationship between qualitative scores and quantitative scores derived from both standard and low-cost motion capture. The TCNN model accuracy is higher than the PCA-ANN (Principal Component Analysis-Artificial Neural Networks) model accuracy in reference [37], which ranged from 65% to 87% [37].

Conclusions
In this paper, the cloud rehabilitation system for upper limb-based motion tracking is proposed for home rehabilitation training and assessment. The 3D-RWVG could motivate and encourage the patients to stretch their arm unknowingly to undertake rehabilitation training. The MV, LJ and LC could reflect the smoothness of the hand motion paths. The TCNN model could assess the level of the five special movements. The model accuracy is 92.6%, 80%, 89.5%, 85.1% and 87.5%, respectively. A therapist could obtain the training and assessment results of the patients and instruct the patient to do further rehabilitation training. The system is convenient, inexpensive and practical, and can enable patients to carry out rehabilitation training and evaluation at home without the supervision of a therapist. The future work of this paper will focus on the rehabilitation assessment of upper limb muscle strength and the design of various multi-objective and oriented virtual games.