The Automatization of the Gait Analysis by the Vicon Video System: A Pilot Study

The quality of modern measuring instruments has a strong influence on the speed of diagnosing diseases of the human musculoskeletal system. The research is focused on automatization of the method of gait analysis. The study involved six healthy subjects. The subjects walk straight. Each subject made several gait types: casual walking and imitation of a non-standard gait, including shuffling, lameness, clubfoot, walking from the heel, rolling from heel to toe, walking with hands in pockets, and catwalk. Each type of gait was recorded three times. For video fixation, the Vicon Nexus system was used. A total of 27 reflective markers were placed on the special anatomical regions. The goniometry methods were used. The walk data were divided by steps and by step phases. Kinematic parameters for estimation were formulated and calculated. An approach for data clusterization is presented. For this purpose, angle data were interpolated and the interpolation coefficients were used for clustering the data. The data were processed and four cluster groups were found. Typical angulograms for cluster groups were presented. For each group, average angles were calculated. A statistically significant difference was found between received cluster groups.


Introduction
Today, there are many reasons why a person's motor functions are impaired, in particular, a normal gait worsens. Such diseases include blood stroke after which one side of the body is partially or completely paralyzed. Disabled persons with spinal cord injury in various regions, patients with scoliosis, osteochondrosis, hemiparesis (unilateral paresis) of cerebral genesis [1,2], etc. have similar problems. Infantile cerebral palsy also influences movement disorders. Gait features depend on the physiological characteristics of a person, age, and mental state [3,4]. Gait analysis is an important clinical tool for planning clinical treatments. It is being actively introduced into the sport industry to predict injury.
There are various studies in this field. Locomotion research is important for a patient's diagnosis. D. Kaplun et al. [5] describe an automatic orientation algorithm based on the alignment of the virtual anatomical axis of the lower leg of a 3D model with the vertical axis of rectangular coordinates in a three-dimensional space. The accuracy of the orientation of the stump greatly influences the patient's locomotion. A digital twin was obtained with the 3D. The spatial orientation of the model was determined by the angles between the projections of the anatomical axis on the plane of the Cartesian coordinate system. To rotate a digital 3D model, a set of transformations was used following the approach of rotation matrices. In 3D space, rotation around the Z axis is described by a matrix transformation. The development shows that the use of a 3D model simplifies and speeds up the process of solving physiological problems.
M. Baltin et al. [6] studied changes in the range of motion of the hind limbs of rats induced by epidural stimulation in the acute phase of SCI and recovery of movements was assessed. A kinematic analysis of complete stepping cycles was performed for animals of each group. Vicon Nexus 2.5 software was used to manually complete the 3D motion model and remove artifacts from the recording. In each group of test rats, the data of the average values of 30 steps in the angles in the phases of one step were researched. For each group of test rats, the average angles (beyond 30 steps) for each step phase were researched. The data were obtained in the form of angulograms. This approach is convenient for visualizing the change of the values of the angles in the phases of the step, as well as for further statistical analysis.
Research by Hiroaki Hobara et al. [7] investigates the movement of a person with prostheses upstairs. The study was carried out using an eight-camera motion analysis system (VICON 512, Oxford Metrics, Oxford, UK). Kinematic parameters such as the angle at the hip and knee (and their swing average flexion rate) were calculated. During the swing, the average flexion rate of the hip and knee was found. Changes in angles in the step phases were analyzed. Research by A. Gawłowska et al. [8] similarly describes a person's upper body movement using different sets of prostheses. To obtain kinematic data, a motion capture system OptiTrack Flex 13 (NaturalPoint, Corvallis, OR, USA) was used. Average movement speed and average vertical ground reaction force were also calculated. Present-day practice is more focused on angles benchmarking. It can be not only joint angles but, e.g., abduction and adduction or flexion and extension [9,10]. These studies illustrate motion capture system benefits and provide numerical results from experiments. However, an urgent task is to obtain a technique that helps to decrease the time of diseases diagnosis, as well as to adjust the applied therapy, depending on the current state of the patient [11,12].
Considering clusterization approaches, the proposed method allows taking into account the form of angle time series. Thus, the most common approach is to use peaks of angle functions [13] or even average values [14] for clusterization. There are more complicated methods to form data for clusterization [15], but that involves manual processing or questions the accuracy of automatic processing. The proposed parameters describe change range (in different part of gait phase) of some kinematic parameter (e.g., peaks). Such description is rather sketchy and does not take into account the curve form of kinematic parameter time-series. The proposed approach offers a solution. Obviously, the method or/and the order of approximation influence the quality of the following data processing, and this topic should also be investigated.
Therefore, in most articles, the researchers aim to analyze time-series of measured data and piecewise linear (usually constant) functions used for clusterization. The research is focused on convolution of the time-series data to parameters vector using higher order polynomial interpolation. Such technique allows obtaining the classification of patients by kinematic parameters, and the approach allows segregate patients in cluster groups for the following diagnostics.

Experiment Design
Optoelectronic motion capture is widely used in biomechanics. Vicon is a system consisting of several Vicon Vero 2.2 digital infrared cameras (Vicon Motion Systems, Oxford, UK) with adjustable lenses and focus, and elaborated software for primary data processing and visualization Vicon Nexus 2.9.3. The scheme of cameras mounting is shown in Figure 1. Angle orientation is set by two angles a 1 and a 2 and shown in Figure 1. The field of view  Figure 1 respectively). In Figure 1, the area covered by cameras is marked by the gray area.
Optoelectronic motion capture is widely used in biomechanics. Vicon is a system consisting of several Vicon Vero 2.2 digital infrared cameras (Vicon Motion Systems, Oxford, UK) with adjustable lenses and focus, and elaborated software for primary data processing and visualization Vicon Nexus 2.9.3. The scheme of cameras mounting is shown in Figure 1. Angle orientation is set by two angles a1 and a2 and shown in Figure 1. The field of view of Vicon Vero 2.2 digital infrared cameras was 50.1° x 98.1° (FoV1 and FoV2 in Figure 1 respectively). In Figure 1, the area covered by cameras is marked by the gray area. Calibration and synchronization were carried out using the Active Wand calibration marker (Vicon Motion Systems, Oxford, UK). The standard video was obtained through a Sony camera. Kinematic data were collected at a sampling rate of 100 Hz and transferred to a personal computer using Vicon Nexus 2.9.3 software. The walkway was about 4 m.
The study involved six healthy subjects: 3 men and 3 women aged 20-26 years (mean age 24.16 years; SD 1.83), mean height 1.73 m (SD 0.07), and mean weight 69.5 kg (SD 14.6). The study protocol was approved by the Local Ethics Committee of the Federal State Autonomous Educational Institution of Higher Education KFU (protocol No. 12 dated 18 September 2018). All subjects were in socks and thin, tight clothing. A total of 27 reflective markers (14 mm diameter sphere) were placed on the following anatomical regions: calcaneus, toe, first metatarsal head, outer lateral malleolus, inner medial malleolus, tibia, outer or inner knee, femur, pelvic bones, and ilium. Markers' location scheme is shown in Figure 2. Calibration and synchronization were carried out using the Active Wand calibration marker (Vicon Motion Systems, Oxford, UK). The standard video was obtained through a Sony camera. Kinematic data were collected at a sampling rate of 100 Hz and transferred to a personal computer using Vicon Nexus 2.9.3 software. The walkway was about 4 m.
The study involved six healthy subjects: 3 men and 3 women aged 20-26 years (mean age 24.16 years; SD 1.83), mean height 1.73 m (SD 0.07), and mean weight 69.5 kg (SD 14.6). The study protocol was approved by the Local Ethics Committee of the Federal State Autonomous Educational Institution of Higher Education KFU (protocol No. 12 dated 18 September 2018). All subjects were in socks and thin, tight clothing. A total of 27 reflective markers (14 mm diameter sphere) were placed on the following anatomical regions: calcaneus, toe, first metatarsal head, outer lateral malleolus, inner medial malleolus, tibia, outer or inner knee, femur, pelvic bones, and ilium. Markers' location scheme is shown in Figure 2.
To increase the number of observations, the subjects performed different types of gait. The subjects walked straight. Each subject carried out several gait types: casual walking (T 1 ), run (T 2 ), and additionally imitation of a non-standard gait, including shuffling (T 3 ), lameness (T 4 ), clubfoot (T 5 ), walking from the heel (T 6 ), rolling from heel to toe (T 7 ), walking with hands in pockets (T 8 ), and catwalk (T 9 ). Each type of gait was retested three times. The subjects did not have any vascular disease, skin problems, or balance limitations. They did not have operations on their lower limbs or serious injuries. The subjects were marked by letter F or M (depending on sex) and number. Non-standard gaits were used by subjects to increase the number of observations. Additionally, it allows to analyze the deviations in gait parameters in case of simulations. To increase the number of observations, the subjects performed different types of gait. The subjects walked straight. Each subject carried out several gait types: casual walking (T1), run (T2), and additionally imitation of a non-standard gait, including shuffling (T3), lameness (T4), clubfoot (T5), walking from the heel (T6), rolling from heel to toe (T7), walking with hands in pockets (T8), and catwalk (T9). Each type of gait was retested three times. The subjects did not have any vascular disease, skin problems, or balance limitations. They did not have operations on their lower limbs or serious injuries. The subjects were marked by letter F or M (depending on sex) and number. Non-standard gaits were used by subjects to increase the number of observations. Additionally, it allows to analyze the deviations in gait parameters in case of simulations.
The data from the cameras were transferred into Vicon Nexus 2.9.3 software. For the primary check, 3D reconstruction was used. It allows identifying and removal artifacts in the record. The following steps were taken: carry out identification for each marker on the screen; build a skeleton using the created markers; create areas (for example, foot, calf, thigh, etc.); create links between areas; calibrate this skeleton for each frame; and import a .c3d file. The created skeleton can be saved and successfully applied to the tracing. Thus, a three-dimensional array of x, y, z coordinates was obtained. It is possible to restore the trajectory of markers in case of their loss [16][17][18][19].

Data Processing
The input data are presented in the following structure: where ti-time data, mi-marker data, xi, yi, zi-marker coordinates, m-number of frames, and n-number of markers. The X and Y axes were placed in the horizontal plane. The Z-axis was normal to the plane X-Y. The subject moved in the X-Z plane. The data were used to assess joint angle estimation [20].
The algorithm performing the following operations: (1) Reading a file in .c3d format; (2) Data visualization; (3) Dividing the entire dataset into steps (touching the toe of the floor was step completing criteria); (4) Assignment the stance phase and the swing phase at each step by searching for the local minimum of the heel movement trajectory [21]. The output of the aforementioned algorithm is an indices array, which allows dividing the data into steps and step phases. The developed software makes it possible to The data from the cameras were transferred into Vicon Nexus 2.9.3 software. For the primary check, 3D reconstruction was used. It allows identifying and removal artifacts in the record. The following steps were taken: carry out identification for each marker on the screen; build a skeleton using the created markers; create areas (for example, foot, calf, thigh, etc.); create links between areas; calibrate this skeleton for each frame; and import a .c3d file. The created skeleton can be saved and successfully applied to the tracing. Thus, a three-dimensional array of x, y, z coordinates was obtained. It is possible to restore the trajectory of markers in case of their loss [16][17][18][19].

Data Processing
The input data are presented in the following structure: where t i -time data, m i -marker data, x i , y i , z i -marker coordinates, m-number of frames, and n-number of markers. The X and Y axes were placed in the horizontal plane. The Z-axis was normal to the plane X-Y. The subject moved in the X-Z plane. The data were used to assess joint angle estimation [20].
The algorithm performing the following operations: (1) Reading a file in .c3d format; (2) Data visualization; (3) Dividing the entire dataset into steps (touching the toe of the floor was step completing criteria); (4) Assignment the stance phase and the swing phase at each step by searching for the local minimum of the heel movement trajectory [21].
The output of the aforementioned algorithm is an indices array, which allows dividing the data into steps and step phases. The developed software makes it possible to calculate the stride length, maximum and average height, and swing around the movement. These operations were calculated using a local minimum in projection onto each of the three orthogonal planes. Vectors constituting the required angles can be found by equations: Next, the angles in the hip, knee, and ankle joint at each moment of time can be calculated according to the equations [6]: The obtained values of the angles (3), divided by phases, were saved in a file for further processing. To study the pattern of the movement, the changes in the angles in the stance phase and the swing phase were approximated by a seventh-degree polynomial function [22]. In Figure 3, initial data values (blue bubble) and polynomial approximation of swing and stance phases (orange lines) in knee angle are shown. Figure 4 illustrates the same but for hip angle. As a result of the operations, approximation coefficients were found for each record.
Next, the angles in the hip, knee, and ankle joint at each moment of time can be calculated according to the equations [6]: The obtained values of the angles (3), divided by phases, were saved in a file for further processing. To study the pattern of the movement, the changes in the angles in the stance phase and the swing phase were approximated by a seventh-degree polynomial function [22]. In Figure 3, initial data values (blue bubble) and polynomial approximation of swing and stance phases (orange lines) in knee angle are shown. Figure 4 illustrates the same but for hip angle. As a result of the operations, approximation coefficients were found for each record.  The next step was high-dimensional clustering for received data. As a result of the operations, approximation coefficients were found for each record. The main idea is to divide all dataset into groups with similar features. For clustering, the k-means method  The next step was high-dimensional clustering for received data. As a result of the operations, approximation coefficients were found for each record. The main idea is to divide all dataset into groups with similar features. For clustering, the k-means method was used. The algorithm seeks to minimize the total square-law deviation of cluster points from the centers of these clusters.
where k-number of clusters, S i -the full dataset, µ i -the cluster center. The Euclidean distance was used in Equation (4). To determine the optimal number of clusters, the Calinski-Harabasz criterion was applied [23][24][25].
Angle results were averaged for all steps. To estimate deviation in movement, standard deviation was used. In this case, the resulting distribution can be presented as follows [6]: where ϕ(τ)-angle function, mean (·, k) and std (·, k)-mean and standard deviation of function according to steps, N step -number of steps.
To quantify the volume of motion degree, the following equation was used to calculate the range of motion [6]: Another parameter for estimating range of motions is the calculation of the triangle area. The vertices of triangle are the toe position at the initial moment of time; toe position at maximum height; and the hip position at the initial moment of time. The area of a triangle can be calculated as half of the vector product:

General Pipeline
For further analysis, the code was implemented in MATLAB software. Let us introduce the general pipeline for data processing (Algorithm 1). Then, the clusterization (4) was performed for parameter vectors vParam for all datasets.

Results and Discussion
For each record of each subject and each gait type, all aforementioned calculations were carried out (according to Algorithm 1). Then, clusterization by the coefficients of the approximating polynomials was carried out. As a result of clustering, the dataset was divided into 4 groups. Using t-Distributed Stochastic Neighbor Embedding, clusters are illustrated and shown in Figure 5.
Then, the clusterization (4) was performed for parameter vectors vParam for all datasets.

Results and Discussion
For each record of each subject and each gait type, all aforementioned calculations were carried out (according to Algorithm 1). Then, clusterization by the coefficients of the approximating polynomials was carried out. As a result of clustering, the dataset was divided into 4 groups. Using t-Distributed Stochastic Neighbor Embedding, clusters are illustrated and shown in Figure 5. Therefore, two subjects F1, F2, F3, and M3 had robust results via gait type. Subjects M1 and M2 appear in the 1st and the 4th cluster groups. For each cluster group, numbers of gait type records were counted. The gait types with maximum records amount for each cluster group are presented in Table 1. The 1st group consists of run (T2), walking from the heel (T6), and walking with hands in pockets (T8). Shuffling (T3) and walking with hands in pockets (T8) were in the 2nd group. Normal walk (T1) and catwalk (T9) were both in the 4th group. According to the gait type, the 3rd cluster group major consists of running type (T2). Therefore, two subjects F 1 , F 2 , F 3 , and M 3 had robust results via gait type. Subjects M 1 and M 2 appear in the 1st and the 4th cluster groups. For each cluster group, numbers of gait type records were counted. The gait types with maximum records amount for each cluster group are presented in Table 1. The 1st group consists of run (T 2 ), walking from the heel (T 6 ), and walking with hands in pockets (T 8 ). Shuffling (T 3 ) and walking with hands in pockets (T 8 ) were in the 2nd group. Normal walk (T 1 ) and catwalk (T 9 ) were both in the 4th group. According to the gait type, the 3rd cluster group major consists of running type (T 2 ). Average hip angle in the swing phase for group 1 was equal 169 • ± 2 • , for group 2-161 • ± 3 • , for group 3-155 • ± 5 • , for group 4-165 • ± 4 • . Average value of the hip angle in the stance phase for group 1 was equal 168 • ± 2 • , for group 2-161 • ± 2 • , for group 3-161 • ± 4 • , and for group 4-167 • ± 3 • . Average knee angle in the swing phase for group 1 was equal 153 • ± 4 • , for group 2-150 • ± 4 • , for group 3-139 • ± 8 • , and for group 4-147 • ± 5 • . Average knee angle in the stance phase for group 1 was equal 163 • ± 4 • , for group 2-161 • ± 5 • , for group 3-153 • ± 5 • , for group 4-161 • ± 4 • (all data presented in Table 1).
Statistical differences between all groups in the case of the swing phase (p < 0.05) were shown by paired-samples t-test. For stance phase, partial statistical differences (p < 0.05) were found. In Figure 6, box plot and pair statistical differences are shown for hip angle in the swing phase (Figure 6a), hip angle in the stance phase (Figure 6b), knee angle in the swing phase (Figure 6c), andknee angle in the stance phase (Figure 6d). Table 1).
Statistical differences between all groups in the case of the swing phase (p < 0.05) were shown by paired-samples t-test. For stance phase, partial statistical differences (p < 0.05) were found. In Figure 6, box plot and pair statistical differences are shown for hip angle in the swing phase (Figure 6a), hip angle in the stance phase (Figure 6b), knee angle in the swing phase (Figure 6c), andknee angle in the stance phase (Figure 6d). It should be noted that the average angles in the swing phase change in proportion to the average angles in the stance phase for the hip and knee angles. This information, together with the results of the paired-samples t-test, makes it possible to conclude about the veracity of the distribution into groups. The stance phase takes about 60% of the It should be noted that the average angles in the swing phase change in proportion to the average angles in the stance phase for the hip and knee angles. This information, together with the results of the paired-samples t-test, makes it possible to conclude about the veracity of the distribution into groups. The stance phase takes about 60% of the stride cycle, which helps to align the body. Perhaps, for this reason, there is no statistical difference between some groups in the stance phase.
In Figure 7, hip and knee angle results are presented. Here, the thick line-mean values of angle, dashed line area-standard deviation (function ϕ ± (τ) in Equation (5)) of an angle [23][24][25]. The change in the hip angle in the swing phase (Figure 7a), and in stance phase (Figure 7b), the change in the knee angle in the swing phase ( Figure 7c) and in stance phase (Figure 7d) are shown. Curves have different colors: group 1 is highlighted in red; group 2-blue; group 3-black; group 4-green. All curves in each of the four images are located to a greater degree in the same order. Figure 8 shows the position of the right leg markers for one step in each frame. The blue line is the trajectory of the ankle movement. Red lines enclose a triangle describing the subject's range of motion (7). This method of assessing movement can be used as an additional criterion for the distribution of subjects into clusters.
In Figure 7, hip and knee angle results are presented. Here, the thick line-mean values of angle, dashed line area-standard deviation (function φ ± (τ) in Equation (5)) of an angle [23][24][25]. The change in the hip angle in the swing phase (Figure 7a), and in stance phase (Figure 7b), the change in the knee angle in the swing phase ( Figure 7c) and in stance phase (Figure 7d) are shown. Curves have different colors: group 1 is highlighted in red; group 2-blue; group 3-black; group 4-green. All curves in each of the four images are located to a greater degree in the same order.  Figure 8 shows the position of the right leg markers for one step in each frame. The blue line is the trajectory of the ankle movement. Red lines enclose a triangle describing the subject's range of motion (7). This method of assessing movement can be used as an additional criterion for the distribution of subjects into clusters. After applying the approach described above, clustered groups of gaits were found. Therefore, classification can be used to allocate new data of subjects to existing groups. This will speed up the process of identifying health deviations of new test subjects in future.
Currently, there are new techniques for motion capture and analysis of the kine- After applying the approach described above, clustered groups of gaits were found. Therefore, classification can be used to allocate new data of subjects to existing groups. This will speed up the process of identifying health deviations of new test subjects in future.
Currently, there are new techniques for motion capture and analysis of the kinematics of the movement of an object [26][27][28]. Each researcher strives to make his concept more optimal, easier to use, and cheaper. Therefore, the use of 2D technologies such as video cameras [29,30] is being actively introduced. On one hand, it is easier and cheaper, but still there is no protocol for such measurements [28]. That leads to different number of data deviation and questions the received results. Of course, general information such as body position and orientation can be found by 2D technologies. In contrast, 3D analysis equipment such as Vicon transmits the most accurate marker position. Placing the markers on anatomical positions allows to calculate joint angles more accurately. Moreover, while using video, it is possible to calculate only a plane angle, in contrast to using Vicon. When it comes to tracking the dynamics of a patient's condition and adjusting his therapy, it is necessary to obtain results of high accuracy.
Other studies [31,32] have shown the possibility of preventing a patient from falling through real-time gait analysis. The proposed method was evaluated by recording the normal gait and simulating the pathological gait of the subjects. To determine the phases of the step, it was proposed to use pressure sensors installed in the shoe linings. This solution allows to simultaneously determine all phases of the gait and the distribution of the load on the feet, which is important when assessing gait discoordination. This technique provides a minimum level of discomfort for the object, since there are no direct contacts between the sensor surface and the human body in the system. The patient can use this measuring equipment all the time, which means continuous monitoring is possible.
Conducting research in the field of kinematic analysis of motion is possible using various devices and techniques. Each of them has its advantages and disadvantages; however, some of them are more applicable in a particular clinical case. Combination of video fixation and feet pressure sensor is optimal for full-scale research, but it limits by laboratory capabilities. Researchers should emphasize data processing taking into account not only the physics of measurement equipment but mathematical formalism.
Comparing to other studies, it can be noticed that there is dispersion in patterns. It can be explained by common limitations. Not only clusterization methods [14] but input variables greatly influence the results. Usually easily measured parameters (velocity, cadence, stride length, etc.) are used [13]. Of course, directly increasing the number of parameters can solve the problem. On one hand, dependent variables can appear; on the other hand, it can reduce robustness. Such problems can be solved by dimensionality reduction methods [14], but still there is no general solution. As for clinic practice, there are also obstacles. Direct correlation between clusters groups (received by gait data) and injuries or diseases does not always appear [15,33]. Studies offer ways to meet the challenges: increase data, increase number of gait phases, etc. Nonetheless, gait parameters for clustering are a priori selected, and usually describe time series in terms of constants peaks, excursions, ranges, etc.) [34,35].
Regarding the proposed approach, the time series form was used for clusterization. We cannot be sure of the stability of the received groups and do not pretend that they will be the same in case of data augmentation. The research was focused on method and software development. Conducted research shows results which can be used in the future. However, more importantly, the paper brought up a point of methods for processing time-series for clusterization. Focused on interpolation results, a case with minimal deviation from interpolation (see Figure 4a,b) and a case with deviations (see Figure 3b) were found. We can only question, how does it influence the final clusterization results? Herewith, two obvious solutions can be pointed out: applying more complicated functions for interpolation (e.g., non-uniform rational B-spline) or using methods of topological data analysis.

Conclusions
This article provides a detailed description of the methodology for distributing data into groups with different characteristics. It can be helpful to speed up diagnosis of the patient's disease and start the treatment in the shortest possible time. Undoubtedly, this technique can be applied to all patients with musculoskeletal system problems. Moreover, the results obtained indicate the possibility of using the technique for prosthetics to monitor the progress of rehabilitation and adjust the treatment method. An approach for data clusterization was presented. For this purpose, angle data were interpolated and the interpolation coefficients were used for clustering the data. The data were processed and four cluster groups were found. Typical angulograms for cluster groups were presented.
The proposed method has obvious limitations involving the degree of approximation. The noisy input data can influence polynomial vector and consequently clustering results. To improve the robustness of the method, it is planned to try topological data analysis. Despite the limitations, the proposed method allows to get objective results.  Informed Consent Statement: Informed consent was obtained from all subjects involved in the study.