A Gait-Based Real-Time Gender Classification System Using Whole Body Joints

Gait-based gender classification is a challenging task since people may walk in different directions with varying speed, gait style, and occluded joints. The majority of research studies in the literature focused on gender-specific joints, while there is less attention on the comparison of all of a body’s joints. To consider all of the joints, it is essential to determine a person’s gender based on their gait using a Kinect sensor. This paper proposes a logistic-regression-based machine learning model using whole body joints for gender classification. The proposed method consists of different phases including gait feature extraction based on three dimensional (3D) positions, feature selection, and classification of human gender. The Kinect sensor is used to extract 3D features of different joints. Different statistical tools such as Cronbach’s alpha, correlation, t-test, and ANOVA techniques are exploited to select significant joints. The Coronbach’s alpha technique yields an average result of 99.74%, which indicates the reliability of joints. Similarly, the correlation results indicate that there is significant difference between male and female joints during gait. t-test and ANOVA approaches demonstrate that all twenty joints are statistically significant for gender classification, because the p-value for each joint is zero and less than 1%. Finally, classification is performed based on the selected features using binary logistic regression model. A total of hundred (100) volunteers participated in the experiments in real scenario. The suggested method successfully classifies gender based on 3D features recorded in real-time using machine learning classifier with an accuracy of 98.0% using all body joints. The proposed method outperformed the existing systems which mostly rely on digital images.


Introduction
The term "gait" describes a person's manner of walking. Every person has a unique gait that can be used as a behavioral trait to identify them. Since gait capture may be performed from a distance, the system does not require any assistance from the person. Gait recognition has therefore become more common in recent years and has a variety of applications, including forensics, security systems, surveillance, and the investigation of criminal activity. As CCTV is currently implemented in nearly all public areas [1][2][3], it unobtrusively records the person's gait.
In the existing research work, the majority of the literature has focused on gait recognition [4][5][6][7][8][9][10][11][12][13][14], whereas gender classification based on gait offers a lot of potential for real-time applications [15]. The difficulty of discovering lost people, especially youngsters, at train stations, airports, bus stops, and other public locations can be solved with the help of gait-based gender classification, which can act as a clue in an investigation. The gender classification based on gait can also be applied in commercial activities. For instance, relevant advertisements can be shown on the digital screen if the gender of the person entering the store is known. A talking robot that has been installed in supermarkets can recognize customers through cameras and assist them with gender-specific needs including finding washrooms, food stores, men's wear stores, and women's wear stores.
The majority of gender classification studies make use of a different number of biometric features for gender classification, e.g., Pries et al. [16] used 13 features while Alharbi et al. [17] used 7 different joints. In the literature, few joints are often used to achieve gait-based gender recognition; therefore, little focus has been given to studying the gender differences based on all joints while walking. Unfortunately, very few studies in the recent research literature have used a self-created dataset for gender classification, which may be thought of as the main motivation for this work.

Research Gap and Motivation
• The existing gait recognition approaches have compared limited joints of male and female during gait and less focus has been paid for complete study/analysis of joints. • Gender has been classified based on gait in recent research [18][19][20][21][22], although the accuracy of this classification mostly depends on 2D camera images. • Many researchers have recognized the gender of walking participants [23][24][25][26], but they have analyzed the results using a comparatively limited dataset, which may adversely affect the system's accuracy.

Main Contributions
The following are the key innovations and contributions of our work: • Using information obtained from MS Kinect, the authors created their own data set for gender classification. • Applying a straightforward binary logistic regression model to the challenging issue of gender classification with better accuracy. • The suggested system provides better accuracy as compared to the state-of-the-art system using the same data set.
In short, there is little information on gender-related differences in joints during casual walk; therefore, the purpose of this study is to compare/investigate every joint difference that contributes during walking. Finally, the gender of the walking is classified based on the complete set of joints positions. This paper is organized as follows. Section 2, presents related work, Section 3 describes proposed work, and Section 4 describes experiments and results. Finally, Section 5 is about conclusion and future work.

Related Work
Gender classification is performed in a number of ways using gait. According to the format of the gait, the existing gait identification techniques can be classified into 2D-based and 3D-based techniques [27][28][29]. The 2D-based gait identification techniques depend on a human silhouette, and is often collected by a single RGB camera in video surveillance. RGB-image-based methods are mostly used in this area of gait recognition. These methods are often categorized as model-free and model-based [18,23,[30][31][32].
Model-free methods are often referred to as appearance-based techniques. The silhouettes, which are removed from the video sequences via background subtraction, are used to directly generate gait signatures. A gait energy image (GEI) is the most common representation of gait based on appearance [6], whereas 3D-based approaches typically make use of numerous calibrated cameras or cameras with depth sensors to extract 3D gait data. Moreover, 2D-based methods are typically simpler to execute because they simply require a normal camera. The 3D skeleton model depicted was suggested by Zhao et al. [33] and is based on 10 joints and 24 degrees of freedom (DOFs) recorded by various cameras. Yamauchi et al. [34] captured the dense 3D range gait, which may be used to identify people in various orientations. A system with four calibrated and synchronized cameras was developed by Krzeszowski et al. [35] to predict 3D gait motion from video sequences and identify view-variant gaits.
In terms of identifying gaits, both 2D-based and 3D-based approaches offer significant advantages; however, 3D-based approaches are more realistic as compared to 2Dbased/image-based approaches because their robustness against the variation of observation perspective [36], walking speed [37], and clothing [38] is better than a 2D-based system.
Recently, the majority of researchers have focused on the view-invariant Kinect-based gait detection method because it is challenging to extract viewpoint information from only silhouettes [39]. In order to identify persons, Kale et al. [40] introduced a gait recognition algorithm based on static body metrics that are collected from the walking across multiple views. A framework was presented by Jean et al. [41] to compute and assess the viewnormalized trajectories of the head and feet as seen in monocular videos. Hu et al. [42] suggested using a unitary linear projection to reduce the dimensions of the original gait parameters that were taken from any view(s) and increase their capacity to differentiate between different views. Because of its ability to collect skeletons with less distortion from clothes and lighting than prior depth sensors, the Kinect has become a popular tool for gait detection. Sivapalan et al. [43] used the Kinect sensor to convert the GEI from a 2D to a 3D representation. The lengths of body parts were calculated from joint positions and used as static anthropometric features by Araujo et al. [44] for gait recognition. The coordinates of all the joints recorded by the Kinect were used to produce an RGB image, which was then combined into a video to represent the walking sequence. This method was used by Milos et al. [45] to detect the gait. Price et al. [16] evaluated 11 skeletal features, such as step length and speed, which were used by Kinect as dynamic features; both static and dynamic features were combined for recognition. Yang et al. [46] developed relative distance-based gait features as an alternative to anthropometric measurements, which are better able to retain the periodic patterns of gait. Ahmed et al. [47] developed a gait signature using the total number of joint relative angles (JRA) sequences throughout a walking cycle. In order to determine the distance between two JRA sequences, they also created a dynamic temporal warping (DTW)-based kernel. Kastaniotis et al. [48] proposed a strategy for leveraging the Kinect for gait-based recognition. A person can be identified with 90% accuracy using a Kinect-based gait recognition system, as suggested by [16], which uses 13 biometric parameters. Similarly, X et al. [49] used 18 dynamic features that covered the dynamic angles in the lower limp joints. The system's accuracy was still just 43.6%. A similar approach [50] used 14 features to identify objects, mostly based on the horizontal distance feature (HDF) and vertical distance feature (VDF). The system's recognition accuracy was 92% when using the KNN method. An SVM-based method for gender classification using the Kinect obtained an average accuracy of 87% [51]. Azhar et al. used a limited set of joints in [52] for gender classification in real-time using Kinect sensor. With five subjects, another human identification system that used skeleton data from the Kinect sensor showed an accuracy of 86% [53]. Similarly, the researchers in [47] used joint relative angles (JRA) technique for 3D skeleton Kinect-based gait identification system with a 93.3% recognition rate. A different gait recognition and gender classification method based on the Kinect was suggested in [54]. The KNN classifier produced results with a reasonable accuracy of 96%. A gait-based recognition system for identifying individuals in baggy and lengthy clothing was proposed in [17]. Using nine joints the system used SVM and claimed a 97% accurate result.
The existing gait recognition approaches have compared limited joints of male and female during gait and less focus has been paid for complete study/analysis of joints. Furthermore, the majority of researchers have created systems that classify both age and gender in a single system, but they have typically divided age into various groups (classes), creating the issue of a classification system with a restricted age range. The key innovations and contributions of our work is the creation of our own dataset, comparison of all body joints of male and female using different statistical tools, comparison of the existing systems with the state-of-the-art systems and applying a straight forward binary logistic model to the challenging issue of gender classification with better accuracy.

Proposed System
This paper presents a comparative method for calculating the contribution of male and female body joints in walking. The contribution of male body joints is then compared to the contribution of female body joints using statistical methods such as correlation, t-test and ANOVA techniques. Then, Cronbach's alpha technique is applied to find the reliability of joints. The findings also include a comprehensive list of all human body joints and their roles in gender classification. Furthermore, the proposed system model has two working stages i.e., training and testing. In the training phase, the 3D position of the gender are recorded through MS Kinect to find the best fitted model for gender classification. Finally, in the testing phase, the gender of the walking subject is recorded using percentage of the most contributed body joints via the binary logistic-regression-based machine learning model. The proposed system model is shown in Figure 1.

Dataset Generation
Since no other datasets provided both skeleton joint position and sufficient gender information, we had to develop our own. In this study, 373 participants ranging in age from 7 to 70 took part in the study. Inside the Kinect recognition space, the authors instructed each participant to walk in a free position while facing various directions (see Figure 2). A cheap MS Kinect V1 camera is utilized to record the three dimensional (3D) joint movements of subjects. This sensor not only captures the traditional RGB image of the object being shown, but also the depth information between the objects and the camera.
The Microsoft Kinect sensor is made up of a 3D depth sensor, RGB camera, tilt motor, and a Microsoft array. The Kinect sensor extracted the human skeleton 3D joints using the Kinect software development kit (SDK). The Kinect SDK uses Kinect skeletal tracking algorithm for extracting the 20 human joints. As seen in Figure 3, MS Kinect offers a full set of 20 usable joints, ranging from the head to the foot. Multiple records for each individual were taken in order to evaluate MS Kinect's accuracy. The measurement was calculated from several distances, and 6 feet was found to be the ideal distance. The MS Kinect had a height of 3.4 feet and an angle of 0 degrees at 30 frames per second (fps). Every volunteer was free to adopt any behavior or stance at any time. There was no time restriction because the feature extraction process used the average of all frames.

Pre-Processing of Data
Our system for recording human gait consists of a single Kinect depth camera installed on a tripod and placed in a naturally lighted room. Humans are seen moving smoothly along the length of the room as the camera records their frontal motion and walking direction. Because the field of vision is 3 m and the depth field is 4.5 m, longer sequences can be captured. The Kinect makes it possible to capture a person at 30 fps and extract their 20-joint skeleton in real time; however, as our focus is on developing a reliable and effective gait processing approach, simple and generic low-cost acquisition setups are sufficient. A motion sequence in time t[1 . . . T] is represented by a collection of skeletons, each of which has 20 joints. We use technique to represent the jth joint at time t.
Equation (1) indicates that data can be recorded in 3D i.e., x, y, and z axis. We use the observation that the Euclidean distance between two joints should remain constant while the subject moves to create a scale-invariant skeleton. As a result, we normalize the skeleton joints in relation to how far they are from each other. Euclidean distance is a logical approach to compare two joints. The Euclidean distance between two joints of a body of a person during walk is defined as In order to describe our gait data, we encode body joint data using a covariance matrix which calculates variance and covariance between various joints. We describe joints using a local relative coordinate system to achieve efficiency and consistency of body positions. In order to measure efficiency of each joint the sample variance-covariance matrix or covariance matrix is used as The main diagonal values are referred to as the variances of particular joints and the remaining joints are the co-variances. The sample covariance matrix is calculated using the followingformula.
where S is covariance matrix,X is the mean vector and X is a data matrix of joints, and whereX

Correlation Matrix
The correlation coefficient between ith and jth joints is given by The p × p matrix of correlation for all joints is defined as where R = R, r ij = r ji for i = j and −1 ≤ r ij ≤ +1. If R = I, then all joints are uncorrelated. Alternately, some of the fundamental statistical techniques are used to assess the contribution and effectiveness of joints. In order to identify all skeletons as having the same component length and compute an average value for each of its skeletal parts, we also compute the joints' mean, variance, and coefficient of variation in x, y, and z coordinates, respectively.X where i = 1, 2, 3 . . . n andX Hip L represents arithmetic mean of hip-left joint. The arithmetic mean for other axis i.e., y and z axis can be determined in the same way using Equation (4). Further, to compute a single value of various joints i.e., combined mean of x, y, and z coordinates of individual person for comparison of male and female joints during gait. The combined mean is statistically defined for joint Hip left as follows: whereX C is the combined mean of Hip left joint andX Hip L ,Ȳ Hip L andZ Hip L are the means for joint Hip left of x, y and z coordinates.

Variance of and Pooled Variance of Joints
To capture variation of joint features from person to person of male and female joints during gait, the authors calculated variances and standard deviation. In this phase, variance of all joints are calculated in the 3D axis e.g., variance of the left hip joint in x-axis is measured using Equation (10).
where N is the size or number of observation in the population, σ 2 is the variance of population, X i is the variables, and µ X is the population mean at x coordinate. Similarly, the variance of hip left joint in y and z coordinates can be computed in the same way. The authors calculated the variances of other joints using Equation (11). After computing the variance of each joint, a pooled variance of all joints is calculated as where δ 2 P is the pooled variance while δ 2 1 , δ 2 2 , δ 2 3 . . . δ 2 k , are the separate variances of joints. Similarly N = N 1 + N 2 + N 3 , where N represents size of all joints and N 1 , N 2 , and N 3 are the sizes of x, y, and z axis. The pooled variances for joints Hip L may be computed as where N 1 , N 2 , and N 3 are the sizes of x, y, and z axis respectively. δ 2 PHip L is the pooled variances of joints Hip L . Similarly, pooled variances of other joints can also be computed in the same way.

Coefficient of Variation of Joints
The coefficient of variation (C.V) is used to assess the consistent joints out of the twenty joints for both male and female. The coefficient of variation is defined as where S.D represents the standard deviation of the hip left joint and the mean corresponds to the mean of the hip left joint.

Average Variation in Particular Joints of Human During Gait
Mean, standard deviation, and coefficients of variation of male and female joints were computed by using various statistical software's.
The results show comparatively a minimum score (10.31 and 9.05) for the coefficient of variation for the head joint of males and females. It suggests that the joint head is consistent joint for both male and female during the walk but more consistent for females as compared to males. On the other hand, C.V for the joint foot left of a male is a maximum of 17.58 as compared to the other joints, while C.V for the joint foot right of a female is also a maximum of 17.82 compared to other joints. It proves that the joints foot left and foot right of a male and a female vary more from individual to individual. C.V for the joints hand left and knee right of a male are same and also the C.V for joints spine and hip center of a female are the same, which show that these joints play the same role during gait. Table 1 shows the mean, standard deviation (S.D), and coefficient of variation (C.V) for various joints of male and female.

Comparison of Male and Female Joints
To assess the mean difference among all joints of male and female during the walk, different approaches are used.

Comparison of Male and Female Joints by Correlation
To identify the relationship between various joints and various directions during walk, the author used the Pearson correlation method. Pearson correlation is a statistical method used to measure the relationship between any two variables. The correlation is always lies between −1 and +1, the correlation near to +1 indicates that there is strong and direct relationship between variables and correlation near to −1 indicates that there is strong and indirect relationship. When the correlation is near to zero indicates that there is weak relationship between variables. The Pearson correlation is defined as To know the difference between male and female joints during walk, the authors check the correlation between various joints. The significant difference between male and female joints was obtained from the correlation and there is high difference in the male and female joints during walk.
According to Table 2, there are strong and weak relationships between different joints of males and females during gait. During a walk, the relationship between x and y directions at HIP-center is 84% for males, while for females is 14%. Males in the HIP-center have a relationship between y and z directions of 88%, while females have 20%. Males at HIP-Center have 90% relationship between x and z directions, while females have 80%. The relationship between different joints in men and women is different. The relationship between x and z directions at joint Wrist left for males is 94%, while for females it is 79%.
The relationship between x and z directions at joint Hand left for males is 93%, while for females is 72%. Similarly, the relationship between x and z directions at joint Knee right for males is 91%, while for females is 79%. The relationship between x and z directions at joint Ankle right is 93% for males, while 82% for females. The relationship between x and z directions at joint foot right is 93% for male joints, while for female joints, it is 88%. At joint HIP-Left, the correlation for males is 94% while for females it is 88%, which is a strong and significant relationship. The correlation results in Table 2 indicate that there is a significant difference between male and female joints during gait.

Comparison of Male and Female Joints Using T-test
To compare various joints of male and female during gait, the t-test is applied to identify the significant difference, i.e., at joint hip center there is significant difference between male and female walking. As each human can walk in three directions such as x, y, and z, therefore, we calculate mean of x, y, and z directions, respectively, and compute their combine mean. The combined means were compared by using t-test. Consider t-test of equal variances and not equal variances provide the same conclusion. The results of two samples t-test are shown in Table 3. The formulation of our hypothesis is given below.
where µ Hip L (M) is the mean of hip left for males and µ Hip L (F) is the mean of joint hip left for females. To compare each joint of males and females pair wise during gait, the t-test and ANOVA technique are used for all comparisons of all joints at the same time. Assume that variances are the same for both male and female joint contribution for t-test. The t-test is defined for analysis as whereX 1 is the combined mean of each joint of male,X 2 is combined mean of each joint for female. n 1 is the size of males and n 2 is the size of females. The degree of freedom for t-test is where ν represents the degree of freedom. S p is the pooled standard deviation for both male and female joints and is defined as where s 2 1 and s 2 2 are the variances of various joints for males and females.  Table 3, all twenty joints are statistically significant at 1% level of significance, because p-value of all joints is zero and less than 1%-this suggests that there is clearly difference between males and females at each joint during a casual walk. With the help of the t-test, it is concluded that all joints of males and females play different roles during gait.

Comparison of Gender Joints Based on ANOVA Technique
T-test is used for the comparison of two joints simultaneously. To compare more than two joints at the same time, the authors used analysis of variance (ANOVA) technique. Table 4 shows the result of ANOVA for particular joints during gait of both males and females-Twenty joints of males compared with twenty joints of females; k = 40 and d.f = 39.
Further, = 1440 and k = 40, therefore, an error d.f = n − k = 1400, sum of square (SS) and mean square (MS), respectively. The p-value is less than 5% and difference between various joints is significant, i.e., there is significant difference between males and females at joint hip center. It suggests that, at joint hip center, the female contribution is higher than males, which further concludes that the contribution of the hip center during walking is different for both males and females, as shown in Table 3. The t-test and ANOVA technique provides us the same conclusion. The constant term k denotes the total numbers of joints, i.e., male = 20 and female = 20.

Feature Selection
We first calculate the adjusted mean and adjusted standard deviation to determine the Cronbach's alpha value. By evaluating a feature's consistency, or how close together a set of objects is, we can use Cronbach's alpha to assess it.
The Cronbach's alpha yields an average result of 99.74% as shown in Table 5. It indicates that the data can be analyzed very effectively.

Classification Using Machine Learning
Machine learning classification can often capture some key information to represent subjects' identities, which other classifiers cannot capture, by iteratively feeding gait features and correlating labels into machine learning models in the training phase. Regression models are a well-known machine learning classifications for gait biometrics. The relationship between the categorical target variable and one or more independent factors is measured using logistic regression. When there are just two alternative outcomes for a goal variable, it can be helpful (in other words, it is binary). In order to predict the target variable classes, binary logistic regression classification uses one or more predictor variables that may be continuous or categorical. With the aid of this technique, it is possible to pinpoint crucial variables (Xi) that have an impact on the dependent variable (Y) as well as the nature of their interactions. Binary logistic regression model can be represented mathematically as The main advantage of the proposed system is to identify the gender (male/female) of the walking subject on the basis of gait 3D joints positions detected via MS Kinect controller. The proposed system calculates the values of P 1 and P 0 using Equations (22) and (23). When the value of P 1 exceeds the threshold value, the walking person will be a male, and when the value of P 0 exceeds the threshold value, the walking person will be a female. To compare the results of the fitted binary logistic regression model, we used the threshold value of the binary logistic regression model for gender classification.

Experimental Result
To evaluate our proposed method, we used a dataset of 373 different individuals wearing their typical shoes and clothes with different ages and gender. Our setup placed the Kinect camera on a tripod at a height of 1.0 m in a typical room measuring 6 m × 8 m with normal daylight. On a PC with a 64 GB RAM and an Intel Core i3 312 GHZ processor, the recorded data were processed. The training computations take the walking of subjects for only a few seconds to complete. The MS Kinect tracks the subjects' 3D joints positions at a rate of 30 frame per seconds (fps). We recorded people walking toward and away from the camera in front of the sensor in order to capture lengthy sequences. As a result, each walking sequence had between 30 and 150 frames. Different step sizes and speeds among different participants caused variations in sequence duration. As for 3D joint detection, the visibility of the full body joints is mandatory in front of the MS Kinect; therefore, the maximum distance between the MS Kinect and walking subject is defined as 6 feet (see Figure 4). The main factor to optimize the distance to be 6 feet was the "height of the person" and "resolution (640 × 480) of the camera". Because when a tall person walks near to the camera, due to their height, some of the body joints are occluded, which affects the accuracy due to the detection of few joints. In contrast, a person walking beyond the working area of the MS Kinect can result in the false detection of 3D joints. The detection of the walking person has no effect on the distance between subject and MS Kinect, but just for the detection of full body joints. For a better depiction of the data, the data also include additional details, such as participants' age and gender. Figures 5 and 6 show screenshots of the gender detected

Classification Accuracy
We have evaluated several parameters and features of our method in order to fully assess it. We assess the efficiency of joint combinations. The accuracy's detected by our system, using different set of joints such as 9, 12, 14, 16, 18, and 20 were 93.0%, 95.0%, 90.0%, 96.0%, 97.0%, and 98.0%, respectively. The best accuracy was 98.0% when all 20 joints in our suggested system were used. We consider several combinations of walking sequences for training and testing in order to choose the optimal model for gender classification. For each set, we maintained a minimum difference of two to three joints. By contrasting all of the male and female joints, we find a natural overall difference. The complete set of joints greatly enhanced the output; however, the variance is minimal and the rate of increase is moderate, demonstrating the robustness of our classifier to the training data. To measure the accuracy of classification, we utilize an accuracy definition as Our classification accuracy for each of the classes is summarized in Table 6. The proposed technique achieves the same accuracy for males as well as females, but comparatively it achieves better accuracy than other systems. Table 6. Classification accuracy of the proposed system with different set of joints.  Accuracy will decrease with larger standard errors, and will increase with lower standard errors. These joints provide better gender detection because the its standard errors are minimum. Moreover, the p-value for each joints is less than 0.05, which indicates that the joints are statistically significant in prediction of a gender as shown in Table 7. The standard error of joints can be computed using formula We tested the result of the proposed system using a number of conventional algorithms, such as naive Bayes, multi-layer perception (MLP), random forest (RF), and k-nearest neighbor (KNN). The comparison results are shown in Table 8. Performance evaluation parameters are used to generate the experimental results of gender classification as well as the parameters for performance evaluation. Classification accuracy and true positive rate (recall) for male and female individuals have been considered as evaluation criteria. In the conventional algorithms, the best performance is shown by the KNN algorithm, with a classification accuracy of 92.0%. In comparison to the conventional algorithms, the classification accuracy of the proposed system is 98.0%, which is better than all the existing systems. In the conventional methods, most of the systems showed unbalanced recall for males and females, which indicates that the data are more difficult to classify. On the other hand, the proposed system is better because it shows a balanced recall for both males and females.
To the best of the author's knowledge, very limited datasets are publicly available for gait-based gender recognition captured by Microsoft Kinect. After an extensive literature review, we found one public dataset (UPCVgait) of gait-based data using Microsoft Kinect for gender recognition. We applied the proposed model on the above dataset for gender classification and compared it with the other conventional methods [55]. As shown in Table 9, the existing system produced an accuracy of 96.67% [55]. The proposed system outperformed the existing system with an accuracy of 98.5% by using the same dataset (see Table 8).

Discussion
The statistical findings of this research reveal that coefficient of variations for both male and female joints during gait shows significant differences. The coefficient of variation at joint head for males is 10.31 (see Table 1), which shows less variability as compared to others joints of male. Therefore, the joint head of males is more consistent during walking. On the contrary, the coefficient of variation at the joint head of female is 9.05, which also shows less variability compared to the remaining joints of females. So, the joint head is also more consistent for females during casual walk. Similarly, the coefficient of variation at joint foot-left of males is 17.58, which shows high variability compared to the remaining joints of males. Therefore, the male performance at joint foot-left varies from person to person. For foot-left of females, the coefficient of variation is 17.01, which shows less variability as compared to joint foot left of males. Therefore, the female performance at joint foot-left is consistent compared to males. The coefficient of variations at joint foot-right for males and females are 16.34 and 17.82, respectively, which shows that male performance is more consistent than female at joint foot-right.
To compare all joints using t-test, there is a significant difference between male and female joints. p-values for all joints are zero (see Table 3) which indicates that male and female joints play different roles during gait.
According to analysis of correlation, both males and females widely move in the X and Z coordinates, as the correlation is high and highly significant at X and Z coordinates, i.e., for both males and females at X and Z-coordinates because the p-values for each joints is zero. On the contrary, the correlation is weak and less than 50% for most joints of males and females at X and Y-coordinates as well as Y and Z-coordinates. The correlation of joints elbow-left and hip-left for male and head of female are more than 90%, which means that these joints are highly significant (see Table 2). According to analysis of correlation there is quit difference between male and female joints. Outcomes of t-test show that all joints of male and female are significantly different from each other. ANOVA technique was applied and provided the same conclusion for reduction in type-I error and for comparison of all joints of male and female. It showed that all joints of males and females are statistically significant at 1% and 5% level of significance.
Analysis and results of t-test, ANOVA, and correlation suggest that male and female joints have different contributions during walking. Further more, the Cronbach's alpha yields a result of 99.75%, which suggests the reliability of data. Finally, the gender of the walking subjects are detected with an accuracy of 98.0% using a machine learning approach.

Conclusions and Future Work
In this paper, a machine learning approach for gait-based gender classification using all body joints is proposed. The proposed method included several stages, such as gait feature extraction based on three dimensional (3D) features, feature selection, and human gender classification. The proposed system extracted the 3D joints positions of all human joints using Kinect sensor, which were compared for the identification of gender during walk. Different statistical tools such as Cronbach's alpha, correlation, t-test, and ANOVA techniques were exploited to select significant joints. The Cronbach's alpha technique yielded an average result of 99.74%, which indicates the reliability of joints. Similarly, the correlation results indicated that there is significant difference between male and female joints during gait. t-test and ANOVA approaches demonstrated that all twenty joints are statistically significant for gender classification, because the p-value for each joint was zero and less than 1%. It was observed that all male joints contributed differently than female joints during gait. The results showed that the head is consistent for both male and female during walk. Furthermore, the results proved that the joints foot-left and foot-right of males and females vary more from individual to individual. The joints hand-left, knee-right, hip-center, and spine of all males played the same role during gait. Further, the joint spine, shoulder-center, head, and shoulder-right played the same contribution for all female during gait. According to the coefficient of variation of male and female joints, male joints are more consistent than female joints during walking, which showed that there is a large variance in female joints as compared to male joints. There was high variation in males at joint knee left during walking, which suggested that contribution of joint knee-left of males varies from person to person. There was high variation in joints foot-right, ankle-right, and foot-left of females, which indicates that female contribution varies from female to female. Finally, the gender of the walking person was correctly identified with an accuracy of 98.0% using all body joints. To demonstrate the improved performance of the proposed system, experimental results are compared with deep learning as well as with conventional machine learning techniques. The comparative studies demonstrated the robustness of the proposed machine learning approach, which outperforms the state-of-the-art models. The majority of research on gait-based gender recognition that have been published in the literature have relied on very small sample sizes, which may have an impact on gender identification.
In future work, the researchers would strive further to strengthen the results by studding a much larger database. The proposed method will investigate which joints of males and females are more significant in running, jumping, and sports. Furthermore, these results can be used for modeling human identification.