Classiﬁcation of Diabetic Walking for Senior Citizens and Personal Home Training System Using Single RGB Camera through Machine Learning

: Senior citizens have increased plasma glucose and a higher risk of diabetes-related complications than young people. However, it is difﬁcult to diagnose and manage elderly diabetics because there is no clear symptom according to current diagnostic criteria. They also dislike the invasive blood sample test. This study aimed to classify a difference in gait and physical ﬁtness characteristics between senior citizens with and without diabetes for a non-invasive method and propose a machine-learning-based personal home-training system for training abnormal gait motions by oneself. We used a dataset for classiﬁcation with 200 over 65-year-old elders who walked a ﬂat and straight 15 m route in 3 different walking speed conditions using an inertial measurement unit and physical ﬁtness test. Then, questionnaires for participants were included to identify life patterns. Through results, it was found that there were abnormalities in gait and physical ﬁtness characteristics related to balance ability and walking speed. Using a single RGB camera, the developed training system for improving abnormalities enabled us to correct the exercise posture and speed in real-time. It was discussed that there are risks and errors in the training system based on human pose estimation for future works.


Introduction
Diabetes is a metabolic disease that causes problems with either the secretion or assimilation of insulin, a natural hormone in which the function is to reduce the sugar concentration in the blood [1]. If diabetes is not managed, high blood sugar levels and other risk factors can lead to the blood vessel and nerve damage [2]. Moreover, the complications of diabetes can develop and affect nearly every organ system in the body. In particular, we have large and small blood vessels that deliver blood around the body. Damage to the large blood vessels leads to heart attacks, several kinds of strokes, or affects blood flow to the lower extremities, and risk to the small blood vessels can affect the eyes, kidneys, teeth and gums, and nerves. In addition, nerve damage can affect the digestive system, sexual organs, and excretory system. That is the significant problem as to why there is no complete cure for diabetes yet. Current treatments for diabetes are only to check the amount of glucose in the blood, adjust the food, and keep exercise activities every day by oneself [1,3].
Walking is the activity recommended for most diabetic patients, while being effective in weight loss and maintenance and in improving glucose control [4]. This recommendation comes from the results of a meta-analysis, including many small, short-term randomized controlled clinical trials (RCTs) and some recent additional research, showing clinically appreciable improvement of HbA1c [5,6]. The effect on insulin resistance is not apparent. Walking is easily applicable in daily life in most patients without requiring expertise, and logistic support can be performed in different places [7,8]. Limited information indicates the improvement of several alterations involved in the increased cardiovascular risk associated with diabetes. Moreover, a small study reported favorable changes in several functional aspects of diabetic neuropathy, although the presence of this condition requires specific monitoring of patients and may also limit walking activities [4].
Many researchers have shown that elders with diabetes were related to a greater risk of falls, and this was more clear in insulin-treated patients [9]. For example, according to six studies involving 14,685 participants, the number of falls in diabetic and non-diabetic, respectively, was 25.0%, and 18.2% [10]. On the other hand, diabetes increased 94%, and 27% risk of falls in insulin-treated and no-insulin-treated patients, respectively [11]. Hence, preliminary screening before starting any physical activity programs in older adults with diabetes mellitus should include a general medical examination, with specific attention to symptoms and signs of chronic complications (cardiovascular disease, nephropathy, retinopathy, and neuropathy), and assessment of metabolic control [12]. However, the significant problem is that there are large changes in the analysis results due to the diverseness of the diabetic population with the presence and severity of diabetes complications. Accordingly, although the prescription of walking in patients with diabetes should be preceded by a tailored medical and functional assessment, it is difficult to assess the functionality of walking capacity by themselves [13]. Thus, senior citizens with diabetes must recognize the diabetic walking abnormality for self-preservation and evaluate the functionality without expert medical knowledge using the simple sensing system.
The results of several studies have been conducted to assess walking biomechanics alterations in diabetic neuropathic subjects [11]. According to kinetics and muscle activation patterns, there have been significant variations in the results of both a reduction and an increase in the gastrocnemius activity and the lower limb joints moments. In terms of plantar pressure [14,15], a shorter center of pressure (CoP) excursion and a higher peak pressure over the forefoot have been found [16]. Although the diabetic symptoms are apparent, the neuromuscular, kinematics, and kinetics changes do not show a distinct pattern associated with the results of diabetes and diabetic neuropathy (DPN) [17][18][19][20][21][22][23]. Compared to these analyses of movement symmetry, continuous relative phase (CRP) is one of the most sensitive analyses for detecting the mutual relationship among the joints and asymmetries in coordination during walking, particularly for identifying cyclic movement deviations caused by diabetes [24,25]. However, because CRP also tries to explain biomechanical walking patterns, such as ground reaction forces, angles, and moments of the trunk, hip, knee, and ankle, the assessment should be performed at the well-equipped hospital with expert medical knowledge by using the expensive motion capture system. Thus, it seems complicated to protect diabetic senior citizens by themselves in daily life through conventional works.
Therefore, this study aimed to classify a difference in gait and physical fitness characteristics between senior citizens with and without diabetes through automated machine learning (AutoML) by using the simple sensor and propose a machine-learning-based personal home training system for training abnormal gait motions by oneself. For this study, a dataset was constructed by 200 senior citizens over 65 years old who performed to walk a flat and straight pathway of 15 m under three different conditions of walking speed (slow (=20% slower than preferred walking speed), preferred, and fast (=20% faster than preferred walking speed)) by using an inertial measurement unit (IMU) and physical fitness tests. Then, for training abnormal gait motions, the proposed home training system enabled us to correct the exercise posture and speed in real-time through machine-learning-based similarity evaluation between training experts and novices by using a single RGB camera.
The contribution of this study was to make clear the association between diabetic walking abnormalities and measured feature vectors. Then, these results could be applied for early detection and therapeutic intercessions that rehabilitate the walking function in diabetic senior citizens by grouping them not only by clinical features but also based on their motor control strategies. Furthermore, the proposed home training system helps senior citizens with diabetes continue to exercise activities with the correct posture and speed.
The structure of this study is: we describe how to construct the dataset for this study in Section 2, then explain how to classify through AutoML in Section 3, and show all of the results, including which gait characteristics are essential to classify, in Section 4. Finally, to train the detected abnormality, we introduce the developed machine-learningbased personal home training system in Section 5 and show all of the results to verify the usefulness in Section 6. Finally, we discuss the classification for healthcare through AutoML and the limitations of the developed home training system in Section 7, before our conclusions in Section 8.

Human Subjects
Human subjects for this study participated in community activities in the Busan metropolitan city from 2018 to 2019. A total of 200 were recruited for human subjects aged 65 years old and over living in the community; 59 were in the group of healthy subjects, and the remaining 141 were in the group of subjects with diabetes. In the group for subjects with diabetes, there was no one to take anti-diabetes drugs to control the blood sugar level because their symptoms were not severe. Thus, the treatment of diabetes in this study was centered on diet, exercise, and weight loss. In addition, a human subject was excluded if he or she could not walk without any aid tool, had a history of severe orthopedic problems, or had neurosurgical and neurophysiological problems in the preceding six months. Figure 1 summarizes the characteristics of two different groups for this study. There is no significant difference in age and body mass index (BMI) between subjects with diabetes and without diabetes. Although the number of females is more significant than that of males, the ratio of gender between two different groups is similar. There is no problem because the mean age for human subjects is 74 years old. The ratio of 80∼91% for human subjects has an experience of compulsory education courses, including the elementary, middle, and high school. If all participants agreed to attend all experiments, they had to read and sign the informed consent document approved by the Institutional Review Board of Dong-A University (IRB number: 2-104709-AB-N-01-201808-HR-023-02). All experimental procedures were performed under the Declaration of Helsinki. Figure 2 shows the experimental environment for the measurement and analysis phases under steady-state conditions. (a) Three arrows, from left to right side, indicate acceleration, consecutive, and deceleration steps for the measurement phase, respectively. (b) Detection of walking abnormalities with the shoe-type inertial measurement unit (IMU) system is analyzed through extracted features, such as heel strike (HS) and toe-off (TO) [26,27]. Shoe-type IMU sensor (DynaStab TM , JEIOS, Busan, Korea) consists of the data logger (Smart Balance SB-1, JEIOS, Busan, Korea) and the data acquisition device. The IMU sensor (IMU-300 TM , InvenSense, San Jose, CA, USA) in the data logger can measure triaxial acceleration (up to ±6 g) and tri-axial angular velocities (up to ±500 • s −1 ) along the three orthogonal axes. The IMU sensors are set up on the outsoles of both shoes, and data are transmitted to the data acquisition device via Bluetooth. The walking measurement was collected at 100 Hz and filtered using a second-order Butterworth low-pass filter with the 10 Hz cut-off frequency. Although data are measured during acceleration and deceleration steps, data for these steps are ignored. Instead, data during consecutive steps are used to analyze for extraction of features.

Experiment for Data Acquisition
All subjects performed three trials of the overground walking test along the straight 15-m walkway at slower, preferred, and faster speeds wearing the shoe-type IMU sensor. The preferred walking speed is defined by someone's comfortable, stable, and usual walking speed. The subjective decision of subjects decides the slower and faster speed. Thus, the slower or faster speed is controlled 20% slower or faster under the preferred speed. The prepared metronome supports the decision of the walking speed. The subjects were asked to walk at the preferred speed to measure cadence using the metronome before each test.
All subjects also performed four physical fitness domains with nine tests: muscle strength, flexibility, balance, and cardiorespiratory fitness. All subjects were performed grip strength with a hand-grip dynamometer (TKK 5401 Grip-D, Takei Scientific Instruments, Tokyo, Japan) and biceps curls with a dumbbell (3 kg for men; 2 kg for women) to measure upper extremity strength. Five times sit-to-stand (STS) and standing time from a prolonged sitting position were also performed to assess the lower extremity strength. To assess flexibility, back scratch as the upper extremity flexibility and chair sit and reach as the lower extremity flexibility was performed. Single-leg balance (dominant leg) as the static balance and a three-meter timed-up-and-go (TUG) as the dynamic balance were performed to assess physical abilities. Finally, a 6-min walk test was performed to assess cardiorespiratory (or functional) fitness. Two attempts of each test calculated the mean scores of physical fitness tests.

Dataset
The dataset consists of features extracted by the non-invasive method, such as the survey and measurement with the shoe-type IMU sensor. The average age of human subjects in this study is 74 years old, thus being senior citizens. It is thought that the experiment through the invasive method is complex because participants' burden is so enormous. So, it is crucial to extract the feature point for classification through the noninvasive method. The total number of features is 43 for the training and prediction, 7 for the survey, and 36 for the measurement.
At first, all human subjects were surveyed to check their health condition. Figure 3 shows the walking-related features extracted by the non-invasive method, including the survey and measurement. Extracted features with the survey are shown at the numbers 1, 2, and 3; Age, Gender, Education level, BMI, MET-min/week, Hypertension, MMSE score, Insomnia score, and Quality of Life (QOL). It is well-known that these features are famous for estimating personal life patterns, in general. The total number of extracted features was 9.
Then, all human subjects attended the walking experiment by wearing the shoe-type IMU sensor. Extracted features with the measurement are from the numbers 4 and 5, as shown in Figure 3; walking speed, stride length, CV stride length, CV stance phase, gait asymmetry, cadence, stride time, CV stride time, grip strength, five times sit-to-stand (STS), biceps curl, chair sit and reach, three-meter TUG, back scratch, single leg balance, six min walk, and standing time from a long sitting position. It is well-known that these features are also famous for evaluating walking and physical fitness characteristics, in general. All walking experiments occurred under the three different conditions of slower, preferred, and faster speeds. The total number of extracted features was 36.

Classification through Automated Machine Learning
Most AutoML tools follow a typical three-step pipeline described in Figure 4, which shows typical components of a machine learning problem pipeline [28]. The first step consists of preparing the data [29]. This step involves loading and cleaning the data for use in the system and applying any transformations, normalizations, or encodings. The next step involves selecting to select the model [30]. This step might also involve feature engineering, which uses domain knowledge to generate new features to support and improve the machine learning model. Then, the final step consists of an iterative process in which one builds, trains, optimizes, validates, and selects a given machine learning algorithm to use for a given problem. In general, these three components are optimized iteratively to obtain the best results. The function of f : X → Y is good for the learning stage. The classification or regression uses y. An algorithm of A can be set to {d 1 , · · · , d n } of training data points d i = (x i , y i ) ∈ X × Y, which represents a parameter vector, and hyperparameters of λ ∈ Λ, which represents changeing the method of the algorithm of A λ . Here, hyperparameters indicate the length penalty, the number of neurons in a hidden layer, and the number of data in a decision tree. The loop can evaluate the performance of each hyperparameter configuration, which the cross-validation optimizes.

Data Preparation and Feature Engineering
Data pre-processing still requires considerable human intervention. This stage asks about the data type and schema detection. Thus, this work has not been primarily assisted among the AutoML. However, when one data type is identified, the tools support appropriate feature engineering.

Model Selection
After many different models use the driven features for training with different parameters, we can find the most proper model for the selection. When the algorithm of A and the limited amount of training data D = {(x 1 , y 1 ), · · · , (x n , y n )} are given, the model is to determine A * ∈ A. The performance of each model is estimated by D into sets of D valid . These allow for the model selection problem as follows: We use k-fold cross-validation, which splits the training data into k equal-sized parti-
Hyperparameters of λ i can be replaced with another hyperparameter λ j when λ i is only active [32]. At that time, hyperparameter λ j takes values from a given set V i (j) Λ j ; in this case, we call λ j a parent of λ i . Conditional hyperparameters may be parents of other conditional hyperparameters [33]. When a structure of Λ is given, the optimization can be solved as: Figure 5 shows an example of the constructed dataset for this study. As in the principal component analysis (PCA), the statistical method did not work well because the total number of feature vectors was 43. Questionnaires for seven feature vectors give a subjective judgment. However, that provides preliminary evidence that the interview with self-care and regimen adherence is a reliable and valid instrument and efficiently assesses self-care behaviors associated with glycemic control. The used features for this study are related to physical activities in daily life. The remaining 36 feature vectors are related to physical fitness and gait characteristics under imposed challenged speed conditions in senior citizens with diabetes during walking. The dimension of the matrix is (human subjects × feature vectors = 200 × 43).  Figure 6 shows the results of data distribution under the condition of three different walking speeds. Blue-colored data (mean ± standard deviation (SD) = 0.898 ± 0.155) represent the condition of slow speed, which becomes, on average, 22% slower walking speed than the preferred speed, and orange-colored data (1.157 ± 0.219) represent the preferred condition. Finally, gray-colored data (1.441 ± 0.275) represent the fast condition, which becomes the averagely 25% faster speed than the preferred, respectively. As a result, it was confirmed that there was no problem analyzing gait and physical fitness characteristics with the constructed dataset, although the experimental condition for walking speed was dependent on personal subjective criteria.  Results of training through automated machine learning (AutoML). "auc" is related to the accuracy results of classification produced by the trained model, and "logloss" is that the crossentropy between the model and the target values. "rmse" is the root-mean-square error metric, and "mse" is the mean square error. "mean_per_class_error" is one kind of available options for classification.

Results of Classification
The training results found that it was possible to classify the difference in gait and physical fitness characteristics between senior citizens with or without diabetes with 80% accuracy through the non-invasive method. Figure 8 shows the results of the confusion matrix for the prediction through DRF, which is the highest accuracy in used algorithms. Among the total data, 165 were used for training (80%), and 35 were used for prediction (20%). It was confirmed that the error rate was so low. That means that the applied algorithm was suitable for classifying the difference between senior citizens with and without diabetes.  Figure 9 shows the results of the variable importance plot, which show the relative importance of the essential variables in the model. Variable importance is currently available for all H2O models; so, if you happen to use h2o.explain() [34] on an AutoML object with a Stacked Ensemble at the top of the leaderboard, it instead shows the variable importance for the top "base model", which is DRF for this study, as in Figure 7. As a result, it was found that the variable of single_leg_balance_s was the most dominant. That meant that the balance stability might decrease more with senior citizens with diabetes than those without diabetes. Then, we found that senior citizens exhibited more insufficient gait stability at slower and faster strides. Thus, the different walking speeds must help evaluate gait characteristics to distinguish senior citizens with diabetes and controls. Additional important variables in the model were fast_CV_stance_phase_percent, stand-ing_time_from_a_long_sitting_position_s, six_min_walk_s, slow_CV_stride_length_percent, fast_CV_stride_time_percent, three_meter_TUG_s, and non-physical performance variables, such as MMSE_score, hypertension, and total_physical_activity/MET_min_week. Variability (CV) domain of gait exhibited to be an important factor in senior citizens with diabetes.

Development of Machine-Learning-Based Personal Training System
Our study analyzing the gait and physical fitness of senior citizens based on machine learning found that the symmetry between the left and right feet differed in a fast walking speed because elderly patients with diabetes had a worse balance than healthy elderly adults. Therefore, although elderly adults have to exercise alone due to social distance under COVID-19 (pandemic situation), they need to train and evaluate their balance ability.
The authors have become interested in developing a home training system with an algorithm that allows users to evaluate their exercise poses alone at home using low-cost available devices. This study confirms the feasibility of an algorithm to evaluate the dynamic exercise pose using a low-cost single RGB camera, instead of an IMU sensor, as the gold standard, because the analysis of IMU sensors requires technical knowledge. Furthermore, the skeleton includes the information of movement of lower limbs, as well as others. The cheap USB-connected-typed RGB camera is the device that anyone can quickly obtain. Recent advances in technology, such as OpenPose, which can extract each joint from the human body based on human pose estimation, have been dazzling. However, to evaluate the exercise pose, the only key point representing each joint is not sufficient. Therefore, this study develops an algorithm to evaluate the exercise pose by comparing two skeletons for experts and users.

Experimental Condition
The experimental system in this study is composed of the RGB camera for acquiring data and a computer for processing and analyzing data. There is no problem using a highresolution camera by connecting to the computer with the USB connector when the camera is not built into the computer. The resolution of the camera in this study is 640 × 480. Some downloadable videos on the website construct a dataset of experts. Although there are many exercise activities for home training, we adapt yoga and squat, which are frequently used for home training.
Human pose estimation is performed by MediaPipe Pose, developed by Google [35]. MediaPipe Pose is an ML solution for high-fidelity body pose tracking with 33 3D joints on the whole body from RGB video frames utilizing BlazePose research that also powers the ML Kit Pose Detection API. Current advanced approaches rely essentially on powerful desktop environments for inference, whereas this method achieves real-time performance on most modern mobile phones, desktops/laptops, in Python, and even on the web [35]. Figure 10 represents a flow chart to explain how to process the image from the RGB camera and then to verify the similarity in exercise activities between experts and users in this study. It is necessary to perform camera calibration at first. Camera calibration is the process of estimating intrinsic or extrinsic parameters. Extrinsic parameters are mainly related to the position and orientation in the world, and intrinsic parameters deal with the camera's internal characteristics, such as its focal length, skew, distortion, and image center. Thus, we can say that intrinsic parameter is an essential first step for 3D computer vision, as it allows us to estimate the scene's structure in Euclidean space and removes lens distortion, which degrades accuracy. Then, it is necessary to reduce noise signals in time-series data. The location for each key point through human pose estimation technology based on a convolutional neural network (CNN) is time-series data which means the x-axis for the sampling time and y-axis for location value. In time series forecasting, the presence of dirty and messy data can hurt the final predictions. Therefore, temporal dependency plays a crucial role when dealing with temporal sequences. After two phases are completed, now, we are ready to calculate each link length (mean ± standard deviation(SD)) when the user shows the stationary pose. Moreover, finally, it is necessary to evaluate the similarity in exercise activities between experts and users after selecting the kind of exercise activities. Human coaches are very good at visually detecting such patterns, although trainees show performance with different speeds. Nevertheless, programming machines to do the same is a complex problem. Successful recognition strategies are based on the ability to approximately match amplitude for each key point, despite wide variations in timing. Figure 10. A flowchart to explain how to process the image acquired from the RGB camera and then to verify the similarity in exercise activities between experts and users.

Dynamic Time Warping Algorithm for Similarity Evaluations
In particular, extracting some features in continuously measured signals seems to include many essential aspects of pattern detection in time series. Feature extraction is usually based on matching templates against a waveform of the continuous signal, converted into a discrete-time series. Thus, successful recognition strategies are based on matching signals, despite wide variation in timing and amplitude, approximately. Because the speed of exercise activity for experts is different from that for the user, it is necessary to evaluate the similarity between two different time-span data. Any two-time series can be compared using Euclidean distance or similar distances on a one-to-one basis on the time axis. The amplitude of first time-series data at time T is compared with the amplitude of second time-series data at time T. The comparison at the same time axis leads to an inferior comparison and similarity score even if the shape of the two-time series are very similar but out of phase in time. Dynamic time warping (DTW) compares amplitude of first signal at time T with amplitude of second signal at time T + 1 and T − 1 or T + 2 and T − 2. This makes sure it does not give low similarity score for signals with similar shape and different phase [36,37].
The DTW technique is based on an approach of dynamic programming, while aligning the time series to find minimized distance measurement. Because some regulations of the time axis can fit the horizontal axis, the proper template looks useful. Investigating a time series, S, is related to finding the pattern in a template, T, S = s 1 , s 2 , · · · , s i , · · · , s n (3) The sequences S and T have a n − by − m matrix. Each element of (i, j) at the matrix represents the similarity between the two elements of s i and t j . W aligns the elements of S and T, and the warping path has the minimum value, where W represents a sequential point, and w k represents (i, j) k . Solving a dynamic programming problem is to have a measured distance between two elements. Although we have many candidates, it looks proper for the absolute value or square of the similarity as the distance function of δ.
The function of δ means a measured distance between two-time series data. Since the cumulative measurement for each path indicates the potential warping paths, the DTW problem can define as minimization of warping paths.

DTW(S, T) = min
Dynamic programming explains legal state transitions with stage, state, and decision. Although the decision is difficult to recognize, these variables show possible paths between the two elements in the matrix. Some limitations are as follows. However, these are good for deciding permissible paths for efficiency.

1.
Monotonicity: The points must be monotonically ordered with respect to time, i k−1 ≤ i k and j k−1 ≤ j k .

2.
Continuity: The steps in the grid are confined to neighboring points, i k − i k−1 ≤ 1 and j k − j k−1 ≤ 1.

3.
Warping window: Allowable points can be constrained to fall within a given warping window, |i k − j k | ≤ w, where w is a positive integer window width.

4.
Slope constraint: Allowable warping paths can be constrained by restricting the slope, avoiding extensive movements in a single direction. 5.
Boundary conditions: The starting point selects one of the subsequent paths, and the endpoint adds some offset to constrained points, such as i 1 = 1, j 1 = 1 and i k = n, j k = m. The dynamic programming algorithm is based on the following recurrence relation, which defines the cumulative distance, γ(i, j), for each point, Filling the lowest cumulative distances in the matrix helps us find the optimal warping path. Figure 11 shows a description of the graphical user interface (GUI) for the developed fitness software. The current pose of the user is to initialize each link length according to the key point detection. The initialization results make it possible to calculate the value of (mean ± standard deviation (SD)) for each link length on the approximate 3D space. At the bottom of the user picture, "Reps" indicates the counter for the number of exercise activities, "Feedback" indicates the advice for the excellent exercise pose, and "Timer" indicates the history for the exercise time, respectively. On the right side of the figure, the bar plot represents the achievement per one exercise activity.  Figure 12 shows results of similarity in time series data through DTW algorithm. The horizontal axis, as shown in the left-side plot in Figure 12b,c, indicates the number of frames which is similar to the sampled time, and the vertical axis indicates calculated angles of θ trunk and θ leg , as shown in Figure 11, on the left. The orange curved plot of "pro" results from training experts, and the blue curve of "nov" is the users. The user who did not have much experience with this fitness system hesitated to begin during 40 frames of 1.3 s because she had no idea when to start. Many lines connecting one point on the curve for experts to one or several points on the curve for the user are results of DTW comparison. DTW compares amplitude of first signal at time T with amplitude of second signal at time T + 1 and T − 1 or T + 2 and T − 2. The comparison with the changing time axis makes sure it does not give a low similarity score for signals with similar shapes and different phases. The right-side plot in Figure 12b,c shows the results of the cost matrix and warping path: the horizontal axis represents data for the user, and the vertical axis represents those for averaged experts. The closer the plot is to the diagonal, the higher the exercise activity for the user is similar. The closer the plot is to the horizontal axis, the lower the activity for the user is the similarity. The line shows the zero DTW distance. Although the user was late to start, it was found that there was no problem evaluating similarities in time series data of exercise activities through the DTW algorithm. Figure 13 shows results of visualized error points in the real-time through DTW algorithm. As a result, it was proven that the DTW algorithm helped evaluate exercise activities. Furthermore, this system could monitor us and provide real-time feedback if we extended our knees too far or our legs were placed too close.

Discussions
Although most people want home training systems to replace human coaches, methods are still not perfect. Therefore, it is necessary to know what kinds of failure cases are still existed. Among many different exercise activities, squats have become an excellent example of applying for human pose estimation technology and have been proper for describing widespread errors resulting in severe healthcare problems. While some athletes perform power-lifting, the most common exercise, most athletes request a personal coach because the heavyweight exercise tool produces poor posture. Then, it is time to clarify whether home training systems based on human pose estimation can substitute human training coaches or not.
(1) Body specifics according to the gender: When image data of humans train human-pose-estimation-based models, it is necessary to consider the difference in physiology between males and females. For example, if the dataset for the train includes many men's images, the accuracy for the prediction depends on only male users. Meanwhile, if women's users use the trained model, wrong results wait for us, although the exercise posture is good. Thus, the model should consider the difference between the two genders when the home training system is developed based on human pose estimation. (2) Physiology specifics: The model based on human pose estimation can recognize the user's body through the dataset of image data. However, it is not easy to guarantee whether images for training are similar body structures or not. Thus, the prediction results are always low even if the exercise posture is correct when the training does not use the dataset with the general proportion for the body parts. (3) Decision of the exercise start: There is no problem that the user follows human coaches to start and finish the exercise. However, it is difficult to tell the home training system when the activity begins and ends. Thus, the dataset should be time-series data because it doubts to decide the exercise duration with some images. (4) Decision of frontal view: When we need to compare exercise postures with two different videos, there is no guarantee which the taken conditions of the camera, such as the angle, height, and lighting, are similar. Thus, finding the frontal view from the recorded video is always impossible, and the results may be insufficient. The dataset still does not contain enough information for alignment. (5) Problem for quick movements of the body part: The frame rate of the web camera is 30 or 60 Hz, in general. That means that the model based on human pose estimation does not allow fast movements for exercises to detect exact key points. Although deep learning improves pose estimation technology, blurred image data are not helpful for training. (6) Decision of horizontal position: It is not easy to find the flat plane from the image. Thus, it is difficult to find the horizontal and vertical axes from the only image when someone performs exercises.
That is the reason why we should need to calibrate images before training. (7) Decision of occluded joints: The occlusion problem is a problematic issue when finding key points from the image through human pose estimation. According to the taken condition, some bodies and objects hide target joints, in general. At that time, it is necessary to decide how to estimate hidden or lost joints. However, there is still not a clear to solve the occlusion problem.

Conclusions
This study addressed two issues: (1) use sensor data and AI/ML to classify senior citizens with diabetes based on their gait and physical fitness characteristics and (2) develop a personal training program using AI/DL based on 3D skeleton detection. Thus, using IMU sensor data and ML, we could classify the elderly with diabetes based on their gait and physical fitness characteristics and learned how to develop a personal training program using AI/DL, e.g., 3D skeleton detection.
In fact, this study aimed to prove that abnormalities for senior citizens with diabetes were classified under imposed challenge walking speed conditions; slow (=22% slower than preferred speed), preferred, and fast (=25% faster than preferred speed) walking speeds through AutoML. The dataset for training was constructed with the support of senior citizens in the community by using the IMU. The applied AutoML for classification is an emerging research field within computer science that can help non-experts use machine learning off the shelf. Furthermore, the developed ML-based personal home training system using the single RGB camera showed the high possibility of correcting the exercise posture and speed in real-time. Therefore, the results of this study may be helpful for the self-preservation of senior citizens with diabetes by themselves with a single RGB camera.  Informed Consent Statement: Informed consent was obtained from all subjects involved in the study.

Data Availability Statement:
The dataset is available for your non-commerical research. Please contact any author. You can download our dataset after you submit the document for your agreement.