Identiﬁcation of Smartwatch-Collected Lifelog Variables Affecting Body Mass Index in Middle-Aged People Using Regression Machine Learning Algorithms and SHapley Additive Explanations

: Body mass index (BMI) plays a vital role in determining the health of middle-aged people, and a high BMI is associated with various chronic diseases. This study aims to identify important lifelog factors related to BMI. The sleep, gait, and body data of 47 middle-aged women and 71 middle-aged men were collected using smartwatches. Variables were derived to examine the relationships between these factors and BMI. The data were divided into groups according to height based on the deﬁnition of BMI as the most inﬂuential variable. The data were analyzed using regression and tree-based models: Ridge Regression, eXtreme Gradient Boosting (XGBoost), and Category Boosting (CatBoost). Moreover, the importance of the BMI variables was visualized and examined using the SHapley Additive Explanations Technique (SHAP). The results showed that total sleep time, average morning gait speed, and sleep efﬁciency signiﬁcantly affected BMI. However, the variables with the most substantial effects differed among the height groups. This indicates that the factors most profoundly affecting BMI differ according to body characteristics, suggesting the possibility of developing efﬁcient methods for personalized healthcare.


Introduction
A lifelog is an integrated digital record consisting of personal data collected from various digital sensors [1] such as activity, sleep information, weight change, body mass, muscle mass, and fat mass. With the development of wearable devices, more accurate and precise measurements are possible. Lifelog information obtained by wearable devices, such as gait, sleep, and weight, is now used for chronic disease occurrence monitoring and health care [2][3][4]. However, healthcare services using lifelogs are currently limited to simple records or incomplete statistics. Even if they include exercise and lifestyle feedback functions, the feedback provided is not personalized according to user characteristics. Therefore, this study aims to identify factors that can be used to develop personalized healthcare through lifelog analysis. This study interprets machine learning results using an interpretable model rather than a black box model.
Most previous studies on the correlation between BMI and weight with disease incidence have used medical data [5,6]. In contrast, we used lifelogs of sleep, steps, and weight in daily life. Individual analysis was subsequently performed using machine learning algorithms.
The rest of this paper is organized as follows. Section 2 describes the use and importance of lifelog data. Section 3 analyzes the association between lifelog data and BMI using

Relationship between Weight and Disease
Weight gain is known to be associated with an increased risk of type 2 diabetes, coronary artery disease, high blood pressure [25], cholelithiasis [26], and several cancers [27]. A study that used cohort survey data from 92,837 women and 25,303 men in the U.S. to investigate how weight changes from adolescence to middle age are associated with various chronic diseases after the age of 55 years found that weight gain increased the risk of type 2 diabetes, hypertension, cardiovascular disease, obesity-related cancer, cholelithiasis, severe osteoarthritis, and cataracts.
In this study, weight was used as an indicator of health status; sleep and gait were used as independent variables related to weight. However, as each person's physical characteristics are unique, BMI was calculated and used as the response variable instead of weight.

The Association between Lifelog Data and BMI Using Regression Machine Learning Algorithms
Existing relevant studies can generally be categorized as follows.

Digital Healthcare Research Using Machine Learning and Data Generated by Smartphones and Smartwatches
One study proposed developing a severity score for Parkinson's disease using smartphone sensor data and machine learning, which can provide helpful information for the clinical management and treatment of patients with the disease [28]. Other studies have proposed a motion recognition model related to the user's meal intake using a smartwatch sensor [29].

Research on Men and Women's Health Using Machine Learning
One study proposed a predictive model using individual health data that affects the mortality rate in women with breast cancer [30]; another study developed and tested an early prediction model that could predict diabetes in women with an accuracy of 81.1% using a factor analysis that was highly correlated with diabetes [31]. Other studies have used machine learning predictive models to identify women at high risk of postpartum depression [32].

Research on Weight and Weight Change Using Machine Learning
In a study that summarized the risk factors for obesity and overweight using machine learning, age and gender were selected as significant relevant risk factor variables [33]. In addition, some studies have classified and predicted high, medium, and low weight loss potential levels using machine learning algorithms and the dietary and exercise data of obese patients [34].

Data Collection
Data on sleep, gait, and weight of 47 women and 71 men aged 35-59 years were obtained using the GiVita Inc. app on Samsung Galaxy Watch Active2 smartwatches, collected from 1 February to 9 August 2021. The age was set at 35 to 59; we targeted middle age because the age at which health care begins is the age at which interest in health care is greatest.
We collected the data for these six months because the app was updated compared to the previous version, improving usability, stabilizing data, and reducing missing and abnormal values. First, updating the app resulted in fewer errors, making data collection more stable. Second, the user experience was significantly improved and rewards were provided as an update. The data included the users' bedtimes, wake-up times, steps per minute and day, walking distance per minute and day, walking speed per minute and day, and daily weight. Based on these records, data on daily sleep in minutes, daily steps in minutes, and daily weight were created. The dataset sizes were 6223 rows of sleep data collected by day, 241,068 rows of sleep data collected by minute, 1,797,590 rows of step data collected per day, 6380 rows of step data collected by minute, and 6729 rows of body weight data collected by day.

Data Preprocessing
To find the optimal variables explaining individual BMI variance, several variables, such as total daily sleep time and sleep efficiency, were generated using daily sleep data and sleep data in minutes. Likewise, derivative variables, such as the total number of steps per day and average walking speed in the morning, were generated using the daily step data and step data in minutes. Additionally, the users' body data, such as height and weight, were integrated with each day's step and sleep data. The derivative variables are shown in Appendix A: Table A1.

Feature Selection
If all 55 derived variables (Table A1) were used as inputs in the model, there would be a risk of overfitting. Therefore, feature selection was performed to remove unnecessary variables. This study selected features using the Boruta SHapley Additive exPlanations (BorutaSHAP) method, which combines the Boruta feature selection algorithm with SHAP values [35]. The execution procedure of the Boruta algorithm can be summarized as follows [36], and Figure 1 shows the procedure of Boruta feature selection. 4. Calculate the Z-score. 5. Search for the maximum Z-score among shadow attributes (MZSA). 6. For raw data, if the Z-score is greater than the MZSA, it is an importan 7. Repeat the above process as often as the random forest is performed variable is marked as either important or non-significant. In previous studies, algorithm experiments on multiple datasets have s Boruta method is better at feature selection than the Chi-Square method [3 method of selecting variables based on a random forest. In addition, the Bor cess uses the Light Gradient Boosting Machine (LGBM), Category Boosti (which robustly addresses categorical variables, as well as random forests), ing-type models, such as eXtreme Gradient Boosting (XGBoost), to calcula portance. BorutaSHAP provides flexibility in model selection and allows v the selected features by applying the SHAP [35]. Therefore, the BorutaSHAP algorithm was used in this study for flex lection and convenient visualization of the key selected variables. This m

1.
Create a replicated random variable called "shadow features" for all features.

2.
Randomly mix and combine the original and replicated data to remove possible correlations between dependent variables and features.

3.
Create a random forest on the combined data and calculate the variable's importance.
For raw data, if the Z-score is greater than the MZSA, it is an important variable. 7.
Repeat the above process as often as the random forest is performed, or until each variable is marked as either important or non-significant.
In previous studies, algorithm experiments on multiple datasets have shown that the Boruta method is better at feature selection than the Chi-Square method [37]. Boruta is a method of selecting variables based on a random forest. In addition, the BorutaSHAP process uses the Light Gradient Boosting Machine (LGBM), Category Boosting (CatBoost) (which robustly addresses categorical variables, as well as random forests), or other boosting-type models, such as eXtreme Gradient Boosting (XGBoost), to calculate feature importance. BorutaSHAP provides flexibility in model selection and allows visualization of the selected features by applying the SHAP [35]. Therefore, the BorutaSHAP algorithm was used in this study for flexible model selection and convenient visualization of the key selected variables. This method extracts important features using thresholds and t-tests on data with random shadow variables added.
As there are few categorical variables in this study, and the feature importance computed using random forest can be biased in some cases [38], to calculate the feature importance, the XGBoost model was selected instead [39]. The strengths of the BorutaSHAP algorithm are the consistency of feature importance [40] and the use of intuitive colors to visualize the feature importance.
As shown in Figure 2, most of the sleep variables in the variable list extracted using BorutaSHAP, except for the total number of hours of sleep per day, are in red and do not significantly influence the generation of rules related to BMI prediction. These steps can directly affect BMI. Most of the variables are green and can be identified as primary variables. In addition, height, which is very closely related to BMI, was also identified as a significant variable. The four blue boxes represent the minimum, median, mean, and maximum attributes. The yellow box means tentative, the importance of which is difficult to determine. The reason is that this corresponding provisional attribute appears near the maximum attribute, which is challenging to identify in the basic random forest execution of the Boruta algorithm. Appl important features using thresholds and t-tests on data with random shadow variables added.
As there are few categorical variables in this study, and the feature importance computed using random forest can be biased in some cases [38], to calculate the feature importance, the XGBoost model was selected instead [39]. The strengths of the BorutaSHAP algorithm are the consistency of feature importance [40] and the use of intuitive colors to visualize the feature importance.
As shown in Figure 2, most of the sleep variables in the variable list extracted using BorutaSHAP, except for the total number of hours of sleep per day, are in red and do not significantly influence the generation of rules related to BMI prediction. These steps can directly affect BMI. Most of the variables are green and can be identified as primary variables. In addition, height, which is very closely related to BMI, was also identified as a significant variable. The four blue boxes represent the minimum, median, mean, and maximum attributes. The yellow box means tentative, the importance of which is difficult to determine. The reason is that this corresponding provisional attribute appears near the maximum attribute, which is challenging to identify in the basic random forest execution of the Boruta algorithm.

Data Modeling
The final features obtained using BorutaSHAP were learned using three models: XG-Boost, CatBoost, and Ridge Regression. Table 1 shows the hyperparameters found by GridSearchCV in the Scikit-Learn (Sklearn) library and used in the three models. Training and test data were divided 8:2, and a 5-fold cross-validation was used for more accurate verification. In this study, data were not normalized to remove outliers from data preprocessing and to facilitate the interpretation of the study results. Thus, we used ridge regression and tree-based machine learning models, such as XGBoost and CatBoost, which can operate relatively robustly without regularization.

Evaluation
This study mainly used five evaluation indicators: Explained Variance Score, R-squared score, adjusted-R-squared score, Mean Absolute Error (MAE), and Root Mean Squared Error (RMSE).
The description and calculation formulas of the performance indicators for each are as follows: (y i − y) 2 , y i is the actual value, y p is the predicted value, and y is the average of the actual values. where n is the number of samples and k is the number of explanatory variables. The Explained Variance Score is 1-((Sum of Squared Residuals-Mean Error)/Total Variance). The only difference between the Explained Variance Score (EVS) and the R-squared score is that the R-squared score subtracts the mean error from the Sum of Squared Residual (SSR). If the mean error is not close to zero, and a negative or positive value is obtained, the error is biased to one side and, thus, the model is biased. In other words, if the R-squared and the explanatory variance score are different, the error is biased, and there is a high possibility of incorrect fitting.
Compared to MAE, RMSE has the advantage of giving a sizable penalty for a significant error value difference and is strong.

Results
This study made predictions on the test dataset using a model based on the entire training data. The BMI values predicted by the model were analyzed using SHAP. Figure 2 shows the feature importance in XGBoost model using SHAP, which indicates that height is a highly importance feature. However, according to the BMI calculation formula, height already had a high correlation with BMI, regardless of model performance. It may not be possible to measure the influence of other variables correctly. Therefore, a clustering method was used to divide the data according to height, and the relationships between the variables and BMI were examined in each height group.
The reason for dividing the groups based on height was to accurately identify the degree of influence of the changeable activity variable on the BMI of users in the same group, as height is an immutable variable. According to the National Statistical Office of Korea and previous studies, the average heights of women in their 30s, 40s, and 50s in Korea are 161.59, 159.91, and 157.23 cm, respectively; the average heights of men in their 30s, 40s, and 50s in Korea are 174.05, 172.15, and 169.39 cm, respectively [41,42]. Figure 3 also shows that height was an important feature. Based on height, we divided the women's data into two groups, 150-160 cm and 160-170 cm, and the men's data into three groups, 165-170 cm, 170-175 cm, and 175-180 cm. To reduce the imbalance in the data due to excessive grouping, we divided the men's data by 5 cm. For females, we formed two groups because the amount of data in a given group became too small when divided by 5 cm, resulting in performance problems.
The Explained Variance Score is 1-((Sum of Squared Residuals-Mean Erro Variance). The only difference between the Explained Variance Score (EVS) and squared score is that the R-squared score subtracts the mean error from the Squared Residual (SSR). If the mean error is not close to zero, and a negative or p value is obtained, the error is biased to one side and, thus, the model is biased. I words, if the R-squared and the explanatory variance score are different, the err ased, and there is a high possibility of incorrect fitting.
Compared to MAE, RMSE has the advantage of giving a sizable penalty for a icant error value difference and is strong.

Results
This study made predictions on the test dataset using a model based on th training data. The BMI values predicted by the model were analyzed using SHAP 2 shows the feature importance in XGBoost model using SHAP, which indica height is a highly importance feature. However, according to the BMI calculation f height already had a high correlation with BMI, regardless of model performance not be possible to measure the influence of other variables correctly. Therefore, a ing method was used to divide the data according to height, and the relationships b the variables and BMI were examined in each height group.
The reason for dividing the groups based on height was to accurately iden degree of influence of the changeable activity variable on the BMI of users in th group, as height is an immutable variable. According to the National Statistical O Korea and previous studies, the average heights of women in their 30s, 40s, and Korea are 161.59, 159.91, and 157.23 cm, respectively; the average heights of men 30s, 40s, and 50s in Korea are 174.05, 172.15, and 169.39 cm, respectively [41,42]. F also shows that height was an important feature. Based on height, we divid women's data into two groups, 150-160 cm and 160-170 cm, and the men's data in groups, 165-170 cm, 170-175 cm, and 175-180 cm. To reduce the imbalance in t due to excessive grouping, we divided the men's data by 5 cm. For females, we two groups because the amount of data in a given group became too small when by 5 cm, resulting in performance problems.    Tables 2 and 3 summarize the performance indicators of the three models in each cluster for men and women, respectively. For modeling women's data, it was decided to use only XGBoost based on the men's data analysis results. The XGBoost algorithm is generally similar to or superior to Ridge Regression and CatBoost algorithms. The benefits of using CatBoost are limited because the features used for analysis have few categorical features.
The main relevant variables for women's group 1 were calories burned, distance, number of steps, time walked at night, and total sleep time per day ( Figure 4). Tables 2 and 3 summarize the performance indicators of the three models cluster for men and women, respectively.  The main relevant variables for women's group 1 were calories burned, d number of steps, time walked at night, and total sleep time per day ( Figure 4).  A figure illustrating the data of women's group 2 can be found in the Appendix A ( Figure A1). The main variables of women's group 2 were calories burned by walking, total sleep time per day, total sleep time variability, gait variability, and average walking speed at night.
The men's data were analyzed in the same way. The main variables of men's group 1 were average morning walking speed, calories burned, total sleep time variability, daily average walking speed, and daily total sleep time. Considering the accumulated SHAP values for this group, we found that average walking speed in the morning based on the approximate distance lowered BMI and that total sleep time influenced increases in BMI ( Figure 5).
Appl. Sci. 2022, 12, x FOR PEER REVIEW The men's data were analyzed in the same way. The main variables of men' 1 were average morning walking speed, calories burned, total sleep time variabilit average walking speed, and daily total sleep time. Considering the accumulated values for this group, we found that average walking speed in the morning based approximate distance lowered BMI and that total sleep time influenced increases ( Figure 5). Figures illustrating the data from men's groups 2 and 3 are presented in the dix (Figures A2 and A3). As for the main variables of men's group 2, calories b distance walked, and the number of steps generally influenced whether BMI w lower, as confirmed by the accumulated SHAP values of the group. The primary r variables of men's group 3 were step variability, calorie consumption by walkin average walking speed, length of time spent walking at night, and walking distan ability. The accumulated SHAP values of the group confirmed that variability in th ber of steps, distance walked, and amount of time walked at night influenced lo BMI, and calorie consumption by walking influenced increasing BMI.
For the local interpretation of arbitrary data, the SHAP force plot was used. women in group 1, the number of steps, distance walked, amount of time walking a and average walking speed at night influenced lowering BMI. In contrast, calor total sleep time affected increasing BMI ( Figure 6).  Figure A4 in the Appendix summarizes the data on women's group 2. In this the number of calories consumed, total sleep time, bedtime, step variability, and a speed walking at night affected a reduction in BMI. In contrast, total sleep time var and average walking speed increased BMI. Figures illustrating the data from men's groups 2 and 3 are presented in the Appendix A (Figures A2 and A3). As for the main variables of men's group 2, calories burned, distance walked, and the number of steps generally influenced whether BMI would be lower, as confirmed by the accumulated SHAP values of the group. The primary relevant variables of men's group 3 were step variability, calorie consumption by walking, daily average walking speed, length of time spent walking at night, and walking distance variability. The accumulated SHAP values of the group confirmed that variability in the number of steps, distance walked, and amount of time walked at night influenced lowering BMI, and calorie consumption by walking influenced increasing BMI.
For the local interpretation of arbitrary data, the SHAP force plot was used. For the women in group 1, the number of steps, distance walked, amount of time walking at night, and average walking speed at night influenced lowering BMI. In contrast, calories and total sleep time affected increasing BMI ( Figure 6). average walking speed, and daily total sleep time. Considering the accumulated SHAP values for this group, we found that average walking speed in the morning based on the approximate distance lowered BMI and that total sleep time influenced increases in BMI ( Figure 5). Figures illustrating the data from men's groups 2 and 3 are presented in the Appendix ( Figures A2 and A3). As for the main variables of men's group 2, calories burned, distance walked, and the number of steps generally influenced whether BMI would be lower, as confirmed by the accumulated SHAP values of the group. The primary relevant variables of men's group 3 were step variability, calorie consumption by walking, daily average walking speed, length of time spent walking at night, and walking distance variability. The accumulated SHAP values of the group confirmed that variability in the number of steps, distance walked, and amount of time walked at night influenced lowering BMI, and calorie consumption by walking influenced increasing BMI.
For the local interpretation of arbitrary data, the SHAP force plot was used. For the women in group 1, the number of steps, distance walked, amount of time walking at night, and average walking speed at night influenced lowering BMI. In contrast, calories and total sleep time affected increasing BMI ( Figure 6).  Figure A4 in the Appendix summarizes the data on women's group 2. In this group, the number of calories consumed, total sleep time, bedtime, step variability, and average speed walking at night affected a reduction in BMI. In contrast, total sleep time variability and average walking speed increased BMI.
Similarly, men's group 1 data showed that step variability, total sleep time, average morning walking speed, and overall daily average walking speed affected a reduction in BMI. In contrast, variability in the length of time spent walking in the morning and total  Figure A4 in the Appendix A summarizes the data on women's group 2. In this group, the number of calories consumed, total sleep time, bedtime, step variability, and average speed walking at night affected a reduction in BMI. In contrast, total sleep time variability and average walking speed increased BMI.
Similarly, men's group 1 data showed that step variability, total sleep time, average morning walking speed, and overall daily average walking speed affected a reduction in BMI. In contrast, variability in the length of time spent walking in the morning and total sleep time affected increasing BMI (Figure 7). Figures illustrating the data from men's groups 2 and 3 can be found in the Appendix (Figures A5 and A6). An interpretation of the data for men's group 2 showed that calorie consumption, number of steps walked per day, and distance walked affected a reduction in BMI.
Step variability and distance walked in the morning affected the increase in BMI. For men's group 3, the number of steps walked per day, step variability, step calorie consumption through walking, and amount of time walking at night affected a reduction in BMI. In contrast, total sleep time affected the increase in BMI.

Discussion
This study aims to identify the lifelog variable with the most decisive influence on BMI, closely related to health, by utilizing changeable gait and sleep lifelogs. The variables related to increases or decreases in BMI were also analyzed and specified.
Although lifelog data have recently been collected more efficiently and constantly by wearable devices, there are still limitations in the accuracy and quality of sleep data. Data collection can be unstable due to various external factors, such as battery and Wi-Fi communication. In addition, the accuracy of the data can be low because it is difficult for users to wear the devices continuously.
To improve the quality of sleep data, the study attempted to complement the limitations in accuracy by minimizing bias and anomalies. Various preprocessing and derivative variables such as total daily sleep time, variability in sleep time compared to the previous day, and daily sleep efficiency were generated. Further complementary research on sleep and accurate data collection and preprocessing of lifelogs is needed.
To determine the influence of each variable more accurately on BMI, we divided the data into two groups of women and three groups of men, and we compared the data using the representative machine learning regression models Ridge Regression, XGBoost, and CatBoost. We also used the highly effective SHAP method for explainable artificial intelligence to visualize the relative importance of variables to BMI changes. The predicted results were interpreted by combining the machine learning model and SHAP. In the case of a deep learning model, also known as a black box model, the performance of the prediction can be high, but the interpretation can be difficult.
Therefore, a SHAP-based interpretable machine learning model was used in this study. The advantages over the black box model include the following: 1. Improved confidence in the machine learning model, providing a clear explanation of the results of the inference path. 2. Deriving insights by interpreting the results: extracting associations and patterns. Figures illustrating the data from men's groups 2 and 3 can be found in the Appendix A ( Figures A5 and A6). An interpretation of the data for men's group 2 showed that calorie consumption, number of steps walked per day, and distance walked affected a reduction in BMI.
Step variability and distance walked in the morning affected the increase in BMI. For men's group 3, the number of steps walked per day, step variability, step calorie consumption through walking, and amount of time walking at night affected a reduction in BMI. In contrast, total sleep time affected the increase in BMI.

Discussion
This study aims to identify the lifelog variable with the most decisive influence on BMI, closely related to health, by utilizing changeable gait and sleep lifelogs. The variables related to increases or decreases in BMI were also analyzed and specified.
Although lifelog data have recently been collected more efficiently and constantly by wearable devices, there are still limitations in the accuracy and quality of sleep data. Data collection can be unstable due to various external factors, such as battery and Wi-Fi communication. In addition, the accuracy of the data can be low because it is difficult for users to wear the devices continuously.
To improve the quality of sleep data, the study attempted to complement the limitations in accuracy by minimizing bias and anomalies. Various preprocessing and derivative variables such as total daily sleep time, variability in sleep time compared to the previous day, and daily sleep efficiency were generated. Further complementary research on sleep and accurate data collection and preprocessing of lifelogs is needed.
To determine the influence of each variable more accurately on BMI, we divided the data into two groups of women and three groups of men, and we compared the data using the representative machine learning regression models Ridge Regression, XGBoost, and CatBoost. We also used the highly effective SHAP method for explainable artificial intelligence to visualize the relative importance of variables to BMI changes. The predicted results were interpreted by combining the machine learning model and SHAP. In the case of a deep learning model, also known as a black box model, the performance of the prediction can be high, but the interpretation can be difficult.
Therefore, a SHAP-based interpretable machine learning model was used in this study. The advantages over the black box model include the following:

1.
Improved confidence in the machine learning model, providing a clear explanation of the results of the inference path.

2.
Deriving insights by interpreting the results: extracting associations and patterns.

3.
Improving overall problem solving and eliminating bias errors: debugging the way predictions are performed can increase predictive power, and the cause of bias can be analyzed and improved.
The integrated lifelog data analysis of walking, sleep, and weight with machine learning revealed the key variables and how walking and sleep affect body weight. For middle-aged individuals, lifelogs can inform specifically and individually tailored health analyses beyond simple predictions, and they can influence weight regulation through interpretable techniques and visualization.
In this study, the most common influential variables were calories burned, number of steps per day, distance walked per day, and sleep quality. These findings are consistent with those of previous studies, indicating that the number of steps [43], walking speed [44,45], and sleep quality [46,47] affect BMI.
The analysis highlights daily calorie consumption as an essential variable for predicting BMI. In the case of women's group 1 (150-160 cm), calories burned per day, the number of steps per day, distance walked per day, and amount of time walking at night had the most significant effect on BMI. In women's group 2 (160-170 cm), daily calorie consumption by walking, total sleep time, total sleep time variability, and step variability had the highest impact on BMI.
For men's group 1 (165-170 cm), average morning walking speed, calorie consumption per day, and total sleep variability had the most substantial effects on BMI. In men's group 2 (170-175 cm), calories burned per day, distance walked per day, and the number of steps per day had the greatest impact on BMI. Finally, in the case of men's group 3 (175-180 cm), gait variability, calorie consumption per day, and average walking speed had the most significant effects on BMI.
Thus, factors such as height, diet, physical activity may affect physical changes and the incidence of diseases in different ways [48][49][50]. In addition, the effects of training methods may vary according to BMI and weight (body type) [51,52]. Experiments have found that different variables, including walking variables, may affect each group differently.
These findings provide evidence that the factors with the most decisive influence on BMI depend on the height and lifelog of an individual, suggesting the possibility of developing an efficient method for personalized healthcare in the future. Although this study has proposed various sleep and gait variables that affect BMI, it would be valuable for a follow-up study to determine specific values or ranges for each variable to support a healthy BMI.

Institutional Review Board Statement: Not applicable.
Informed Consent Statement: Informed consent was obtained from all subjects involved in the study.

Conflicts of Interest:
The authors declare no conflict of interest. Whether BED_TIME is between 10:00 p.m. and 00:00 a.m.