Residential Characteristics as Factors Related to Healthy Behavior Practices—Decision Tree Model Analysis Using a Community Health Survey from Korea

In this study, we sought to identify relevant factors in healthy behavior practices, including not only individual-level variables but also regional and physical environments. Data from the Korea Community Health Survey (KCHS) of Gyeongsangnam-do in 2018 were used, with data from 16,519 of the 17,947 individuals (excluding 1428 individuals who had missing values) who participated in the survey. Healthy behavior practices were defined as meeting the criteria for all three modifiable healthy behaviors (non-smoking, moderate alcohol consumption, regular walking). A decision tree analysis was performed. In men, healthy behavior practices were lower in the unemployed population, in those aged 40–50 years, living in rural residential areas, and with stress. For women who lived in areas with small populations (<100,000 population), healthy behavior practices were below-average. Men and women who had below-average healthy behavior practices reported poor access to places for exercise and fair or poor self-rated health statuses. It is necessary to implement a health behavior practice intervention that considers not only individual characteristics but also access to local exercise facilities and residential area characteristics (urban, rural). Since age is an important variable in healthy behaviors for both men and women, customized programs that consider age should be provided.


Introduction
In recent years, the importance of the prevention and management of chronic diseases has increased. According to the World Health Organization (WHO), non-communicable diseases (NCDs), that is, cardiovascular disease, cancer, and diabetes caused 71% of global deaths in 2016 [1]. To achieve the target of the sustainable development goals (SDGs) of a one-third reduction in premature deaths from NCDs by 2030, it is necessary to modify the following risk factors for NCDs: tobacco use and high-risk drinking, lack of physical activity, and an unhealthy diet [1].
Smoking, high-risk drinking, and lack of physical activity are risk factors for poor health, but improving them can prevent or delay the worsening of chronic diseases [2]. Engaging in multiple risky behaviors is associated with a greater risk of chronic disease and mortality as compared with engaging in one or no risk behaviors [3][4][5]. In addition, mortality risk and the risk of poor health-related quality of life increase when the number of poor behaviors increases [6][7][8][9].
An increasing number of studies have explored the clustering of health behaviors [6,10,11]. Clustering at both ends of the spectrum has been reported, with larger numbers of individuals than expected exhibiting all or none of a range of risk factors [6,[11][12][13].
Therefore, given that risk behaviors rarely occur in isolation, tackling multiple rather than single behaviors may be a more effective approach [3]. Multiple-behavior change 2 of 12 interventions may therefore have a greater potential for causing a positive impact on health outcomes than single-behavior change interventions [6,14,15].
Health behaviors are influenced in a complex way by an individual's demographic characteristics, health status, and the characteristics of the region in which they live.
Population-based health promotion through health interventions is needed to improve healthy lifestyles. However, so far, the primary focus has been on high-risk-based health promotion interventions that find and modify individual risk factors. To carry out health promotion interventions, it is important to select subjects with various risk factors. Recently, in the field of public health, the decision tree analysis has been widely used as a useful method of classifying subjects with risk factors [16].
Decision tree analysis is a method of classifying the entire group into several subgroups and making a prediction by expressing the decision rule in the form of a tree structure. Because the process of classification and prediction describes the model in a tree structure, interpretation is easy, and it provides meaningful knowledge. In other words, since an accurate analysis of the target group is possible and provides a basis for specific classification, prediction, and policy development for the target group [17,18], the selection of health promotion program targets (group profiling) is advantageous. Recently, this analysis has been applied in various ways, such as in metabolic syndrome prediction [19] and in the health management behavior of diabetic patients [20]. However, most studies so far have included individual-level variables, and studies that included regional (including environmental) characteristics have been rare.
The purpose of this study was to identify factors relevant to healthy behavior practices that include not only individual-level variables but also regional (including environmental) variables, and to find the basis for the goal of improving health behaviors and the selection of the priority population. Ultimately, it was intended to be used as evidence for the need to bridge the health gap.

Subjects Data Collection and Subjects
Data from the Korea Community Health Survey (KCHS) for Gyeongnam province in 2018 were used. The KCHS study [21] is a repeated cross-sectional study that has been conducted annually by the Korea Center for Disease Control and Prevention (KCDC) on about 900 residents for each of the 252 districts since 2008. The purpose of this survey is to produce regional representative statistics in Korea; its target population is Korean adults in general, aged 19 years or above. All of the surveys are conducted by trained interviewers that visit the sample households, using a face-to-face computer-assisted personal interviewing (CAPI) method.
The current study utilized 16,519 of the 17,947 individuals who participated in the KCHS in the Gyeongnam province in 2018, after excluding 1428 individuals who had missing values (individuals with missing data on the questionnaire were excluded) ( Figure 1). All data were based on self-reported information. We examined three modifiable healthy behaviors in this study: non-smoking, not high-risk drinking, and regular walking. The subjects were divided into two groups according to whether or not they engaged The current study was approved by the Gyeongsang National University Institutional Review Board (GIRB-A20-X-0063).

Dependent Variables
All data were based on self-reported information. We examined three modifiable healthy behaviors in this study: non-smoking, not high-risk drinking, and regular walking. The subjects were divided into two groups according to whether or not they engaged in each healthy behavior.
Healthy behavior practices were defined as meeting the criteria for all three modifiable healthy behaviors (non-smoking, not high-risk drinking, and regular walking).
Non-smokers were defined as individuals who did not smoke at the time of the study (including former smokers); those who had never smoked or had smoked less than 100 cigarettes in their lifetime were considered non-smokers; and former smokers were defined as those who had smoked at least 100 cigarettes in their lifetime but were not smokers at the time of the survey. Current smokers were defined as those who had smoked five packs (100 cigarettes) or more in their lifetime and were currently smoking every day or on occasion.
Not-high-risk drinking was defined using the KCHS definition of high-risk drinking, which measures the amount of alcohol consumption through the number of drinks/day and the frequency of drinking; thus, since one standard drink in Korea contains 7-8 g of ethanol, high-risk drinking in Korea is defined as two or more drinking occasions per week, with more than seven (for men) or five (for women) standard drinks per drinking occasion. Those who did not meet this threshold were considered not-high-risk drinkers.
The definition of regular walking exercise was walking for at least 30 min per day and at least 5 days per week. Walking activity was measured through the following two questions: (1) "How many days did you walk for at least 10 min at a time in the last week? This includes at work and at home, walking to travel from place to place, and any other walking that you managed to do solely for recreation, sport, exercise, or leisure", and (2) "How much time did you typically spend walking on each of those days?"

Variables Related to Healthy Behavior Practice
In this study, demographics, health status, and regional variables were used as relevant factors for healthy behavior practices. The demographic variables collected were age, gender, monthly household income, education level, job, and marital status. Age was classified into 19-29, 30-39, 40-49, 50-59, 60-69, and ≥70 years. Monthly household income was classified as <2.0 and ≥2.0 million Korean won (KRW) (KRW 1.0 million is approximately USD 1000). Educational level was classified as middle school graduate or less and high school graduate or higher. Marital status was classified as living together (married/living with partner) or living alone (divorced/separated/widowed or never married).
Health status variables included body mass index (BMI), stress, experience of depression, self-rated health status (SRH), and the presence of hypertension and diabetes. BMI was defined using self-reported height and weight, and it was divided into three groups: BMI < 18.5 kg/m 2 for underweight, BMI ≥ 18.5 and <25 kg/m 2 for normal weight, and BMI ≥ 25 kg/m 2 for obese. SRH was classified as good, fair, or poor. For the mental health variables, stress was classified as high for those who felt "very much" or "a lot" of stress in daily life, and low for those who did not. Perceived depression symptoms were defined as experiencing a feeling of depression that interfered with daily life for at least two consecutive weeks during the past year. High blood pressure and diabetes mellitus were marked with a "yes" if the subjects answered that they were suffering from these ailments, and "no" if they did not suffer from them.
Regional variables included satisfaction with house type, urban or rural area, city size according to the population size, and access to places for exercise; the atmospheric environment and green space were perceived as environmental factors. Housing types were divided into apartments and houses. The administrative district units in Korea have three structures: parish (si/do), county (si/gun/gu), and town (eup/myeon/dong). Each residential area was defined as urban or rural by classifying "dong" as an urban area, and "eup/myeon" as a rural area. In addition, city size was categorized based on the administrative districts with a population size of city 1 (≥300,000 population), city 2 (<300,000 and ≥100,000 population), city 3 (<100,000 and ≥50,000 population), and city 4 (<50,000 population). Access to places for exercise was defined as being able to easily find a place to exercise within one's neighborhood in the past year. The perceived environment was divided into good and bad after participants evaluated their satisfaction with the atmosphere and the green space environments of the area in which they currently resided on a 5-point Likert scale.

Statistical Analysis
As for the general characteristics of the subjects, healthy behavior practice variables, including smoking and drinking, were divided by gender because there were differences according to gender. A chi-square test was performed and a p-value < 0.05 was considered to be statistically significant. Statistical analysis was performed using the SPSS version 25.0 (IBM Corp., Armonk, NY, USA).
A decision tree analysis was performed. This analysis was used to classify the group after using health behavior practices as dependent variables. Groups were classified using the decision tree model, and the feature importance of variables determining the health behavior practice groups were calculated, which helped in prioritizing variables to improve health behavior practices ( Figure 2). Decision tree analysis was performed using Python (Python 3.7).
as experiencing a feeling of depression that interfered with daily life for at least two consecutive weeks during the past year. High blood pressure and diabetes mellitus were marked with a "yes" if the subjects answered that they were suffering from these ailments, and "no" if they did not suffer from them.
Regional variables included satisfaction with house type, urban or rural area, city size according to the population size, and access to places for exercise; the atmospheric environment and green space were perceived as environmental factors. Housing types were divided into apartments and houses. The administrative district units in Korea have three structures: parish (si/do), county (si/gun/gu), and town (eup/myeon/dong). Each residential area was defined as urban or rural by classifying "dong" as an urban area, and "eup/myeon" as a rural area. In addition, city size was categorized based on the administrative districts with a population size of city 1 (≥300,000 population), city 2 (<300,000 and ≥100,000 population), city 3 (<100,000 and ≥50,000 population), and city 4 (<50,000 population). Access to places for exercise was defined as being able to easily find a place to exercise within one's neighborhood in the past year. The perceived environment was divided into good and bad after participants evaluated their satisfaction with the atmosphere and the green space environments of the area in which they currently resided on a 5-point Likert scale.

Statistical Analysis
As for the general characteristics of the subjects, healthy behavior practice variables, including smoking and drinking, were divided by gender because there were differences according to gender. A chi-square test was performed and a p-value < 0.05 was considered to be statistically significant. Statistical analysis was performed using the SPSS version 25.0 (IBM Corp., Armonk, NY, USA).
A decision tree analysis was performed. This analysis was used to classify the group after using health behavior practices as dependent variables. Groups were classified using the decision tree model, and the feature importance of variables determining the health behavior practice groups were calculated, which helped in prioritizing variables to improve health behavior practices (Figure 2). Decision tree analysis was performed using Python (Python 3.7).

General Characteristics of the Subjects
Of the total number of subjects, 9302 were women (56.3%). With regard to the demographic variables, those in their 70s or older accounted for the highest number of participants (27.0%), those who earned KRW 2 million or more accounted for 58.5%, while 60.4% were employed. In terms of the level of education, 68.2% graduated from high school or higher. In total, 67.6% lived together (married/living with partner). For the health status, 30.7% had good self-rated health, 20.4% had high stress, and 5.0% had perceived depression symptoms. According to the BMI, 34.9% of the participants were obese. In total, 29.3% of the patients were diagnosed with high blood pressure and 11.2% were diagnosed with diabetes mellitus. For the regional variables, 67.1% were living in houses and 65.1% resided in rural areas, 74.9% had easy to access to places for exercise, 66.6% had a good perceived atmosphere in their living environment, and 70.4% had a good perceived green space environment (Table 1). Values are presented as numbers (%). p-values were determined by chi-square test. 1 Unit of currency: Korean won (KRW); KRW 1.0 million is approximately USD 1000. 2 "Dong" as urban area and "eup/myeon" as rural area. 3 City 1: ≥300 thousand population, city 2: <300 thousand and ≥100 thousand population, city 3: <100 thousand and ≥50 thousand population, city 4: <50 thousand population. 4 Healthy behavior was defined as engagement in all three healthy behaviors (non-smoking, not-high-risk drinking, and regular walking). 5 High-risk drinking; individuals who had more than seven (for men) or five (for women) drinks on the same occasion on at least 2 of the past 7 days. 6 Regular walking; participating in walking activities for at least 30 min, five or more days a week.
There was a significant difference between the men and women in the demographic variables of age, income, job status (employed), education level, and their marital status (married/living with partner) (p < 0.001). SRH, stress, perceived depression symptoms, BMI, and hypertension were significant (p < 0.001). With regard to regional variables, housing type and access to places for exercise were significant (p = 0.011, p = 0.010). Diabetes mellitus, residential area ("dong" as urban area and "eup/myeon" as rural area), city size (the administrative districts with the population size), and perceived environment (atmosphere, green space) were not significant ( Table 1).
The healthy behavior practice group included 29.2% of the total participants (Table 1). There was a significant difference in the percentage of healthy behavior practices according to sex (22.0% for men and 34.7% for women) ( Table 1).

Results of the Decision Tree Analysis for Healthy Behavior Practices (Men)
The group of participants that was over 60 years old and had difficulty accessing places for exercise had the lowest level of healthy behavior practices (regardless of BMI: underweight (5.9%), normal weight and obesity (21.6%)), and 11.1% of the group had no job, were in their 40s up to 50s, and lived in rural areas (eup/myeon). In the group of participants who had a job, were under the age of 60, and had a fair SRH status, the practice of healthy behavior was 13.1%. In addition, healthy behavior practices were low in the group of participants who were over 60 years old, had easy access to places for exercise, high stress, and no diabetes (16.8%) (Figure 3).

Results of the Decision Tree Analysis for Healthy Behavior Practices (Women)
When it was difficult to gain access to places for exercise, the practice of healthy behavior was low regardless of age (group over 70 years (22.3%), group under 50 years and over 70 years (25.5%), and group under 50 years (33.1%)). Those who had easy access to places for exercise were over 50 years old, had fair SRH, had no spouse, had high blood pressure, and lived in a house all had a lower amount of healthy practices (28.3%). In addition, 30.5% of the subjects living in a place with easy access to places for exercise and a population of less than 300,000 showed a greater number of healthy practices. In contrast, 32.9% of those who had easy access to places for exercise, were over 50 years old, had fair SRH, had a spouse, had a job, and lived in a region with a population of less than 100,000 showed a higher amount of healthy practices (Figure 4).

Results of the Decision Tree Analysis for Healthy Behavior Practices (Women)
When it was difficult to gain access to places for exercise, the practice of healthy behavior was low regardless of age (group over 70 years (22.3%), group under 50 years and over 70 years (25.5%), and group under 50 years (33.1%)). Those who had easy access to places for exercise were over 50 years old, had fair SRH, had no spouse, had high blood pressure, and lived in a house all had a lower amount of healthy practices (28.3%). In addition, 30.5% of the subjects living in a place with easy access to places for exercise and a population of less than 300,000 showed a greater number of healthy practices. In contrast, 32.9% of those who had easy access to places for exercise, were over 50 years old, had fair SRH, had a spouse, had a job, and lived in a region with a population of less than 100,000 showed a higher amount of healthy practices (Figure 4).

Results of the Decision Tree Analysis for Healthy Behavior Practices (Women)
When it was difficult to gain access to places for exercise, the practice of healthy behavior was low regardless of age (group over 70 years (22.3%), group under 50 years and over 70 years (25.5%), and group under 50 years (33.1%)). Those who had easy access to places for exercise were over 50 years old, had fair SRH, had no spouse, had high blood pressure, and lived in a house all had a lower amount of healthy practices (28.3%). In addition, 30.5% of the subjects living in a place with easy access to places for exercise and a population of less than 300,000 showed a greater number of healthy practices. In contrast, 32.9% of those who had easy access to places for exercise, were over 50 years old, had fair SRH, had a spouse, had a job, and lived in a region with a population of less than 100,000 showed a higher amount of healthy practices (Figure 4).

Importance of Factors Affecting Healthy Behavior Practices
The importance of the variables for men was in the following order: age, job, access to places for exercise, SRH, stress, city size, BMI, residential area ("dong" as urban areas and "eup/myeon" as rural areas), diabetes, and perceived environment (green space). For women, the order was: access to places for exercise, age, job, SRH, living alone, city size, housing type, education level, and hypertension (Table 2).

Discussion
Men and women had below-average healthy behavior practices if they had poor access to places for exercise and fair SRH. The healthy practices of men were lower in the group without a job, in their 40s-50s, living in a rural area, and with a lot of stress. The healthy practices of women living in less populated cities (<100 thousand of population) were also low.
Several factors influence participation in physical activity (PA), and studies have recently focused on the role of the environment in promoting healthy behaviors as determined by PA [22][23][24][25]. A short distance to urban green spaces and the availability of exercise equipment are positively associated with the frequency of PA [26]. The environment can impact healthy adults' capacity to perform health-enhancing PA [22], while perceived green space characteristics are strong predictors of wellbeing [27]. In this study, healthy behavior was also below average in the cases where access to places for exercise was difficult. The strategies for increasing PA using place-based interventions were more common than those using person-based interventions [22,28]. Place-level interventions included changing the physical structure of parks (i.e., adding walking trails) to facilitate physical activity [29]. To increase healthy behaviors among both males and females, it is necessary to improve the access to places for exercise since the practice of physical activity is greatly influenced by the environment of the residential areas [30]. When the neighborhood exercise environment and public facilities were sufficient, there was a correlation with the SRH [31]. Kang et al. [32] found that SRH and social support had a positive correlation with healthy behavior: the more positive the SRH regardless of disease, the better it predicted individual wellbeing [33]. In this study, healthy behavior was also below-average in the cases of fair-poor SRH.
It is generally said that the search for specific ways to cope with high stress among men is insufficient [34], but in this study, high stress led to decreased health practices, so a program for stress management is needed.
Healthy behavior practices were lower in males and females when they were older. According to the results of a previous study, as age increases, the prevalence of chronic diseases also increases due to a decrease in physical activity and also to habit-based health risk behaviors [35][36][37]. In particular, rural elderly people experience more socioeconomic inequality and social exclusion than urban elderly people, and their standard of living during old age is poor [38]. Groups with low socioeconomic status are more sensitive to local resources because they lack other alternative health resources [39]. Older people are more affected by local resources than younger people because they stay in residential areas for a longer time and spend more time using local facilities or spaces [31]. The job variable was significantly associated with healthy behavior. In particular, there were more men in the job group than women, and they also had more regular walking groups. If women are significantly less likely to work outside of the home, this may explain the difference in the results between men and women. The impacts of physical and social environmental factors on behavior have been increasingly recognized. Individuals are influenced not only by their motivation and capability to make behavioral changes, but also by opportunities afforded by their social and physical environment [40,41]. Even though factors such as information and skills training are important for behavioral changes, other behavior-influencing factors should also be considered, e.g., social support [3,42].
Health inequality by region was not any individual's fault but was caused by an unequal distribution of health protection factors and factors that are harmful to one's health in the region [43]. If the living environment-such as the neighborhood space or the local government's socioeconomic capabilities and policies-was health-friendly, it directly and positively affected health [44]. In other words, an individual's health-related behavior is directly or indirectly influenced by the socioeconomic, built environment, and policy characteristics of the region, and by the operation of local health services throughout the region [45]. In previous studies, there was a difference in health behavior according to the area of residence [46][47][48][49], and in this study, both men and women were affected by access to local exercise facilities and the type of residential areas (urban, rural, etc.).
Therefore, cultivating elderly-friendly physical environments in each region and developing health programs based on supportive social activities for community residents would contribute to promoting healthy lifestyle practices. The results of this study highlight the need to implement policies and strategies that are tailored to personal and environmental factors in order to boost healthy lifestyle practices in older adults [50].
Individual health behaviors (smoking, drinking, exercise, nutrition, etc.) tend to form clusters [6,51]. Therefore, it is necessary to implement a health project with the use of a comprehensive index rather than approaching each health behavior individually.
A limitation of this study is that it was difficult to explain the temporal relationship because KCHS is a cross-sectional study. Additional limitations include the lack of many different variables using secondary data. Self-reported health behavior data (smoking, drinking, walking) may not provide accurate information because of recall and social desirability biases. These limitations may influence the interpretation of the study findings.
Despite these limitations, this study has several strengths. We used a nationally representative sample in this study, which provided evidence for the development of implementation strategies that consider not only individual characteristics, but also access to local exercise facilities and residential area characteristics (urban, rural) for the low health behavior practice group. In addition, this was the first study to use a decision tree analysis, suggesting that health projects that enhance health behaviors should be carried out, giving priority to those who are old, have no job, and have poor perceived health levels. In particular, a project aimed at improving access to local exercise facilities should be implemented.

Conclusions
It is necessary to implement an intervention on health behavior practices that considers not only individual characteristics (age, job, self-rated health status), but also access to local exercise facilities and residential area characteristics (urban, rural). In particular, the elderly and those who have a job with poor health status should be prioritized as the target groups of the health project, and for women, the group with low access to community exercise facilities should also be considered as a priority target group for the health project.