Relations of Lifestyle Behavior Clusters to Dyslipidemia in China: A Compositional Data Analysis

Dyslipidemia is associated with lifestyle behaviors, while several lifestyle behaviors exist collectively among some populaitons. This study aims to identify lifestyle behavior clusters and their relations to dyslipidemia. This cross-sectional study was conducted in Wuhai City, China. Cluster analysis combined with compositional data analysis was conducted, with 24-h time-use on daily activities and dietary patterns as input variables. Multiple logistic regression was conducted to compare dyslipidemia among clusters. A total of 4306 participants were included. A higher prevalence of newly diagnosed dyslipidemia was found among participants in cluster 1 (long sedentary behavior (SB) and the shortest sleep, high-salt and oil diet) /cluster 5 (the longest SB and short sleep), relative to the other clusters in both age groups (<50 years and ≥50 years). In conclusion, unhealthy lifestyle behaviors may exist together among some of the population, suggesting that these people are potential subjects of health education and behavior interventions. Future research should be conducted to investigate the relative significance of specific lifestyle behaviors in relation to dyslipidemia.


Introduction
Cardiovascular disease (CVD) is one of the most threatening diseases, accounting for 31% of mortality globally in 2016 [1]. Estimates of 422.74 million cases and 17.92 million deaths of CVD existed in 2015 around the world [2]. In China, especially, an estimated number of 93.8 million prevalent cases of CVD in 2016 was more than twice those in 1990 (40.6 million) [3]. The number of deaths owing to CVD increased from 2.51 million in 1990 to 3.97 million in 2016 [3].
Dyslipidemia refers to several lipid disorders that are characterized by at least one of following: raised total cholesterol (TC), raised triglycerides (TG), raised low-density lipoprotein cholesterol (LDL-C), and low high-density lipoprotein cholesterol (HDL-C) [4]. Dyslipidemia, a leading contributor to CVD, has become a global public health concern [5,6]. According to previous studies, the prevalence of dyslipidemia increased from 18.60% in 2007 to 36.46% in 2017 in China [7,8].
In addition, abnormal biomarkers of dyslipidemia could also lead to other chronic non-communicable diseases among some populations [9,10]. For example, a study showed increased atherosclerosis extension in prediabetes and newly diagnosed type 2 diabetes subjects with high TG/HDL compared to those with low TG/HDL [10].
In the past decades, with the rapid development of the economy, the living standard of Chinese people has been constantly improved. An increasing proportion of animal fat, oily, and salty foods are consumed in people's diets, along with less physical activity (PA) and more sedentary behavior (SB) conducted among Chinese people. Regular lifestyle behaviors play significant roles in the prevention of dyslipidemia and other cardio-metabolic disorders [5]. Recent studies indicated that several lifestyle behaviors were associated with the risk of dyslipidemia [11][12][13]. They showed that proper PA, less SB, and healthy diet might decrease the occurrence and development of dyslipidemia. However, previous studies mainly focused on individual behavior separately [11]. It is crucial to investigate the comprehensive impact of these lifestyle behaviors instead of focusing on individual behaviors. Determining the cluster patterns of lifestyle behaviors will contribute to an in-depth understanding of people's lifestyles impacting well-being. Recent researches have explored lifestyle behavior patterns and their associations with chronic non-communicable diseases among adults [14,15]. These studies identified several clusters that might be beneficial in the prevention of chronic non-communicable diseases by cluster analyses, and put forward relative policy recommendations.
People's daily activities mainly comprise PA, SB, and sleep [16,17]. Time spent on one activity necessarily displaces that on others, that is, time spent on different activities is intrinsically co-dependent, finite, and subject to collinearity. Therefore, this kind of data is identified as compositional data [17,18]. It may be inappropriate to use traditional statistical analyses to process compositional data, since it may generate misleading results with some effects being misestimated or obscured [16]. To date, little about the combined effect of time-use on daily activities is known.
To address these gaps in the research literature, we aimed to identify lifestyle behaviors among participants in Wuhai City, Inner Mongolia, China, applying compositional data analysis and exploring the associations between these clusters and dyslipidemia.

Ethical Consideration
This study was approved by the Peking University Biomedical Ethics Committee (approval number IRB00001052-16022). The approval conformed to the provisions of the 1995 Declaration of Helsinki (revised in Edinburgh in 2000). All participants provided their written informed consent prior to the conduct of the survey.

Participants
This cross-sectional study was conducted based on the project of Adult Chronic Disease and Prevalence Factors Monitoring from June 2014 to October 2014 in Wuhai City. People who aged 18-79 years old and who lived locally for at least six months in the past year were eligible to participate in this study. Multiple stratified cluster random sampling was performed to guarantee the representativeness of sample, comprising two steps. First, 20 streets of Haibowan District, Hainan District, and Wuda District were selected by random sampling. Second, stratified sampling was performed according to participants' employment status in each street. Employed participants were sampled from 105 work units according to local industry structure for males under 55 years and females under 50 years. Systematic sampling was conducted by employee number to determine the final sample. Unemployed participants were sampled from 100 households in 2 resident committees for males at least 55 years and females at least 50 years, and ultimate participants enrolled were selected by Kish in each household. Participants were excluded if they were pregnant or had neurocognitive difficulties.

Data Collection
Participants were required to complete a questionnaire, laboratory measurements, and a physical examination with the assistance of trained surveyors. The questionnaire was well designed and contained: (1) sociodemographic information, such as gender, age, education, and occupation; (2) information about personal lifestyle behaviors, including activities (PA, SB, and sleep), dietary intake, smoking status, and alcohol intake; and (3) information about history of chronic non-communicable diseases (dyslipidemia, hypertension, and diabetes).
Activities under investigation comprised PA, SB, and sleep. PA was measured with the long volume of International Physical Activity Questionnaire (IPAQ), an effective and widely used questionnaire for examining PA intensity, frequency, and cumulative time in adults [19][20][21]. Dietary intake was measured by a food frequency questionnaire (FFQ), which was proved to have moderate reliability and validity in Chinese adults [22][23][24]. FFQ comprised thirteen food categories relevant to local culture, and frequency (never, monthly, weekly, daily) and quantity consumed each time during the past month were investigated. Data on history of chronic non-communicable diseases were obtained from answers to the question "Have you ever been diagnosed as having dyslipidemia, diabetes, or hypertension by a doctor in a community hospital or above?" Physical examination and laboratory measurements were performed to obtain data on physiological indexes. Fasting venous blood samples were collected with disposal vacuum blood collecting tubes. The concentrations of TC, TG, HDL-C, and LDL-C were measured by local second-class hospitals, where the equipment, instruments, and reagents conformed with the national production and using standard. Weight and height were measured by standard instruments, with participants wearing light clothing and no shoes.

Definition and Group
Lifestyle behavior clusters in this study were identified as the aggregation of 24-h timeuse of activities (PA, SB, and sleep) and dietary patterns, which were the most important modified behaviors to improve dyslipidemia. PA intensity was classified according to IPAQ criteria [25], that is, PA was divided into 3 intensities, comprising vigorous intensity physical activity (VPA), moderate intensity physical activity (MPA), and light intensity physical activity (LPA), namely walking. Some researchers have shown that MPA and VPA might decrease risk of chronic non-communicable diseases, including dyslipidemia [11,[26][27][28]. Therefore, MPA and VPA were merged to be identified as moderate-to-vigorous intensity PA (MVPA). Dietary pattern was identified according to principal component analysis, a detailed method of which was illustrated in Section 2.5.
Dyslipidemia was defined according to Chinese Guidelines for the Management of Dyslipidemia in Adults (2016 Revision) criteria, as the presence of one or more of the following factors: (1) serum TC concentration of 5.2 mmol/L or greater; (2) TG concentration of 1.7 mmol/L or greater; and (3) HDL-C concentration of under 1.0mmol/L [29]. Specifically, newly diagnosed dyslipidemia was defined as a participant who met the diagnostic criteria mentioned above and had not been diagnosed as dyslipidemia by a doctor in a community hospital or above. A prevalent case was defined as a participant who had been diagnosed with dyslipidemia by a doctor. BMI was calculated as the weight (kg) divided by the square of the height (m 2 ), with a normal range of 18.5-23.9 kg/m 2 , considering the physiological characteristics of Chinese [30].

Statistical Analyses
Flow chart of statistical analysis is presented in Figure 1. Data analyses consisted of: (1) describing sociodemographic characteristics by lifestyle behavior clusters of two age groups, (2) identifying the lifestyle behavior clusters of two age groups, and (3) exploring the relationships between clusters and dyslipidemia in two age groups.
Considering potential differences in lifestyle behaviors and risk of dyslipidemia across age groups, participants were divided into a <50 years group and a ≥50 years group [31][32][33][34][35]. In addition, in view of prevalence-incidence bias, newly diagnosed and prevalent cases were analyzed separately. Twenty-four-hour time-use on activities and dietary pattern were inputted to determine lifestyle behavior clusters. Twenty-four-hour time-use (compositional data) was transformed to isometric log ratios due to its closed nature, which contributed to solving the problem of collinearity in compositional data [36]. Compositional data means were used to describe the central tendency of twenty-four-hour time-use. Principal component analysis was conducted to identify three dietary patterns according to scree plot and professional interpretability, with daily intake of food as input variables: (1) healthy dietary pattern (positive loadings for vegetables, fruit, egg, dairy, soy, and its products), (2) high-salt and oil dietary pattern (positive loadings for grease, salt, and sauce), and (3) high-staple dietary pattern (positive loadings for cereal and tubers). Principal component scores representing the above dietary patterns were calculated for each participant and described with arithmetic means; an average score of at least 0.15 for a dietary pattern presented that a participant tended towards the corresponding pattern.
Scree plot was generated to determine the potential cluster number. Subsequently, a k-means partitioning cluster was used. An optimal number of 4 clusters was identified in two age groups based on the results of the scree plot and professional interpretability. To evaluate the stability of the cluster solution, a random subsample of each group (50% of participants in each age group) was clustered via conducting the same procedure. Agreements between solutions were relatively substantial (Cohen's Kappa= 0.78 in <50 years group and 0.70 in ≥50 years group).
Dyslipidemia was compared across lifestyle behavior clusters in two age groups by multinomial logistic analysis with adjustment for covariates, containing age, gender, education, occupation, family annual income, BMI, smoke status, alcohol intake, and history of hypertension and diabetes.
Differences among clusters on sociodemographic characteristics and lifestyle behaviors were investigated by ANOVA (continuous variable) or χ 2 tests (categorical variable). In this study, statistical significance was determined with p < 0.05 (two-tailed). Statistical analyses were performed with R 3.6.2 (R Development Core Team, Vienna, Austria).
The difference of gender, education, occupation, BMI, smoking status, alcohol use status, and family annual income among clusters were statistically significant in the <50 years group. Age, gender, education, occupation, smoking status, alcohol use status, and family annual income among clusters had statistical differences in the ≥50 years group.

Lifestyle Behavior Cluster Characteristics
Activity and diet characteristics of participants in two age groups are shown in Table 2, and descriptions of clusters are shown in Figure 1 according to the compared average activity time of each cluster with the corresponding mean time of overall population in two age groups. Four clusters were identified, and different cluster characteristics were observed in two age groups. Specifically, daily cumulative time of MVPA in the <50 years group and the ≥50 years group had little difference (187.32 min vs. 166.08 min), time of LPA in the <50 years group (111.05 min) was longer than that in the ≥50 years group (175.15 min), and time of SB (472.63 min vs. 297.94 min) and sleep (471.10 min vs. 438.04 min) in the <50 years group was almost longer than that in the ≥50 years group.  In particular, (1) Clusters 1/5 in both age groups were characterized by long cumulative time of SB and short cumulative time of sleep, but high-salt and oil diet existed in the <50 years group, and not in the ≥50 years group. (2) Lifestyle behavior characteristics of cluster 2/6 in both groups were similar, which were both characterized by short cumulative time of SB and long cumulative time of sleep, but cumulative time of sleep in the <50 years group was the longest and cumulative time of SB in the ≥50 years group was the shortest.

Cluster Memberships Relationship to Dyslipidemia
Blood lipid profiles among lifestyle behavior clusters are shown in Table 3. The results of MANOVA showed the lipid profiles were significantly different among clusters in both age groups. Especially, the difference of TC and LDL-C in the <50 years group, and TC and TG in the ≥50 years group, were statistically significant.

Cluster Memberships Relationship to Dyslipidemia
Blood lipid profiles among lifestyle behavior clusters are shown in Table 3. The results of MANOVA showed the lipid profiles were significantly different among clusters in both age groups. Especially, the difference of TC and LDL-C in the <50 years group, and TC and TG in the ≥50 years group, were statistically significant. Among all the participants, 43.80% (1886/4306) were prevalent cases of dyslipidemia, 13.59% (585/4306) were newly diagnosed dyslipidemia cases, with the remaining 42.61% (1835/4306) not suffering from dyslipidemia. The associations between dyslipidemia and cluster memberships by multinomial logistic regression are shown in Table 4. A higher prevalence of newly diagnosed dyslipidemia was found among participants in cluster 1 relative to the other clusters (p < 0.01) in both age groups. No statistically difference of prevalent cases was found among clusters in both age groups. In particular, among people <50 years, participants in cluster 2 and cluster 4 had a significantly decreased prevalence of newly diagnosed dyslipidemia (OR cluster2 [odds ratio]: 0.688, 95% CI cluster2 [confidence interval]: 0.475-0.995; OR cluster4 : 0.421, 95% CI cluster4 : 0.277-0.640) relative to cluster 1 after covariate adjustment. Similarly, in the ≥50 years group, participants in cluster 2 and cluster 4 had a significantly decreased prevalence of newly diagnosed dyslipidemia (OR cluster2 : 0.638, 95% CI cluster2 : 0.412-0.988; OR cluster4 : 0.365, 95% CI cluster4 : 0.221-0.602) relative to cluster 1 after adjusting for potential confounders.

Discussion
This study aggregated compositional data of daily activities and dietary patterns into lifestyle behavior clusters and explored their relationships with dyslipidemia. Cluster analyses were conducted among participants aged < 50 years and ≥50 years because age might be a confounding factor influencing the association between lifestyle behavior and dyslipidemia. Cluster analysis is a statistical classification technique for discovering whether the individuals of a population fall into different groups, and the characteristics of the clusters were summarized by the results of cluster analysis. Time length of PA, SB, and sleep of a cluster was classified by comparing their time-use with the mean of PA, SB, and sleep in the total population in each age group.
Different lifestyle behavioral characteristics were found among clusters in two age groups. Besides, daily cumulative time of SB and sleep in the <50 years group was almost longer than that in the ≥50 years group. This phenomenon may result from greater need for sleep, prolonged sitting at work, pressure from every aspect, and relatively unhealthy lifestyle behavior among young people [34]. No obvious differences of MVPA and LPA existed between two age groups, except for extremely short cumulative time of LPA in the <50 years group, which may be due to more housework and physical exercises, instead of activities at work, conducted after retirement. This change has been found in previous research [37,38].
The average cumulative time of sleep in the <50 years group was higher than that in the ≥50 years group, but both groups met the criteria of the American Sleep Time Duration Recommendation [39]. The result of sleep variation by age was similar to a previous research [33]. However, short cumulative time of sleep and highest prevalence of newly diagnosed dyslipidemia in clusters 1/5 in both age groups indicated that a relationship between sleep and dyslipidemia might exist. Furthermore, several studies have shown that too long or too short sleep duration increased the risk of CVD, stroke, hypertension, diabetes, and metabolic syndrome [40][41][42]. Except for sleep duration, other aspects of sleep were proved to be related to chronic non-communicable diseases [43][44][45]. A study revealed the finding that sleep fragmentation was associated with higher blood glucose levels among African-Americans, and poor sleep efficiency and long wake after sleep onset increased risk of incident CVD [43]. As a result, comprehensive assessment of sleep should be conducted to explore its relationship with health outcomes.
Furthermore, results showed that part of the population focused on healthy lifestyle, but some unhealthy lifestyle behaviors might exist together, such as less PA, more SB, and unhealthy diet. This finding is in line with previous researches [11,[46][47][48], suggesting these people are potential subjects of health education and behavior interventions in the future.
The results of multinomial logistic regression showed that risk of newly diagnosed dyslipidemia might be lower in cluster 2 and cluster 4, but not in cluster 3, compared with cluster 1 in both age groups. This finding indicated that less SB, combined with more MVPA, might decrease risk of newly diagnosed dyslipidemia effectively. However, conducting MVPA may have difficulties in relieving the harm of extremely long cumulative time of SB, so decreasing SB time may be relatively more effective than increasing MVPA time in decreasing risk of newly diagnosed dyslipidemia. However, no significant difference of prevalent dyslipidemia was observed among clusters in both age groups, which might be due to changes in lifestyle behaviors after suffering from dyslipidemia.
Age among clusters in ≥50 years varied significantly, with cluster 5 being the oldest. Therefore, age might be an influencing factor of selecting lifestyle behavior, and more SB and less PA might be conducted among participants with older age. The statistical difference on BMI among clusters in the <50 years group was significant, and cluster 5 had the highest BMI, which indicated that BMI might be associated with lifestyle behavior. Cluster 4 in the <50 years group showed different characteristics from the others. Extremely long cumulative time of MVPA, the shortest cumulative time of SB, and more staple foods intake were major characteristics in this cluster. Besides, cluster 4 tended to appear in participants who were male, less educated, consumed cigarettes and alcohol, and had low annual family income. This phenomenon may be explained by the "Health Worker Effect", which means these young people who were comparatively better in physical fitness often choose manual work without paying much attention to a healthy lifestyle. Therefore, the association between extremely long cumulative time of MVPA and dyslipidemia is unclear. Previous research showed different findings for the health effects of MVPA [49,50], so further scientific research regarding this theme is needed.
The main strength of this study is the application of compositional data analysis to process 24-h time-use on activities, which are recommended for cluster analysis of compositional data. Furthermore, daily cumulative time of activities was considered to develop recommendation on daily PA, SB, and sleep. In addition, participants' lifestyle behaviors were explored comprehensively, considering cumulative time of all intensities of PA, SB, sleep, as well as dietary patterns, which are the most crucial modified behavior factors to prevent dyslipidemia. To our knowledge, this study is the first to combine compositional data analysis with cluster analysis to identify lifestyle behavior clusters among adults, and their relationship with dyslipidemia. Previous studies have combined compositional data analysis and linear regression to examine the associations between 24-h time-use on activities and physiological indicators related to chronic non-communicable diseases without considering dietary pattern [17,[51][52][53][54][55]. Recent studies have integrated compositional data analysis with cluster analysis to explore the relationships of adiposity or its indexes with 24-h time-use on activities among children and adolescents [16,[56][57][58], but no such research has been conducted among adults.
Several limitations exist in our study. First, the cross-sectional design employed in this study impedes the inference of causation. Further prospective studies should be conducted. Second, information on activities and dietary intake was obtained by IPAQ and FFQ, so slight differences between self-reported data and the actual situation might exist. However, these questionnaires have been widely used among the Chinese population because of their substantial validity and reliability. Finally, the results cannot be generalized to other populations because of exploratory, data-driven nature of cluster analysis and principal component analysis.

Conclusions
In summary, this study adds to current evidence that less SB and more MVPA may decrease risk of dyslipidemia, and health education and behavior intervention should focus on the target population. Future studies should also be conducted to investigate the relative significance of specific lifestyle behaviors in relation to dyslipidemia, and make effective interventions.  Informed Consent Statement: Informed consent was obtained from all subjects involved in the study.
Data Availability Statement: Data sharing is not applicable to this article.