Individualized Biological Age as a Predictor of Disease: Korean Genome and Epidemiology Study (KoGES) Cohort

Chronological age (CA) predicts health status but its impact on health varies with anthropometry, socioeconomic status (SES), and lifestyle behaviors. Biological age (BA) is, therefore, considered a more precise predictor of health status. We aimed to develop a BA prediction model from self-assessed risk factors and validate it as an indicator for predicting the risk of chronic disease. A total of 101,980 healthy participants from the Korean Genome and Epidemiology Study were included in this study. BA was computed based on body measurements, SES, lifestyle behaviors, and presence of comorbidities using elastic net regression analysis. The effects of BA on diabetes mellitus (DM), hypertension (HT), combination of DM and HT, and chronic kidney disease were analyzed using Cox proportional hazards regression. A younger BA was associated with a lower risk of DM (HR = 0.63, 95% CI: 0.55–0.72), hypertension (HR = 0.74, 95% CI: 0.68–0.81), and combination of DM and HT (HR = 0.65, 95% CI: 0.47–0.91). The largest risk of disease was seen in those with a BA higher than their CA. A consistent association was also observed within the 5-year follow-up. BA, therefore, is an effective tool for detecting high-risk groups and preventing further risk of chronic diseases through individual and population-level interventions.


Introduction
As the national life span increases, the prevalence of chronic diseases including hypertension (HT), diabetes (DM), and chronic kidney disease (CKD) is accelerating globally [1][2][3]. Chronological age (CA) is a major indicator of health status; however, the effects of CA on diseases may differ based on anthropometry, socioeconomic status (SES), and lifestyle behaviors [4][5][6]. As a result of this difference, people of the same CA have varied biological ages (BA). Therefore, BA, which is calculated using aging markers, has been regarded as a more precise index for assessing health status than CA [7][8][9][10].
Substantial changes in body shape and composition occur with age, and these changes can have an impact on health [6]. In particular, waist circumference is positively associated with the risk of chronic disease [11][12][13]. Differences in aging and health outcomes have also been associated with socioeconomic status [4]. According to the World Health Organization (WHO), chronic diseases share risk factors, including poor lifestyle behaviors such as cigarette smoking, heavy drinking, physical inactivity, and excess body weight [14]. Lifestyle behaviors are also regarded as mediators between SES and health, and they may help alleviate health inequities [15][16][17]. During the coronavirus 2019 (COVID-19) pandemic, health inequities based on socioeconomic disparities have become increasingly pronounced [18]. As a result, the necessity for a personalized health assessment tool is emphasized [19][20][21][22].
There are several previous studies on BA based on clinical information such as laboratory blood tests, frailty-related physical factors, physiological factors, metabolomics, and DNA-methylation [7,[23][24][25]. These were useful to understand the biological mechanism of aging, however, were inflexible to control the aging process. Moreover, limited studies were conducted to assess the BA as an index for predicting the risk of disease [26,27]. In this study, BA, which is calculated based on self-assessed risk factors, can be a useful indicator of health status. The combination of these risk factors may relate to increasing BA, which is positively associated with the risk of developing chronic diseases. This suggests that people can regulate the pace of their biological aging by promoting healthy lifestyles and addressing population-level determinants of health.
Therefore, this study aimed to develop a personalized BA prediction model based on individual-and population-level risk factors and assess it as a useful indicator for predicting the risk of chronic disease ( Figure 1).

Study Population
Participants were drawn from the Korean Genome and Epidemiology Study (KoGES) which integrated three cohorts (including the Ansan and Ansung baseline study from 2001-2002, the health examines study [HEXA] from 2004 to 2013, and the cardiovascular disease association study [CAVAS] from 2005 to 2011). The HEXA study consisted of a total of 173,353 participants aged 40-79 years, which was the largest population dataset from the KoGES. A total of 28,338 individuals aged 40-91 years participated in the CAVAS, and the Ansan and Ansung baseline study consisted of 10,030 individuals aged 40-69 years. All the participants of these three cohorts were interviewed using structured questionnaires and blood tests were collected by well-trained researchers. The detailed design of the KoGES study has been described elsewhere [28]. Of the 211,721 participants in the integrated data, 101,980 healthy individuals with a Charlson's comorbidity index [29] of 0, who had done body measurements (height, weight, waist, and hip circumference) and completed self-reported information such as SES, comorbidities, and lifestyle behaviors, were included to develop the BA prediction model. To estimate the risk of developing complex chronic disease, 43,143 participants with at least 2 years of follow-up were included ( Figure S1).

Biological Age
Among the 128 measurements, variables with missing rates of more than 20% and laboratory data (blood test and calculated dietary intake), which needed to be managed by medical personnel were excluded. Based on previous aging studies, the BA was computed based of the following: (1) body measurement (height, weight, waist, and hip size); (2) SES (income, education level, marital status, and occupation); (3) lifestyle behaviors (smoking duration [years], smoking consumption [packs per day], second-hand smoking [yes/no], drinking frequency [none, 1 time, 2-3 times, 4-6 times/week and daily], frequency of regular exercise [none, 1 time, 2-3 times, 4-6 times/week and daily]); and (4) disease comorbidity (dyslipidemia, asthma, allergy, and thyroid disease).
As there are substantial changes in body shape that occur during the aging process [6], body measurements are useful indicators for estimating biological aging. The relationship between SES and acceleration of aging has been well established [4]. According to previous studies, lifestyle behaviors are also related to chronic diseases and aging [5,14,30,31]. So, lifestyle behaviors were selected as factors affecting BA. Comorbidities were also included. Thus, BA was estimated as a single index using a combination of these self-assessed variables. For women, reproductive factors including age at menarche, oral contraceptive use, and parity were included.
To generate the definite effect of the BA, we defined 'Age-difference (Age-Diff)', as the difference between BA and CA ('Age-Diff' = BA-CA). The categories of 'Age-Diff were classified into 4 groups: "Very young BA", where BA was at least 5-year younger than CA (BA-CA ≤ −5); "Young BA", where BA was between 1-year and <5-year younger than CA (−5 < BA-CA ≤ −1); "Same BA as CA", where the BA-CA difference was between −1 year and 1 year (−1 < BA-CA ≤ 1); "Older BA", where BA was at least 1 year older than CA (BA-CA > 1), respectively.

Outcome Assessments
HT was defined as systolic blood pressure ≥ 130 mmHg or diastolic blood pressure ≥ 80 mmHg or taking any antihypertensive drugs [30]. DM was defined as either a fasting plasma glucose level ≥ 126 mg/dL, HbA1c ≥ 6.5%, or taking any anti-diabetic drugs [31]. CKD was defined as an estimated glomerular filtration rate (eGFR) ≥ 60 mL/min/1.73 m 2 according to the Chronic Kidney Disease Epidemiology Collaboration (CKD-EPI) [32].

Statistical Analysis
Baseline characteristics were compared according to each chronic condition using a Student's t-test for continuous variables and a chi-squared test for categorical variables. Z-score standardization was performed for continuous elements including body measurements and lifestyle information. Based on the standardized elements, elastic net linear regression [33] was applied to find the optimized coefficients for selected variables that minimize the sum of error squares, which were used to generate our BA prediction model ( Figure S2). Our model was then evaluated using 10-fold cross-validation ( Figure S3) [34] To estimate the correlation between CA and BA, correlation coefficients (r) were calculated. Logistic regression analyses were performed to calculate the odds ratios (ORs) of chronic diseases according to CA (<50, 50-59, 60-59, and ≥70 years), BA (<50, 50-59, 60-59 and ≥70 years), and Age-Diff (very young BA, young BA, same BA as CA, and older BA). Cox proportional hazards regression analyses were further performed to assess the hazard ratio (HR) of BA and the risk of complex chronic diseases. Further analyses were conducted to estimate the risk of disease within a 5-year follow-up period. All analyses were performed using SAS 9.4 software (SAS Institute, Cary, NC, USA) and R (version 3.3.3) with the 'glmnet' package.

Calculation and Assessment of Biological Age
Based on the differences in demographic backgrounds and lifestyle behaviors between sexes, this study developed a sex-specific BA prediction model. BA was calculated based on self-reported questionnaire, including body measurements, SES, comorbidities, and lifestyle behaviors. According to the elastic net regression variable selection process, a total of 20 markers for men and 23 for women were selected. The computed BA was a significantly correlated with CA for men (r = 0.709, p < 0.001) and women (r = 0.688, p < 0.001), respectively. According to BA predictors, waist circumference, alcohol consumption, and smoking duration were positively associated with BA (Equations (S1)-(S3)).

Discussion
In this study, we developed machine learning-based self-assessment of BA using body measurements, SES, lifestyle factors, and the presence of comorbidity. We found that BA was more strongly associated with the risk of developing chronic diseases than CA. Individuals with a BA lower than the CA have a decreased risk of developing chronic diseases and the risk increased rapidly within a short period of follow-up (within 5 years). A recent study forecasted a continued increase in the global life expectancy [35]. However, the prevalence of comorbidities also increases with age, which decreases quality of life and increases the burden of disease [36,37]. Thus, increasing healthy life expectancy has become more important since the turn of the 21st century. Based on the different health statuses of people of the same CA, several previous studies have come up with BA as an index to represent biological health status. However, most of these studies were based on clinical data such as laboratory blood tests [9,38], physical tests (grip strength and vertical jump) [8], physiological factors (body mass index and percent body fat mass) [9,38], metabolomics [7], and DNA methylation [25]. Although clinical information may reflect the biological aging status, it is difficult for the general population to understand its meaning, and there are restrictions on information collection.
In this study, we computed for BA based on self-assessed variables including body measurement, SES, modifiable lifestyle factors, and the presence of comorbidities. As there are substantial changes in body shape that occur during the aging process [6], body measurements are useful indicators for estimating biological aging. The relationship between SES and acceleration of aging has been well established [4]. According to previous studies, lifestyle modification is effective in decreasing the risk of cardiovascular disease in primary prevention [14,39]. However, there are increasing health inequities due to socioeconomic disparities [18]. Thus, BA might be a valuable predictor of health status, which has an impact on health equity.
Among the risk factors, we found a positive association between waist circumstance and BA. This relationship was supported by prior studies showing that excess body weight is associated with DM, CKD, and cardiovascular diseases [11][12][13]. We also found that lifestyle factors including smoking duration and drinking frequency were significantly associated with BA. This is in line with the J-shaped relationship between alcohol consumption and all-cause and all-cancer mortality in Korea [40]. The causal association with smoking duration also confirmed that smoking increased oxidative stress and inflammation which accelerated aging [41][42][43]. These findings support the idea that lifestyle modification is effective in delaying biological aging. Further research including dietary intake and type of exercise is needed to assess more comprehensive association between the healthy lifestyles and BA.
In this study, we used machine learning algorithms, particularly the elastic net regression to estimate BA. Previous studies have used multiple linear regression and principal component analysis to compute BA, but these methods resulted in overfitting and low interpretability, respectively [44]. Therefore, we selected elastic net linear regression with 10-fold cross-validation to produce a model that minimize overfitting, reduces bias, and are easily to interpret [33,34].
In addition, previous studies on BA have observed the likelihood of diseases by BA in cross-sectional data [8] or the risk of death in cohort data [7,9] rather than observing the risk of developing diseases. In this study, using data from a large cohort of 101,980 Koreans aged 40 to 89 years, we validated that BA could be used as a significant indicator of the risk of developing chronic disease.
One of the strengths of this study is its large sample size. Second, to our knowledge, this is the first approach to advance the study of BA by using factors that are well-measured, well-understood, and easily collected. Because BA consists of modifiable factors, it could be worthwhile to detect high-risk groups and prevent further risks with healthy lifestyle promotion. Third, we could prevent the model overfitting problem by using the elastic net regression model to predict the BA. Finally, we validated that BA is an important indicator of the risk of chronic diseases and their combination in both short (within 5 years) and long follow-up periods.
However, some limitations of this study need to be considered. First, because BA relies on self-reported factors, recall and misclassification biases should be considered. Second, although we examined the combination risk of DM and HT, we could not estimate the risk of when combined with CKD because of the small number of CKD events. Further research is needed to investigate the role of BA in various combinations of chronic diseases. Lastly, due to the lack of data, we were unable to find the association between BA and mortality and frailty. Further study on the effects of BA on mortality and frailty is needed.

Conclusions
In conclusion, this study suggests that self-assessment of BA could be an effective tool for detecting high-risk groups and reducing disease burden through individual and population-level health promotion.

Supplementary Materials:
The following supporting information can be downloaded at: https:// www.mdpi.com/article/10.3390/jpm12030505/s1, Table S1: Baseline characteristics of healthy participants at baseline among 103,912 cohort participants in the Korean Genome and Epidemiology Study (KOGES), Figure S1: Selection algorithm of study subjects for the analyses among 101,980 cohort participants in the Korean Genome and Epidemiology Study (KOGES), Figure S2 Informed Consent Statement: Informed consent was obtained from all subjects involved in the study.
Data Availability Statement: Raw data were generated from the Korea Genome and Epidemiology Study (KOGES). The derived data supporting the findings of this study are available from the corresponding author upon request.