The Significant Association between Health Examination Results and Population Health: A Cross-Sectional Ecological Study Using a Nation-Wide Health Checkup Database in Japan

In Japan, population health with life expectancy (LE) and healthy life expectancy (HALE) as indicators varies across the 47 prefectures (administrative regions). This study investigates how health examination results, including attitude toward improving life habits, are associated with population health. The association between health checkup variables and summary population health outcomes (i.e., life expectancy and healthy life expectancy) was investigated using a cross-sectional ecological design with prefectures as the unit of analysis. The medical records, aggregated by prefecture, gender, and age in the National Database of Health Insurance Claims and Specific Health Checkups of Japan (NDB) Open Data Japan, were used as health checkup variables. Body weight, blood pressure, liver enzymes, drinking habits, smoking habits, diabetes, serum lipids, and answers to questions regarding attitude toward improving health habits were significantly correlated to population health outcomes. Multiple regression analysis also revealed significant influence of these variables on population health. This study highlights that health examination results, including attitude toward improving health habits, are positively associated with population health. Consequently, implementing measures to improve health habits in response to the examination results could help the population maintain a healthy life.


Introduction
Population health significantly varies based on geography. Japan consists of 47 administrative regions called prefectures. Each prefecture has its unique natural features and culture. While health habits (e.g., diet, smoking, and drinking) are substantially associated with population health [1][2][3], these differ across prefectures. Life expectancy (LE), a commonly used summary population health indicator, varies significantly across prefectures [4]. Differences in LE among prefectures are increasing [5,6].
The Japanese health care system is attributed to the Bismarck-type health insurance where all citizens are covered under one of the public health insurance programs. The programs for individuals aged 40 to 74 years are divided into 2 categories, i.e., the employmentbased health insurance system wherein company employees and their family members are enrolled and the residence-based National Health Insurance system which is for people not eligible for the employment-based insurance system [7]. Public health insurers are obliged to provide a specific health checkup for the insured individuals aged 40 to

Study Design
This study investigated the association between health checkup variables and summary population health outcomes (i.e., LE and HALE) using a cross-sectional ecological design with 47 Japanese prefectures wherein prefectures served as the unit of analysis. We analyzed the NDB Open Data Japan, which contains approximately 27.8 million medical records of adults nationwide aged 40 to 74 years in the 2016 fiscal year [9]. The Research Ethics Committee of the International University of Health and Welfare waived approval for this study.

Health Checkup Variables
This study used medical records aggregated by prefecture, gender, and age in the fourth NDB Open Data Japan [9] as health checkup variables. These were divided into two groups: laboratory tests and answers to questions regarding health habits. Specifically, the prefecture level average values in laboratory tests and percentage of respondents who selected each answer option in the questions were used as health checkup variables in the correlation analysis. The health checkup program included the following laboratory tests: body mass index (BMI), abdominal circumscript, fasting plasma glucose, HbA1c, systolic blood pressure, diastolic blood pressure, triglyceride, high density lipoprotein cholesterol (HDL-C), low density lipoprotein cholesterol (LDL-C), aspartate transaminase (AST), also known as glutamic oxaloacetic transaminase (GOT); alanine transaminase (ALT), also known as glutamate-pyruvate transaminase (GPT); γ-glutamyltransferase (γ-GT or γ-GTP), and hemoglobin. Table 1 summarizes laboratory tests and nationwide average value in each test [9]. Table 2 summarizes questions about health habits, answer options for each question, and the nationwide average percentages of respondents who selected each answer [9].

Standardization of Health Checkup Data
The health checkup raw data were standardized by age using the following formula: Standardized data = (∑ age-specific raw data in a 5-year age group × standard population in that age group)/(total population in standard population); the Japanese population in 2015 [14] was used as the standard population.

Summary Population Health Outcomes
The MHLW has published LE by prefecture and gender every five years since 1965. Table 3 shows the results of LE survey conducted in 2015 [10], and this was used as a summary population health variable. The MHLW has also published HALE, defined as the average duration of time spent without limitation in daily activities, by prefecture and gender every three years since 2010. Table 3 indicates the results of HALE survey conducted in 2016, and this was used as a summary population health variable [13]. Due to the massive earthquake in the Kumamoto prefecture in 2016, there were no HALE data for this location. Thus, we conducted a correlation analysis between health checkup variables and HALE for the remaining 46 prefectures.

Statistical Analysis
We evaluated the association between health checkup variables and summary population health outcomes (i.e., LE and HALE) by gender with the 47 prefectures where prefectures served as the unit of analysis. Correlations between the two variables were examined using t-test and Pearson's correlation coefficient (r) for each pair of variables. Additionally, a multiple regression analysis was performed using a model with each summary population health outcome as the objective variable and two groups of health checkup variables (i.e., the laboratory tests and the answer to the questions regarding health habits) as candidate explanatory variables. The backward stepwise selection procedure was used to select the explanatory variables from the candidate variables for final inclusion in the regression model with the criteria set at p < 0.2. In the regression analysis using the laboratory tests, BMI, fasting plasma glucose, HbA1c, systolic blood pressure, triglyceride, HDL-C, LDL-C, γ-GT, and hemoglobin were included as candidate explanatory variables (BMI was used as body weight-related variable among BMI and abdominal circumscript. Systolic blood pressure was used as the blood pressure-related variable among systolic and diastolic blood pressure. γ-GT was used as liver enzyme-related variable among GOT, GPT, and γ-GT). In the regression analysis using the answer to the questions regarding health habits, all question items (Q1-22) were included as candidate explanatory variables. Only answer option 1 was the target of analysis for question items with 3 or more answering options. A quantile-quantile (Q-Q) plot was used to verify the normality assumption of the residuals in the regression analysis. The significance of individual regression coefficients was tested by t-test. p values < 0.05 were considered statistically significant. Statistical analysis was performed with BellCurve for Excel version 3.20 (Social Survey Research Information Co., Ltd., Tokyo, Japan).  Table 4 summarizes the results of correlation analysis between laboratory tests as health checkup variables and summary population health outcomes. A positive correlation implying statistically significant linear relation was found between the following pairs of variables: (1) HbA1c-HALE in females and (2) LDL-C-LE in males. Meanwhile, a negative correlation implying statistically significant linear relation was present between the following pairs of variables: (1) BMI-LE in both genders, (2) systolic blood pressure-LE in both genders, (3) GOT-LE in both genders and HALE in males, (4) GPT-LE in both genders and HALE in males, (5) γ-GT-LE in both genders and HALE in males, (6) diastolic blood pressure-LE in both genders, (7) fasting plasma glucose-LE in males, (8) triglyceride-LE in males, (9) HDL-C-HALE in females.

Multiple Regression Analysis Using Laboratory Tests as Explanatory Variables
Multiple regression analysis for predicting population health outcomes using laboratory tests as explanatory variables was performed. Table 5 summarizes the results by gender. BMI, systolic blood pressure, and γ-GT were significantly associated negatively with LE in males (R 2 = 0.5725, F = 19.1987, p < 0.001). HbA1c positively and γ-GT negatively showed significant associations with HALE in males (R 2 = 0.3001, F = 9.2195, p < 0.001). BMI, systolic blood pressure, and HDL-C showed a statistically significant negative association with LE in females (0.2964, F = 6.0374, p = 0.0016). γ-GT was significantly associated negatively with HALE in females (R 2 = 0.2959, F = 3.3614, p = 0.0125). Normality of residual distribution in each regression analysis was met as assessed by Q-Q plot. Table 6 summarizes the result of correlation analysis between results of questions about health habits as health checkup variables and summary population health outcomes. A positive correlation implying statistically significant linear relation was found between the following pairs of variables: (1) Question (Q)2/Yes (Y) (hypoglycemic drug)-HALE in females, (2) Q6/Y (chronic kidney failure)-HALE in females, (3) Q7/Y (anemic)-LE in males, (4) Q18/answer (A)3 (drink rarely)-LE in males and HALE in females, (5) Q19/A1 (drink less than 180 mL of sake)-LE in males, (6) Q21/A1 (no intention of improving health habits)-HALE in females, and (7)    Ans.: answer selected; r: Pearson's correlation coefficient. The cells filled with red and blue represent positive and negative correlations with statistically significant level (p < 0.05) by t-test, respectively.

Multiple Regression Analysis Using Questions about Health Habits as Explanatory Variables
Multiple regression analysis for predicting population health outcomes using questions about health habits as explanatory variables was performed. Table 7 summarizes results by gender. Q1/Y and Q8/Y were factors significantly associated with LE in males (R 2 = 0.5273, F = 9.1472, p < 0.001). Q4/Y, Q5/Y, Q8/Y, Q13/Y, Q15/Y, Q18/A1, and Q22/Y were factors significantly associated with HALE in males (R 2 = 0.7216, F = 9.0702, p < 0.001). Q8/Y, Q9/Y, and Q17/Y were factors significantly associated with LE in females (R 2 = 0.5798, F = 4.9678, p < 0.001). Q2/Y, Q13/Y, Q18/A1, and Q20/Y were factors significantly associated with HALE in females (R 2 = 0.5701, F = 7.1982, p < 0.001). Normality of residual distribution in each regression analysis was met as assessed by Q-Q plot. Question items (Q1-22) were included as candidate explanatory variables. The explanatory variables, which were finally included in the regression model and presented in this table, were selected from the candidate variables using the backward stepwise selection procedure with the criteria of p < 0.2. CI: confidence interval; B: partial regression coefficient; β: standardized partial regression coefficient; VIF: variance inflation factor; R 2 : coefficient of determination. * p < 0.05, ** p < 0.01 by t-test.

Discussion
This study was a cross-sectional ecological analysis with prefectures as the unit of analysis using a nationwide health checkup database in Japan conducted to demonstrate that health examination results are associated with summary population health.
Body weight-related variables: BMI and LE were negatively associated in both genders in the correlation study and regression analysis. A previous study suggested that the association between BMI and relative mortality risk was J-shaped. This implied that high and extremely low BMI values are associated with increased all-cause mortality. The increased mortality in those with extremely low BMI was partly due to the inclusion of individuals with diseases that cause weight loss and premature death and residual confounding by smoking [15]. The linear negative relation between BMI and LE in this study could happen, probably because individuals with diseases causing weight loss and premature death tend to not participate in health checkups.
The percentage for Q13/Y (Have you experienced body weight fluctuation of ±3 kg or more during the last year?/Yes) was negatively correlated to LE in females and HALE in males. Its negative influence on HALE was also demonstrated by the regression analysis in both genders. Relapse of weight gain often occurs after dieting, and such weight fluctuation is associated with diseases, including cardiovascular diseases [16]. A recent meta-analysis concluded that weight fluctuation is associated with an elevation of all-cause mortality risk [17]. Additionally, the percentage for Q14/Y (Is your eating speed faster than others?/Yes) was negatively correlated to HALE in males. This result was expected based on the assumption that speedy eating is associated with prevalence of obesity and diabetes [18,19]. This in turn increases the risk of cardiovascular diseases, one of primary causes of morbidity and mortality.
Blood pressure-related variables: Elevated blood pressure was substantially associated negatively with LE in both genders in the correlation study and regression analysis. This was consistent with existing evidence that high blood pressure is a major risk for mortality and morbidity [20,21]. Hypertension is associated with a wide range of acute and chronic cardiovascular diseases, such as angina and heart failure [22]. The percentage for Q1/Y (Are you currently under any medication for high blood pressure?/Yes) was negatively associated with LE in both genders in the correlation study and in males in the regression study. Since those participants who selected "Yes" presumably suffer from hypertension, this negative correlation could be due to the negative impact of hypertension on LE [20,21].
Liver enzymes-related variables: Liver enzymes were negatively correlated to LE in both genders and with HALE in females. A negative impact of γ-GT on LE and HALE in males and HALE in females was also demonstrated in the regression analysis. This negative association was expected in light of existing evidence that the elevation of these liver enzymes is a sensitive marker of various liver diseases, including alcohol-and drug-induced liver injury, hepatitis, hereditary hemochromatosis, and cirrhosis [23]. Consequently, high levels of these enzymes are predictors of all-cause mortality among the elderly [24] and the general population [25,26].
Drinking habit-related variables: The percentage for Q18/A1 (How often do you drink alcohol (e.g., sake, shochu, beer, whisky, wine)?/Every day) was negatively correlated to LE in males and HALE in both genders. Its negative influence on HALE on both genders was also demonstrated by the regression analysis. The percentage for Q19/A3 (How much do you drink in terms of sake per day?/360-540 mL of sake) was negatively correlated to LE in males. Conversely, the percentage that selected A1 (less than 180 mL of sake) was positively correlated to LE in males. Further, the percentage for Q18/A3 (rarely drink) was positively correlated to LE in males and HALE in females. These results are consistent with the existing evidence that habitual heavy drinking is associated with high mortality and morbidity, although light drinking may reduce the risk of some cardiovascular diseases. Indeed, the relationship between alcohol consumption and mortality risk is generally J-shaped, i.e., the mortality risk is reduced by light consumption compared to abstinence but increases steeply as consumption increases [27,28]. Excessive alcohol consumption is one of the leading causes of premature mortality [29], and there is a strong association between heavy drinking and various diseases, such as cancer, cardiovascular disease, liver disease, and diabetes [30].
Smoking habit-related variables: The percentage for Q8/Y (Are you a habitual cigarette smoker (defined as a person who smoked a total of over 100 cigarettes or for over 6 months and has smoked in the last month) at present?/Yes) is negatively correlated to LE in both genders. Its negative influence on LE in both genders was also proven by the regression analysis. This was in accordance with the accumulated evidence that a smoking habit is associated with high risk of various chronic diseases (e.g., cancers and cardiovascular diseases) and decrease of LE [31][32][33][34]. On the other hand, the correlation with HALE was insignificant in this study. This result was inconsistent with previous studies which showed that smoking habit is associated with reduction of HALE [35][36][37]. This inconsistency could be due to different study designs. For instance, subjects aged 65 years or older were targeted in the previous study [35], while those aged 40-74 years were included in this study.
Diabetes-related variables: Fasting blood glucose level was negatively correlated to LE in males. This was consistent with the evidence that diabetes is associated with premature death caused by various diseases, such as cardiovascular diseases, cancers, and infectious diseases [38]. HbA1c exhibited a positive correlation with HALE in females. The relation between HbA1c and all-cause mortality in non-diabetic examinees was reportedly reverse J-shaped with an HbA1c of 5.4% as the lowest mortality risk. All-cause mortality risk does not increase significantly above an HbA1c level of 5.4% for non-diabetic examinees, although the risk is significantly higher in the low range, i.e., less than 5.0% [39]. Given that most participants were non-diabetic individuals, the result seemed consistent with this literature.
The percentage for Q2/Y (Do you take insulin injections or other medications to reduce blood glucose level at present?/Yes) was negatively correlated to LE in males. Since the participants who selected "Yes" presumably suffer diabetes, this negative correlation was due to the negative impact of diabetes on population health outcomes [40,41]. Meanwhile, the positive association between the percentage for Q2/Y and HALE in females was demonstrated by the correlation study and regression analysis. One possible interpretation for the paradoxical result would be that the participants who selected "Yes" take measures to control blood glucose and prevent diabetic complications, such as retinopathy, neuropathy, and nephropathy. Such preventive interventions (including hypoglycemic drugs) prevent or delay disabilities caused by diabetic complications because a previous study [42] established that adoption of intensive diabetes management delays or prevents serious diabetic complications.
Serum lipids-related variables: Triglyceride was negatively associated with LE in males, in line with the reported negative impact of triglyceride on health. High triglyceride seems to be associated with the high mortality and morbidity of cardiovascular diseases and cancer based on epidemiological and genetic evidence [43]. Excessive accumulation of triglyceride in somatic cells is involved in pathophysiology for obesity [44], which is, in turn, associated with mortality and various comorbidities including diabetes, nonalcoholic fatty liver disease, and cardiomyopathy [45].
LDL-C and LE were positively correlated in males in the correlation study. Additionally, a negative association between HDL-C and HALE in females was observed in the correlation study and regression analysis. These results were inconsistent with their roles in the pathophysiology of atherosclerosis. Currently, LDL-C is considered a major causal factor for cholesterol transport to atherosclerotic lesions, whereas HDL-C performs a reverse transport of cholesterol to the liver [46]. Elevated LDL-C and decreased HDL-C levels are considered risk factors for atherosclerosis in cardiovascular disease [47]. There are also alternative observations to which this study's results seem proximate. A study demonstrated a significant trend wherein LDL-C is negatively associated with all-cause mortality, i.e., low LDL-C is associated with high mortality risk [48][49][50]. Regarding HDL-C, a meta-analysis suggested that high HDL-C does not reduce the risk of cardiovascular diseases [51]. According to a recent study, the association between HDL-C level and all-cause mortality is U-shaped, and extremely high and low HDL-C is associated with high risk of all-cause mortality [52]. One possible interpretation would be that the participants who showed high LDL-C or low HDL-C level could make an early start of dyslipidemia treatments, which could consequently help such populations maintain a healthy life.
Variables related to attitude toward improving life habits: Answers to questions about the attitude toward improving health habits were significantly associated with the population health. The percentage for Q21/A5 (Are you going to improve your life habits such as diet and exercise?/already working on health habit improvement for 6 months), i.e., the most positive attitude to improving health habits, was positively correlated to LE in both genders. Paradoxically, the percentage for Q21/A1 (no intention of improving health habits) was also positively correlated to HALE in females. Presumably, the participants who selected A1 do not need to improve health habits because they already adopted healthy habits. Consequently, they maintain a healthy life for a longer period. The percentage for Q21/A2 (health habit improvement within 6 months from now) was negatively correlated to HALE in both genders. The participants selecting A2 hesitated to practice improvement of health habits immediately despite recognizing the need to improve. They might be susceptible to developing diseases associated with disability.
The percentage for Q22/Y (Would you like to receive instructions on life habit improvements?/Yes) was negatively correlated to HALE in both genders. This negative impact on HALE was also indicated by the regression analysis. Given the assumption that the participants selecting "Yes" have concerns about their health and find it difficult to improve life habits on their own, they might be vulnerable to diseases associated with disabilities.
Limitations: This study employed a cross-sectional ecological study design and used the health checkup data and the summary population health data collected in 2016 and 2015, respectively. One of the limitations of this study is that the health checkup data were aggregated prefecture-wise from the perspective of privacy protection. Hence, individual level analysis could not be conducted, limiting the findings to only indicate that health examination results are significantly associated with population health at the prefecture level. A study that uses individual data remains to be conducted to determine the factors associated with individual health. The other limitation is related to the selection bias attributable to the relatively low participation rate (slightly higher than 50%) for the health checkup. Since participation is not mandatory for eligible adults (aged 40 to 74 years), the participation rate remains low and varies based on the insurers providing the checkup programs [53]. For example, participation rate was 75% for people insured by large companies' insurance associations, one of Employee Health Insurance, and 37% for those insured by municipality-based National Health Insurance [53]. Thus, the participation rate is generally higher for employees than for self-employed persons, retirees, and non-working dependents. This difference of properties between participants and non-participants could affect the findings. Indeed, the properties in terms of various factors, such as socioeconomic condition, education level, financial status, and mental condition could affect population health outcomes. For example, a recent study which performed multiple regression analysis using Japanese population data aggregated by prefecture demonstrated that LE at 65 years of age in females was significantly affected by healthcare resource factors (beds per capita, doctors per capita, and medical expenses for the elderly) and an environmental factor (air pollution) [54]. Factorial analyses by including and controlling multiple factors, such as socio-economic-, mental condition-, and environment-associated variables must be focused on by future research.

Conclusions
This study highlights that health examination results, including attitude toward improving health habits, are positively associated with population health at the prefecture level. Thus, implementing measures to improve health habits in response to the examination results could help the population maintain a healthy life.