Next Article in Journal
Nurturing the Will: Unraveling Associative Factors of Exclusive Breastfeeding Intentions Among Primigravida in Saudi Arabia
Previous Article in Journal
Health-Related Quality of Life and Patient Experience in Oncology Palliative Care Within the Saudi Model of Care Framework: Evidence from the Qassim Health Cluster, Saudi Arabia
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

CVD Mortality Disparities with Risk Factor Associations Across U.S. Counties

Harvard University, Massachusetts Hall, Cambridge, MA 02138, USA
Healthcare 2025, 13(22), 2937; https://doi.org/10.3390/healthcare13222937
Submission received: 10 October 2025 / Revised: 13 November 2025 / Accepted: 14 November 2025 / Published: 17 November 2025

Abstract

Introduction: Cardiovascular disease (CVD) remains a primary cause of mortality worldwide, with persistent geographic disparities driven by a complex interplay of risk factors. Continual updates of localized variations in CVD mortality are essential to develop targeted interventions for optimizing disease and healthcare management. Methods: This study investigated associations between CVD mortality and a comprehensive set of biological, environmental, behavioral, and socioeconomic factors across all U.S. counties, employing correlation, geospatial visualization, stepwise multiple regression, and machine learning models to evaluate the importance of risk associations. Results: Significant disparities in CVD mortality trend were observed across race, age, sex, and region, with elevated rates among older adults, men, and Blacks, particularly in southeastern states exhibiting severe social vulnerability. Correlation analysis identified disease management (e.g., COPD, hypertension, medication non-adherence), environmental factors (PM2.5), lifestyle behaviors (e.g., smoking, sleep duration), and socioeconomic status (e.g., poverty, single-parent households, education) as important contributors to CVD mortality. Conversely, higher household income, physical activity, and cardiac rehabilitation participation were strong protectors. Multiple regression explained 66.9% variance in CVD mortality, recognizing PM2.5, smoking, and medication non-adherence as top associated factors. Random Forest models underscored COPD’s predictive dominance, followed by medication non-adherence, smoking, and sleep duration. Conclusions: The findings highlight the geospatial connection of risk factors to CVD mortality disparities across U.S. counties. They emphasize the critical importance of data-driven strategies targeting air quality, tobacco control, social inequities, and chronic disease management to mitigate CVD burden and promote health equity.

1. Introduction

Cardiovascular disease (CVD) has been acknowledged as posing one of the foremost challenges to global public health in the 21st century [1]. It claims millions of lives annually, with nearly 80% of deaths occurring in lower-income countries, reflecting disparities in societal and economic development. Even in the U.S., CVD remains a primary healthcare concern despite extensive prevention and treatment advancements [1]. Projections indicate a rise in prevalence and associated costs through 2050 due to population aging and persistent risk factors, especially within racially diverse communities [2]. While improvements in medical technology and healthcare have reduced age-relevant mortality, regional and social disparities in CVD outcomes persist. Understanding the regional distribution and impact of diverse risk factors is crucial for designing targeted public health interventions.
Multiple risk factors, including hypertension, diabetes, obesity, smoking, physical inactivity, and insufficient sleep, have been consistently recognized as biological and behavioral determinants of adverse cardiovascular outcomes [1,2,3,4,5]. Beyond these traditional factors, emerging evidence highlights environmental and social determinants, such as air pollution exposure (PM2.5) [6], heavy metal contamination [7], transportation noise [8], limited green spaces [9], and socioeconomic disadvantage [10,11,12], in exacerbating CVD risk across regions and populations. Particularly, household socioeconomic status, encompassing education, income, and Internet access, has been linked to cardiovascular health, indicating systemic inequities in medical access and information [10,13,14,15].
The interplay among these determinants is complex and often synergistic. More than half of CVD cases may result from the combined impact of five major modifiable factors [5]. This intricate interplay often fosters bidirectional relationships between CVD and other conditions, such as cancer, mental disorders, and chronic obstructive pulmonary disease (COPD) [12,16]. Moreover, the relationships between the interrelated risk factors and CVD outcomes vary across populations and geographic areas [10,11,17]. This underscores the necessity for geographically fine-grained insights into their associations to inform targeted public health interventions.
Numerous studies have explored CVD risk and factor associations across various geographic scales. Specifically, state-level analyses in Georgia identified household income and PM2.5 as the leading contributors [18], while national studies emphasized demographic composition, education, income inequality, and social vulnerability [19]. Research from South Korea linked high CVD mortality to poor air quality and insufficient green infrastructure [9]. Cross-national studies suggest that CVD mortality can be possibly affected by surrounding countries, with income and other socioeconomic variables being key influencers [14]. Despite these efforts, most analyses have relied on geographical averages or limited variable sets, often evaluating individual factors in isolation. There remains a critical need for geographically resolved analyses that integrate all the relevant factors simultaneously into consideration [2].
The absence of such integration has precluded measuring factors’ joint effects or spatial interactions, thereby constraining our understanding of CVD disparities at a fine-grained geographic level. To address this gap, the present study conducted a county-level analysis across all U.S. regions to capture the complex interdependencies between CVD mortality and a comprehensive set of risk factors. The study’s novelty resides in its high-resolution integration of multi-domain variables, covering traditional, environmental, behavioral, and socioeconomic perspectives, within a unified, data-driven framework. This framework significantly advances prior work by simultaneously evaluating the comprehensive set of factors through a combination of conventional statistical techniques and machine learning (ML) models and visualizing their geospatial relationships.
Accordingly, this study aimed to (1) investigate macro-scale geographic associations between CVD mortality and a complex mixture of human, social, and environmental factors; (2) visualize the spatial distribution of CVD mortality and risk factors across U.S. counties; (3) employ advanced statistical and ML methods to analyze complex interdependencies; (4) inform data-driven and targeted intervention strategies to reduce CVD mortality and promote health equity. Ultimately, the findings will contribute to a more comprehensive understanding of the multifactorial and geographically heterogeneous nature of CVD mortality in the United States.

2. Methodology

2.1. Data Collection

The data used for analyses were collected from publicly accessible databases of the Centers for Disease Control and Prevention (CDC) on 21 November 2024. These were organized in national-by-county geographic type encompassing CVD mortality rates and other different variables. The CVD mortality rates, expressed as the number of total vascular disease deaths per 100,000 population, were sourced from the CDC Heart Disease & Stroke Interactive Atlas (http://nccd.cdc.gov/DHDSPAtlas/Reports.aspx (accessed on 21 November 2024)). This dataset spanning the period 2006–2021 was disaggregated by age, gender, and ethnicity, allowing for longitudinal analysis of mortality trends before and during the COVID-19 pandemic across the U.S. Other datasets were gathered from the National Environmental Public Health Tracking Network (https://ephtracking.cdc.gov/download (accessed on 21 November 2024)) or the CDC National Center for Health Statistics (NCHS) (https://data.cdc.gov/browse (accessed on 21 November 2024)), wherever they were available.
Specifically, drug poisoning mortality rates were downloaded from NCHS; Interactive Atlas was used to extract relevant data pertaining to lifestyle prevalence of coronary heart disease, high blood pressure, stroke, high cholesterol, diabetes, obesity, physical inactivity, alcohol use, insufficient sleep, and smoking status, social and economic status of broadband Internet, computer ownership, education, food stamp, median home value, median household income, income inequality, poverty, housing cost burden, and unemployment rate, physical environment of air quality, park access, and urbanization, and healthcare delivery status of insurance coverage, care costs, blood pressure medication, diuretic non-adherence, renin-angiotensin system antagonism non-adherence, cholesterol-lowering medication, cholesterol screening, cardiac rehabilitation, hospitals, pharmacies, physicians, and specialists; Tracking Network was utilized for the collection of factors and indicators including prevalence of asthma, cancer, and COPD, demographic and socioeconomic indices in the composition of community capital resilience, economic resilience, environmental resilience, infrastructural resilience, institutional resilience, social resilience, and social vulnerability, individual components like household composition, transportation, single family, etc., and built environment like age of housing, land cover and use, traffic safety, sunlight, exposure to hazards, etc.
To ensure temporal consistency across all domains in correlation and statistical modeling analyses, both CVD mortality and exposure variables were averaged over a three-year period (2019–2021). This timeframe was chosen to minimize the influence of short-term fluctuations and anomalous events, such as the immediate COVID-19 shock in 2020, while still capturing contemporary conditions reflective of current CVD risk patterns.

2.2. Data Preparation

The gathered datasets contained either FIPS (federal information processing standards code uniquely identifying counties in the U.S.) or county-state geographic information in their dataframe. For datasets missing FIPS, a geographic identifier was added by cross-checking county-state information with a standardized code reference table via a Python-based approach. The individual datasets were then merged into a single unified data-frame using the ‘FIPS’ code as the primary index key through the data manipulation library of Pandas in Python. This allowed for the integration of data from various sources into a coherent structure for subsequent analyses. For regression and ML modeling described below in the Data Analyses subsection, entries with missing values (NA) were removed via listwise deletion to maintain analytical integrity and statistical reliability. This approach was chosen instead of imputation because the proportion of missing data was not clustered within particular variables, minimizing the risk of bias. It also prevented the introduction of artificial variance or assumptions about the missing patterns that could distort real associations. Future research with denser or individual-level data could benefit from imputation strategies; however, for the present county-level dataset, deletion provided the most transparent and replicable method. Scatter plot exploratory data analyses (EDAs) were initially performed to visualize the distribution of the variables of interest. This visual inspection helped identify the need for data transformation. Box–Cox transformation was considered to stabilize variance if the scatter plot indicated a non-normal distribution or heteroscedasticity, otherwise, Johnson transformation was explored as an alternative. At this stage, twenty of the 111 total variables (18%) were transformed to ensure comparable scale and variance prior to data analysis.

2.3. Data Analyses

The study was structured with counties as the units of analysis. One-way ANOVA was used to determine differences in CVD mortality across race, age, gender, and year. This initial step allowed for the identification of potential trends and disparities in CVD mortality over time and across demographic groups. To evaluate risk factors influencing CVD mortality, a stepwise analytical framework was employed, as detailed below.
Correlation analysis was first utilized to examine relationships between CVD mortality and exposure variables. Pearson’s correlation coefficient was calculated to measure the strength and direction of linear associations from the variables. The factors exhibiting low to strong correlation with CVD mortality were selected for later statistical modeling and visualization. Geographic information systems (GISs) were used to map the spatial distribution of CVD mortality and selected risk factors. This enabled the visualization of regional disparities in CVD mortality and risk factor associations across U.S. counties.
Multiple regression analysis employing stepwise selection of least squares was utilized to estimate the independent and combined effects of the chosen factors on CVD mortality. Variables with high collinearity or non-significant effects (p-value > 0.05) were excluded to enhance regression efficacy. This framework accounted for potential confounders and interactions among variables to ensure robust inference. To assess regression robustness, 10-fold cross-validation was performed on the final model. The model adequacy was further visualized through the spatial mapping of residuals at both the county and state levels, allowing for a geographic diagnosis of under- or over-estimation.
To complement the regression analysis, two popular ML models, Random Forest (RF) and Support Vector Machine (SVM), were implemented to better capture potential nonlinear relationships. This dual-model strategy enabled a comparative and complementary assessment of predictive performance: RF provided feature importance metrics, while SVM served as a nonlinear benchmark to validate observed relationships. Hyperparameters for the RF configuration included n_estimators = 500, max_depth = none, min_samples_split = 2, min_samples_leaf = 1, max_features = ‘sqrt’, bootstrap = True, random_state = 42, and n_jobs = −1. For SVM, the parameters were kernel = ‘rbf’, C = 10, epsilon = 0.1, and gamma = ‘scale’. The dataset was randomly partitioned into 80% for training and 20% for subsequent evaluation. Fivefold cross-validation within the training set was used to provide additional error control for ML performance. Model performance was assessed with Mean Squared Error (MSE) and R-squared (R2) for effective prediction.
All analyses were performed in Python (v 3.11) using open-source libraries: pandas (2.2.2) and numpy (1.26) for data processing, scikit-learn (1.4.2) for ML algorithms and metrics, and matplotlib (3.8) and seaborn (0.13) for visualization. Together, the comparative-complementary design of statistical inference and predictive modeling approaches would enable cross-validation of the results. The convergence of key variables across these frameworks will enhance analytical confidence toward a unified understanding of the risk associations of CVD mortality. The inclusion of spatial residual mapping will assist analysis transparency, offering a comprehensive understanding of the geographic and multifactorial determinants of CVD mortality across the U.S.

3. Results

3.1. Demographic Disparities

The data were initially explored to visualize trends and disparities in CVD mortality across racial, age, and gender groups before diving deeper into associated risk factor profiles. There was a notable decline in mortality rates from 2006 to 2019 (Figure 1A), indicating the effectiveness of public health interventions and advancements in medical treatment. However, the trend bounced back in 2020 and reached similar levels as in 2011 and 2012, implying that COVID-19-related factors had a significant impact on CVD mortality. Significant disparities in CVD mortality were observed between men and women (Figure 1B). Men have consistently exhibited higher rates of mortality compared to women across all age groups. There is a strong association between age and CVD mortality, with the highest death rates occurring in the 65+ age group (Figure 1C), emphasizing the critical need for early prevention and risk management strategies. The data also revealed stark disparities in CVD mortality across racial and ethnic groups (Figure 1D), with Black populations experiencing the highest rates, followed by Native American, White, Mixed, Hispanic, and Asian. It is not surprising that mixed race groups showed the highest variation in CVD mortality compared to others. The pronounced disparities across racial and ethnic groups might reflect long-standing inequities in healthcare access, socioeconomic conditions (e.g., poverty, food insecurity, inadequate housing, unemployment, etc.), and exposure to other risk factors such as hypertension, diabetes, smoking, mental health, and environmental toxins [20,21]. To address disparities across different demographic groups, it is essential to conduct further research integrating comprehensive behavioral, socioeconomic, and environmental data into cardiovascular health strategies to develop targeted interventions [20].

3.2. Key Correlation Factors

The correlation analyses between CVD mortality and various factors categorized their associations into different significances. Conventionally, correlation was considered as strong (absolute R ≥ 0.7), moderate (0.5 ≤ absolute R < 0.7), low (0.3 ≤ absolute R < 0.5), and negligible (absolute R < 0.3) [22]. The strong to low correlations identified in this study are summarized in Table 1, while the full matrix is available in the Supplementary Materials (Table S1). A large portion of factors had p-values ≤ 0.0001, indicating their significant relationships with CVD mortality.
Among the risk factors, COPD exhibited the strongest positive correlation with CVD mortality. The scatter plot visually confirms COPD’s significance (Figure 2), but the moderate spread of data points also reflects the multifactorial nature of CVD risk. Smoking was the second most highly associated factor, showing moderate correlation, which underlines the critical importance of tobacco control in public health management. Likewise, high blood pressure as a direct contributor to CVD helps explain its correlation, as well as stroke. Less sleep emerged as another significant risk factor, possibly linked to chronic stress and hypertension [23]. Indicators of socioeconomic disadvantages (e.g., poverty rate and reliance on food assistance programs) were moderately correlated with increased CVD mortality. Together with unhealthy lifestyles and stress, they indicate how limited healthcare access and chronic stress exacerbate CVD risk.
Social vulnerability indicators like vulnerability rank, single-parent households, and vulnerability indices displayed low but significant correlations, linking the impact of social determinants on health outcomes. Health-related behaviors and factors like coronary heart disease, diabetes, physical inactivity, and non-adherence to medications were weakly associated, reflecting their contribution to CVD mortality. The low R values for surrounding environment like household Internet access and PM2.5 indicate that living conditions have a considerable impact on cardiovascular health.
Interestingly, obesity, though a well-established health risk factor, demonstrated only a negligible correlation (R = 0.09; p-value < 0.0001), suggesting more nuanced relationships. Conversely, factors like median household income, social resilience, alcohol use, and institutional resilience with negative correlations indicate low to moderate protective effects. While it has been inconclusive and conflicting as to whether alcohol consumption offers cardioprotection in previous research [24], this study favors its protective effects. The negative correlation of households with smartphones implies that there is a weak to negligible protective relationship with CVD mortality. Given the high penetration of smart phones into groups with low socioeconomic status, health-related mobile applications might provide an opportunity to overcome traditional barriers to cardiac rehabilitation access [25].
In terms of accessibility, factors related to urbanization and healthcare infrastructure, such as availability of cardiac rehabilitation hospitals and walkability, were negligibly but inversely linked to reduced CVD mortality, reflecting marginal benefits of improved healthcare access and active transportation. Surprisingly, the geospatial correlation barely favored a close relationship between CVD and cancer, despite being two of the leading causes of death worldwide with known common mechanisms and risk factors that predispose individuals to both conditions [12]. This negligible correlation, though statistically significant, suggests that their interconnection is not reflected at a geographic level, a phenomenon also observed in the case of obesity.
Finally, there were a few factors like atrazine in water and certain healthcare access measures (e.g., cardiovascular physician) that showed no significant relationships (p-value > 0.05). This is in conflict with a previous meta-analysis that indicated the positive association of CVD with chronic exposure to drinking water arsenic at concentrations below the WHO provisional guideline value [26]. One explanation could be the limited data points for this factor in the study. Collectively, these correlation patterns provide an empirical foundation for subsequent multivariable regression and ML modeling.

3.3. Geospatial Visualization

Three correlation factors at moderate (social vulnerability index—SVI), low (air quality PM2.5), and negligible (sunlight UV exposure) levels were chosen to showcase the geospatial disparities and potential environmental and social contributors to cardiovascular health outcomes across the U.S.
The geographic distribution revealed regional disparities with high CVD mortality rates prominently visible in the southeastern U.S., particularly around the Mississippi River Basin, including states like Mississippi, Alabama, Louisiana, and parts of Arkansas (Figure 3A). In contrast, areas in the western and northeastern U.S. displayed significantly lower mortality. These disparities underscore the potential role of public health interventions targeting high-risk regions to reduce CVD burden.
The SVI, as a composite measure of community resilience to external stressors such as natural disasters, economic shocks, or public health crises, demonstrated strong spatial alignment with CVD mortality (Figure 3B). High SVI scores, reflecting populations facing poverty, limited access to education and healthcare, and transportation challenges, were predominantly concentrated in the southern and southeastern regions. The geographic overlap between elevated SVI and high CVD mortality underscores how social vulnerabilities can exacerbate cardiovascular risk and worsen outcomes. Conversely, lower SVI scores in the Midwest and western states suggest stronger social and economic resilience, likely correlating with favorable cardiovascular outcomes.
Higher concentrations of PM2.5 were observed in regions with significant industrial activity, urbanization, or reliance on fossil fuel combustion, such as parts of California, the Midwest (including the Ohio River Valley), and the Northeast (Figure 3C). Long-term exposure to PM2.5 has been linked to cardiovascular conditions, including atherosclerosis, hypertension, and myocardial infarction due to mechanisms such as systemic inflammation and oxidative stress [27]. While there was no notable visual overlap between poor air quality and high CVD mortality, many regions with high PM2.5 levels coincided with elevated death rates, indicating that chronic environmental exposures might amplify existing health disparities.
The distribution of annual sunlight exposure across the U.S. demonstrated a clear latitudinal gradient, with higher levels in southern states (Figure 3D). Sunlight has complex effects on health, with moderate exposure promoting vitamin D synthesis but excessive exposure increasing oxidative stress and inflammation, potentially influencing cardiovascular health. The southern regions, experiencing both higher sunlight intensity and elevated CVD mortality, may thus represent areas where climatic and behavioral factors interact to influence outcomes.
Taken together, the spatial patterns illustrate that CVD mortality is highly heterogeneous across the U.S., shaped by a convergence of social, environmental, and geographic factors. The high CVD mortality occurring in regions with social and environmental vulnerability highlights the critical need for location-based public health strategies that address multiple determinants simultaneously.

3.4. Multiple Regression Modeling

Variables with absolute R values ≥ 0.3 in the correlation were subjected to a stepwise multiple regression analysis. Factors with non-significant p-values (>0.05) or multicollinearity concerns in the regression were sequentially excluded to optimize model performance. During resolution of the multicollinearity, Principal Component Analysis (PCA) was applied to interrelated variables. The first principal component (PC1) from the PCA of COPD, coronary heart disease, smoking, high blood pressure, and stroke initially showed a variance inflation factor (VIF) value greater than 5, suggesting strong correlation and synergistic effects among these factors. Through stepwise removal and comparison of R2 and VIF values, smoking status was retained as it provided the optimal model balance (higher R2 and lower VIF). After optimization, all variables participating in the final regression fell below the VIF threshold of 5, confirming that multicollinearity was adequately resolved.
The optimized regression model achieved R2 of 66.93%, closely matching the adjusted (66.72%) and predicted (66.34%) values (Table 2). This suggests that the model has moderate predictive ability without over fitting. The regression coefficients and t-values further pointed to the strength and significance of the associations. The analysis highlighted the interplay of socioeconomic, behavioral, and environmental factors influencing CVD mortality. Particularly, PM2.5 emerged as the strongest factor, being followed by smoking status, blood pressure medication (BPM) non-adherence, cardiac rehabilitation eligibility, and blood pressure medication. These results highlight modifiable behavioral and clinical targets for CVD prevention.
Socioeconomic factors, such as single-parent households, food stamp usage, no college degree, and disability, were positively associated with CVD mortality. Other lifestyle and health-related factors, such as diabetes, sleep duration, and post-acute care cost, were also significant. Protective socioeconomic factors such as household income, park access, and mobile-home housing offer opportunities for structural improvements in cardiovascular health outcomes. Factors like no high school diploma and alcohol use were inversely associated with CVD mortality rates, potentially reflecting confounding lifestyle variables. Interestingly, factors of no college degree and no high school diploma contributed to the regression model differently with one positive and the other negative. This divergence might be attributed to the broader trend and association between higher education and healthier lifestyles. Individuals with higher educational backgrounds are less prone to risk factors like smoking, high salt intake, air pollution exposure, and depression. Conversely, they are more likely to engage in physical activity and benefit from increased household income, highlighting the importance of mitigating educational inequality in efforts to address CVD mortality disparities [28].
Geographic distribution maps of the regression residuals further visualized the spatial heterogeneity of model performance across the U.S., clearly indicating where the regression fit between the CVD mortality and exposure variables deviated. The county-level patterns pinpoint the model performance, indicating that socio-environmental or demographic factors unexplored in this study may contribute to the observed systematic deviation (Figure 4A). The state-level distribution of larger positive residual clusters in certain areas (e.g., Nevada, Alabama, Mississippi, and Maryland) suggests the localized underestimation of CVD mortality, whereas regions with significant negative residuals (e.g., Arizona, New Mexico, Minnesota, and West Virginia) imply potential overestimation (Figure 4B). These spatial trends reveal the importance of incorporating regional and contextual variables in future modeling frameworks.

3.5. Machine Learning Prediction

The RF and SVM models were tested to evaluate the predictive power of various factors on CVD mortality. Both models demonstrated strong predictive performance in estimating county-level CVD mortality, with results closely aligned with the correlation and regression analyses. The RF model achieved a higher overall predictive accuracy (MSE = 0.0984, R2 = 0.696), outperforming the SVM model (MSE = 0.1119, R2 = 0.654). The performance gap reflects RF’s superior ability to capture complex feature interactions and handle nonlinearities in this study. The comparative evaluation of RF and SVM performance underscores their mutual validity and complementarity in modeling CVD mortality across U.S. counties. RF effectively captures nonlinear, multicollinear, and interaction effects among predictors. SVM, though slightly less accurate, confirms that the relationships identified by RF are structurally consistent. The close alignment between SVM and RF outcomes strengthens the confidence in identified determinants.
Feature importance estimates from RF weighted the relative contributions of individual factors to the prediction of CVD mortality. COPD emerged as the most critical factor, contributing nearly half (45.42%) of the model’s predictive power (Figure 5). Following COPD, the most influential factors included non-adherence to blood pressure medications, diuretic non-adherence, smoking, sleep duration, and PM2.5 exposure. In addition, cardiac rehabilitation and high blood pressure were notable features in the model. This highlights the critical role of medication adherence in maintaining cardiovascular health, especially for patients on first-line treatment of hypertension and heart failure, as non-adherence can lead to fluid retention, increased cardiac workload, and higher risks of stroke [29]. In terms of SVM’s prediction, it produced a comparable rank ordering of key variables, whereas its nonlinear kernel-based architecture could not provide interpretable feature weights, limiting transparency in understanding factor contributions. Nevertheless, both models demonstrated high consistency in identifying respiratory and behavioral factors as dominant contributors, reinforcing the reliability of the analytical framework.

4. Discussion

This study presented a comprehensive analysis of CVD mortality and its associations with a wide spectrum of risk factors across the U.S. counties. A rigorous stepwise analytical framework, integrating correlation, regression and ML, was employed to identify and prioritize risks for targeted interventions. The multi-stage approach ensures that variables with subtler effects are not overshadowed by stronger factors in a multivariate context, thereby enhancing the comprehensiveness and reliability of the analytical results. The findings are broadly consistent with large national and international investigations, including the Global Burden of Disease (GBD), Institute for Health Metrics and Evaluation (IHME), and American Heart Association (AHA) studies, which collectively emphasize the roles of air pollution, smoking, hypertension, and socioeconomic deprivation as principal factors of CVD mortality [1,2]. Whereas GBD studies often rely on national averages to model global trends, this research further extends the current understanding by offering finer spatial resolution and incorporating biological, behavioral, socioeconomic, and environmental variables within a unified data-driven framework.

4.1. Demographic and Temporal Disparities

The prominent racial disparities, with Black and Native American populations experiencing the highest CVD mortality, indicate the cumulative impact of socioeconomic disadvantages, environmental exposures, and healthcare access barriers [20,21]. The strong association between age and CVD mortality in the 65+ group reemphasizes the importance of early prevention strategies, such as lifestyle interventions and regular screenings, to mitigate risk accumulation over time [5]. Notably, the discrepancy of higher mortality in men across all age groups challenges the previous conclusion and necessitates the re-examination of gender-specific risk factors and interventions in CVD management [30,31]. The decline in CVD mortality from 2006 to 2019, followed by a significant rebound in 2020, is a critical notion, reflecting the COVID-19 impact on healthcare systems, lifestyle behaviors, and cardiovascular complications [1]. This warrants further investigation into its long-term effects and underscores the need for resilient healthcare systems capable of maintaining CVD management during public health crises.

4.2. Biological and Behavioral Determinants

Among individual-level exposures, COPD emerged as the dominant factor in CVD mortality, reaffirming the established bidirectional relationship involving shared inflammatory and vascular pathways [16]. Hypertension and stroke were also significant, advocating for patient-centered approaches to CVD and related comorbidities, incorporating multimodal interventions that target shared pathways to yield dual benefits [12,32,33]. Behavioral factors, including smoking, insufficient sleep, and medication non-adherence, ranked among the most influential contributors across all analyses, reinforcing their importance in CVD prevention [4,5,34]. Consistent with prior evidence, physical activity and cardiac rehabilitation participation were protective, highlighting the potential for lifestyle-based interventions to reduce risk [35,36]. These findings reinforce the need for cost-effective, practical solutions to enhance adherence to lifestyle interventions, particularly through targeted patient education [4,33,37].
Certain factors, such as alcohol consumption and obesity, exhibited counterintuitive correlation with CVD mortality, despite their traditional role as risk contributors. Their associations with CVD mortality should be interpreted cautiously, as they likely reflect residual confounding rather than true causal effects. For instance, regions with moderate alcohol consumption might concurrently benefit from higher income and healthcare access, providing alternative explanations to a direct cardioprotective effect [24]. Similarly, the association between obesity and CVD mortality aligns with the concept of an “obesity paradox”, wherein increased cardiac imaging and follow-up may allow for earlier detection, better medical follow-up, and improved survival in obese patients with underlying CVD [38]. Such observations underscore the need for longitudinal analyses to disentangle behavioral, socioeconomic, and clinical interactions.

4.3. Socioeconomic and Environmental Influences

Beyond individual behaviors, the study highlights the profound impact of socioeconomic and environmental inequalities on cardiovascular health. Counties with high social vulnerability and low family incomes correlated with elevated CVD mortality, confirming the central role of social determinants as fundamental contributors to health disparities [10,11,12]. Conversely, protective factors such as higher household income, park access, and community resilience point to the potential of structural improvements in reducing cardiovascular risks [35].
Environmental exposure to PM2.5 was identified as an important player in spatial disparities, supporting previous research linking long-term air pollution to elevated cardiovascular morbidity [7,18]. The prominence of PM2.5 and smoking status as strong factors highlights the interconnectedness of environmental and behavioral factors in cardiovascular health, emphasizing the critical need for integrated public health initiatives. Practically, these initiatives should simultaneously target respiratory and cardiovascular health through efforts like smoking cessation programs and the promotion of green infrastructures in vulnerable areas and communities.

4.4. Complementary Analytical Framework

The complementary application of traditional regression and dual ML models enhances the reproducibility, interpretability, and predictive strength of the findings. Stepwise regression provided transparent parameter estimates and statistical significance for identifying independent associations, whereas ML models captured nonlinear interactions and variable hierarchies. Together, these approaches move the evidence base beyond purely retrospective analysis toward proactive, geographically tailored prevention.
From a translational standpoint, the agreement between regression and ML models provides actionable confidence for public health policy. Both highlight COPD, medication non-adherence, smoking, and PM2.5 exposure as convergent targets for intervention. In practice, these results could inform data-driven prioritization of resource allocation and prevention efforts. For instance, the spatial patterns derived from regression residuals could help state and county health agencies visualize “hotspots” and prioritize counties with high-risk profiles for enhanced surveillance and subsidized medication access.

4.5. Public Health Implications

From a public health and policy perspective, the identified risk factors provide actionable entry points for mitigating cardiovascular disparities at regional and national scales. First, the strong influence of COPD, smoking, and PM2.5 underscores the need for integrated respiratory and cardiovascular health initiatives, such as joint screening programs, tobacco taxation, expansion of smoking cessation, pulmonary rehabilitation, and air-quality improvement in high-burden counties. Second, enhancing medication adherence and chronic disease management through mobile health technologies, pharmacist-led monitoring, and subsidized antihypertensive programs could directly reduce mortality, particularly in low-income and rural communities. Third, the spatial overlap between CVD mortality and social vulnerability suggests a role for place-based policy interventions, including investment in affordable housing, green spaces, and public transportation infrastructure to reduce environmental stressors. Finally, local health departments could use the ML-based predictions and spatial distributions from this study to create data-driven risk maps that guide prevention funding and health equity initiatives. Collectively, these strategies illustrate that mitigating CVD disparities requires not only individual-level behavior modification but also structural reforms addressing environmental, economic, and healthcare inequities.

5. Limitations

Despite offering valuable insights, this study has certain limitations that should be acknowledged. First, variability in county-level data quality may affect precision and cannot explain intra-county variations in risk factor exposure and health outcomes. Second, ecological design with reliance on publicly available datasets precludes inference at the individual level, and unmeasured confounders such as genetic predispositions or local healthcare characteristics may remain. Third, the cross-sectional nature prevents causal inference, restricting interpretation to associations rather than temporal relationships. Future research could benefit from longitudinal studies by integrating more granular individual-level data to track changes in risk factor prevalence and their impact on CVD mortality rates over time.

6. Conclusions

Management of CVD burden requires a multifaceted approach encompassing public health initiatives, clinical care, and policy interventions. The study herein provides a robust framework to understand the major linkage pathways to CVD mortality disparities across the U.S., offering actionable insights for the development of data-driven interventions to promote population health outcomes. Key associated factors, including COPD, smoking, PM2.5, and medication non-adherence, provide opportunities for targeted interventions. The persistent geographic disparities, particularly elevated mortality in the southeastern U.S. coinciding with areas of high social vulnerability, highlight the profound influence of systemic inequalities on cardiovascular health outcomes. The protective effects of structural factors such as income and healthcare access further emphasize the need for policies to address systemic inequities to reduce CVD burden. The significant 2020 rebound in CVD mortality signals the potential long-term consequences of public health crises like COVID-19 on cardiovascular health. In practical terms, this research advocates for a multi-pronged approach to mitigate CVD mortality disparities. Further investigation into the cost-effectiveness of tailored interventions in high-risk counties would be valuable for effectively guiding resource allocation. This should particularly focus on strategies that incorporate lifestyle modifications with pharmacological treatments to address environmental and socioeconomic determinants. Effective strategies may prioritize integrated initiatives with foci on (1) addressing key environmental and behavioral risk factors; (2) implementing primary prevention for patients with chronic conditions through increased physical activity, smoking reduction, and enhanced medication adherence support; (3) improving early detection and treatment via regular screenings for chronic diseases like COPD and hypertension; (4) mitigating socioeconomic and educational inequalities; and (5) developing innovative interventions such as mobile health technologies, telehealth services, and community-based care programs to enhance healthcare access and support for high-risk communities. By practicing data-driven interventions at the local and regional levels, it may help reduce the CVD burden throughout the nation. Future research efforts should explore longitudinal changes and assess the effectiveness of targeted interventions in mitigating their temporal effects on CVD outcomes, enabling more reliable long-term forecasting and the management of cardiovascular health.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/healthcare13222937/s1, Table S1: Correlation of various factors to CVD mortality rates across U.S. counties.

Funding

This research received no external funding.

Institutional Review Board Statement

Ethics review and approval were waived for this study as all the datasets were collected from publicly accessible databases of the Centers for Disease Control and Prevention (CDC).

Informed Consent Statement

Patient consent was waived as publicly accessible data were used in this study, and no individual-level identifiers were accessed.

Data Availability Statement

The data presented in this study were derived from publicly available resources provided by the CDC, accessed on 21 November 2024. These sources include: CDC Heart Disease & Stroke Interactive Atlas (http://nccd.cdc.gov/DHDSPAtlas/Reports.aspx); CDC National Environmental Public Health Tracking Network (https://ephtracking.cdc.gov/download); and the CDC National Center for Health Statistics (NCHS) (https://data.cdc.gov/browse). No new raw data were created. The processed, merged county-level dataset and associated analysis scripts supporting this article are available from the corresponding author upon reasonable request.

Conflicts of Interest

The author declares no conflicts of interest.

References

  1. Martin, S.S.; Aday, A.W.; Almarzooq, Z.I.; Anderson, C.A.; Arora, P.; Avery, C.L.; Baker-Smith, C.M.; Gibbs, B.B.; Beaton, A.Z.; Boehme, A.K.; et al. 2024 heart disease and stroke statistics: A report of us and global data from the american heart association. Circulation 2024, 149, e347–e913. Available online: https://www.ahajournals.org/doi/suppl/10.1161/CIR.0000000000001209 (accessed on 25 June 2025). [CrossRef] [PubMed]
  2. Joynt Maddox, K.E.; Elkind, M.S.; Aparicio, H.J.; Commodore-Mensah, Y.; de Ferranti, S.D.; Dowd, W.N.; Hernandez, A.F.; Khavjou, O.; Michos, E.D.; Palaniappan, L.; et al. Forecasting the burden of cardiovascular disease and stroke in the united states through 2050—Prevalence of risk factors and disease: A presidential advisory from the american heart association. Circulation 2024, 150, e65–e88. [Google Scholar] [CrossRef]
  3. Welsh, A.; Hammad, M.; Piña, I.L.; Kulinski, J. Obesity and cardiovascular health. Eur. J. Prev. Cardiol. 2024, 31, 1026–1035. [Google Scholar] [CrossRef]
  4. Addo, P.N.; Mundagowa, P.T.; Zhao, L.; Kanyangarara, M.; Brown, M.J.; Liu, J. Associations between sleep duration, sleep disturbance and cardiovascular disease biomarkers among adults in the united states. BMC Public Health 2024, 24, 947. [Google Scholar] [CrossRef]
  5. Magnussen, C.; Ojeda, F.M.; Leong, D.P.; Alegre-Diaz, J.; Amouyel, P.; Aviles-Santa, L.; De Bacquer, D.; Ballantyne, C.M.; Bernabé-Ortiz, A.; Bobak, M.; et al. Global effect of modifiable risk factors on cardiovascular disease and mortality. N. Engl. J. Med. 2023, 389, 1273–1285. [Google Scholar] [CrossRef] [PubMed]
  6. Krittanawong, C.; Qadeer, Y.K.; Hayes, R.B.; Wang, Z.; Virani, S.; Thurston, G.D.; Lavie, C.J. PM2.5 and cardiovascular health risks. Curr. Probl. Cardiol. 2023, 48, 101670. [Google Scholar] [CrossRef]
  7. Lamas, G.A.; Bhatnagar, A.; Jones, M.R.; Mann, K.K.; Nasir, K.; Tellez-Plaza, M.; Ujueta, F.; Navas-Acien, A.; American Heart Association Council on Epidemiology and Prevention; Council on Cardiovascular and Stroke Nursing; et al. Contaminant metals as cardiovascular risk factors: A scientific statement from the american heart association. J. Am. Heart Assoc. 2023, 12, e029852. [Google Scholar] [CrossRef]
  8. Münzel, T.; Molitor, M.; Kuntic, M.; Hahad, O.; Röösli, M.; Engelmann, N.; Basner, M.; Daiber, A.; Sørensen, M. Transportation noise pollution and cardiovascular health. Circ. Res. 2024, 134, 1113–1135. [Google Scholar] [CrossRef]
  9. Kang, E.; Cho, D.; Lee, S.; Im, J.; Lee, D.; Yoo, C. An explainable ai framework for spatiotemporal risk factor analysis in public health: A case study of cardiovascular mortality in south korea. GIScience Remote Sens. 2024, 61, 2436997. [Google Scholar] [CrossRef]
  10. Bevan, G.; Pandey, A.; Griggs, S.; Dalton, J.E.; Zidar, D.; Patel, S.; Khang, S.U.; Nasir, K.; Rajagopalan, S.; Al-Kindi, S. Neighborhood-level social vulnerability and prevalence of cardiovascular risk factors and coronary heart disease. Curr. Probl. Cardiol. 2023, 48, 101182. [Google Scholar] [CrossRef] [PubMed]
  11. Minhas, A.M.K.; Jain, V.; Li, M.; Ariss, R.W.; Fudim, M.; Michos, E.D.; Virani, S.S.; Sperling, L.; Mehta, A. Family income and cardiovascular disease risk in american adults. Sci. Rep. 2023, 13, 279. [Google Scholar] [CrossRef] [PubMed]
  12. Wilcox, N.S.; Amit, U.; Reibel, J.B.; Berlin, E.; Howell, K.; Ky, B. Cardiovascular disease and cancer: Shared risk factors and mechanisms. Nat. Rev. Cardiol. 2024, 21, 617–631. Available online: https://www.nature.com/articles/s41569-024-01017-x (accessed on 13 July 2025). [CrossRef] [PubMed]
  13. Kundrick, J.; Rollins, H.; Mullachery, P.; Sharaf, A.; Schnake-Mahl, A.; Roux, A.V.D.; Bilal, U. Heterogeneity in disparities by income in cardiovascular risk factors across 209 us metropolitan areas. Prev. Med. Rep. 2024, 47, 102908. [Google Scholar] [CrossRef]
  14. Baptista, E.A.; Queiroz, B.L. Spatial analysis of cardiovascular mortality and associated factors around the world. BMC Public Health 2022, 22, 1556. [Google Scholar] [CrossRef] [PubMed]
  15. Cotton, A.; Salerno, P.; Deo, S.; Virani, S.; Nasir, K.; Neeland, I.; Rajagopalan, S.; Sattar, N.; Al-Kindi, S.; Elgudin, Y.E. The association between county-level premature cardiovascular mortality related to cardio-kidney-metabolic disease and the social determinants of health in the us. Sci. Rep. 2024, 14, 24984. [Google Scholar] [CrossRef]
  16. Fabbri, L.M.; Celli, B.R.; Agustí, A.; Criner, G.J.; Dransfield, M.T.; Divo, M.; Krishnan, J.K.; Lahousse, L.; de Oca, M.M.; Salvi, S.S.; et al. Copd and multimorbidity: Recognising and addressing a syndemic occurrence. Lancet Respir. Med. 2023, 11, 1020–1034. [Google Scholar] [CrossRef]
  17. Aggarwal, R.; Yeh, R.W.; Maddox, K.E.J.; Wadhera, R.K. Cardiovascular risk factor prevalence, treatment, and control in us adults aged 20 to 44 years, 2009 to march 2020. JAMA 2023, 329, 899–909. [Google Scholar] [CrossRef]
  18. Adepu, S.; Berman, A.E.; Thompson, M.A. Socioeconomic determinants of health and county-level variation in cardiovascular disease mortality: An exploratory analysis of georgia during 2014–2016. Prev. Med. Rep. 2020, 19, 101160. [Google Scholar] [CrossRef]
  19. Sun, F.N.; Yao, J.; Du, S.C.; Qian, F.; Appleton, A.A.; Tao, C.; Xu, H.; Liu, L.; Dai, Q.; Joyce, B.T.; et al. Social determinants, cardiovascular disease, and health care cost: A nationwide study in the united states using machine learning. J. Am. Heart Assoc. 2023, 12, e027919. [Google Scholar] [CrossRef]
  20. Borkowski, P.; Borkowska, N.; Mangeshkar, S.; Adal, B.H.; Singh, N. Racial and socioeconomic determinants of cardiovascular health: A comprehensive review. Cureus J. Med. Sci. 2024, 16, e59497. [Google Scholar] [CrossRef]
  21. Zuma, B.Z.; Parizo, J.T.; Valencia, A.; Spencer-Bonilla, G.; Blum, M.R.; Scheinker, D.; Rodriguez, F. County-level factors associated with cardiovascular mortality by race/ethnicity. J. Am. Heart Assoc. 2021, 10, e018835. [Google Scholar] [CrossRef] [PubMed]
  22. Akoglu, H. User’s guide to correlation coefficients. Turk. J. Emerg. Med. 2018, 18, 91–93. [Google Scholar] [CrossRef] [PubMed]
  23. Wang, S.S.; Li, Z.X.; Wang, X.Y.; Guo, S.; Sun, Y.J.; Li, G.H.; Zhao, C.H.; Yuan, W.H.; Li, M.; Li, X.L.; et al. Associations between sleep duration and cardiovascular diseases: A meta-review and meta-analysis of observational and mendelian randomization studies. Front. Cardiovasc. Med. 2022, 9, 930000. [Google Scholar] [CrossRef] [PubMed]
  24. Toma, A.; Paré, G.; Leong, D.P. Alcohol and cardiovascular disease: How much is too much? Curr. Atheroscler. Rep. 2017, 19, 13. [Google Scholar] [CrossRef]
  25. Neubeck, L.; Lowres, N.; Benjamin, E.J.; Freedman, S.B.; Coorey, G.; Redfern, J. The mobile revolution-using smartphone apps to prevent cardiovascular disease. Nat. Rev. Cardiol. 2015, 12, 350–360. [Google Scholar] [CrossRef]
  26. Xu, L.Q.; Mondal, D.; Polya, D.A. Positive association of cardiovascular disease (cvd) with chronic exposure to drinking water arsenic (as) at concentrations below the who provisional guideline value: A systematic review and meta-analysis. Int. J. Environ. Res. Public Health 2020, 17, 2536. [Google Scholar] [CrossRef]
  27. Al-Kindi, S.G.; Brook, R.D.; Biswal, S.; Rajagopalan, S. Environmental determinants of cardiovascular disease: Lessons learned from air pollution. Nat. Rev. Cardiol. 2020, 17, 656–672. [Google Scholar] [CrossRef]
  28. Hu, M.J.; Yang, T.; Yang, Y.J. Causal associations of education level with cardiovascular diseases, cardiovascular biomarkers, and socioeconomic factors. Am. J. Cardiol. 2024, 213, 76–85. [Google Scholar] [CrossRef]
  29. Roush, G.C.; Kaur, R.; Ernst, M.E. Diuretics: A review and update. J. Cardiovasc. Pharmacol. Ther. 2014, 19, 5–13. [Google Scholar] [CrossRef]
  30. Rodgers, J.; Briesacher, B.A.; Wallace, R.B.; Kawachi, I.; Baum, C.F.; Kim, D. County-level housing affordability in relation to risk factors for cardiovascular disease among middle-aged adults: The national longitudinal survey of youths 1979. Health Place 2019, 59, 102194. [Google Scholar] [CrossRef]
  31. DuPont, J.J.; Kenney, R.M.; Patel, A.R.; Jaffe, I.Z. Sex differences in mechanisms of arterial stiffness. Br. J. Pharmacol. 2019, 176, 4208–4225. [Google Scholar] [CrossRef]
  32. Barbera, M.; Lehtisalo, J.; Perera, D.; Aspö, M.; Cross, M.; De Jager Loots, C.A.; Falaschetti, E.; Friel, N.; Luchsinger, J.A.; Gavelin, H.M.; et al. A multimodal precision-prevention approach combining lifestyle intervention with metformin repurposing to prevent cognitive impairment and disability: The met-finger randomised controlled trial protocol. Alzheimer’s Res. Ther. 2024, 16, 23. [Google Scholar] [CrossRef] [PubMed]
  33. Nelson, A.J.; Pagidipati, N.J.; Bosworth, H.B. Improving medication adherence in cardiovascular disease. Nat. Rev. Cardiol. 2024, 21, 396–416. [Google Scholar] [CrossRef] [PubMed]
  34. Perry, A.S.; Dooley, E.E.; Master, H.; Spartano, N.L.; Brittain, E.L.; Gabriel, K.P. Physical activity over the lifecourse and cardiovascular disease. Circ. Res. 2023, 132, 1725–1740. [Google Scholar] [CrossRef] [PubMed]
  35. Baran, C.; Belgacem, S.; Paillet, M.; de Abreu, R.M.; de Araujo, F.X.; Meroni, R.; Corbellini, C. Active commuting as a factor of cardiovascular disease prevention: A systematic review with meta-analysis. J. Funct. Morphol. Kinesiol. 2024, 9, 125. [Google Scholar] [CrossRef]
  36. Taylor, R.S.; Dalal, H.M.; McDonagh, S.T. The role of cardiac rehabilitation in improving cardiovascular outcomes. Nat. Rev. Cardiol. 2022, 19, 180–194. [Google Scholar] [CrossRef]
  37. Bakhit, M.; Fien, S.; Abukmail, E.; Jones, M.; Clark, J.; Scott, A.M.; Glasziou, P.; Cardona, M. Cardiovascular disease risk communication and prevention: A meta-analysis. Eur. Heart J. 2024, 45, 998–1013. [Google Scholar] [CrossRef]
  38. Powell-Wiley, T.M.; Poirier, P.; Burke, L.E.; Després, J.P.; Gordon-Larsen, P.; Lavie, C.J.; Lear, S.A.; Ndumele, C.E.; Neeland, I.J.; Sanders, P.; et al. Obesity and cardiovascular disease: A scientific statement from the american heart association. Circulation 2021, 143, E984–E1010. [Google Scholar] [CrossRef]
Figure 1. CVD mortality trends across various demographic groups. (A) The fluctuations in CVD mortality rates over a 12-year period from 2006 to 2018. (B) Comparison of CVD mortality rates between men and women. (C) Comparison of CVD mortality rates across different age groups. (D) Comparison of CVD mortality rates among various racial and ethnic groups. The letters (a, b, c, etc.) on the graphs indicate statistical significance between groups.
Figure 1. CVD mortality trends across various demographic groups. (A) The fluctuations in CVD mortality rates over a 12-year period from 2006 to 2018. (B) Comparison of CVD mortality rates between men and women. (C) Comparison of CVD mortality rates across different age groups. (D) Comparison of CVD mortality rates among various racial and ethnic groups. The letters (a, b, c, etc.) on the graphs indicate statistical significance between groups.
Healthcare 13 02937 g001
Figure 2. Scatter plot displaying the correlation between COPD and CVD mortality across the U.S. counties.
Figure 2. Scatter plot displaying the correlation between COPD and CVD mortality across the U.S. counties.
Healthcare 13 02937 g002
Figure 3. Spatial patterns of CVD mortality and risk factors across the U.S. counties. (A) Spatial distribution of CVD mortality rates (per 100,000) with higher depicted in red and lower in blue. (B) Social vulnerability index (SVI) scores ranging from low (white) to high (dark purple). (C) Average annual air quality PM2.5 concentrations represented by shades of green, highlighting regions with different levels of air pollution. (D) Sunlight UV index distribution with darker orange indicating higher exposure levels.
Figure 3. Spatial patterns of CVD mortality and risk factors across the U.S. counties. (A) Spatial distribution of CVD mortality rates (per 100,000) with higher depicted in red and lower in blue. (B) Social vulnerability index (SVI) scores ranging from low (white) to high (dark purple). (C) Average annual air quality PM2.5 concentrations represented by shades of green, highlighting regions with different levels of air pollution. (D) Sunlight UV index distribution with darker orange indicating higher exposure levels.
Healthcare 13 02937 g003
Figure 4. Geographic distribution of regression residuals across the United States. (A) The spatial patterns present the county-scale residual distribution using individual data points located at each county’s centroid. Marker size is proportional to the absolute residual value. Marker shape differentiates residual direction, with ‘X’ corresponding to positive residuals or underestimation and ‘O’ to negative residuals or overestimation. (B) The spatial patterns illustrate the state-level mean residuals from the multiple regression model, averaged across all counties within each U.S. state. The diverging color scale indicates underestimation (positive residuals in red tones) or overestimation (negative residuals in blue tones) of the observed CVD mortality by the regression model, and the color intensity reflects the magnitude of bias.
Figure 4. Geographic distribution of regression residuals across the United States. (A) The spatial patterns present the county-scale residual distribution using individual data points located at each county’s centroid. Marker size is proportional to the absolute residual value. Marker shape differentiates residual direction, with ‘X’ corresponding to positive residuals or underestimation and ‘O’ to negative residuals or overestimation. (B) The spatial patterns illustrate the state-level mean residuals from the multiple regression model, averaged across all counties within each U.S. state. The diverging color scale indicates underestimation (positive residuals in red tones) or overestimation (negative residuals in blue tones) of the observed CVD mortality by the regression model, and the color intensity reflects the magnitude of bias.
Healthcare 13 02937 g004
Figure 5. Dominance of risk factors in predicting CVD mortality with random forest models across the U.S. counties.
Figure 5. Dominance of risk factors in predicting CVD mortality with random forest models across the U.S. counties.
Healthcare 13 02937 g005
Table 1. Correlation of ranked risk factors to CVD mortality across U.S. counties.
Table 1. Correlation of ranked risk factors to CVD mortality across U.S. counties.
FactorRpNFactorRpN
COPD Prevalence0.700<0.00013054Socioeconomic Vulnerability0.534<0.00013121
Current Smoker Status0.650<0.00013070Population with Disability0.526<0.00013122
High Blood Pressure0.644<0.00013070Adults No College Degree0.513<0.00013200
Less Sleeping < 7 h0.644<0.00013121Alcohol Use−0.509<0.00013054
Population Living in Poverty0.591<0.00013128Social Resilience−0.533<0.00013123
Food Stamp Percentage0.547<0.00013136Median Household Income−0.590<0.00013128
Stroke Prevalence0.540<0.00013070
Overall Vulnerability Rank0.481<0.00013121Renin Angiotensin Antagonist NA0.393<0.00013147
Social Vulnerability0.478<0.00013113High Cholesterol Prevalence0.388<0.00013070
Single-parent Households0.478<0.00013122Air Quality PM2.50.383<0.00013118
Asthma Prevalence0.470<0.00013054Leisure-time Physical Inactivity0.379<0.00013070
Coronary Heart Disease0.439<0.00013070Incremental Post-Acute Care Cost0.374<0.00013199
Diuretic Non-Adherence0.428<0.00013108Population without HS Diploma0.345<0.00013200
Diagnosed Diabetes0.417<0.00013070Family without Internet0.339<0.00013205
Mobile Housing Units0.414<0.00013122Blood Pressure Medication Use0.307<0.00013070
Post-Acute Care Cost0.411<0.00013199BRIC Resilience−0.331<0.00013123
Cardiac Rehabilitation Eligibility0.401<0.00013044Park Access Percent−0.336<0.00013137
Blood Pressure Medication NA0.397<0.00013161Housing-Infrastructural Resilience−0.413<0.00013123
DEHP in Water0.396<0.0001123Cardiac Rehabilitation Participation−0.413<0.00012399
Household Composition Disability0.395<0.00013121Median Home Value−0.453<0.00013197
Table 2. Stepwise multiple regression of CVD mortality and highly correlated factors.
Table 2. Stepwise multiple regression of CVD mortality and highly correlated factors.
Coefficient TermCoefSE CoefT-Valuep-ValueVIF
Constant−0.0180.234−0.080.940
Single-parent Households0.5930.2192.710.0073.03
Disability0.4340.2102.070.0392.73
Mobile-home Housing−0.5120.098−5.250.0002.29
Alcohol Use−0.8260.359−2.300.0212.53
Blood Pressure Medication Use1.2300.2564.790.0001.93
Cardiac Rehabilitation Eligibility1.0700.1298.310.0001.31
Diabetes1.0100.4962.030.0421.65
Food Stamp0.3640.1642.220.0263.59
Median Household Income−0.4580.071−6.470.0003.77
No College Degree0.3010.1152.610.0093.78
No High School Diploma−1.0600.177−5.970.0002.70
Air Quality PM2.54.7200.42311.150.0001.41
Park Access−0.0720.029−2.500.0121.65
Less Sleeping < 7 h0.9310.2873.250.0013.09
Post-Acute Care Cost0.0190.0053.660.0001.43
Blood Pressure Medication Non-adherence0.0890.00613.790.0003.28
Smoking Status4.4800.30814.550.0003.92
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

An, D.H. CVD Mortality Disparities with Risk Factor Associations Across U.S. Counties. Healthcare 2025, 13, 2937. https://doi.org/10.3390/healthcare13222937

AMA Style

An DH. CVD Mortality Disparities with Risk Factor Associations Across U.S. Counties. Healthcare. 2025; 13(22):2937. https://doi.org/10.3390/healthcare13222937

Chicago/Turabian Style

An, David H. 2025. "CVD Mortality Disparities with Risk Factor Associations Across U.S. Counties" Healthcare 13, no. 22: 2937. https://doi.org/10.3390/healthcare13222937

APA Style

An, D. H. (2025). CVD Mortality Disparities with Risk Factor Associations Across U.S. Counties. Healthcare, 13(22), 2937. https://doi.org/10.3390/healthcare13222937

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Article metric data becomes available approximately 24 hours after publication online.
Back to TopTop