Next Article in Journal
Trauma-Informed Care in Primary Health Settings—Which Is Even More Needed in Times of COVID-19
Next Article in Special Issue
Addressing Healthcare Gaps in Sweden during the COVID-19 Outbreak: On Community Outreach and Empowering Ethnic Minority Groups in a Digitalized Context
Previous Article in Journal
Mortality Rate and Predictors of Mortality in Hospitalized COVID-19 Patients with Diabetes
Previous Article in Special Issue
Examining Social Determinants of Health, Stigma, and COVID-19 Disparities
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:

Predictors of Death Rate during the COVID-19 Pandemic

Department of Geography and Political Sciences, Front Range Community College, Longmont, CO 80501, USA
Memorial Hermann Hospital, Houston, TX 77024, USA
School of Health Administration, Texas State University, San Marcos, TX 78666, USA
Acushnet Corporation, Boston, MA 02743, USA
Author to whom correspondence should be addressed.
Healthcare 2020, 8(3), 339;
Submission received: 17 August 2020 / Revised: 7 September 2020 / Accepted: 9 September 2020 / Published: 14 September 2020
(This article belongs to the Special Issue Health Disparities and Stigma in the Era of COVID-19)


Coronavirus (COVID-19) is a potentially fatal viral infection. This study investigates geography, demography, socioeconomics, health conditions, hospital characteristics, and politics as potential explanatory variables for death rates at the state and county levels. Data from the Centers for Disease Control and Prevention, the Census Bureau, Centers for Medicare and Medicaid, Definitive Healthcare, and were used to evaluate regression models. Yearly pneumonia and flu death rates (state level, 2014–2018) were evaluated as a function of the governors’ political party using a repeated measures analysis. At the state and county level, spatial regression models were evaluated. At the county level, we discovered a statistically significant model that included geography, population density, racial and ethnic status, three health status variables along with a political factor. A state level analysis identified health status, minority status, and the interaction between governors’ parties and health status as important variables. The political factor, however, did not appear in a subsequent analysis of 2014–2018 pneumonia and flu death rates. The pathogenesis of COVID-19 has a greater and disproportionate effect within racial and ethnic minority groups, and the political influence on the reporting of COVID-19 mortality was statistically relevant at the county level and as an interaction term only at the state level.

1. Introduction

Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) is the etiologic agent of the coronavirus (COVID-19) pandemic. As of 31 August 2020, the associated death toll in the United States is reported to have surpassed 180,000 [1], the highest of any country in raw numbers but equivalent to many other developed countries when adjusted for population [2]. The proper recognition and remediation of the disease are pressing concerns and each will likely be subject to debate in the months prior to the 2020 presidential election [3,4]. However, there is some concern surrounding the veracity of the data and factors contributing to COVID-19 deaths. Media outlets provide daily updates on the number of cases and deaths but draw this information from data collection and reporting agencies that have adjusted their methods over time [5]. The resulting inconsistencies have led to charges of underreporting [6,7] and overreporting [8,9], and have contributed to the politicization of the pandemic.
COVID-19 data inconsistencies and potential political bias in data reporting can have significant implications. If the data that politicians rely on are faulty, subsequent policies may harm public health, the economy, and other aspects of society. Testing differences, false positives, false negatives, and other factors that likely differ from state-to-state and county-to-county make the underlying official deaths with COVID-19 reports somewhat suspect; however, this study leverages the official data used by the Centers for Disease Control and Prevention (CDC). There are several county level studies about COVID-19 available from recent research. Badr et al. (2020) evaluated mobility patterns and COVID-19 transmission [10]. This study provided county level spread data but did not focus on deaths. Scannell et al. (2020) demonstrated racial disparities at the county level for COVID-19 cases and deaths [11]. Cases, unfortunately, suffer from severe measurement problems, as will be discussed. Ives and Bozzuto (2020) analyzed county level estimates of R0, the basic reproduction number for COVID-19 [12]. Altiera et al. (2020) estimated county level deaths used in estimating required medical supplies [13]. Two articles consider political factors—Flanders et al. (2020) assessed voter turnout as related to COVID-19 [14], and Makridis and Rothwell (2020) evaluated the effects of political polarization but not in terms of death rates [15]. We found no other paper that addresses death rate disparities by including a political variable. Thus, given the novel nature of the virus and its progression and the known inconsistencies in the reported data, we sought to gain a deeper understanding of the factors that contribute to reported deaths from COVID-19.

1.1. Research Questions

We investigated three research questions. First, what attributes of geography, demography, population density, economy, population health, hospital characteristics, and politics might explain the deaths per 100,000 (death rate) at the county level as of 31 August 2020? Second, did COVID-19 death rates at the state level differ based upon governor party affiliation after accounting for other relevant variables? As a control for our second line of inquiry, we also examined whether variation existed in previous flu/pneumonia death rates (2014–2018) based upon the governor’s party affiliation.

1.2. Significance and Motivation

To our knowledge, this research is the first to evaluate COVID-19 using combined data from multiple areas covering demographic, socioeconomic, health system, population health, and political factors using a spatial regression. It is also the first study to evaluate the effects of state and county political affiliation on COVID-19 death rates. The motivation behind this study is to address the media promulgation of explanatory factors that may or may not be scientifically verifiable (e.g., population density and political factors), particularly when placed in the context of other known factors established at the individual unit of analysis (e.g., race).

2. Methods

2.1. Sample Sizes and Data Sets

Sample sizes for the research questions were 3116 (county), 51 (states plus Washington D.C.), and 250 (50 states by 5 years). The dependent variable was the death rate per 100,000 population. Cumulative COVID-19 deaths were obtained from [1] for 31 August 2020. Flu data were from the Centers for Disease Control and Prevention, CDC, from 2014–2018 [16]. Definitive healthcare data provided descriptive hospital-related information [17]. Population and demographic data were from the Census Bureau [18]. The Centers for Medicare and Medicaid Services (CMS) provided the source for relevant patient morbidity proportions by state and county [19]. Geographic variables in the analysis included the shapefiles from the Census Bureau’s state and county Tiger Files [20].

2.2. Variables

The race and ethnicity variables included the proportion of African Americans, Native Americans, Asians, and Hispanics. The proportion of Caucasians was omitted due to collinearity considerations. Population density (population per square kilometer), and the proportion of people aged 65 and older served as additional control variables, although we anticipated (correctly) that the former might not enter the model, particularly when geospatial effects were considered. Economic variables included the median household income and unemployment. Population health status variables included the population proportions with chronic obstructive pulmonary disease (COPD), heart failure, diabetes, obesity, and cancer, all of which have been identified by the CDC as risk factors at the individual level [21], as well as other health-related variables including smoking, obesity, alcohol abuse, Alzheimer’s Disease, asthma, atrial fibrillation, depression, drug abuse, HIV, hepatitis B, and stroke. Health system capability variables included the number of acute beds in the county or state and the average case-mix index in the county or state. The case-mix index, or CMI, adjusts inpatients based on severity, with 1.0 being the “typical” visit and higher average numbers meaning more acute visits than would be expected.

2.3. Reasons for Variable Inclusion and Expected Effects

Geography was included as a known predictor of COVID-19 [22]. Similarly, demographics [23,24], population density [25], proportion of people aged 65 and older [21], economic considerations [26], population health status (comorbidities) [27], and political considerations [28] are also known as hypothetical factors that affect infection and death rates, although the reasons for the associations between individual variables and death rates are not fully understood [24]. We include hospital system characteristics to account for the possibility that lack of resources increase death rates [29].
Based on these research studies, we surmise that higher population densities might initially be associated with higher death rates, but that the effects of including spatial models will remove these effects. Increases in population density may place individuals at an increased risk of exposure. A better economic status (e.g., lower poverty rates) should result in better access to healthcare systems and thus lower death rates. Poverty, for example, results in reduced compliance with COVID-19 protocols [30]. Higher rates of comorbidities (e.g., health status) are likely to be associated with higher death rates [31]. An improved hospital capability and lower patient severity might reduce death rates [29]. Finally, there is much speculation that political considerations are influencing both death rates and the reporting of death rates, where Democratically affiliated geographies are anticipated to have higher death rates [32].

2.4. Transformations

Quantitative variables were standardized. At the state level of analysis, the small number of observations (51) necessitated data reduction. We used the first three principal components of all health status variables to proxy the effects of population health. These three components accounted for 75% of the variability of the original 19 variables.

2.5. Models

We evaluated least absolute shrinkage and selection (lasso) models [33] to generate a subset of variables associated with deaths per 100,000 using adaptive p-values as presented by Lockhart et al. [34] and implemented in the covTest package [35] in R [36]. The adaptive p-values address Lindley’s paradox, which often requires that the significance level changes as sample size increases [37]. We also used 10-fold cross-validation to evaluate R2 and the root mean squared error (RMSE) along with associated standard deviations (SDs). Appendix A Table A1 is a list of the independent variables evaluated.
After fitting the Ordinary Least Squares (OLSs) model and constrained models, we repeated the same process to fit geospatial models. Specifically, we used a residual analysis to fit appropriate geospatial models with all of the variables and the subset suggested by lasso. Moran’s I and Lagrangian multiplier diagnostics were used to recommend the appropriate geospatial model to be fitted (none, spatial lag, or spatial error).
We also investigated reporting differences that might exist for flu and pneumonia deaths at the state level. Using a repeated measures analysis, we modeled the logarithm of flu and pneumonia deaths as a function of year and governor party. All analyses were performed in R Statistical Software [36].

3. Results

All code is available for replication. County level R code (updated through 31 August 2020) is available online [38]. State level code (also updated through 31 August) and influenza analyses are available online as well [39].

3.1. Descriptive Statistics

Table 1 summarizes the descriptive statistics at the county level of analysis. At the county level (as of 31 August 2020), the mean COVID-19 death rate is 33.84. The mean county population was 9% African American, 2% Native American, 9% Hispanic, 1% Asian, and 20% aged 65 and over. Population density, income, and unemployment averages were 106.45 per square kilometer, USD 53,000 per county person and 4% per county, respectively. The largest comorbidity proportion average was adult obesity (32.85%), and the mean number of acute beds was 215 with a median of 35. The average CMI was 1.06 with a median of 1.17. Sixteen percent of counties voted for the Democratic candidate in 2016.
Figure 1 is a notched boxplot of the death rate of Democratic counties versus Republican counties. The notch indicates the statistical significance (median test) at the α = 0.05 level. There appears to be a statistically significant difference between the two group’s death rates per 100,000 people.
Figure 2 provides a scatterplot of the proportion voting Democrat in a state versus the deaths per 100,000 with symbols showing which states voted for Clinton versus Trump. Seven states have at least 75 deaths per 100,000. Of those states, six voted for Clinton. The red and blue dots indicate the current party of the state governor.
Table 2 presents a county level summary of the association between 2016 presidential election results, population density, and deaths from COVID-19. The population density is higher for counties that voted Democratic (116.2 versus 23.5), as are the death rates (71.0 versus 36.8).
At the state level (Table 3), descriptive statistics are provided for variables considered for the final model. The deaths per 100,000 for COVID-19 were 45.74 versus flu deaths of 15.10 per 100 K. The proportions of African Americans, Native Americans, Hispanics, and people 65 years of age (and older) were 11.27%, 1.62%, 12.01%, and 16.39%, respectively. Unemployment in 2019 averaged 3.62%, and about 49% of the states had Democratic governors.

3.2. COVID-19 Death Analysis, County

The four models estimated for the county analysis are depicted in Table 4. Column 1 shows the estimates for the full OLS model. The lasso model is shown in column 2. The geospatial models (full and reduced based on residual analysis) are shown in columns 3 and 4.

3.2.1. Ordinary Least Squares (OLSs) Full Model

The full OLS model (“OLS Full”) is depicted in the first columns of Table 4. The highest variance inflation factor (VIF) was 3.706 (poverty). The model accounted for 37.9% of the variability (R2). No statistically significant effect for the county’s winning party was apparent in the first model evaluation (p = 0.242). Figure 3 shows the map of the residuals for the full OLS model, indicating that some spatial autocorrelation exists in the northeast and the southwest areas of the country. Moran’s I analysis suggested a geospatial correlation as well (I = 0.253, p < 0.001).

3.2.2. Lasso Model

The best-tuned lasso model RMSE was 0.800 with a standard deviation (SD) of 0.045. The predicted R2 was 0.352 with a standard deviation of 0.028. The lasso model (“Lasso”, Table 4) using adaptive p-values identified likely predictors such as race, ethnicity, and three health status variables (Alzheimer’s Disease, COPD, and diabetes). The model produced a similar R2 as the unconstrained model (R2 = 0.374). This constrained regression model also suggested that the political factor (winning party) should be considered as a potential explanatory variable (p = 0.089). Residual patterns were similar to Figure 2, and Moran’s I was statistically significant, indicative of a spatial correlation (I = 0.265, p < 0.001). The Lagrange multiplier diagnostics again recommended a lag model.

3.2.3. Generalized Spatial Two-Stage Least Squares Model, All Variables

A generalized spatial two-stage least squares model (GS2SLS) [40] was used on the full set of independent variables. This model (“GIS Full”, Table 4) identified that geospatial location was important for explaining the death rate (ρ = 0.634). Variables in the model again included the political factor (winning party). The residuals from the geospatial model no longer exhibited an autocorrelation (Moran’s I = −0.098, p = 0.980).

3.2.4. Generalized Spatial Two-Stage Least Squares Model, Lasso Variables

A final reduced model included the variables identified by the lasso as part of a geospatial lag model. This final model (Table 4, “GIS Reduced”) also included the political factor, and again, the residuals were stable based on a Monte Carlo simulation of Moran’s I (I = −0.070, p = 0.980). For interpretability, the unscaled geospatial model is shown in Table 5.
In Table 5, the reduced geospatial analysis with unscaled variables suggests that geospatial effects, population density, ethnicity and race, unemployment, three health status variables, and the winning party are important in explaining the death rates per 100,000. Native American, Hispanic, and/or African American proportions are associated with a 42.728, 23.226, and 52.703 increase in deaths per 100,000 individuals, respectively. County political leaning based on the 2016 presidential election is associated with an increase of 4.503 deaths per 100,000 individuals (dichotomously coded variable). Moran’s I was not significant (I = −0.070, p = 0.9804).
An important result is that while we evaluated population density, its standardized effect size was almost zero (0.003) when other factors were considered. This county level analysis is congruent with Pew Research findings that death rates are higher in Democratic-led counties [32]. This study suggests that the racial/ethnic composition and geographic relationships with the outbreak are important considerations along with political considerations. Further, we note that the results of the spatial analysis are similar to those of the nonspatial analysis. The implication may be that our county level models are robust.

3.3. COVID-19 Death Analysis, State

Given the results of the political analysis at the county level, we further evaluated political leadership at the state level, examining a subset of variables found from the county level analysis. Since only 51 observations were available, the analysis was restricted to the minority proportion in the state (1-proportion Caucasian only), the first three principal components of health status variables (accounting for 75% of the variability), population density, unemployment, the governor’s party, and plurality [20]. Plurality was dichotomously coded with 0 = plurality (the 2016 voting consensus matching the governor’s party) and 1 = no plurality (voting block different from the governor’s party). We also surmised that there might exist an interaction effect between the governors’ party and health status and modeled the interaction terms accordingly. Death rates were mapped, and states in the Northeast (New Jersey, New York, Massachusetts, and Connecticut) had higher death rates than other areas of the country. These states were omitted in a secondary analysis to ensure that the results found were not due strictly to outliers.
An OLSs model using the aforementioned variables captured 66% of the variability with the highest VIF of 3.24. Statistically significant variables included the minority population, all three health status principal components, and the interaction term between the governor’s party and the first principal component (the linear combination representing the primary comorbidities of the population). Moran’s I did not suggest that a spatial model was required at the state level (I = 0.060, p = 0.162). A map of the residuals is shown in Figure 4. When removing the outliers of New Jersey, New York, Massachusetts, and Connecticut, minority status was the remaining statistically significant variable. Health status and the governor’s party interaction with health status fell out of the model (Table 6).

3.4. Flu Death Analysis, State

As a final analysis, we investigated death rates from past influenza outbreaks and governors’ parties, a proxy for party politics. Since we found an effect at the county level and an interaction effect at the state level, we wanted to see if this was constant over time based on another respiratory disease. To investigate, we ran a repeated measures (by state) analysis of variance on the log-transformed death rate for 2014–2018. The model identified no effects associated with the governor party affiliation (F(1, 244) = 1.531, p = 0.217), only the reporting year (F(4, 244) = 2.382, p = 0.040).

4. Discussion

4.1. Summary of Results

In this study, we first ran a county level analysis for death rates based on geographical, socioeconomic, health status, health capability, and political groupings. Our investigations were reduced to two full OLS models and two geospatial models. From our analysis, it was clear that geospatial models with lags were preferred to the OLS models. Further, the reduced GIS model using only variables identified from lasso produced nearly the same R2 as the full GIS model (0.500 versus 0.507, respectively). Thus, the reduced model performs nearly as well as the full model in estimating county death rates. In that model, we see significant geospatial effects (ρ), as well as those associated with population density, race, and the winning party in the 2016 election. The estimate for Democratic counties (untransformed) was 4.503 deaths per 100,000.
For the state level analysis, we found effects associated with the proportion minority, three principal components associated with health status variables, and the interaction between the governor’s party and the first health status variable. However, when removing the four states with the highest death rates (New Jersey, New York, Massachusetts, and Connecticut), we found that the only predictive variable was the minority proportion in the state. Further, an analysis of influenza death rates showed no effect associated with political party.

4.2. Population Density Effects

Population density has been identified as a predictive factor in disease progression [41,42]. A superficial examination of county level data indicates that a relationship might exist between population density and death rate from COVID-19 (see Table 2). Consistent with prior analysis [43,44], Table 2 also shows urban areas tended to vote Democrat in the 2016 presidential election. Due to these associations, media outlets have presented the urban–rural divide as a viable explanation for the difference in death rates between counties that voted Democrat in 2016, and those that voted Republican [45,46]. This divide has also provided an explanation for the divergent response to the disease based on party affiliation. For example, Democrats are more concerned about COVID-19 than Republicans, and are more likely to wear a facemask and practice other forms of social distancing [28,47,48]. However, the effect size of population density at the county level is negligible when other factors are considered. For example, in the reduced GIS model for counties, the standardized coefficient is only 0.051. Population density does not appear as a significant variable in the state level models. The failure of population density to provide a more significant explanation for deaths from COVID-19 has been one of the surprising results from our analysis.

4.3. Race and Ethnicity/Minority Effects

At the county level, our study confirms the findings of numerous researchers pertaining to healthcare disparities in the United States, particularly with respect to Native American, Hispanic, and African American populations [49,50,51]. We found an increase in the percentage of these populations to be associated with an increase in mortality from COVID-19 at the county and state levels of analysis. McLaren (2020) attributes this difference to disparities in education, occupation, and commuting patterns [51]. The causes of disparity, however, are not explained by the covariates in this study (see Carl, 2020 [52]). Although we did not include these factors in our analysis, we did find the mortality disparities do not appear to be attributable to differences in unemployment rates or household income. Our county findings suggest that there are healthcare disparities in the United States, but may also be indicative of a pathogenesis of COVID-19 that has a greater and disproportionate effect within these three racial groups [53,54]. At the state level, increases in minority population proportions were also associated with increases in death rates per 100,000.

4.4. Health Status Effects

At the state level, health status (measured by three principal components and the interaction between the governor’s party and the first principal component) was a predictor for the n = 51 state observations. These health status effects disappeared after removing the four outlier states from the model. Thus, it would appear that minority status is the predominant predictor such that increases in the proportion of minorities are associated with increases in deaths per 100,000.

4.5. Unemployment Effects

At the county level (and consistent with prior research), unemployment characteristics were identified as having a significant association with COVID-19-related deaths [44,45]. While this association is clear, its causation is not. It is possible that unemployment increases exposure to the disease; for example, cost-cutting might lead to increased use of public transportation. It is possible that unemployment increases vulnerability to the disease through elevated stress levels and poor nutrition. The unemployed may also be left without access to healthcare, which increases mortality from disease. However, it is also possible that unemployment increases the incidence of deaths of despair (deaths due to drug, alcohol, and suicide), and that these excess deaths (defined by the CDC as the difference between the observed numbers of deaths and expected number of deaths in a specific time period) [55] are being reported as COVID-related. For example, on 13 April 2020, New York City added more than 3700 people to the COVID-19 death total – people who were presumed to have died of the coronavirus but had never tested positive [56,57]. Without a positive test, it is impossible to know if these additional deaths—at the time, 37% of the city’s total—were actually COVID-related, were deaths of despair, or were due to other causes.
Periods of economic downturn have long been found to be associated with declines in health status and higher suicide rates compared with periods of relative prosperity [46,47,48]. Recent research has found a 17% increase in drug overdose nationally during April and May 2020 [58]. Compounding the problem, there are indications that a prolonged and overly restrictive COVID response is deepening an already deleterious economic cycle, the result of which is increased unemployment [49]. As unemployment increases, so does the mortality rate either directly or indirectly from the disease. In short, extended efforts to eradicate the disease may cause additional harmful secondary and tertiary effects that may be worse than the disease itself.

4.6. Political Party Effect

The influence of politics on the reporting of COVID-19 mortality was a significant finding in our analysis. County level Democratic affiliation was significantly associated with increased COVID-19 deaths, even after controlling for factors such as population density. To the best of our knowledge, this is the first time that population density and urbanization are used as controls when evaluating death rates between Democratic and Republican states.
In past years, the CDC retrospectively tabulated the number of flu-associated illnesses, hospitalizations, and deaths—a process that takes up to two years to generate an estimate. The process relies on estimation modeling in and out of hospitals based on behavioral algorithms [59]. The CDC never relies solely on death certificate data because it recognizes that there is never large-scale testing and that the clinicians do not routinely list influenza data on death certificates if the patient died of pneumonia, heart failure, or deteriorating lung disease. According to the CDC, this leads to significant underreporting of deaths due to flu every year [59].
On 20 February 2020, the CDC published guidelines for the diagnosis and mandatory reporting of COVID-19 for any patients evaluated with “COVID related” illnesses. This applied to all healthcare practitioners and included a comprehensive set of instructions and codes to document any relationship to COVID-19 on the death certificates [60]. This represents a significant change in reporting of the disease and consequently the inclusion on the death certificate. Three separate additional guidelines put out in March and April affirmed these measures. In addition, the new CDC guidance stated that: “In cases where a definite diagnosis of COVID–19 cannot be made, but it is suspected or likely, it is acceptable to report COVID–19 on a death certificate as ‘probable’ or ‘presumed’” [60]. This change introduced significant potential variations in the tabulation of COVID-19 death tolls.
At approximately the same time, the Centers for Medicare and Medicaid Services (CMS) authorized an additional 20% reimbursement for patients carrying a diagnosis of COVID-19 pursuant to Sections 3710 and 3711 of the CARES Act [61]. These changes created a financial incentive for hospitals to classify patients as positive for COVID-19. Importantly, at the time these measures were introduced, the dominant model used by policy-makers—based on Ferguson et al. [62]—predicted an exceptionally high mortality rate [63]. By late March, more accurate estimates predicted a mortality rate well below original expectations [64]. This should have triggered a policy reversal from the CDC and CMS, but no changes were noted. In short, in the politically charged landscape of 2020, the CDC’s new way of collecting data, combined with CMS’ monetary incentives, may have resulted in the overreporting of COVID-19 deaths. The introduction of these two new sources of reporting bias makes historical comparisons unreliable at best. Without reliable data, it is difficult to effectively fight a pandemic. This conundrum associated with the reliability of data on COVID-related deaths highlights the need for objective and uniform standards for case identification and data collection.

5. Conclusions

During our analysis, we evaluated the data that pointed toward political interference in the reporting of COVID-related deaths. As of 31 August 2020, it is clear that the national death rate from COVID-19 is higher than from other flu pandemics, but the increase in the reported death rate in states with Democratic governors has been greater than the increase in states with Republican governors. Much more research in the area of politicization of medical reporting is needed, particularly given the political climate of the United States.
One of the major limitations of this study is that the associated methods are unable to estimate causality. Any variable found to be unimportant in this analysis might have its effects mediated out by others. The coefficient estimates are associated with the model built, and the associated p-values suggest the importance of that model. A second important limitation is that this analysis is current only as of 31 August 2020. The analysis will continue to change as the pandemic peaks and subsides.
Future research should supplement this analysis by investigating whether states with contested gubernatorial elections (e.g., those with ballot purges, an issue that is becoming more commonplace [65]) report higher mortality rates than those with normal elections. Additional research should focus on time series models as well as simulations to generate forecasts with the external regressors identified by this research.

Author Contributions

Conceptualization, I.F. & B.C.; methodology, L.F.; software, L.F.; validation, B.B. and J.H. formal analysis, L.F.; writing—original draft preparation, I.F., B.C., J.H., B.B., L.F. All authors have read and agreed to the published version of the manuscript.


This research received no external funding.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A

Table A1. Independent Variables Considered in the Analysis.
Table A1. Independent Variables Considered in the Analysis.
State FIPS CodeUSA Facts% HypertensionCMS
State NameUSA Facts% Ischemic Heart DiseaseCMS
County NameUSA Facts% StrokeCMS
County FIPS CodeUSA FactsCivilian Labor Force, February 2020BLS
Population 2020USA FactsEmployed, February 2020BLS
Land Area, square kilometersCBUnemployed, February 2020BLS
People per sq. kilometerCalculatedPercent Unemployed, February 2020BLS
Urban–Rural ClassificationNCHSCivilian Labor Force, 2019USDA ERS
% < Poverty LineUSDA ERSEmployed, 2019USDA ERS
% for Clinton in 2016MITUnemployed, 2019USDA ERS
Winning Party in 2016MIT% Unemployed, 2019USDA ERS
% Below Poverty Line, 2018USDA ERSMHI, 2018USDA ERS
% SmokersRWFPopulation ≥ 65, 2019CB
% Adult ObesityRWF% age 65 and over, 2019CB
% Abusing AlcoholCMSMedian Age, 2019CB
% Alzheimer’sCMSTotal Population, 2018IPUMS
% AsthmaCMSRacial DataIPUMS
% Atrial FibrillationCMS# Hospital PhysiciansDHC
% CancerCMS# Acute Care BedsDHC
% Chronic Kidney DiseaseCMS# Intensive Care BedsDHC
% COPDCMS# Staffed BedsDHC
% DepressionCMS# DischargesDHC
% DiabetesCMSSum Average Daily CensusDHC
% Drug AbuseCMSHospital average length of stayDHC
% HIVCMSAverage market concentration indexDHC
% Heart FailureCMSAverage hospital case mix indexDHC
% Hepatitis B or CCMSGeographic shape filesCB
% HyperlipidemiaCMS
# = Number, CB = Census Bureau [18], NCHS = National Center for Health Statistics [66], USDA ERS = United States Department of Agriculture Economic Research Service [67], MIT = MIT Election Lab [68], RWF = Robert Woods Foundation County Health Rankings and Roadmaps [69], CMS = Centers for Medicare & Medicaid Services [19], BLS = Bureau of Labor Statistics [70], IPUMS = Integrated Public Use Microdata Series [71], DHC = Definitive Healthcare [17].


  1. USAFacts. Nonpartisan Government Data. Available online: (accessed on 7 August 2020).
  2. JohnsHopkins. COVID-19 Map. Available online: (accessed on 4 August 2020).
  3. Weller, M.W.; Wheat, A.D.; Fleischmann, J.F. COVID-19 Will Reshape the 2020 Presidential Campaign. Available online: (accessed on 31 August 2020).
  4. Newall, M. Most Americans Support Single, National Strategy to Combat COVID-19. Available online: (accessed on 7 August 2020).
  5. Patino, M. Coronavirus Data in the U.S. Is Terrible, and Here’s Why. Available online: (accessed on 4 August 2020).
  6. Brown, E.; Reinhard, B.; Davis, A.C. Coronavirus Death Toll: Americans Are Almost Certainly Dying of COVID-19 but Being Left Out of the Official Count; The Washington Post: Washington, DC, USA, 2020. [Google Scholar]
  7. Kliff, S.; Bosman, J. Official counts understate the U.S. coronavirus death toll. New York Times, 5 April 2020. [Google Scholar]
  8. COVID Doctors Challenge CDCs Rules on Cause of Death, Concerned about Inflated Numbers. Available online: (accessed on 7 August 2020).
  9. Ingold, J.; Paul, J. Nearly a Quarter of the People Colorado Said Died From Coronavirus Don’t Have COVID-19 on Their Death Certificate; The Colorado Sun: Denver, CO, USA, 2020. [Google Scholar]
  10. Badr, H.S.; Du, H.; Marshall, M.; Dong, E.; Squire, M.M.; Gardner, L.M. Association between mobility patterns and COVID-19 transmission in the USA: A mathematical modelling study. Lancet Infect. Dis. 2020. [Google Scholar] [CrossRef]
  11. Scannell, C.A.; Oronce, C.I.A.; Tsugawa, Y. Association Between County-Level Racial and Ethnic Characteristics and COVID-19 Cases and Deaths in the USA. J. Gen. Intern. Med. 2020. [Google Scholar] [CrossRef] [PubMed]
  12. Ives, A.R.; Bozzuto, C. Predictable county-level estimates of R0 for COVID-19 needed for public health planning in the USA. MedRxiv 2020. [Google Scholar] [CrossRef]
  13. Altieri, N.; Barter, R.; Duncan, J.; Dwivedi, R.; Kumbier, K.; Li, X.; Netzorg, R.; Park, B.; Singh, C.; Tan, Y.S.; et al. Curating a COVID-19 data repository and forecasting county-level death counts in the United States. arXiv 2020, arXiv:2005.07882. [Google Scholar]
  14. Flanders, W.D.; Goodman, M. The association of voter turnout with county-level coronavirus disease 2019 occurrence early in the pandemic. Ann. Epidemiol. 2020, 49, 42–49. [Google Scholar] [CrossRef]
  15. Makridis, C.; Rothwell, J.T. The real cost of political polarization: Evidence from the COVID-19 pandemic. SSRN 2020. Available online: (accessed on 31 August 2020).
  16. CDC. Stats of the States-Influenza/Pneumonia Mortality. Available online: (accessed on 7 August 2020).
  17. Definitive Healthcare. Healthcare Analytics & Provider Data|Definitive Healthcare. Available online: (accessed on 7 August 2020).
  18. U.S. Census Bureau. Available online: (accessed on 9 July 2020).
  19. Research, Statistics, Data & Systems|CMS. Available online: (accessed on 7 August 2020).
  20. Bureau, U.S.C. TIGER/Line Shapefiles. Available online: (accessed on 7 August 2020).
  21. CDC. People with Certain Medical Conditions. Available online: (accessed on 4 August 2020).
  22. Bialek, S.; Bowen, V.; Chow, N.; Curns, A.; Gierke, R.; Hall, A.; Hughes, M.; Pilishvili, T.; Ritchey, M.; Roguski, K.; et al. Geographic differences in COVID-19 cases, deaths, and incidence—United States, February 12–April 7, 2020. Morb. Mortal. Wkly. Rep. 2020, 69, 465–471. [Google Scholar]
  23. Dowd, J.B.; Andriano, L.; Brazel, D.M.; Rotondi, V.; Block, P.; Ding, X.; Liu, Y.; Mills, M.C. Demographic science aids in understanding the spread and fatality rates of COVID-19. Proc. Natl. Acad. Sci. USA 2020, 117, 9696. [Google Scholar] [CrossRef] [Green Version]
  24. Lynn, R.; Meisenberg, G. Race Differences in Deaths from Coronavirus in England and Wales: Demographics, Poverty, Pre-existing Conditions, or Intelligence? Mankind Q 2020, 60, 511–524. [Google Scholar] [CrossRef]
  25. Rocklöv, J.; Sjödin, H. High population densities catalyse the spread of COVID-19. J. Travel Med. 2020, 27. [Google Scholar] [CrossRef]
  26. Dragano, N.; Rupprecht, C.J.; Dortmann, O.; Scheider, M.; Wahrendorf, M. Higher risk of COVID-19 hospitalization for unemployed: An analysis of 1,298,416 health insured individuals in Germany. MedRxiv 2020. [Google Scholar] [CrossRef]
  27. Richardson, S.; Hirsch, J.S.; Narasimhan, M.; Crawford, J.M.; McGinn, T.; Davidson, K.W. Presenting Characteristics, Comorbidities, and Outcomes Among 5700 Patients Hospitalized With COVID-19 in the New York City Area. JAMA 2020, 323, 2052–2059. [Google Scholar] [CrossRef]
  28. Grossman, G.; Kim, S.; Rexer, J.; Thirumurthy, H. Political Partisanship Influences Behavioral Responses to Governors’ Recommendations for COVID-19 Prevention in the United States. SSRN 2020. [Google Scholar] [CrossRef]
  29. Fang, D.; Pan, S.; Li, Z.; Yuan, T.; Jiang, B.; Gan, D.; Sheng, B.; Han, J.; Wang, T.; Liu, Z. Large-scale public venues as medical emergency sites in disasters: Lessons from COVID-19 and the use of Fangcang shelter hospitals in Wuhan, China. BMJ Glob. Health. 2020, 5, 002815. [Google Scholar] [CrossRef] [PubMed]
  30. Driscoll, J.; Sonin, K.; Wilson, J.; Wright, A.L. Poverty and Economic Dislocation Reduce Compliance with COVID-19 Shelter-in-Place Protocols CEPR Discussion Paper No. DP14618 2020. Available online: (accessed on 31 August 2020).
  31. CDC. People Who Are at Higher Risk for Severe Illness | Coronavirus | COVID-19 | CDC. Available online: (accessed on 7 August 2020).
  32. Jones, B. Coronavirus Death Toll is Heavily Concentrated in Democratic Congressional Districts. Available online: (accessed on 7 August 2020).
  33. Tibshirani, R. Regression shrinkage and selection via the lasso. J. R. Stat. Soc. Ser. B Methodol. 1996, 58, 267–268. [Google Scholar] [CrossRef]
  34. Lockhart, R.; Taylor, J.; Tibshirani, R.J.; Tibshirani, R. A significance test for the lasso. Ann. Statist. 2014, 42, 413–468. [Google Scholar] [CrossRef] [Green Version]
  35. Lockhart, R.; Taylor, J.; Tibshirani, R.; Tibshirani, R. CovTest: Computes Covariance Test for Adaptive Linear Modelling. R Package Version 1.02. Available online: (accessed on 29 August 2020).
  36. R Core Team. R: A Language and Environment for Statistical Computing; R Foundation for Statistical Computing: Vienna, Austria, 2018. [Google Scholar]
  37. Lindley, D.V. A statistical paradox. Biometrika 1957, 44, 187–192. [Google Scholar] [CrossRef]
  38. Fulton, L. County Analysis. Available online: (accessed on 31 August 2020).
  39. Fulton, L. State Analysis. Available online: (accessed on 31 August 2020).
  40. Drukker, D.M.; Prucha, I.R.; Raciborski, R. Maximum Likelihood and Generalized Spatial Two-Stage Least-Squares Estimators for a Spatial-Autoregressive Model with Spatial-Autoregressive Disturbances. Stata J. 2013, 13, 221–241. [Google Scholar] [CrossRef] [Green Version]
  41. Reyes, R.; Ahn, R.; Thurber, K.; Burke, T.F. Urbanization and Infectious Diseases: General Principles, Historical Perspectives, and Contemporary Challenges. In Challenges in Infectious Diseases; Fong, I.W., Ed.; Springer: New York, NY, USA, 2013; pp. 123–146. [Google Scholar]
  42. Neiderud, C.-J. How urbanization affects the epidemiology of emerging infectious diseases. Infect. Ecol. Epidemiol. 2015, 5, 27060. [Google Scholar] [CrossRef]
  43. Badger, E.; Bui, Q.; Pearce, A. The election highlighted a growing rural-urban split. The New York Times, 11 November 2016. [Google Scholar]
  44. Evich, H.B. Revenge of the Rural Voter. Available online: (accessed on 7 August 2020).
  45. Davis, E. Study: COVID-19 Deaths Concentrated in Urban Areas Represented by Democrats; U.S.News: Washington, DC, USA, 2020. [Google Scholar]
  46. Elving, R. NPR: What Coronavirus Exposes About America’s Political Divide. Available online: (accessed on 7 August 2020).
  47. Civiqs. Coronavirus: Outbreak Concern. Available online: (accessed on 7 August 2020).
  48. Pew. Health Concerns From COVID-19 Much Higher Among Hispanics and Blacks Than Whites. Available online: (accessed on 7 August 2020).
  49. Wilder, J.M. The Disproportionate Impact of COVID-19 on Racial and Ethnic Minorities in the United States. Clin. Infect. Dis. 2020. [Google Scholar] [CrossRef]
  50. Dorn, A.v.; Cooney, R.E.; Sabin, M.L. COVID-19 exacerbating inequalities in the US. Lancet 2020, 395, 1243–1244. [Google Scholar] [CrossRef]
  51. McLaren, J. Racial disparity in COVID-19 deaths: Seeking economic roots with census data. Natl. Bur. Econ. Res. 2020. [Google Scholar] [CrossRef]
  52. Carl, N. An analysis of COVID-19 mortality at the local authority level in Englan. SocArXiv 2020. [Google Scholar] [CrossRef]
  53. Nédélec, Y.; Sanz, J.; Baharian, G.; Szpiech, Z.A.; Pacis, A.; Dumaine, A.; Grenier, J.-C.; Freiman, A.; Sams, A.J.; Hebert, S.; et al. Genetic Ancestry and Natural Selection Drive Population Differences in Immune Responses to Pathogens. Cell 2016, 167, 657–669. [Google Scholar] [CrossRef] [PubMed]
  54. Price-Haywood, E.G.; Burton, J.; Fort, D.; Seoane, L. Hospitalization and Mortality among Black Patients and White Patients with Covid-19. N. Engl. J. Med. 2020, 382, 2534–2543. [Google Scholar] [CrossRef] [PubMed]
  55. CDC. Excess Deaths Associated with COVID-19. Available online: (accessed on 13 September 2020).
  56. Goodman, J.D.; Rashbaum, W.K. N.Y.C. Death Toll Soars Past 10,000 in Revised Virus Count. Available online: (accessed on 13 September 2020).
  57. Fichera, A. Hospital Payments and the COVID-19 Death Count. Available online: (accessed on 13 September 2020).
  58. Alter, A.; Yeager, C. ODMap: COVID-19 Impact on US National Overdose Crisis. Available online: (accessed on 13 September 2020).
  59. CDC. How CDC Estimates the Burden of Seasonal Influenza in the U.S. | CDC. Available online: (accessed on 7 August 2020).
  60. CDC. Guidance for Certifying Deaths due to Coronavirus Disease 2019 (COVID-19). Available online: (accessed on 7 August 2020).
  61. CMS. New Waivers for Inpatient Prospective Payment System (IPPS) Hospitals, Long-Term Care Hospitals (LTCHs), and Inpatient Rehabilitation Facilities (IRFs) due to Provisions of the CARES Act. Available online: (accessed on 7 August 2020).
  62. Ferguson, N.; Laydon, D.; Nedjati Gilani, G.; Imai, N.; Ainslie, K.; Baguelin, M.; Bhatia, S.; Boonyasiri, A.; Cucunuba Perez, Z.; Cuomo-Dannenburg, G.; et al. Report 9: Impact of Non-Pharmaceutical Interventions (NPIs) to Reduce COVID19 Mortality and Healthcare Demand. Available online: (accessed on 7 August 2020).
  63. Adam, D. Special report: The simulations driving the world’s response to COVID-19. Nature 2020, 580, 316–318. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  64. Holmdahl, I.; Buckee, C. Wrong but Useful—What Covid-19 Epidemiologic Models Can and Cannot Tell Us. N. Engl. J. Med. 2020, 383, 303–305. [Google Scholar] [CrossRef] [PubMed]
  65. Purges: A Growing Threat to the Right to Vote. Available online: (accessed on 8 August 2020).
  66. NCHS. National Center for Health Statistics. Available online: (accessed on 7 July 2020).
  67. ERS. United States Department of Agriculture Economic Research Service. Available online: (accessed on 7 July 2020).
  68. MIT. Data. Available online: (accessed on 7 July 2020).
  69. RWF. 2020 County Health Rankings Report. Available online: (accessed on 7 July 2020).
  70. BLS. Available online: (accessed on 7 July 2020).
  71. IPUMS. Download U.S. Census Data Tables and Mapping Files. Available online: (accessed on 7 July 2020).
Figure 1. Boxplots of Republican versus Democratic county death rates per 100,000.
Figure 1. Boxplots of Republican versus Democratic county death rates per 100,000.
Healthcare 08 00339 g001
Figure 2. Coronavirus (COVID-19) death rates per 100,000 (y axis) as a function of proportion voting for Clinton in 2016 (x axis) and the current party of the governor as a red or blue dot.
Figure 2. Coronavirus (COVID-19) death rates per 100,000 (y axis) as a function of proportion voting for Clinton in 2016 (x axis) and the current party of the governor as a red or blue dot.
Healthcare 08 00339 g002
Figure 3. Residual plot from the Ordinary Least Squares (OLSs) model shows clusters in the Northeast and Southwest.
Figure 3. Residual plot from the Ordinary Least Squares (OLSs) model shows clusters in the Northeast and Southwest.
Healthcare 08 00339 g003
Figure 4. Residuals, state level initial analysis.
Figure 4. Residuals, state level initial analysis.
Healthcare 08 00339 g004
Table 1. County level descriptive statistics.
Table 1. County level descriptive statistics.
Variable (n = 3116 Counties)MeanSDMedianMinimumMaximum
Population in 2020105,237334,733.3826,163.0016910,039,107
Population Density (persons per km2)106.45696.9417.50027755
Native American %1.57%6.48%0.30%0.00%89.60%
Hispanic %9.30%13.84%4.10%0.00%99.10%
African American %8.99%14.51%2.20%0.00%87.40%
Asian %1.31%2.59%0.60%0.00%43.08%
% 65 or older19.79%4.76%19.40%4.90%58.20%
Unemployment % (2019)3.96%1.39%3.70%0.70%18.30%
Household Income USD (2018) +USD 52,714.43USD 13,851.63USD 50,531.00USD 25,385.00USD 140,382.00
Poverty %15.17%6.11%14.10%2.60%54.00%
Smoke %17.44%3.56%16.95%5.91%41.49%
Adult Obesity %32.85%5.43%33.10%12.40%57.70%
Alcohol Abuse %2.24%1.01%2.21%0.00%10.36%
Alzheimer’s %10.17%2.18%10.15%0.00%25.02%
Asthma %4.31%1.34%4.35%0.00%11.64%
Atrial Fibrillation %8.03%1.61%8.12%0.00%17.50%
Cancer %7.41%1.40%7.43%0.00%12.10%
Kidney % *22.85%4.51%22.94%0.00%51.45%
COPD %12.81%3.77%12.44%0.00%32.15%
Depression %17.44%3.57%17.48%0.00%35.87%
Diabetes %26.93%5.09%27.11%0.00%49.62%
Drug Abuse %3.14%1.83%2.93%0.00%16.70%
HIV %0.11%0.25%0.00%0.00%4.51%
Heart Failure %14.39%3.28%14.15%0.00%33.75%
Hepatitis B %0.47%0.42%0.49%0.00%4.10%
Hyperlipidemia % **38.04%8.80%39.35%0.00%67.55%
Hypertension %56.51%8.77%58.30%0.00%74.95%
Ischemia % ***26.84%5.44%26.68%0.00%46.91%
Stroke %3.32%1.09%3.35%0.00%9.46%
Number of Acute Beds215720.4735019274
Case Mix Index1.0610.5871.1700.0002.710
2016 Winning Party (1 = Democrat)0.1580.3640.0000.0001.000
+ collinear with poverty, r = −0.771, * collinear with diabetes, r = 0.78, ** collinear with hypertension, r = 0.80, *** collinear with heart failure and hypertension, r = 0.71 for both.
Table 2. Population density and COVID-19 deaths by 2016 electoral outcome (31 August 2020).
Table 2. Population density and COVID-19 deaths by 2016 electoral outcome (31 August 2020).
CandidateCounties WonAvg. DensityDeathsDeath Rate
Table 3. State level descriptive statistics.
Table 3. State level descriptive statistics.
Variables (n = 51)MeanSDMedianMinimumMaximum
% African American11.27%10.72%7.50%0.40%46.90%
% Native American1.62%2.87%0.50%0.20%14.40%
% Hispanic 12.01%10.31%9.52%1.43%49.09%
% 65 and over16.39%1.99%16.40%11.10%20.60%
% Unemployment3.62%0.82%3.50%2.40%6.10%
% Democratic Governor49.02%50.49%0.00%0.00%100.00%
COVID-19 Deaths/100 K45.7439.5832.955.01179.53
Flu Deaths/100 K15.103.7614.657.0029.60
Table 4. Model results (scaled variables).
Table 4. Model results (scaled variables).
VariableOLS FullpLassoAdaptive pGIS FullpGIS Reducedp
R2(Predicted R2for Lasso)0.3770.352 +/- 0.8000.5070.500
Rho 0.634<0.0010.589<0.001
Pop. Density0.1630.0170.1380.0380.066<0.0010.051<0.001
% Native American0.0900.0180.0570.0380.070<0.0010.059<0.001
% Hispanic0.1330.0220.132<0.0010.082<0.0010.071<0.001
% Black0.4080.0290.369<0.0010.178<0.0010.169<0.001
% Asian0.0080.019 −0.0090.581
% 65 and older0.0220.019 0.0220.182
% Unemployed0.0790.0180.0750.0070.0520.0010.062<0.001
Poverty0.0180.027 0.0120.621
% Smoke−0.0610.026 −0.0060.815
% Adult Obesity−0.0450.019 0.0060.721
% Alcohol0.0410.020 0.0240.170
% Alzheimer’s0.1120.0210.149<0.0010.073<0.0010.097<0.001
% Asthma−0.0490.020 −0.0220.217
% Atrial Fib.0.0170.021 0.0110.563
% Cancer−0.0100.020 −0.0160.379
% COPD−0.0740.027−0.104<0.001−0.0470.048−0.0530.006
% Depression0.0360.023 0.0430.034
% Diabetes0.1830.0270.162<0.0010.0780.0010.079<0.001
% Drug Abuse−0.0270.022 −0.0330.096
% HIV−0.0740.021 −0.0470.011
% Heart Failure−0.0270.021 −0.0090.636
% Hepatitis B−0.0480.021 −0.0310.095
% Stroke0.0920.022 0.0260.182
Number of Acute Beds−0.0060.018 0.0090.565
Case Mix Index0.0380.017 0.0450.004
Winning Party0.0290.0190.0240.0890.0460.0070.0320.033
Table 5. Unscaled geospatial model.
Table 5. Unscaled geospatial model.
Population Density0.0030.001
% Native American42.728<0.001
% Hispanic23.226<0.001
% African American/Black52.703<0.001
Unemployment Rate2.112<0.001
Alzheimer’s Disease2.077<0.001
Chronic Obstructive Pulmonary Disease (COPD)−0.6640.005
Winning Party, 2016 Election (1 = Democrat)4.5030.021
Table 6. Results of the regression analyses for the state models.
Table 6. Results of the regression analyses for the state models.
VariableOLS FullpOLS without State Outliersp
% Minority−0.2310.0830.4210.070
Governor’s Party−0.0560.609−0.2600.137
% in Poverty0.1980.243−0.2700.273
Population Density−0.2580.116−0.0130.959
Health PC10.2010.000−0.0740.272
Health PC20.3880.0000.0050.977
Health PC3−0.2130.0290.1450.263
Governor’s Party × Health PC10.0840.0270.0530.332

Share and Cite

MDPI and ACS Style

Feinhandler, I.; Cilento, B.; Beauvais, B.; Harrop, J.; Fulton, L. Predictors of Death Rate during the COVID-19 Pandemic. Healthcare 2020, 8, 339.

AMA Style

Feinhandler I, Cilento B, Beauvais B, Harrop J, Fulton L. Predictors of Death Rate during the COVID-19 Pandemic. Healthcare. 2020; 8(3):339.

Chicago/Turabian Style

Feinhandler, Ian, Benjamin Cilento, Brad Beauvais, Jordan Harrop, and Lawrence Fulton. 2020. "Predictors of Death Rate during the COVID-19 Pandemic" Healthcare 8, no. 3: 339.

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop