Counties with Lower Insurance Coverage and Housing Problems Are Associated with Both Slower Vaccine Rollout and Higher COVID-19 Incidence

Equitable vaccination distribution is a priority for outcompeting the transmission of COVID-19. Here, the impact of demographic, socioeconomic, and environmental factors on county-level vaccination rates and COVID-19 incidence changes is assessed. In particular, using data from 3142 US counties with over 328 million individuals, correlations were computed between cumulative vaccination rate and change in COVID-19 incidence from 1 December 2020 to 6 June 2021, with 44 different demographic, environmental, and socioeconomic factors. This correlation analysis was also performed using multivariate linear regression to adjust for age as a potential confounding variable. These correlation analyses demonstrated that counties with high levels of uninsured individuals have significantly lower COVID-19 vaccination rates (Spearman correlation: −0.460, p-value: <0.001). In addition, severe housing problems and high housing costs were strongly correlated with increased COVID-19 incidence (Spearman correlations: 0.335, 0.314, p-values: <0.001, <0.001). This study shows that socioeconomic factors are strongly correlated to both COVID-19 vaccination rates and incidence rates, underscoring the need to improve COVID-19 vaccination campaigns in marginalized communities.


Introduction
Since it was first declared a global pandemic by the World Health Organization (WHO) on 11 March 2020 [1], the novel coronavirus disease 2019  has developed into the worst pandemic in over 100 years [2]. As of 13 August 2021, there have been more than 200 million cases of COVID-19 reported worldwide, including more than 4.3 million reported deaths [3]. In the United States alone, there have been over 36 million reported COVID-19 cases and 600,000 reported deaths [4]. This has resulted in the deepest global economic recession since World War 2 [5].
To combat this deadly pandemic, companies and researchers around the world have been racing to develop treatments [6] and vaccines [7], and national governments have been working to obtain access to vaccines and rapidly administer them to their populations. There are currently seven vaccines for COVID-19 approved for use by the WHO, which are manufactured by Pfizer/BioNTech, Moderna, Johnson & Johnson, Oxford/AstraZeneca, Serum Institute of India, Sinopharm, and Sinovac [8]. To date, 30.8% of the world's population has received at least one dose of a COVID-19 vaccine, and 16.1% are fully vaccinated [9].
The COVID-19 vaccine rollout in the United States has been among one of the fastest in the world [10]. However, this rapid vaccine rollout has not benefited all Americans equally, and the vaccination rate in some marginalized communities has lagged significantly behind the average [11]. Social determinants of health (SDoH) and aspects of an individual's life that occur "outside of the four walls of healthcare" have a tremendous impact on actual health status [12,13]. A recent study by the CDC showed that vaccine coverage is lower in counties with high social vulnerability based upon socioeconomic indicators (poverty, unemployment, low income, no high school diploma) [14]. This study did not, however, assess the interplay between these factors and new COVID-19 incidence rates. In addition, another recent analysis of 580 US counties found that the change in COVID-19 incidence from 1 December 2020 to 1 March 2021 is significantly correlated with cumulative vaccination rate through 1 March 2021 [15]. Outside of the US, researchers have found significant correlations between socioeconomic status and vaccination acceptance rates in Israel [16], and vaccine hesitancy is a worldwide issue [17]. However, it remains unclear whether disparities in vaccine rollout and associated COVID-19 infection rate fluctuations have been driven by some specific socioeconomic and population health factors.
The objective of this study is to determine which socioeconomic and environmental factors at the county level affect vaccination and COVID-19 incidence in the US. The following research questions were considered: (1) Which county-level socioeconomic factors are most strongly correlated with low vaccination rates and high COVID-19 incidence?
(2) Which county-level socioeconomic factors are most strongly correlated with low vaccination rates and high COVID-19 incidence, after adjusting for age as a confounding factor?
(3) What are the characteristics of counties with the lowest vaccination rates? To address these research questions, publicly available data on US county-level vaccination rates and COVID-19 incidence rates were considered, along with a large dataset of 44 county-level socioeconomic factors. Pairwise correlation analysis between each of the socioeconomic factors with vaccination rates and COVID-19 incidence rates was performed, along with an age-adjusted pairwise correlation analysis to account for age as a confounding factor in the vaccine rollout. Enrichment analysis was performed to determine the socioeconomic factors that differentiate the counties in the top and bottom quartiles of vaccination rate. Finally, multivariate analysis was performed to determine which socioeconomic factors are heavily correlated with each other and which are significant predictors of vaccination rate in a model, controlling for all of the other socioeconomic factors.

Materials and Methods
The study analyzed 3142 counties and county equivalents in the United States that have cumulative vaccination data available through 6 June 2021. These counties include over 328 million individuals from all 50 states and the District of Columbia. For each county, the vaccination rate was defined as the percentage of individuals in the county with at least one dose of an FDA-authorized COVID-19 vaccine as of 6 June 2021. In addition, countylevel COVID-19 incidence data were obtained from the CDC COVID Data Tracker [4]. The change in COVID-19 incidence is defined as the 7-day rolling average COVID-19 incidence rate on 6 June 2021 minus the 7-day rolling average COVID-19 incidence rate on 1 December 2020, where the COVID-19 incidence rate is the number of new COVID-19 cases reported in the county per 100,000 individuals. Additional analysis was performed using multivariate linear regression to adjust for age as a potential confounding variable.
Demographic and socioeconomic data for each county were obtained from the 2020 County Health Rankings [18] resource provided by the County Health Rankings & Roadmaps program at the University of Wisconsin Population Health Institute. A data completeness threshold of 70% was set and redundant variables were filtered out, resulting in 44 out of a total of 131 variables from the 2020 County Health Rankings. Most of the variables with limited data availability were race-specific variables for minority populations (e.g., number of firearm fatalities-Black, motor vehicle crash deaths-Hispanic), so these could not be included in the analysis. A complete list of demographic and socioeconomic variables with data available is included in Table 1. For each of the 44 county-level features, Spearman rank correlations were computed between: (1) the feature of interest and county-level vaccination rate, and (2) the feature of interest and county-level change in COVID-19 incidence. Spearman rank correlations and corresponding p-values were computed using the SciPy package (version 1.6.0) [19] in Python. These plots were created using Python's Matplotlib package (version 3.3.4) [20]. Table 1. Spearman rank correlations for county-level features with vaccination rates and change in COVID-19 incidence rates. The county-level vaccination rate is defined as the percentage of individuals in the county with at least one COVID-19 vaccine dose as of 6 June 2021. The county-level COVID-19 incidence rate increase is defined as the 7-day rolling average COVID-19 incidence rate (number of new COVID-19 cases/total population) in the county on 6 June 2021 minus the 7-day rolling average COVID-19 incidence rate in the county on 1 December 2020. For each county-level feature, we show the Spearman rank correlation coefficients for the feature vs. vaccination rate, and the feature vs. COVID-19 incidence rate increase, along with the associated p-values. Rows are sorted by correlation with vaccination rate. In order to evaluate the effect of each standardized county-level feature on vaccine coverage after controlling for age as a confounding factor, a linear regression model was implemented to get standardized coefficients, along with their 95% confidence intervals. The p-values were corrected using the Benjamini-Hochberg procedure [21] to avoid Type I error.

County-Level
Next, each county was grouped into quartiles based on percent vaccinated through 6 June 2021. For a select number of county-level features, rates were computed in the top and bottom quartiles, and relative risks and Fisher exact test p-values were reported. For the relative risk values, 95% confidence intervals were computed using a delta-method approximation [22].
To analyze the relationship between each pair of features, a correlation matrix was calculated using the Spearman method, and the results are presented in a heatmap. In addition, principal component analysis was used to explore multivariate relationships in the dataset. For this analysis, principal components were computed using features standardized around the mean, and missing values were filled in using the expectationmaximization algorithm. Afterwards, Spearman correlations between each feature and each of the principal components were computed. The results are presented in a heatmap.
Finally, multivariate regression analysis was performed in order to determine how much each feature influences vaccination rates when all other features are kept constant. In particular, a logistic regression model with L1 regularization was trained to predict whether a county was in the top or bottom quartile based upon vaccination rate, using all of the 44 socioeconomic factors as predictors. The logistic regression model was implemented using the statsmodels (version 0.12.2) package in Python, and the optimal value of the L1 penalty term hyperparameter was computed using cross-validation.

Results
Results from the correlation analyses are synthesized together in Figure 1. Each socioeconomic variable is plotted to show the strength of its relationship with county-level vaccine coverage (x-axis) and county-level new COVID-19 incidence (y-axis). The upper left quadrant contains variables that are associated with both increased incidence and poor vaccine coverage, and the bottom right quadrant contains variables that are associated with decreased incidence and better vaccine coverage. Intervariable correlations are shown in Figure S1. poor vaccine coverage, and the bottom right quadrant contains variables that are associated with decreased incidence and better vaccine coverage. Intervariable correlations are shown in Figure S1. . The y-axis shows the Spearman rank correlation between the county-level feature and the change in COVID-19 incidence rate (defined as the 7-day rolling average COVID-19 incidence rate on 6 June 2021 minus the 7-day rolling average COVID-19 incidence rate on 1 December 2020). Factors are only shown here if their Spearman coefficient is greater than 0.1 along at least one dimension. Factors in pink are related to housing and income, factors in orange are related to environmental risk, factors in purple are related to education level, and factors in blue are related to race.

Insurance Coverage and Vaccination Rates
Factors related to housing and income were shown to have strong correlations with lower vaccination rates and higher incidence cases compared to the national average. Two of these factors, the percentage of uninsured individuals and the percentage of children eligible for free lunch, were both significantly negatively correlated with the percentage of vaccinated individuals (Spearman correlation: −0.460, p-value: <0.001; Spearman correlation: −0.328, p-value: <0.001) (see Figure 2 and Table 1). The relationship between these two factors and the incidence change in COVID-19 cases from 1 December 2020 to 6 June 2021 were significantly positive (uninsured individuals-Spearman correlation: 0.252, pvalue: <0.001; children eligible for free lunch-Spearman correlation: 0.225, p-value: <0.001) (see Figure 3 and Table 1).

Figure 1. Relationship between county-level vaccine coverage and change in COVID-19 incidence rate for county-level features
. The x-axis shows the Spearman rank correlation between the county-level feature and cumulative vaccination rate (percent of individuals in the county with 1+ vaccine dose as of 6 June 2021). The y-axis shows the Spearman rank correlation between the county-level feature and the change in COVID-19 incidence rate (defined as the 7-day rolling average COVID-19 incidence rate on 6 June 2021 minus the 7-day rolling average COVID-19 incidence rate on 1 December 2020). Factors are only shown here if their Spearman coefficient is greater than 0.1 along at least one dimension. Factors in pink are related to housing and income, factors in orange are related to environmental risk, factors in purple are related to education level, and factors in blue are related to race.

Insurance Coverage and Vaccination Rates
Factors related to housing and income were shown to have strong correlations with lower vaccination rates and higher incidence cases compared to the national average. Two of these factors, the percentage of uninsured individuals and the percentage of children eligible for free lunch, were both significantly negatively correlated with the percentage of vaccinated individuals (Spearman correlation: −0.460, p-value: <0.001; Spearman correlation: −0.328, p-value: <0.001) (see Figure 2 and Table 1). The relationship between these two factors and the incidence change in COVID-19 cases from 1 December 2020 to 6 June 2021 were significantly positive (uninsured individuals-Spearman correlation: 0.252, p-value: <0.001; children eligible for free lunch-Spearman correlation: 0.225, p-value: <0.001) (see Figure 3 and Table 1).  A county's percentage of motor vehicle crash deaths and teen births were both significantly correlated with the percentage of the county that had been vaccinated (Spearman correlation: −0.543, p-value: <0.001; Spearman correlation: −0.515, p-value: <0.001) (see Table 1). However, the relationships between these two factors and the incidence change in COVID-19 cases from 1 December 2020 to 6 June 2021 (Spearman correlation: −0.047, p-value: 0.71; Spearman correlation: 0.054, p-value: 0.70) were insignificant.
Similar results were seen with the percentage of firearm fatalities. This factor was significantly correlated with the percentage of the county that had been vaccinated (Spearman correlation: −0.487, p-value: <0.001), but the factor's relationship with the incidence change in COVID-19 cases from 1 December 2020 to 6 June 2021 was not as strong (Spearman correlation: 0.091, p-value: 0.001).   A county's percentage of motor vehicle crash deaths and teen births were both significantly correlated with the percentage of the county that had been vaccinated (Spearman correlation: −0.543, p-value: <0.001; Spearman correlation: −0.515, p-value: <0.001) (see Table 1). However, the relationships between these two factors and the incidence change in COVID-19 cases from 1 December 2020 to 6 June 2021 (Spearman correlation: −0.047, p-value: 0.71; Spearman correlation: 0.054, p-value: 0.70) were insignificant.
Similar results were seen with the percentage of firearm fatalities. This factor was significantly correlated with the percentage of the county that had been vaccinated (Spearman correlation: −0.487, p-value: <0.001), but the factor's relationship with the incidence change in COVID-19 cases from 1 December 2020 to 6 June 2021 was not as strong (Spearman correlation: 0.091, p-value: 0.001). A county's percentage of motor vehicle crash deaths and teen births were both significantly correlated with the percentage of the county that had been vaccinated (Spearman correlation: −0.543, p-value: <0.001; Spearman correlation: −0.515, p-value: <0.001) (see Table 1). However, the relationships between these two factors and the incidence change in COVID-19 cases from 1 December 2020 to 6 June 2021 (Spearman correlation: −0.047, p-value: 0.71; Spearman correlation: 0.054, p-value: 0.70) were insignificant.
Similar results were seen with the percentage of firearm fatalities. This factor was significantly correlated with the percentage of the county that had been vaccinated (Spearman correlation: −0.487, p-value: <0.001), but the factor's relationship with the incidence change in COVID-19 cases from 1 December 2020 to 6 June 2021 was not as strong (Spearman correlation: 0.091, p-value: 0.001).
There was a slight negative correlation between unemployment level and percentage of individuals that had been vaccinated (Spearman correlation: −0.107, p-value: <0.001) and a positive correlation between unemployment levels and new cases (Spearman correlation: 0.182, p-value: <0.001). There was no significant correlation, however, between unemployment rate and insurance coverage (Spearman correlation: 0.003, p-value: 0.85), indicating that these findings with insurance coverage are not driven by unemployment. Table 2 shows estimated coefficients of a linear regression for each standardized county-level feature, indicating their effect on vaccination rates after adjusting for age as a confounding variable when that feature is increased by one standard deviation. These regression coefficients, along with their corresponding 95% confidence intervals, are visualized in Figure

. Linear regression coefficients showing the relationship between vaccination rates and county-level features.
County-level vaccination percentage is defined as 100% x (the number of individuals in the county with at least one COVID-19 vaccine dose as of 6 June 2021)/(total population of the county). An increase in a county-level feature of one standard deviation corresponds to a change in the county-level vaccination rate in percentage after controlling for age as a confounding variable. For each coefficient, error bars corresponding to the 95% confidence intervals are shown as well.

. Linear regression coefficients showing the relationship between vaccination rates and county-level features.
County-level vaccination percentage is defined as 100% × (the number of individuals in the county with at least one COVID-19 vaccine dose as of 6 June 2021)/(total population of the county). An increase in a county-level feature of one standard deviation corresponds to a change in the county-level vaccination rate in percentage after controlling for age as a confounding variable. For each coefficient, error bars corresponding to the 95% confidence intervals are shown as well. Table 3 demonstrates that the relative risk related to the percent of uninsured adults in the population is 0.577, where counties in the top quartile of vaccine coverage have 9.579% uninsured adults and those in the bottom quartile have 16.614%. This translates to counties in the top quartile of vaccination coverage having a 42% lower uninsured population compared to those in the bottom quartile.

Table 3. Comparison of county-level features in the top and bottom quartiles of vaccinated counties.
The county-level vaccination rate is defined as the percentage of individuals in the county with at least one COVID-19 vaccine dose as of 6 June 2021. Counties in the top quartile have vaccination rates greater than or equal to 39.12%, and counties in the bottom quartile have vaccination rates less than or equal to 26.58%. Rows have been sorted by relative risk in increasing order.   Table 4 looks at the top and bottom quartiles of counties, ranked by percent of the county that is uninsured. On average, counties in the bottom quartile have an uninsurance rate of 18.7%. Overall, counties with lower insurance coverage rates tended to be more rural, have higher populations of minorities, and have higher populations of young people. The states contributing to these counties the most are Texas, Georgia, Oklahoma, Mississippi, and Florida. The top 25 US counties with the highest proportions of uninsured individuals are detailed in Table S1. Table 4. General characteristics of all counties and counties with the highest and lowest levels of insurance coverage. The first column (Overall) shows the characteristics for all 3087 counties with vaccination data available. The second column (Top 25%) shows the characteristics for counties with the fewest uninsured individuals per capita (≤7.36%). The third column (Bottom 25%) shows the characteristics for counties with the most uninsured individuals per capita (≥14.57%). Information on state, county population, major town/city, cumulative vaccination till date, and increase in COVID-19 incidence as of 12 April 2021 relative to 1 December 2020 is provided for each group of counties. States with at least one county in the bottom 25% based on insurance coverage are highlighted in red.

Housing Problems and COVID-19 Incidence
The strongest correlates of COVID-19 incidence in 2021 were the percent of households in a county with high housing costs and the percent of households with severe housing problems (Spearman correlations: 0.314, 0.335, p-values: <0.001, <0.001). In addition, other housing problem factors had some of the most positive correlations with COVID-19 incidences compared to all of the included variables. These housing factors were rates of households with high housing costs, income inequality, children eligible for free or reduced-price lunch, and unemployment.

Environmental Risk Factors, Education, and Vaccination Rates
Annual incidence of motor vehicle crash deaths and incidence of firearm fatalities were both negatively correlated with vaccine coverage and positively correlated with new COVID-19 case incidence (Table 1). Violent crime was also negatively associated with vaccine coverage. On the other hand, access to exercise opportunities was positively correlated with vaccine coverage and COVID-19 incidence rates. In addition, college completion rates by 2020 were positively correlated with vaccine coverage; however, this variable was only weakly negatively correlated with COVID-19 incidence rates. Finally, social association ranking, as reflected by the number of civic organizations in the county, was weakly positively correlated with vaccine coverage and strongly negatively correlated with COVID-19 incidence rates (Table 1).

Principal Component Analysis
From the principal component analysis, factors associated with poverty and environmental risks had the strongest negative correlations with the first principal component, which accounts for the greatest variation in the data. This includes variables such as teen births, children eligible for free or reduced-price lunch, uninsured population, annual incidence of motor vehicle crash deaths, incidence of firearm fatalities, and low birthweight ( Figure S2). In contrast, factors that are highly associated with affluent communities had strong positive correlations with the first principal component. This includes variables such as college degree, access to exercise opportunities, dentists, primary care physicians, and high school graduation ( Figure S2).
The second principal component, which captures the second highest variation in the data, was strongly negatively correlated with high housing costs, severe housing cost burden, and severe housing problems. On the other hand, the second principal component was also strongly positively correlated with rural counties and homeownership ( Figure S2). These results suggest that factors related to housing problems contribute to a significant source of variation in the dataset, and these factors are distinct from the number of uninsured individuals, number of primary care physicians, and other socioeconomic factors that are strongly correlated with the first principal component. Figure S3 shows the magnitude of coefficients in the L1 logistic regression model to predict which counties are in the top vs. bottom quartile based upon vaccination rate. The logistic regression coefficients, 95% confidence intervals for the logistic regression coefficients, and associated p-values are also presented in Table S2. In this model, an increase in a county-level feature of one standard deviation corresponds to the amount of increase in the predicted log odds of counties with most vaccination coverage, holding all other features constant. Among the 44 socioeconomic features considered, 31 features were selected by the model to have non-zero coefficients. The feature representing uninsured adults per 100,000 people (odds ratio: 0.30, 95% CI: (0.23, 0.41): p-value: <0.001) has the strongest negative correlation with vaccination rate, with all other variables held constant.

Discussion
At a high level, this study highlights the fact that socioeconomic factors are highly correlated with county-level vaccination rates and COVID-19 incidence rates. In particular, the proportion of uninsured individuals was observed to be significantly negatively correlated with vaccination rates and positively correlated with COVID-19 incidence rates, and the proportion of individuals with housing problems was observed to be significantly correlated with COVID-19 incidence rates. Prior studies in the United States [14] and Israel [16] have found that socioeconomic vulnerability is linked with lower vaccination rates; however, these studies focus on the concept of a socioeconomic vulnerability index more broadly, rather than on individual socioeconomic factors. Another recent US-based study found that vaccination rates are strongly correlated with housing problems [14,23]; however, this analysis did not consider COVID-19 incidence rates as an additional outcome measure. Furthermore, this current study is novel because it includes a large number of socioeconomic factors in the analysis.
Despite the US government's financial sponsorship of the COVID-19 vaccine [24], this study shows a strong relationship between county-level health insurance status, percent of the county that has been vaccinated, and the incidence of new cases since the beginning of 2021. Of all variables studied, insurance coverage was one of the most strongly associated with vaccination coverage. This may be due to the fact that many individuals receive information about general health, and also about their vaccine eligibility status from their primary care provider [25]. In particular, individuals without health insurance may receive less information about their eligibility for COVID-19 vaccines and less information about the precautions that they can take to reduce their risk of COVID-19 infection in general. Direct messaging from the government to inform individuals that they are eligible for the vaccine regardless of insurance status may have a significant impact on both vaccine coverage and new case incidence rates.
When assessing socioeconomic factors related to COVID-19 incidence in 2021, the strongest relationships were with factors relating to severe housing problems. According to County Health Rankings, this is defined as the percentage of households with at least one of the following four housing problems: overcrowding, high housing costs, lack of kitchen facilities, or lack of plumbing facilities [26]. Given that one of the most effective ways to avoid the spread of COVID-19 is social distancing, the findings related to housing problems and new spread of disease are expected. Along similar lines, prior studies have also shown positive correlations between population density and case incidence [27].
Factors pertaining to race, wealth, housing, and education status are tightly intertwined when it comes to healthcare [28,29]. To this end, it is not surprising that similar trends were seen with lower education, poorer housing status, income inequality, and racial minorities that move in the same direction in the analyses. All of these factors show some relationship with poorer vaccine coverage and higher recent incidence rates. This work highlights that there are also environmental risk factors that fall into the same pattern. Many factors that fall into the bottom-right quadrant of Figure 1, the quadrant with the most favorable outcomes, pertain to having a higher education, a higher-paying career, general quality of life, social connectivity, and being white. For the most part, factors that fall into the top-right and bottom-left quadrants with mixed outcomes are educational factors signifying a mid-range level of education, or pertain to age-related factors that directly impact vaccine eligibility, such as being under 18 years old.
There are several limitations of this study. First, only 44 of the original 131 variables were able to be utilized due to limitations with data availability. Many variables that were lacking in complete data were those at the intersection of racial minority status and other socioeconomic factors, such as homicide rates within specific racial segments. Specifically, 52 of the 63 incomplete variables were specific to racial minority groups, and all data variables with less than 35% completeness were specific to racial minority groups. Had this been available, the study may have been able to parse out more specific relationships of COVID-19, vaccination coverage, and racial minorities. Additionally, one of the challenges in assessing both vaccine coverage, as well as new incidence rates, is in the diversity of state roll-out plans in terms of timeline and eligibility criteria. A future retrospective analysis comparing individual states is an important next step to be taken when more data have been collected across the nation.
There are multiple promising areas for future research. Targeted questionnaires and patient focus groups could be used to determine the reasons that patients without insurance coverage are vaccinated at lower rates. In addition, this line of research could be used to come up with interventions and public policy to make vaccine distribution more equitable, especially among the uninsured population. Similar follow-up research focusing on populations with housing problems could be used to determine the reasons that COVID-19 incidence rates are elevated in this population and which interventions may be most effective.

Conclusions
The main implications of this research are that socioeconomic factors are significant drivers of vaccination rates and COVID-19 incidence rates. In particular, the results show that populations without health insurance and with housing problems are particularly vulnerable, highlighting the need to progress COVID-19 vaccination campaigns for these groups. These findings reinforce and build upon what is known about the vast socioeconomic disparities that are still ongoing in the US surrounding the COVID-19 pandemic.
It was shown that the most significant factors associated with low vaccination rates at the county level are those related to poverty and environmental safety, such as uninsurance prevalence, teen births, firearm fatalities, and motor vehicle crash deaths. On the other hand, the most significant protective factors are related to college education, social connectivity, and high prevalence of medical professionals. Among the factors associated with high COVID-19 incidence rates, severe housing problems and high costs of housing were found to have the strongest correlations. Taken together, these findings suggest that addressing socioeconomic inequalities will be important in order to increase vaccine coverage across the United States and to reduce future COVID-19 surges in counties with socioeconomically vulnerable populations.
Supplementary Materials: The following are available online at https://www.mdpi.com/article/10 .3390/vaccines9090973/s1, Figure S1: Heatmap of Spearman's rank correlation for selected countylevel features, Figure S2: Correlation matrix between principal components and county-level features, Figure S3: LASSO logistic regression coefficients and confidence intervals, Table S1: Top 25 US Counties ranked by percent of uninsured population, Table S2: LASSO logistic regression coefficients to differentiate top and bottom quartile vaccinated counties.  Data Availability Statement: After publication, the data will be made available to others upon reasonable requests to the corresponding authors (colin@nference.net, venky@nference.net). A proposal with a detailed description of study objectives and the statistical analysis plan will be needed for evaluation of the reasonability of requests.
Conflicts of Interest: GD, MC, EL, CP, and VS. are employees of nference and have financial interests in the company. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results.