Competing Heterogeneities in Vaccine Effectiveness Estimation

Understanding the waning of vaccine-induced protection is important for both immunology and public health. Population heterogeneities in underlying (pre-vaccination) susceptibility and vaccine response can cause measured vaccine effectiveness (mVE) to change over time, even in the absence of pathogen evolution and any actual waning of immune responses. We use multi-scale agent-based models parameterized using epidemiological and immunological data, to investigate the effect of these heterogeneities on mVE as measured by the hazard ratio. Based on our previous work, we consider the waning of antibodies according to a power law and link it to protection in two ways: (1) motivated by correlates of risk data and (2) using a within-host model of stochastic viral extinction. The effect of the heterogeneities is given by concise and understandable formulas, one of which is essentially a generalization of Fisher’s fundamental theorem of natural selection to include higher derivatives. Heterogeneity in underlying susceptibility accelerates apparent waning, whereas heterogeneity in vaccine response slows down apparent waning. Our models suggest that heterogeneity in underlying susceptibility is likely to dominate. However, heterogeneity in vaccine response offsets <10% to >100% (median of 29%) of this effect in our simulations. Our study suggests heterogeneity is more likely to ‘bias’ mVE downwards towards the faster waning of immunity but a subtle bias in the opposite direction is also plausible.


Introduction
Providing accurate estimates of vaccine-induced protection is essential in guiding public health policy.However, many factors complicate our ability to estimate population level vaccine effectiveness (VE) such as prior immunity, underlying health risks, timing of vaccination, inconsistent hazards in different locations, and other confounders.Further adding to uncertainty is the common presence of observed waning which may reflect actual waning of immune responses, introduction of different strains, or may be an artifact coming from heterogeneity among individuals.
Many studies report fast, intraseasonal waning of vaccine-induced protection, particularly for viruses such as influenza and SARS-CoV-2 (1-3); however, various effects can bias this conclusion.Depletion of susceptible individuals (also called the frailty effect in biostatistics) can bias estimates (4,5).Heterogeneity in exposure risk, even if exactly the same in the vaccinated and unvaccinated groups, tends to bias the vaccine effectiveness estimates downwards potentially leading to spurious claims of waning (6,7).If natural immunity is not taken into account, merely having a "leaky" vaccine (i.e. a vaccine that provides partial protection) can bias estimates downwards (5,7).This complicates the estimation of actual waning of vaccine-induced protection which is expected to occur as many correlates of protection, e.g.neutralizing antibodies, have been shown to decrease over time (8)(9)(10)(11).
In this paper, we focus on the hazard ratio as the measure of vaccine effectiveness, as the hazard ratio corresponds to relative risk at a particular moment in time.To determine the direction and magnitude of bias, we simulate an epidemic in a population under various frailty and vaccine protection distributions in the absence and presence of waning and evaluate the interplay between these factors.Commonly, hazard ratios are estimated with the Cox proportional hazards model and there are several standard extensions utilized in real world studies (12)(13)(14)(15)(16)(17)(18)(19).While Cox proportional hazards models were not intended to be time-varying, several approaches exist in order to make it applicable for measuring waning vaccine effectiveness.We utilize time category-vaccine interactions (henceforth TVI) which, in contrast to the commonly used Cox method utilizing the scaled Schoenfeld residuals, should accurately measure the hazard ratios even in the presence of extreme (observed) waning (20).
If vaccine effectiveness is assessed via the hazard ratio and the outcome of interest is the first infection post-vaccination, heterogeneity in baseline (pre-vaccination) susceptibility causes measured vaccine effectiveness (mVE) to decline over time, whereas heterogeneity in response to vaccination causes mVE to increase over time.Hence any apparent change in mVE may reflect any combination of selection on these heterogeneities in addition to the biological processes of pathogen evolution and waning of immune responses.In this paper, we first illustrate the problem using standard statistical distributions for the heterogeneities.We then provide concise formulas that give the net effect of these heterogeneities on the hazard rates and hazard ratio.Next, using parameter values based on epidemiological and immunological data incorporating waning of antibodies, we use agent-based simulations to investigate the magnitude of these opposing effects.Our models suggest that the larger effect is from heterogeneity in baseline susceptibility but that heterogeneity in vaccine response may offset a substantial fraction of that effect.This exacerbation of observed waning may explain the sometimes negative VE reported in some studies (21,22).

Methods
We consider an agent-based model of acute viral infection with a constant background force of infection where we introduce heterogeneity in individual infection risk, heterogeneity in vaccineinduced protection, or both.For most scenarios to be described, vaccine protection follows the "leaky" model, wherein each vaccinated individual experiences a certain percent reduction in risk.Additionally, we model 40% out of a cohort of 100,000 to be vaccinated, in line with the CDC estimate for influenza vaccine coverage (23).Since we model an acute viral infection, we assume sterilizing immunity upon infection for the remainder of the one-year time period considered.All simulations were run in Julia version 1.3.1 and statistical analysis was completed in R version 4.2.1.
For heterogeneity in underlying risk (risk in the absence of vaccination), we select a daily risk rate for both the vaccinated and unvaccinated groups from a single gamma distribution.For heterogeneity in vaccine protection, we select vaccinated individuals' protection from a variety of distributions including beta distributions, in contrast to leaky, homogeneous vaccination.To establish a comparison, we use the mean vaccine efficacy in the context of no epidemic, VENE, which thereby removes the effect of selection.We then calculate vaccine effectiveness using a time categoryvaccine-interaction (TVI) as the independent variable of the Cox proportional hazards model in order to find a time-varying estimate of protection.The TVI method has been shown to behave accurately in circumstances where waning occurs rapidly (20).

Susceptibility to Infection (Frailty) Distribution
When considering only heterogeneity in underlying frailty, the given gamma distributions (parameterized as Gamma(ag, bg) where ag/bg is the mean and ag -0.5 is the coefficient of variation (CoV)) can produce the appearance of waning vaccine effectiveness though as ag increases this effect lessens.This appearance of waning corresponds to what many statistical studies have posited would occur and termed the "frailty effect" or "frailty phenomena" (24)(25)(26)(27); in epidemiological studies, this is sometimes alternatively called "survivor bias" or "depletion of susceptibles" effect.Some studies have also simulated this effect but have not compared the qualitative effect of different distributions (5,6).In Figure 1 we show how different gamma distributions with the same mean can cause differing amounts of perceived waning (waning in mVE) when the true vaccine protection is in fact constant and leaky.

Vaccine Efficacy Distribution
In simulated studies, two modes of vaccine efficacy are often compared: "leaky" vaccination where protection is incomplete but reduces each individual's chance to become infected by a specified amount (e.g.50%) or "all-or-nothing" vaccination where a fraction of individuals receive complete protection from the vaccine and other receive no protection.A limited number of studies have also considered normal-like distributions for vaccine protection (7).We consider two main beta distributions parameterized by Beta(ab, bb) where ab/(ab + bb) is the mean (held here at 0.5): the normal-like Beta(2,2) and the U-shaped Beta(0.5,0.5).These distributions as well as their resultant dynamics and mVE estimates are compared in Figure 2 which shows that for both of these cases, distributions in vaccine protection bias VE estimates upwards.Non-symmetric vaccine protection distributions were also tested (see Supplement Figure S1) but did not change the direction of bias, showing increase in mVE.
As seen in Figures 1 and 2, singly the two types of heterogeneity appear to contribute in opposite directions; beta distributed vaccine protection tends to skew the estimate upwards while gamma distributed underlying risk tends to skew the estimates downwards and to a greater extent.As these effects compete when combined, we constructed a predictor to determine which direction, if any, the competing distributions change mVE from VENE.

Effect of Selection on Heterogeneities
Assuming hazard rates for a given individual are not time-varying and ignoring stochasticity, ̅ , the average hazard rate in not-yet-infected individuals, is given by the following equation Here () is the probability density function for the hazard rates at time 0; we allow () to be a generalized function (e.g. a delta function) so the formula also applies to discrete probability distributions.Let  denote the random variable for ().Let () and () be the moment generating and cumulant generating functions of −, respectively.Since the denominator of Equation 1is () and the numerator is − ) () and () = ln/()0, we get the following relation, Hence, the first derivative of −̅ () is the second cumulant (i.e. the variance) of −, the second derivative of −̅ () is the third cumulant of − (related to the skewness), the third derivative is the fourth cumulant (related to excess kurtosis), and so on.Since − can be viewed as fitness, the above is essentially equivalent to a generalization of Fisher's fundamental theorem of natural selection, according to which the n-th derivative of mean fitness over time is the n+1 cumulant (28,29).
Since in most cases the force of infection is not constant, we further extend this relation between the hazard rates and cumulants.If we let  , = () ⋅  , where  , is individual i's hazard rate, FOI is the force of infection at time t, and qi is the individual's relative hazard, we can recover the above relation in terms of a transformation of time  = ∫ () where  / is the cumulant generating function for the distribution of -qi.
We now consider the hazard ratio comparing vaccinated to unvaccinated individuals, () =  v ;/ 0 ; , where  v ; is the average hazard rate in not-yet-infected vaccinated individuals and  0 ; the analogous for unvaccinated individuals.To find the rate of change of the hazard ratio and recalling that the first derivative of the mean hazard rate is the second cumulant (i.e.variance) of the hazard rates, we apply the quotient rule (or the quotient and chain rules for the case of time-varying FOI) which yields the following equation where  v / is the variance of the vaccinated group's hazard rates and  0 / is the variance of the unvaccinated group's hazard rates.
If at time t=0 underlying risk is distributed Gamma(ag, bg) and vaccine protection Beta(ab, bb) and the two are independent,  6 −  7 −  7 determines the direction in which these heterogeneities affect HR(t) as sign[ ) (0)] = sign/ 6 −  7 −  7 0. ( Hence, even in this simple scenario, heterogeneity can cause either an increase or decrease in mVE.

Vaccine Effectiveness Under Competing Heterogeneities
We consider the interplay of heterogeneities in both underlying frailty and vaccine protection and compare vaccine effectiveness estimates to our predicted value based on Equation 4. We find that overall estimates tend to oscillate around the predicted value (purple dashed), as seen in Figure 3.
Here we find that, depending on the underlying distributions, the mVE can increase, decrease, or remain steady around VENE, the vaccine efficacy under no epidemic.Likewise, the extent of observed change is dependent on the interplay of both distributions with some changing a negligible amount and others shown here changing >20% compared to the VENE value.Furthermore, for all of the combinations shown in Figure 3, we maintained a VENE of 50%, but the initial protection level given also mediates how far from VENE a given distribution can change, as seen in Figure S2.
Even larger changes of mVE can be found by either extending the study period, allowing for more individuals to get infected and thus contribute to the over-or underestimation, or by considering a more extreme distribution.Additionally, it is theoretically possible to generate nonmonotonic behavior, as shown in Figure S3, where mVE can go both up and down from the VENE value.

Modeling Waning Protection
Without direct challenge studies, estimating heterogeneity in vaccine effectiveness can be fraught.However, many studies use antibody titers as a correlate of protection, including those for SARS-CoV-2 (30)(31)(32).Using data on waning SARS-CoV-2 antibodies, we created a distribution for initial protection in a population that then wanes over time.We model waning of antibodies as a power law of the form in line with our previous studies (9)(10)(11).The exponent -1 corresponds to relatively fast waning.Here C for each individual is drawn from a log normal distribution with a standard deviation of 0.75-1.5 natural log in line with (11,33,34).In previous studies, we analyzed antibodies and waning starting at day 42 post-infection or vaccination.We assume here that all individuals in the vaccinated group are fully vaccinated before the study begins.After which we correlate an individual's antibody level to their individual VE using where the antibody to VE conversion exponent is based on an adjustment to the relationships given in (34) for HAI titers and risk of infection with exponents of approximately -0.35.Because HAI titer is only one component of the antibody response, we slightly increased the strength of the relationship and used -0.5.We call this the risk-correlate model.
We also consider a different relationship between VE and antibody based on a within-host stochastic extinction model where here a = 10 is the death rate of virions, R0 =10 is the basic reproductive number at the between-cell level (35),  ⋅ Ab represents the scaled level of antibody, and m = 0.5 is the product of the number of viral particles per inoculum and a virion's probability of successfully infecting a cell in the absence of antibodies (see Supplement for derivation and additional details).This relationship gives qualitatively similar results to the risk-correlate model.
Antibody is scaled to give an approximately 90% initial vaccine protection in the population when the standard deviation for natural log of antibody is equal to 1, again in line with (11,33,34); inherently, as standard deviation is varied, this causes the initial average of vaccine-induced protection in the population to vary slightly.This distribution replaces the beta distributions used in Section 3.4 for vaccine protection while underlying frailty in both groups continues to be modeled with gamma distributions with CoV based on (36) who estimate CoV of 0.7-1.5 (mean of 0.9) based on contact surveys of very short duration (e.g. two days) and CoV of 0.3-0.9(mean of 0.5) based on contact surveys when aggregating by 1 year age categories.As elaborated by (36), the former is likely an over estimate whereas the latter is likely an underestimate, hence we consider CoV of 0.5-1.
Using the risk-correlate model, Figure 4 compares a simulation without any heterogeneity (Figure 4A) to simulations with just heterogeneity in (antibody-induced) vaccine protection or underlying frailty and simulations with heterogeneity in both protection and underlying frailty, where underlying frailty is characterized by the coefficient of variation (CoV) of the gamma distribution where CoV=1/ R  6 and the mean of the gamma distributions are held the same as the previous figure.Recapitulating the earlier simulations, heterogeneity in vaccine protection results in an increase relative to VENE.However, heterogeneity in underlying frailty overwhelms this positive trend and causes VE estimates to be underestimated.Similar qualitative results are also given using the unadjusted power law exponent estimated from the HAI titers (a likely underestimate) as shown in the Supplement.
Using the within-host stochastic model given by Equation 8 yields similar results, as seen in Figure 5 to the risk-correlate model.For both models the degree of the over-or underestimation (relative to VENE) at the end of the season is given in Table S1.Again, for plausible acute infectious disease parameters, mVE tends to be approximately the same as or underestimates VENE.

Discussion
Heterogeneity complicates the ability to accurately estimate the extent of vaccine-induced protection in a population as well as if protection is truly waning or merely appears to be waning.While this has been extensively investigated for underlying frailty (4-6) the confounding effect of heterogeneity in vaccine protection has been less thoroughly explored.In many cases (7,37,38), only all-or-nothing and leaky vaccines are investigated but we argue that these are edge cases that are wonderful for illustrating theory but are unlikely to accurately model real-world responses to vaccination.In particular, this study considers a much wider array of distributions and shows that the net effect of selection on these heterogeneities can cause either an increase or a decrease in mVE with the effect given by concise and interpretable formulas.
We parameterize our model using data from epidemiological and immunological studies and also incorporate within-host modeling of the immune system and pathogen.We find that, within the estimated ranges, mVE is likely to be underestimated; however, the degree of underestimation is quite varied with heterogeneity in vaccine response offsetting anywhere from <10% to >100% (median of 29%) of the effect of heterogeneity in underlying susceptibility alone (Table S1).Therefore, vaccine effectiveness estimates should be interpreted with caution, especially over time as the heterogeneities continue to accumulate differential outcomes.While mVE seems more likely to underestimate than overestimate VENE, underestimation should not be assumed as our range of plausible parameters includes cases without any underestimation.
Previous studies have used all or nothing or beta distributions to model vaccine induced protection (7,37,38).However, modeling waning with such distributions is not straightforward.Our technique of modeling decay of immune responses (in our case, antibodies) at the individual level and converting these immune responses into individual level VE is more transparent and possibly easier to implement than modeling waning by shifting a beta distribution over time.
There are some important caveats to interpreting our results.Although Equations 2-4 give the effect of selection on the hazard rates and hazard ratios, the hazard rates and ratios are also affected by regression towards (or away from) the mean.Regression towards the mean would tend to mitigate the effects of both heterogeneities.Secondly in our simulations heterogeneity in underlying susceptibility and heterogeneity in vaccine response are uncorrelated at baseline.Allowing for correlations permits for more diverse outcomes and affects not only waning but also the initial level of mVE.It should be noted that Equations 2-4 are valid even in the presence of such correlations.Extending our simulations to include such correlations and also regression towards or away from the mean is straightforward.In the current study, we focused only on the hazard ratio as a measure of VE.We do not consider the effect of seasonality, spatial structure, or epidemic waves.We also did not consider all possible combinations of distributions, but rather limited ourselves to those implied by data for acute respiratory infections.
Overall, we give concise statistical formulas for understanding the effect of selection on mVE and give estimates of the magnitude of this effect under a variety of situations based on both antibody and frailty data (10,36).Our results suggest that VENE is likely but not certain to be higher than mVE due to variation at the individual level and that the level of discrepancy is dependent on the specifics of the population and vaccine meaning that a simple overall correction cannot be applied.Further exploration for how to correct for these factors statistically or via study design is essential to more accurately understanding vaccine-induced protection.which increased 6%.Keeping the mean the same but changing the distribution shape, as seen in Panel D, to Beta(0.5, 0.5), we likewise see similar infection dynamics but higher mVE, increasing by approximately 10%.Here, VENE is the vaccine efficacy if there was no epidemic and vaccine protection is constant and leaky at 50%.Under no heterogeneities, as in Panel A, mVE is extremely close to VENE; however, the introduction of heterogeneity biases the estimate.In the first column, showing simulations lacking heterogeneity in underlying susceptibility, heterogeneity in antibody biases the estimate upwards.In the first row, without any heterogeneity in antibody, the bias is downwards.With both, the underlying heterogeneity in susceptibility outcompetes heterogeneity in VE and leads to an underestimate relative to VENE, though not as extreme as would be seen if a vaccine was purely homogeneously leaky except in Panel H where the two effects approximately cancel each other out.SD indicates the standard deviation of antibody (at a given time) in natural logs; higher SD in antibody translates to higher variability in vaccine protection via Equation 6. CoV indicates the coefficient of variation in underlying susceptibility at the beginning of the simulation.8) for antibody-mediated, vaccine-induced protection yields similar results to the risk-correlate model (Equation 7).As expected, increasing heterogeneity in underlying susceptibility (moving left to right) pushes mVE downwards while increasing heterogeneity in vaccine-induced protection (moving top to bottom) pushes mVE upwards.When these effects are mixed, as in Panels E, F, H, and I, the heterogeneities compete.vaccination caused no more 1% observed change in mVE in either direction.When ag is 1, change in mVE ranged from -7 to -17% while when agis 2 the change in mVE ranged from -4 to -10% with the largest decreases being at the mid-range values for both.These changes are plotted explicitly in Figure S2.In Panel A, when there is no heterogeneity in individual level VE, regardless of chosen gamma distribution mid-range values of VE have the greatest capacity to change based on the difference from the initial value and the mean of the last five time points.In Panel B, if there is no heterogeneity in susceptibility but only in vaccine-induced protection as given by either a beta distribution or all-or-nothing protection, again midrange values have the largest capacity for change but this time bias upwards.For the beta distributions, F=1+ab +bb indicates the fold reduction in variance relative to all-or-nothing protection with the same mean.For VE = 50%, the F=2 point corresponds to the main text Figure 2F and the F=5 point corresponds to Figure 2C.

Predictions Not Necessarily Monotonic
While in the main text a variety of distributions were considered, we also tested additional distribution parameters for both the underlying heterogeneity in susceptibility and vaccine protection.Most of these combinations yielded monotonic predictions for change in VE; however, some, like the one pictured in Figure S3, can be non-monotonic.Here the predicted value decreases after a short period as can be seen by the dashed purple line appearing below the original value and then later increases past the original line around day 225 before dipping down and up yet again.as follows from the assumption that Rs, the actual number of cells infected by a given cell early in infection, is Gamma Poisson distributed with Rs ~ Poisson(mean=M) and M~Gamma(ag=1, bg= - * -1 ).Additionally, we assume that  - * is proportional to the infectivity of the virus, so where  -is the basic reproduction number of a cell in the absence of antibody.Building upon Equations S2 and S3, we then consider an inoculum with n virions where the probability of infection which escapes early stochastic extinction is This then becomes a hazard ratio HRW by normalizing the above, yielding where we set  = $D E .
We take the viral death rate to be a=10 in line with [2].For the basic reproduction number in the absence of antibody, we use  -= 10 which is in the range estimated in [2].
For our combined parameter  = $D E , we have additional considerations.
In the absence of antibody, the probability of infection taking hold is the denominator of Equation S7.Here, m in (0,1] corresponds to Pn of up to 59% per exposure in naive individuals and hence covers both small and moderately large exposures.As seen in Figure S5 changing m from near 0 to 1 does not result in large changes in the hazard ratio and therefore we chose m=0.5 as a representative for this range.If m is extremely large, this causes the hazard ratio to essentially become all-ornothing.

Figure S5
: Effect of m on the hazard ratio.Hazard ratio results are not particularly sensitive to choice of m in (0,1] and thus a midrange value was selected (m=0.5).

Comparison of Risk-Correlate and Within-Host Stochastic Model Outcomes
The difference in end-season estimates from VENE for both the risk-correlate model and the withinhost stochastic model are given in Table S1, below.

Table S1: Average of the difference between mVE and VENE over the last three time points.
These values indicate the effect size of heterogeneity on mVE in our simulations.Heterogeneity in underlying susceptibility alone led to mVE underestimating VENE by 3.6%, 8%, 15.4%, and 22.9%.Adding heterogeneity in vaccine response offset anywhere from a negligible (<10%) to >100% (median: 29%) of the effect of heterogeneity in underlying susceptibility alone.

Figure 1 :
Figure 1: Gamma distributed underlying risk with vaccine protection (VE) homogeneous at 50% protection.For underlying risk described by Gamma(0.2, 80), the distribution of the unvaccinated and vaccinated population's daily risk is given in blue and purple as shown in Panel A. Panel B shows the fractions of the susceptible individuals in each group.Panel C shows the estimated vaccine effectiveness (mVE) which drops markedly below the given level we expect of the leaky vaccine (black), decreasing to 17% from the original 50%.Panels D-F, display the same results for Gamma(1, 400).As seen in F, the estimated vaccine effectiveness is below the true value but is not as severe as Panel C, only decreasing to 38.5%.

Figure 2 :
Figure 2: Beta distributed vaccine protection with homogeneous underlying risk.Panels A-C, give the results for vaccine protection distributed Beta(2,2), where Panel A displays the resultant

Figure 3 :
Figure 3: Competing heterogeneities allow for diverse outcomes in mVE.Per Equation 5, we predict an increase in observed vaccine effectiveness in Panels A-C, no change in Panels D-F, and a decrease in Panels G-I.For all panels our predicted value (purple dashed) closely matches the mVE (green).Given that an individual's vaccine protection is constant for these simulations (black), this reiterates the difficulty in interpreting changes in mVE.

Figure 4 :
Figure4: mVE for plausible acute infectious disease parameters is likely to underestimate VENE but not necessarily in all circumstances.Under no heterogeneities, as in Panel A, mVE is extremely close to VENE; however, the introduction of heterogeneity biases the estimate.In the first column, showing simulations lacking heterogeneity in underlying susceptibility, heterogeneity in antibody biases the estimate upwards.In the first row, without any heterogeneity in antibody, the bias is downwards.With both, the underlying heterogeneity in susceptibility outcompetes heterogeneity in VE and leads to an underestimate relative to VENE, though not as extreme as would be seen if a vaccine was purely homogeneously leaky except in Panel H where the two effects approximately cancel each other out.SD indicates the standard deviation of antibody (at a given time) in natural logs; higher SD in antibody translates to higher variability in vaccine protection via Equation6.CoV indicates the coefficient of variation in underlying susceptibility at the beginning of the simulation.

Figure 5 :
Figure 5: Within-host stochastic model.Utilizing the within-host stochastic model (Equation8) for antibody-mediated, vaccine-induced protection yields similar results to the risk-correlate model (Equation7).As expected, increasing heterogeneity in underlying susceptibility (moving left to right) pushes mVE downwards while increasing heterogeneity in vaccine-induced protection (moving top to bottom) pushes mVE upwards.When these effects are mixed, as in Panels E, F, H, and I, the heterogeneities compete.

Figure S2 :
Figure S2: Effect of starting VE protection level.Different VE(0) values affect how much mVE can change.In Panel A, when there is no heterogeneity in individual level VE, regardless of chosen gamma distribution mid-range values of VE have the greatest capacity to change based on the difference from the initial value and the mean of the last five time points.In Panel B, if there is no heterogeneity in susceptibility but only in vaccine-induced protection as given by either a beta distribution or all-or-nothing protection, again midrange values have the largest capacity for change but this time bias upwards.For the beta distributions, F=1+ab +bb indicates the fold reduction in variance relative to all-or-nothing protection with the same mean.For VE = 50%, the F=2 point corresponds to the main text Figure2Fand the F=5 point corresponds to Figure2C.