Effectiveness of Statins for Primary Prevention of Cardiovascular Disease in Low- and Medium-Risk Males: A Causal Inference Approach with Observational Data

In this study, we analyzed the effectiveness of statin therapy for the primary prevention of cardiovascular disease (CVD) in low- and medium-risk patients. Using observational data, we estimated effectiveness by emulating a hypothetical randomized clinical trial comparing statin initiators with statin non-initiators. Two approaches were used to adjust for potential confounding factors: matching and inverse probability weighting in marginal structural models. The estimates of effectiveness were obtained by intention-to-treat and per-protocol analysis. The intention-to-treat analysis revealed an absolute risk reduction of 7.2 (95% confidence interval (CI95%), −6.6–21.0) events per 1000 subjects treated for 5 years in the matched design, and 2.2 (CI95%, −3.9–8.2) in the marginal structural model. The per-protocol analysis revealed an absolute risk reduction of 16.7 (CI95%, −3.0–36) events per 1000 subjects treated for 5 years in the matched design and 5.8 (CI95%, 0.3–11.4) in the marginal structural model. The indication for statin treatment for primary prevention in individuals with low and medium cardiovascular risk appears to be inefficient, but improves with better adherence and in subjectvs with higher risk.


Introduction
Multiple randomized clinical trials (RCTs) have shown that lipid-lowering statins are effective in reducing cardiovascular disease (CVD) morbidity and mortality in individuals with high risk of CVD [1]. Several secondary analyses of RCT data have shown similar relative efficacy in low-and medium-risk individuals [2][3][4]. However, it is difficult to quantify the effect of statins in absolute terms as this is largely dependent on the characteristics and the baseline risk level of the population studied.
Moreover, RCTs are usually carried out in controlled settings, where the population is carefully selected and subjects are closely followed. These conditions, which are ideal for demonstrating the efficacy of a drug, do not allow us to measure the effectiveness of a drug in real-world conditions. Observational studies that emulate a "target trial" can be used to overcome this limitation, and can provide estimators that are as valid as those of RCTs [5][6][7], with the added advantage of studying a population in the context of real-world clinical practice.
However, non-experimental studies have some limitations. It should be noted that in routine practice, not all individuals with low or medium CVD risk have an indication for statin therapy [8]. Therefore, it is difficult to evaluate the effectiveness of statins in a population with these characteristics using observational data, since certain risk profiles (e.g., very low CVD risk) will be very scarce or entirely absent from the study group. In other words, certain individuals, owing to their baseline characteristics, will have zero or very low probability of receiving the treatment. The absence of a non-zero probability of being assigned to one of the treatment levels indicates that the positivity condition is not fulfilled, and therefore the results obtained may be biased [9]. To avoid this, various statistical techniques such as matching or sample restriction can be used to eliminate subjects with extreme values for key covariates. However, the application of these approaches implies creating different populations, which can lead to distinct outcomes in terms of absolute risk reduction, and can therefore complicate the comparability of the results.
Our objective was to analyze the effectiveness of statin therapy for primary prevention of CVD in low-and medium-risk patients by applying a target trial emulation design and comparing the results of distinct analytical approaches.

Materials and Methods
We conducted an observational study that emulated the design of several successive clinical trials, as proposed by Hernán et al. [5,10,11], in subjects undergoing treatment between July 2010 and June 2019.
Observational data from the Aragon Workers Health Study (AWHS) cohort [12] were used. The AWHS is a prospective study designed to evaluate the evolution of CVD risk factors and their association with subclinical atherosclerosis in a cohort of 5650 middle-aged workers at an automobile factory in Spain. Follow-up began in 2009 and continues today.

Study Population
In each "trial", the Systematic Coronary Risk Evaluation (SCORE) [13] was performed for each subject and their cardiovascular risk was calculated according to current European guidelines [8], on the trial start date. After calculating this risk, which combines the SCORE with cholesterol and blood pressure levels, and with history of CVD, diabetes mellitus, and chronic kidney disease, only subjects with low or medium CVD risk, according to the guidelines, were included. Furthermore, in order to only include subjects who were candidates to begin treatment for primary prevention, the following exclusion criteria were applied: (i) subjects who received a statin prescription during the 6 months preceding the trial start date; (ii) subjects with less than 6 months of follow-up; and (iii) subjects who experienced a CVD event at some point prior to the start of the trial. To ensure data quality and to control for confounding factors, subjects for whom there were no data available on tobacco use, body mass index (BMI), systolic blood pressure (SBP), low-density lipoprotein cholesterol (LDL-C), high-density lipoprotein cholesterol (HDL-C), or blood glucose in the previous 12 months were excluded.

Data Sources
Information on pharmaceutical dispensing was obtained from the Aragon pharmaceutical consumption information system. Events were identified using the administrative database of the Minimum Basic Data Set (MBDS), which codes hospital discharges, and the Aragón hospital emergency information system. Data on the number of visits to primary care were obtained from the Aragon primary care information system. The remaining clinical and analytical variables necessary to calculate CVD risk and to control for confusion were obtained from AWHS databases. Mortality data were obtained from the Spanish National Mortality Registry.

Variables Used
The studied drugs were identified using Anatomical Therapeutic Chemical (ATC) codes, as proposed by the World Health Organization in its ATC/DDD Index 2021. ATC codes corresponding to statin therapy were as follows: C10AA (hydroxymethylglutaryl-CoA reductase inhibitors); C10BA (combinations of various lipid modifying agents); and C10BX (lipid modifying agents in combination with other drugs). Statin prescriptions filled on a monthly basis at a pharmacy during the study period were recorded. Patients were considered to have stopped using statins when at least 2 months passed without filling a prescription at a pharmacy.
The first diagnoses of major adverse cardiovascular events (MACE) in the emergency department or upon admission to hospital were considered main events, as well as deaths in which a MACE was the cause of death. To assess the effectiveness of statins in preventing cardiovascular events, a conservative definition of MACE was chosen [14]. Thus, the ICD codes used to identify MACE were I21 and I22 (acute myocardial infarction) for coronary artery disease and I60-I63 (nontraumatic intracranial hemorrhage and cerebral infarction) for CVD.
Information on the following covariates was collected: (i) age at the beginning of the study; (ii) number of visits to primary care in the 6 months prior to MACE; (iii) smoking, divided into 3 categories (smoker, non-smoker, and ex-smoker); (iv) BMI; (v) LDL-C levels; (vi) HDL-C levels; and (vii) blood glucose levels. For each of these covariates each patient was assigned the value recorded prior and closest to the trial start date. LDL cholesterol was calculated using the Friedewald formula [15].

Analyses
Applying the aforementioned selection and exclusion criteria, a clinical trial was emulated for each month between July 2010 and June 2019. The unit of analysis was "subject-trial", since each subject could participate in more than one trial throughout the follow-up period.
For the selected subjects, two groups were established depending on treatment status during the month the trial began: "initiators" and "non-initiators". "Initiators" were subjects who began statin treatment during the month the trial began. "Non-initiators" were those who were not receiving statin therapy during the month of study initiation.
Patient follow-up depended on the analysis performed (intention-to-treat or perprotocol analysis). In the intention-to-treat analysis, each patient was followed until the onset of the main event, death, or loss to follow-up, whichever occurred first. In this analysis, each patient remained in the group to which they were assigned at the beginning, regardless of whether they discontinued treatment (in the case of "initiators") or started it (in the case of "non-initiators"). In the per-protocol analysis, each patient was followed until the onset of the main event, death, loss to follow-up, or deviation from assigned treatment, whichever occurred first. Therefore, in this analysis, "initiators" were censored when they stopped statin treatment and "non-initiators" were censored when they started statin treatment. In all analyses, the first diagnosis of MACE in an emergency episode or upon hospital admission was considered the main event.
To ensure compliance with the positivity condition (i.e., that all subjects had some probability of receiving or not receiving treatment), two distinct approaches were used. The first was matched analysis, whereby each treated subject was matched with an untreated subject from the same trial with similar values for potential confounding variables. This allowed us to obtain effectiveness estimates for a population resembling that which actually receives statin treatment, since in the sample analyzed both treated and untreated subjects have a similar risk of CVD as the treated group. The second approach consisted of sample restriction, whereby subjects with extreme values for key covariates were eliminated. This approach allowed us to obtain effectiveness estimates for the global population with low or medium risk, since the resulting pseudo-population had a risk similar to the global risk of the selected population. To this end, subjects with extreme values for confounding quantitative variables were excluded from the analysis. Extreme values were those that exceeded the maximum value in treated subjects +0.1 standard deviations and those that were below the minimum value in treated subjects −0.1 standard deviations. Using this restricted sample, a marginal structural analysis was performed, creating a pseudopopulation by weighting the subjects according to the inverse probability of receiving the assigned treatment (inverse probability weighting). Figure 1 depicts the sample restriction procedure for the variable LDL-C, and shows the distribution of LDL-C for subject-trials assigned to each treatment arm. LDL-C levels are lower in untreated versus treated subjects, and the minimum values correspond exclusively to untreated subjects.
actually receives statin treatment, since in the sample analyzed both treated and untreated subjects have a similar risk of CVD as the treated group. The second approach consisted of sample restriction, whereby subjects with extreme values for key covariates were eliminated. This approach allowed us to obtain effectiveness estimates for the global population with low or medium risk, since the resulting pseudo-population had a risk similar to the global risk of the selected population. To this end, subjects with extreme values for confounding quantitative variables were excluded from the analysis. Extreme values were those that exceeded the maximum value in treated subjects +0.1 standard deviations and those that were below the minimum value in treated subjects −0.1 standard deviations. Using this restricted sample, a marginal structural analysis was performed, creating a pseudo-population by weighting the subjects according to the inverse probability of receiving the assigned treatment (inverse probability weighting). Figure 1 depicts the sample restriction procedure for the variable LDL-C, and shows the distribution of LDL-C for subject-trials assigned to each treatment arm. LDL-C levels are lower in untreated versus treated subjects, and the minimum values correspond exclusively to untreated subjects. In the population resulting from the matched analysis and in the restricted pseudopopulation resulting from the marginal structural model, we calculated the overall incidence, the incidence per treatment group, the difference in incidence, the number of patients needed to treat (NNT) for 5 years to avoid an event, and the incidence ratio, using both intention-to-treat and per-protocol analyses.
All analyses were performed using R version 4.1.1 (2021, The R Foundation for Statistical Computing)

Intention-to-Treat Analysis
The intention-to-treat analysis included 133,048 subject-trials, corresponding to 4253 subjects. Of these, 473 subject-trials were considered to be treated with statins. Table 1 lists the characteristics of the subject-trials, according to treatment. Table S1 in the Supplementary Materials shows the distribution of subject-trials according the type of statin prescribed. In the population resulting from the matched analysis and in the restricted pseudopopulation resulting from the marginal structural model, we calculated the overall incidence, the incidence per treatment group, the difference in incidence, the number of patients needed to treat (NNT) for 5 years to avoid an event, and the incidence ratio, using both intention-to-treat and per-protocol analyses.
All analyses were performed using R version 4.1.1 (2021, The R Foundation for Statistical Computing).

Intention-to-Treat Analysis
The intention-to-treat analysis included 133,048 subject-trials, corresponding to 4253 subjects. Of these, 473 subject-trials were considered to be treated with statins. Table 1 lists the characteristics of the subject-trials, according to treatment. Table S1 in the Supplementary Materials shows the distribution of subject-trials according the type of statin prescribed.

Matched Analysis
The matched analysis included a total of 946 subject-trials (473 pairs). Their characteristics, listed according to treatment group, are shown in Table 2. In total, 25 events occurred over a total follow-up period of 85,310 months (I = 17.6 events per 1000 subjects followed for 5 years). Among treated subjects, there were 10 events in 42,682 months (I = 14.0 per 1000 subjects followed for 5 years) and among untreated subjects, 15 events occurred in 42,448 months (I = 21.2 per 1000 subjects followed for 5 years).

Marginal Structural Model
The pseudo-population created for the marginal structural model included 125,198 subject-trials, of which 441 were considered to be treated. Subject characteristics are shown in Table 3. In the pseudo-population, 1498.2 events occurred over a total follow-up period of 10,923,864 months (I = 8.23 per 1000 subjects followed for 5 years). In treated subjects, 4.0 events occurred in 39,079.1 months (I = 6.1 per 1000 subjects followed for 5 years) and in untreated subjects, 1494.2 events occurred in 10,884,784.7 months (I = 8.2 per 1000 subjects followed for 5 years).

Per-Protocol Analysis
The per-protocol analysis included 133,048 subject-trials, corresponding to 4253 subjects. Of these, 473 subject-trials were considered to be treated with statins.

Matched Analysis
The matched analysis included the same pairs as in the intention-to-treat analysis (473 pairs).
In this analysis, 14 events were recorded in a total of 40,212 months of follow-up (I = 20.89 events per 1000 subjects followed for 5 years). In treated subjects, 1 event occurred in 8014 months (I = 7.5 per 1000 subjects followed for 5 years), and in untreated subjects 13 events occurred in 32,198 months (I = 24.2 per 1000 subjects followed for 5 years).

Marginal Structural Model
The pseudo-population created for the marginal structural model included 125,198 subject-trials, of which 440 were considered treated. The characteristics of the subject-trials are shown in Table 3.
In the pseudo-population, 1076.9 events occurred over a total of 9,494,031 months of follow-up (I = 6.81 per 1000 subjects followed for 5 years). In treated subjects, 0.1 events occurred in 7465.1 months (I = 1.0 per 1000 subjects followed for 5 years), and in untreated subjects 1076.8 events in 9,486,566 months (I = 6.8 per 1000 subjects followed for 5 years) The absolute risk reduction was 5.8 cases per 1000 subjects treated for 5 years (CI95%, 0.3-11.4 per 1000 subjects followed for 5 years), which implies the need to treat 172 patients for 5 years to avoid a cardiovascular event (5-year NNT = 172; CI95%, 3548-88). The incidence ratio of treated to untreated individuals was 0.15 (RR = 0.15; CI95%, <0.01-38.82). Table 4 summarizes the results obtained. ARR, absolute risk reduction; NNT, number of patients needed to treat for 5 years to avoid an event; IR, incidence ratio.

Discussion
Our results suggest a beneficial effect of statin treatment for primary prevention of CVD in subjects with low or medium risk. Our findings indicate that in order to prevent a cardiovascular event, statins should be prescribed for 5 years to between 139 and 464 patients, depending on the level of risk of the population. Assuming adequate adherence by treated patients, statins should be prescribed to 60-172 patients to prevent a cardiovascular event, depending on the level of risk.
The incidence ratio values estimated using an intention-to-treat analysis are similar to those previously reported in the literature. A meta-analysis carried out by the CCT collaborators using individual data from 27 RCTs [2], reported risk ratios of 0.57-0.77 (depending on the event and risk group) for each reduction in LDL-C of 38.61 mg/dl among subjects with low and medium risk of CVD. Furthermore, Danaei et al. [5], using a study design similar to ours, reported a hazard ratio (HR) of 0.89 between initiators and non-initiators in the general population without discriminating by baseline risk, a result comparable to our IR estimates of 0.66 and 0.74. Compared with previous reports [2,5], the results of our per-protocol analysis appear to overestimate the effect of statins. However, it should be noted that our results may be imprecise given the low number of events included in our analysis, together with the short follow-up of treated subjects (subjects that discontinued treatment were censored), which resulted in very high confidence intervals.
On the other hand, the ARR and NNT results are more difficult to compare, as they depend largely on the specific population analyzed. Glynn et al. [3], in their secondary analysis of an RCT, reported a 5-year NNT of 38, which is far from the value of 60 obtained in our matched per-protocol analysis. While the population included in that study had low levels of LDL-C, both mean age and C-reactive protein levels were higher. This may help explain the higher incidences of CVD in their two treatment groups, and the higher ARR despite similar IR and HR values.
The main limitation of our study is the wide confidence intervals of the results obtained. This is due to the low incidence of cardiovascular events in a population categorized as low-or medium-risk, and the low number of subjects who began treatment during the study period. These differences were exacerbated in the per-protocol analysis due to poor treatment adherence [16,17]. Attempts were made to address these shortcomings by emulating successive trials and using more statistically efficient techniques such as matching. However, our study population of just over 4000 subjects was insufficient to yield more accurate results. Despite this limitation, our estimated incidence ratio values in the intention-to-treat analysis are consistent with those previously reported [2,5], suggesting that the techniques used to avoid confusion proved successful. Therefore, it is reasonable to assume that our estimates of absolute risk reduction in 5 years are also reliable, albeit imprecise.
The marked differences observed between the results of the intention-to-treat and the per-protocol analyses are mainly due to poor statin treatment adherence among treated subjects. In other words, given that the intention-to-treat analysis includes the complete follow-up regardless of adherence and that the per-protocol analysis includes only the follow-up period in which adherence is maintained, the fact that the results are better with "optimal adherence" (i.e., per protocol analysis) than with "suboptimal adherence" (i.e., intention-to-treat analysis) shows the relevance of adherence to the effectiveness of statins. In individuals with low risk of disease, especially in observational studies, poor treatment adherence is expected, since the perceived risk is lower. This poor adherence to statin treatment, which has already been measured in the AWHS population [16,17], results in a treatment persistence of less than 30% at 1 year. By contrast, persistence in RCTs is usually greater than 95% [18]. This should not constitute a problem in the context of the present study, the main objective of which was to evaluate the effectiveness of statin treatment in a real population in real-world conditions. In this sense, our intention-totreat analysis measures the effectiveness of the medical decision to prescribe, while the per-protocol analysis measures the causal effect of the treatment taken according to the medical prescription.
Another limitation is the possible violation of the positivity condition. To overcome this limitation, we applied two distinct techniques: matching and sample restriction. Because the matched analysis included fewer treated subjects with higher CVD risk, this particular sample had a higher cardiovascular risk than the global population. By contrast, sample restriction yielded a population more similar to the global population, with low and medium cardiovascular risk. Thus, the risk reduction values obtained using the matching approach correspond to the effect obtained with the current prescription system, while those obtained for the pseudo-population (using sample restriction) would better correspond to the effect obtained after treatment of any subject with low or medium CVD risk.
Although our population consisted exclusively of male industrial workers, we believe that our estimators are applicable to the general population, given that our population was selected based on CVD risk profile. Regardless, the inclusion of women, who have a lower risk of CVD, would have rendered the treatment even more ineffective.
Finally, neither of the two approaches resulted in comparable LDL-C levels in the two treatment groups; this parameter was slightly higher in treated subjects. However, if these differences caused confusion, our estimators would underestimate the true effectiveness of statins, since treated subjects would have a higher risk of CVD than untreated subjects. Given that the incidence ratios we obtained were similar to those observed in RCTs [2], such underestimation is unlikely. Additionally, although we know that HBA1c better represents glycemic status, we were forced to use fasting blood glucose as an approximation in our study, since HBA1c was only measured in 30% of blood tests. However, we carried out exploratory analyzes with the subjects who had this information and the results were equivalent to those shown.
Our results suggest that the indication of statin treatment for primary prevention in subjects with low and medium CVD risk is inefficient, given the low adherence observed: all approaches used resulted in high estimated NNT values. The current use of statins in our sample, as represented by the matched analysis, is more efficient than if prescription were extended to all subjects with this cardiovascular profile, as it avoids a greater number of events for the same number of people treated, and can be improved with adequate treatment adherence. It is advisable to take into account these results and those of other similar studies when including specific treatment recommendations in clinical practice guidelines, and to emphasize the need to improve treatment adherence, as this enables more realistic evaluation of the impact of the intervention than in RCTs.

Supplementary Materials:
The following supporting information can be downloaded at: https: //www.mdpi.com/article/10.3390/jpm12050658/s1, Table S1. Number of subject-trials according the type of statin prescribed. Funding: This research was funded by Proyecto del Fondo de Investigación Sanitaria, Instituto de Salud Carlos III and the European Fund for Regional Development (FEDER), grant number PI17/01704. The APC was funded by the same funders.

Institutional Review Board Statement:
This study was carried out in accordance with the Declaration of Helsinki, and the protocol was approved by the Ethics Committee of Aragon (Project Identification Code PI17/00042).

Informed Consent Statement:
All subjects provided informed consent to participate in the AWHS cohort.
Data Availability Statement: Data available under request.