Sample Size for Oxidative Stress and Inflammation When Treating Multiple Sclerosis with Interferon-β1a and Coenzyme Q10

Studying multiple sclerosis (MS) and its treatments requires the use of biomarkers for underlying pathological mechanisms. We aim to estimate the required sample size for detecting variations of biomarkers of inflammation and oxidative stress. This is a post-hoc analysis on 60 relapsing-remitting MS patients treated with Interferon-β1a and Coenzyme Q10 for 3 months in an open-label crossover design over 6 months. At baseline and at the 3 and 6-month visits, we measured markers of scavenging activity, oxidative damage, and inflammation in the peripheral blood (180 measurements). Variations of laboratory measures (treatment effect) were estimated using mixed-effect linear regression models (including age, gender, disease duration, baseline expanded disability status scale (EDSS), and the duration of Interferon-β1a treatment as covariates; creatinine was also included for uric acid analyses), and were used for sample size calculations. Hypothesizing a clinical trial aiming to detect a 70% effect in 3 months (power = 80% alpha-error = 5%), the sample size per treatment arm would be 1 for interleukin (IL)-3 and IL-5, 4 for IL-7 and IL-2R, 6 for IL-13, 14 for IL-6, 22 for IL-8, 23 for IL-4, 25 for activation-normal T cell expressed and secreted (RANTES), 26 for tumor necrosis factor (TNF)-α, 27 for IL-1β, and 29 for uric acid. Peripheral biomarkers of oxidative stress and inflammation could be used in proof-of-concept studies to quickly screen the mechanisms of action of MS treatments.


Introduction
Monitoring multiple sclerosis (MS) and developing new disease modifying treatments (DMTs) requires the use of biomarkers for underlying pathological mechanisms [1,2]. Thus, it is crucial to define a set of biomarkers that can be easily measured (e.g., in accessible body fluids), are quickly responsive to change, and reflect MS clinical features accurately [2,3].
Experimental evidence supports the important role of inflammation and oxidative stress in the pathogenesis of MS [4]. In the initial relapsing-remitting (RR) phase, oxidative stress is strictly associated with inflammatory activity, whereas the progressive phase is characterized by chronic inflammation and neurodegeneration, further amplifying the oxidative damage [4,5]. In our recent study [6], supplementation with Coenzyme Q10, a natural anti-oxidant, along with Interferon-β1a 44 mcg treatment, was associated with an improved oxidative balance, with a shift toward an anti-inflammatory milieu and with related clinical benefits. However, in this study we used a large number of peripheral biomarkers of oxidative stress and inflammation, which was time-and resource-consuming, and ultimately resulted in a significant statistical challenge due to multiple comparisons [6]. Thus, future studies would benefit from a subset of biomarkers that are sensitive to change in a short time and on a small sample.
In the present post-hoc analysis of our previous longitudinal study, we aim to estimate the sample size needed in RR-MS for different peripheral biomarkers of oxidative stress and inflammation.

Study Design and Population
This is a post-hoc analysis on a prospective cohort that was fully described elsewhere [6]. Briefly, in 2016-2017, we included 60 RRMS patients on clinical stability and on treatment with subcutaneous high-dose Interferon-β1a (Rebif ® , 44 mcg, Merck, Rome, Italy), either alone or with Coenzyme Q10 (Skatto ® , 100 mg/ml, Chiesi Farmaceutici SpA, Parma, Italy) for 3 months, with a cross-over design. In particular, group 1 (n = 30) was treated with Interferon-β1a and Coenzyme Q10 from baseline to a 3-month visit, and then with Interferon-β1a alone until a 6-month visit; meanwhile, group 2 (n = 30) was treated with Interferon-β1a alone from baseline to a 3-month visit, and then with Interferon-β1a and Coenzyme Q10 until a 6-month visit. This design used within-subjects comparison of treatments, and therefore minimized confounding variables by removing any natural biological variation that may have occurred in the measurement of the outcome measures [6,7].
CellROX ® Orange Reagent (Life Technologies, Carlsbad, CA, USA) was used for measuring intracellular reactive oxygen species (ROS) production in peripheral blood mononuclear cells (PBMCs) using a FACScanto II analyzer (Becton-Dickinson, San Diego, CA, USA) and Flow-Jo v10 software (Tree Star Inc., Ashland, OR, USA); intracellular ROS production (CellROX) was measured as percent positive cells (%) and mean fluorescence intensity (MFI).

Statistics
The sample size needed to detect a treatment effect on different markers of oxidative stress and inflammation was computed using the formula n = 2(Z α +Z 1−β ) , where n is the required sample size per treatment arm in 1:1 controlled trials, Z α and Z 1-β are constants (set at 5% alpha-error and 80% power, respectively), σ is the standard deviation, and ∆ the estimated effect size [8,9]. The treatment effect was defined as the actual observed effect in our previous study (i.e., variation in each laboratory measure between treated and untreated groups), estimated using mixed-effect linear regression models (including age, gender, disease duration, baseline expanded disability status scale (EDSS), and duration of Interferon-β1a treatment prior to study inclusion as covariates; creatinine was also included for uric acid analyses) [6,8,9]. The crossover model included random effects for patient ID, and fixed-effects for time (baseline, 3 and 6 months), and for the visit after Coenzyme Q10 exposure, overall accounting for possible carry-over effects. Adjusted beta-coefficients of 3-month variations were obtained for each laboratory measure. We assumed that the observed variation, as estimated by the adjusted beta-coefficients, was the highest achievable treatment effect (100%) over 3 months. From there, with a conservative approach, we hypothesized a number of effect sizes-e.g., 30%, 50%, 70%, and 90%-that were smaller than the observed effect. Standard deviations were calculated from the variation of each laboratory measure after 3 months. Then, we hypothesized a clinical trial where two different biomarkers were included as primary outcome measures for sample size estimates (alpha-error was set at 2.5%). Finally, we considered that the study was designed to include one or two interim analyses in addition to the final analysis (alpha-error was set at 2.94% and 2.21%, respectively, according to the Pocock method) [10,11]. Stata 15.0 (StataCorp LLC, College Station, TX, USA) was used for data processing and analysis.

Figure 1.
Profile plot for sample size estimates for a treatment arm. Figure shows sample sizes for laboratory markers of oxidative stress and inflammation (<30 patients for a treatment arm with a 70% treatment effect). Sample size per treatment arm is reported hypothesizing a 30%, 50%, 70%, and 90% treatment effect compared with the observed effect. Power was set at 80% and alpha-error at 5%. Abbreviations: interleukin (IL), regulated on activation-normal T cell expressed and secreted (RANTES), and tumor necrosis factor (TNF).  Figure shows sample sizes for laboratory markers of oxidative stress and inflammation (<30 patients for a treatment arm with a 70% treatment effect). Sample size per treatment arm is reported hypothesizing a 30%, 50%, 70%, and 90% treatment effect compared with the observed effect. Power was set at 80% and alpha-error at 5%. Abbreviations: interleukin (IL), regulated on activation-normal T cell expressed and secreted (RANTES), and tumor necrosis factor (TNF).
Sample size estimates for a study with one or two interim analyses (Pocock method, setting alpha-error = 2.94% and 2.21% respectively), in addition to the final analysis, are presented in Table 1; this design would reduce study participants' exposure to an inferior or useless treatment. Table shows absolute values of biomarkers of oxidative stress and inflammation at the baseline visit. Adjusted beta-coefficients (adj. coeff.) of 3-month variation for each laboratory measure were obtained with mixed-effect linear regression models (including age, gender, disease duration, baseline EDSS, and duration of Interferon-β1a treatment prior to study inclusion as covariates; creatinine was also included for uric acid analyses) (* indicates p < 0.05). Standard deviation (SD) was calculated from the variation of each laboratory measure after 3 months. Sample size per treatment arm is reported, hypothesizing a 70% treatment effect, compared with the observed effect, over 3 months (power was set at 80%, alpha-error was set at 5%). Then, we also performed calculations hypothesizing additional scenarios: (i) two different biomarkers were included as combined primary outcome measures for sample size estimates (alpha-error was set at 2.5%); (ii) the study was designed to include one or two interim analyses in addition to the final analysis in order to obtain early evidence of inferior or useless treatment (alpha-error was set to be 0.

Discussion
Peripheral biomarkers of inflammation, scavenging activity, and oxidative damage gave realistically achievable sample size estimates, and could be used in exploratory clinical trials and observational studies to screen new or already existing medications with putative effects on inflammation and oxidative stress over a 3-month period. Not least, interim analyses could detect an inferior or useless treatment even earlier, with subsequent study termination or treatment switch within adaptive designs [12].
Current sample size calculations were rather conservative. In particular, in the Results (Section 3) and in Figure 1, we specifically focused on a 70% treatment effect, which was smaller than what we actually observed (100% treatment effect) [6,9]. However, greater treatment effects could be hypothesized with different medications and doses, leading to even smaller sample size estimates. Also, the inclusion of multiple markers as primary outcome measures would remain feasible for sample size calculations. Of note, present estimates are based on the combination of subcutaneous high-dose Interferon-β1a (Rebif ® , 44 mcg, Merck, Rome, Italy) and Coenzyme Q10. For a subgroup of patients (50%), the Interferon-β1a treatment was also administered prior to study inclusion. Drug naïve patients were equally distributed between Coenzyme Q10 treatment groups, and we also included the duration of the Interferon-β1a treatment as a covariate in the statistical models, but, of course, we cannot exclude the possibility that previous treatment has affected the study outcomes. However, if we assume Interferon-β1a could have exerted its effects before inclusion in the study, we would have observed smaller Coenzyme Q10-related effects, resulting in subsequently more conservative sample size estimates. Interferon-β1a is an approved treatment for MS, with a well-established long-term efficacy and safety profile [13]. On the contrary, Coenzyme Q10 has proven effect on biomarkers of oxidative stress and inflammation and on MS symptoms [14][15][16], but its disease-modifying effect remains to be established. As such, future studies should evaluate the reproducibility of our findings on more recent medications (e.g., cladribine).
Of note, for some inflammatory biomarkers (e.g., IL-3 and IL-5) sample size estimates were unexpectedly low and should be interpreted with caution. If we assume we are studying a compound with a specific molecular target (e.g., anti-TNF-α or anti-CD20 antibodies), then only a very small sample is necessary to detect biological effect [30,31]. On the contrary, for compounds with multimodal mechanisms of action, a larger sample would be needed or, at least, profiles of inflammatory pathology should be considered [26].
Limitations of this study include possible confounding factors. In our previous study, we excluded patients with possible confounding factors (e.g., contraceptive and immunosuppressive medication), we used within-patients comparison of treatments (minimizing confounding effects by removing any natural biological variation), and we accounted for a number of covariates in our statistical models [6], but factors influencing oxidative stress and inflammation are multiple and virtually impossible to exclude completely. For instance, four patients presented with a clinical relapse (6.6%) that we did not account for considering that patients were equally distributed in the Coenzyme-Q10-treated and untreated groups. Specificity of peripheral biomarkers to MS-related pathology remains to be further investigated, and based on current knowledge, these markers cannot replace conventional biomarkers of disability (e.g., neuroimaging) [32]. We included 180 measurements at three timepoints from 60 patients to estimate coefficients of variation for sample estimates. As such, included sample could have been larger, but was based on sample size calculations from our previous study, and not least, was in line with previous studies with similar goals [6,8,33]. Also, measurements over short intervals may be prone to increased measurement errors leading to a greater variability and larger sample, but apparently, this was not the case in our cohort. A control group (untreated or treated with a medication different from Interferon-β1a) was unfortunately not available, with difficulties in drawing formal conclusions on the observed effects.

Conclusions
In conclusion, peripheral biomarkers of oxidative stress and inflammation could be used in exploratory, proof-of-concept studies aiming to evaluate the activity profile of new or already existing medications. Medications with putative anti-oxidant and anti-inflammatory effects could be tested in a short time (3 months) and on small samples (<30 per treatment arm) by using a limited subset of biomarkers, before being moved toward larger and more expensive clinical trials.