Uncertainty of Size-Exclusion Chromatography Method in Quality Control of Bevacizumab Batches

: In addition to the analytical challenges related to the size and complexity of biopharmaceutical drugs, the inherent variability that arises due to their manufacturing process requires monitoring throughout the production process to ensure the safety and efﬁcacy of the ﬁnished product. In this step, validation data should demonstrate that the process is controlled and reproducible, whereas the manufacturing process must ensure the quality and consistency of the product. For this, the manufacturer sets speciﬁcation limits according with regulatory guidance. In such a situation, the comparison of different batches is required in order to describe and analyze the variability between them. However, it is unclear how great the variability of the analytical method would be or that in producing the batches. The estimation of the β -expectation tolerance intervals based on the variance components to account for both between-batch and within-batch variability was proposed as a speciﬁcation limit to control the heterogeneity between batches at the time of manufacture and to verify whether batches meet speciﬁcation limits. At this point, the variance components were computed by the maximum likelihood method using a linear random model. For this, the protein content, expressed as a percentage of the actual concentration relative to the claim value, and the dimer content (expressed as percentage) were used as critical quality attributes (CQAs) in the monitoring and control process. We used real data from six bevacizumab commercial batches.


Introduction
Monoclonal antibodies (mAbs) represent highly targeted and efficient biotherapeutic agents with widespread uses in the treatment of different disorders, including cancer, autoimmune, and inflammatory diseases [1]. Compared to small biotherapeutic molecules, mAbs are formulated in elevated concentrations, making them highly susceptible to physical and chemical changes [2]. This increases their critical attributes and makes their development a challenging task. Therefore, all mAbs' CQAs that can impact efficacy and safety must be controlled and totally characterized.
The characterization of biopharmaceutical drugs, including biosimilar drugs, involves a variety of analytical methods that permit comparing their physicochemical properties, biological activities, impurities and stability. A panel of highly sensitive and orthogonal methods are used to compare the physicochemical and biological properties of this type of products [3,4]. Characterization included biochemical studies (primary structure, glycosylation, disulfide structure, charge variants and size variants), biophysical studies (secondary structure, tertiary structure and thermal stability), biological studies (mechanism of action, including antigen specificity and Fc functionality) and forced degradation studies under specific stress conditions. Under these circumstances, product-related impurities that may be formed during this stress step can compromise not only the safety and potency of mAbs, but also their immunogenicity [5,6].
Several research groups have published a list of analytical methods used for physicochemical and biological characterization of mAbs. All methods were demonstrated to be fit for their intended use [7,8]. There are several CQAs as well as different analytical methods for each one, many of which report on the status of only a single quality attribute.
Despite the popularity of therapeutic mAbs and their characterization by reverse phase liquid chromatography (RPLC), the available information in the literature on their RPLC behavior is very limited. Generally, mAb separations by RPLC are often performed at elevated temperatures (80-90 • C), using gradient times of >15-20 min, and in the presence of relatively high concentrations of TFA (0.1%). Under such situations, it is possible to achieve a suitable performance in terms of peak shape and recovery [9]. However, this analysis can cause on-column hydrolysis and the generation of artefact peaks. Recently, Bobaly et al. [10] used a new phenyl-based column to analyses a wide range of mAbs applying milder mobile phase conditions. These authors proposed reducing the work temperature to 75 • C with two aims: (1) to avoid on-column sample degradation and biased quantitative analysis, and (2) to obtain appropriate recovery level and peak shape.
Size-exclusion chromatography (SEC) is the reference method for the quantitative measurement of high molecular weight species (HMWS) of mAbs. SEC separates molecules according their differences in size as they pass through a resin packed in a column [11]. The introduction of sub-3 µm UHP-SEC columns resulted in greater peak efficiency, and greatly reduced the run time, usually to less than 10 min. However, further reduction in column dimensions would not be desired, due to the predominant impact of the liquid chromatographic system on the non-retentive separation mechanism in SEC [12]. This fact could distort the HMWS estimated amount [13]. Usually, non-volatile mobile phases containing phosphate buffers in combination with NaCl or KCl have been widely used for the analysis of proteins under native conditions in SEC [14]. Goyon et al. [15] performed the characterization of 30 therapeutic mAbs by means of SEC using a column packed with sub-3 µm particles and both non-volatile (phosphate salts) and volatile (ammonium acetate) mobile phase conditions. The levels of HMWS varied between 0.1% and 13.1%, and the majority of them were below the 5% specification limit for therapeutic mAbs in the biopharmaceutical industry [16]. However, Lee et al. [17] and Oliva and Llabrés [14] proposed a SEC method for quantitative analysis of mAbs. In this last case, a pre-study validation was conducted to prove that the method could deliver quality results following the ICH-Q2-(R1) [18].
In addition to the analytical challenges related to the size and complexity of biopharmaceutical drugs, the inherent variability that arises due to their manufacturing process requires monitoring throughout the production process to ensure the safety and efficacy of the finished product. For this, the manufacturing process also needs to be controlled through process validation. In this process, a reasonable range of analytical and manufacturing variability is expected.
During the development and manufacture of a biopharmaceutical product, "in-house" reference standards are used to support the characterization and traceability of the product's CQAs throughout its life cycle, either between batches or over extended periods. For example, the bioactivity of various biosimilar drugs has been altered due to post-approval manufacturing changes. In such a situation, regulatory authorities could assess this alteration using this type of standard; however, the availability of international standards is an alternative method for the identification of the causes of variation, and determining their clinical significance would be facilitated [19]. For this, regulatory guidelines request that manufacturers monitor and control the CQAs of biosimilar drugs to keep them within appropriate ranges, so that their quality and clinical properties remain consistent over time [20]. To achieve this aim, it is necessary to perform the comparison of different batches to describe and analyze the variability between them. However, it is unclear how great the variability of the analytical method would be, or that experienced in producing the batches. In the majority of previous studies, the analytical method uncertainty was neglected, or its contribution was little in comparison with other sources of variation. We found values from 0.2% in real data studies to 30% in simulation studies [21].
In this study, we to control the heterogeneity between batches at the time of manufacture in order to verify whether the batches met specification limits. The estimation of the β-expectation tolerance intervals based on the variance components to account for both between-batch and within-batch variability was proposed as a specification limit. At this point, the variance components were computed by means of the maximum likelihood method using a linear random model. For this, the protein content, expressed as a percentage of the actual concentration relative to the claim value, and the dimer content (expressed as percentage) were used as CQAs in the monitoring and control process. We used real data from six bevacizumab commercial batches with different measurements for each batch.

Size Exclusion Chromatography System
The chromatographic system used was a Waters apparatus (Milford, MA, USA) consisting of a pump (600E Multisolvent Delivery System), an auto sampler (700 Wisp model) and a differential refractive index (RI) detector (Waters model 2414). Elution was performed at room temperature in a Protein KW-804 column (8 × 300 mm, Waters). The mobile phase was phosphate-buffered saline (300 mM NaCl, 25 mM sodium phosphate, pH 7.0) at a flow rate of 1.0 mL/min, and injection volume 25 µL. The data were collected and analyzed using the Millennium 32 ® chromatography program (Waters). This software was used for chromatogram integration and estimating monomer and dimer content through relative peak area percentage values.

Analytical Method Validation
The proposed analytic method was validated at the time of testing and was demonstrated to be fit for the intended use. The validation was carried out at two levels: (1) a pre-study validation process in accordance with the ICH-Q2-(R1) guidelines, and (2) an in-study validation procedure using different control charts combined with the Analytical target profile (ATP) approach. For more details, see Oliva and Llabres [14].

Statistical Model
The statistical model used was where y ij is the observation j (j = 1,..., n i ) from batch i (i = 1,...,m), µ is the general mean, B i is the batch random effect, and e ij the residual random term accounting for sampling variability and analytical method uncertainty. The random terms B i and e ij are assumed to be independent and with distributions of N (0, σ 2 B ) and N (0, σ 2 ), respectively. Variance components of the statistical model (Equation (1)) can be estimated using the least square method (i.e., analysis of variance, ANOVA) when the experimental design is balanced. However, under these circumstances, there is some probability of obtaining a negative value for σ 2 B when σ 2 B σ 2 . To avoid this situation, we chose to estimate the variance components of the statistical model by the maximum likelihood method as well as its application in unbalanced design. This was completed using the function lmer(), included in the library lme4 [22] from R-program [23]. β-expectation tolerance intervals were computed using the method described by Hoffman and Kringle [24].

Analytical Method Validation
The proposed analytical method was previously validated using the pre-and in-study validation procedure [14]. However, the main objective in any internal quality control is the maintenance of validation conditions over a long time period, specially, in small laboratory. At first, a reproducibility study was carried out to confirm that the analytical method continues to be in control and stable, providing appropriate accuracy and precision. In such a situation, a drug nominal concentration of 25 µg/mL was prepared three times (i.e., assay effect), and each sample was analyzed under repeatability conditions (n = 6). This concentration is within the linearity range [14]. The function aov() from R-program [23] was used to compute the between-assay variance (σ 2 assay ) and the method precision, expressed as repeatability (σ 2 r ). Thus, the reproducibility is equal to: The ANOVA results show that the null hypothesis was accepted for assay effect Ho : σ 2 assay = 0 (p = 0.954), and thus, the reproducibility is equal to precision, expressed as repeatability, which is calculated as an average of the estimated variance for each assay. In this case, a value of 0.293 was obtained. In a second step, the Bayesian posterior distribution of the mean and standard deviation of drug nominal content was used to calculate the overall uncertainty [25]. Figure 1 shows the contours of the posterior distribution for 5%, 95% and 99% probability levels, together with the claimed value of bevacizumab content (25 mg/mL). In addition, the contour lines for of the overall uncertainty values 1.0 and 1.25 (%) were included. The precision was obtained from the reproducibility study (σ 2 R ). The overall uncertainty calculated using the Bayesian posterior distribution was lower than 1% for drug nominal content. However, the overall uncertainty for dimer content was higher, at around 2.0%. The uncertainty is expressed as the sum of variances of the precision and accuracy. To calculate this last, it is necessary to know the target or true value (X i − T) 2 . For the first CQA, we know this value (25 mg/mL), whereas for the second one, it is unknown. To facilitate the calculations, we used the mean content as the true value. In addition, precision was lower (i.e., high dispersion) in comparison with that observed for the first CQA.  In the course of this study, two SEC columns from the manufacturer same and different batch were used. The statistical method described in Equation (1) can be applied to analyze the column effect. The samples from batch #1 and #2 and three of the four samples from batch #3 were analyzed using column #1, while the rest were analyzed using column #2. The data analysis shows no difference between them (p > 0.05), and thus the results obtained do not depend on the column used ( Figure 2). In this case, a value of 0.0560 was obtained for the residual variance. A similar result was obtained for dimer content (data not shown).

Data Analysis
In this study, six batches from bevacizumab monoclonal antibody were used. Figure  3 shows individual observations for each batch, together with β-expectation tolerance intervals computed using the method described by Hoffman and Kringle [24]. Table 1 shows the summarized statistics: the number of observations per batch, and the mean and stand- Figure 2. Experimental data obtained for two SEC columns used in this study. The mean and standard deviation obtained were 24.97 and 0.271 mg/mL for the column #1, whereas for the column #2, they were 24.84 and 0.254 mg/mL, respectively. The dashed line corresponds to the limits established at x ± 3σ. In this case, σ = 0.236 mg/mL.

Data Analysis
In this study, six batches from bevacizumab monoclonal antibody were used. Figure 3 shows individual observations for each batch, together with β-expectation tolerance intervals computed using the method described by Hoffman and Kringle [24]. Table 1 shows the summarized statistics: the number of observations per batch, and the mean and standard deviation for both quality attributes included in this study. Overall, the mean bevacizumab content was 24.89 mg/mL; the within-batch standard deviation ranged from 0.035 to 0.209 (coefficient of variation (CV) from 0.14% to 0.84%). Overall bevacizumab mean dimer content was 1.543%, and the within-batch standard deviation ranged from 0.071 to 0.172 (CV from 4.49 to 11.0%). We must highlight that the within-batch variability for the drug content is low (on average, the CV is equal to 0.55%), whereas the within-batch variability for dimer content (%) is higher, with the overall CV being about 8.72%. This difference could relate with the methodology used for their determination. In the first, the drug content is estimated from a classical calibration approach, whereas for the dimer content, if the resolution between peaks is not total, the uncertainty in the integration process increases. Table 2 shows the variance components estimated by the maximum likelihood method with their 95% confidence intervals. As shown in Table 2, we must accept the null hypothesis Ho : σ 2 B = 0 for percent dimer content because the confidence intervals include the zero value. In such a situation, the observed variability is due to within-batch factors. These are mainly analytical method uncertainty and sampling variability, with the CV being 9.20%. use the proposed approach by Hoffman and Kringle for two-sided intervals [24]. Montes et al. [27] developed different functions implemented in R-program to facilitate the calculations, although this one can be performed using Microsoft Excel ® spreadsheets. In our case, the 100(1−α)% two-sided tolerance intervals that contains at least 100P% of the population were calculated using the R-program [23]. The resulting two-sided β-expectation tolerance intervals for drug nominal content were 24.22 and 25.54 mg/mL, with the 95% confidence interval for the lower bound being (4.08, 24.36), and (25.35, 25.74) for the upper lower. For the dimer (%) content, the interval estimated was (1.233, 1.868), with the 95% confidence interval for the lower bound being (1.028, 1.342), and (1.696, 2.098) for the upper lower, respectively. In all of these calculations, a confidence level of 90% was used (β = 0.90), whereas the percentage covered (i.e., 100P%) was established at 95% (P = 0.95). Figure 3 shows that at least 95% of the values are within the specification limits for both CQAs, with these limits being relatively greater for the dimer (%) content. Given this specification range and assuming a shift in the mean of 3% for the dimer (%) content would not have too much of a practical impact, since this value is below the 5% specification limit for therapeutic mAbs in the biopharmaceutical industry [16]. However, this shift in mean for the second CQA implies that the test values from new batch fall outside the specification limits.

Conclusions
The reproducibility assay confirmed that the method is in control and stable, providing appropriate accuracy and precision. In addition, the analytical method validation provides useful information in predicting the uncertainty of the analytical method.  For the second CQA, the null hypothesis is rejected, Ho : σ 2 B = 0, and therefore, there are differences between batches. We must consider the within-batch and between-batch variability to estimate the overall variability. In our case, the between-batch variability was much higher than the within-batch variability (i.e., analytic error), with the total variability being around 1.0%. The classical application of statistical process control implies that when the manufacturing process is under statistical control, only "natural variability" operates, and therefore batch-to-batch variability is explained solely by within-batch variability. Under these circumstances, an easy way to estimate the overall variability would be from combined variance of several samples from several batches (our case). However, the manufacturing process of biopharmaceuticals is a complex process and because of the lack of published manufacturing records, we cannot assure that batch-to-batch variability is negligible. For example, in the European Medicines Agency (EMA) assessment report on ABP 215 biosimilar, the sponsor indicated that one batch presents a purity profile falling just below the presented comparability range. However, this minor difference has no clinical relevance, and the ABP 215 product is analytically similar to the reference product [26]. The tolerance interval calculated from the attribute measurement of drug product batches is one of the factors considered when establishing acceptance limits to ensure drug product quality. In the case of repeated measurements in all batches, the correlation between the values within a batch must be properly modeled to correctly calculate tolerance intervals. For a design with unequal numbers of repeated measures in each batch, we can use the proposed approach by Hoffman and Kringle for two-sided intervals [24]. Montes et al. [27] developed different functions implemented in R-program to facilitate the calculations, although this one can be performed using Microsoft Excel ® spreadsheets. In our case, the 100(1 − α)% two-sided tolerance intervals that contains at least 100P% of the population were calculated using the R-program [23].
The resulting two-sided β-expectation tolerance intervals for drug nominal content were 24.22 and 25.54 mg/mL, with the 95% confidence interval for the lower bound being (4.08, 24.36), and (25.35, 25.74) for the upper lower. For the dimer (%) content, the interval estimated was (1.233, 1.868), with the 95% confidence interval for the lower bound being (1.028, 1.342), and (1.696, 2.098) for the upper lower, respectively. In all of these calculations, a confidence level of 90% was used (β = 0.90), whereas the percentage covered (i.e., 100P%) was established at 95% (P = 0.95). Figure 3 shows that at least 95% of the values are within the specification limits for both CQAs, with these limits being relatively greater for the dimer (%) content. Given this specification range and assuming a shift in the mean of 3% for the dimer (%) content would not have too much of a practical impact, since this value is below the 5% specification limit for therapeutic mAbs in the biopharmaceutical industry [16]. However, this shift in mean for the second CQA implies that the test values from new batch fall outside the specification limits.

Conclusions
The reproducibility assay confirmed that the method is in control and stable, providing appropriate accuracy and precision. In addition, the analytical method validation provides useful information in predicting the uncertainty of the analytical method. A linear random model was used to assess whether the batch and column used had an effect on attribute measure. First, the drug concentration does not depend on the column used, whereas the residual variance depends on the analytical method uncertainty and batch effect. The data analysis according to the batch effect showed that the drug nominal content depends on the between-batch and within-batch variability. This last corresponds to the analytical method uncertainty. The dimer content's determination only depends on the analytical method uncertainty, which is of the same order of magnitude as the value estimated for drug content (0.142 vs. 0.162) and that already published (0.17 mg/mL) [14]. Thus, this process allows us to know the batch-to-batch as well as the analytical method uncertainty's contribution to total variation. The use of β-expectation tolerance intervals as specification limits can be a starting point for the determination of the final product's acceptance criteria from the manufacturing process.
During analytical method validation, it is vital to know the systematic bias and random variability of the analytical method as well as the ranges within which individual release values will deviate from their true value. In this line of research, tolerance intervals are very useful to obtain specification limits within which a future observation is expected to be produced from a new batch. This issue will be the subject of future research. Data Availability Statement: Full database will be available after request.