The arrangement of the matter in the structure of a semisolid product, also known as microstructure [1
], is determined by the components employed in its formulation (e.g., polymorphism, size and shape of the dispersed particles of the active substance, grade and rheology of the excipients, interfacial tension between phases, partition coefficient of the active substance between the different phases) and the manufacturing process. This microstructure and the excipient composition do not only affect the organoleptic properties of the product, but also the drug permeation through the skin [1
The European Medicines Agency (EMA) has released for consultation a draft guideline on the quality and equivalence of topical products [4
]. The characterization of topical semisolids includes the investigation of their physical properties, which are defined by their microstructure, influence bioavailability and usability of the product and are indicative of stability and consistency of the manufacturing process. This draft guideline assumes that if the physical properties of two products with similar qualitative and quantitative composition are sufficiently similar, the microstructure of the formulation can also be considered similar.
According to this draft guideline, the demonstration of therapeutic equivalence of two products requires comparison of the qualitative (Q1) and quantitative (Q2) composition of the formulations, as well as a comparison of these physical properties (Q3). This comparison should be conducted in at least three batches of the reference product and three batches of the test product with at least 12 replicates per batch.
The microstructure characterization includes rheology, type of emulsion and physical state of drug in semisolid system (polymorphic form, solubilized drug vs. dispersed solid drug, particle size of drug particles) [1
]. The draft guideline specifies that the non-Newtonian rheological behavior has to be characterized by investigating the complete flow curve (viscosity or shear stress versus shear rate) and the linear viscoelastic response (creep-recovery tests or oscillatory measurements at different frequencies). This allows calculation of parameters of viscosities at specified shear rates, yield stress values, thixotropic relative area (SR
), elastic and viscous moduli (G’ and G″), or loss tangent (tan δ
). To conclude equivalence of these microstructure characteristics, the 90% confidence interval (CI) of the difference between means of both formulations should not be more than 10% from the reference product mean, assuming normal distribution of data. According to this draft guideline, equivalence has to be concluded for all rheological parameters that define the microstructure of the product. Neither the parameters that define the microstructure of semisolid pharmaceuticals, nor the acceptance limits that ensure equivalence in those parameters have so far been defined in the literature. Moreover, since all the rheological parameters mentioned above are not included as routine analysis when delivering new batches, the inter-batch variability has not been characterized. When evaluating rheological parameters, inter-batch variability can be due to multiple factors, such as excipient or formulation manufacturing, storage conditions and aging of the formulation.
Daivobet ointment (LeoPharma, Dublin, Ireland) is a combination product of the vitamin D analogue calcipotriol monohydrate and the corticosteroid betamethasone dipropionate. It is indicated for topical treatment of stable plaque psoriasis vulgaris in adults. A maximum daily dose of 15 g and maximum body surface area to be treated of 30% should not be exceeded to avoid vitamin D-related adverse events such as hypercalcemia.
The objective of the present work is to assess if the inter-batch variability of the rheological parameters in the reference product Daivobet allows concluding equivalence within a ±10% acceptance range, as defined in the EMA draft guideline on the quality and equivalence of topical products.
3. Results and Discussion
The rheological study of the Daivobet formulation demonstrated its pseudoplastic behavior (Supplementary Figure S1
). Consequently, various physical properties of 10 batches (12 replicates each) were characterized by calculating relative thixotropic area (SR
), yield stress (σ0
), zero shear viscosity (η0
), viscosity al 100 s−1
), loss tangent (tan δ
), calculated elastic and viscous moduli at 1 Hz (
, respectively), the parameters m’
of the fit, and spreadability (Table 1
). Raw data are shown in Table S1
Yield stress (i.e., the minimum force required by a formulation to flow) and the viscosity of the sample at low shear rates (η0), relate to the area occupied when applying weight on the formulation. This area is obtained through spreadability measurements, which are simple routine tests for semisolid preparations informing about the spreadability of a formulation when applied on skin, while the other two parameters mentioned correspond to a more rigorous analysis of flow properties. The internal structure of formulations and their viscoelastic properties is given by the determination of storage and loss moduli (G’ and G″) and their dependence with frequency oscillation. In the here tested formulation, elastic behaviour clearly predominated over viscous (G’ > G″) and both moduli varied with frequency, as characterized by their slopes (i.e., m’ and m″) in double logarithmic scale.
For all parameters tested, total variability, measured as coefficient of variation (CV) from all replicates of all batches (n = 120), was <15%. The parameters η100, tan δ, m’, m″ and spreadability showed an inter-batch CV ≤ 5.7%. The contribution to the total variability of each parameter was similar in terms of inter- and intra-batch variability.
To evaluate whether these physical parameters followed a normal distribution, the Shapiro–Wilk test [10
] was performed. As shown in Figure 1
, the experimental distribution of these ten batches were notably different. Based on the p
value of the Shapiro–Wilk test, neither SR
, tan δ
follow a normal distribution. Those physical parameters not following a normal distribution would not qualify for comparison according to the EMA draft guideline [4
Next, parametric comparison was conducted using three different calculation methods: comparing one batch vs. one batch, comparing the mean of five batches vs. the mean of five batches and comparing the median batch within five batches vs. the median batch within five batches (Table 2
). The values in bold identify the acceptance range that provides the conclusion of equivalence in at least 80% of the comparisons. For those in vitro parameters with low inter-batch variability below 3.7%, (tan δ, m’, m″
and spreadability), the equivalence with a ±10% acceptance range for the 90% CI of the test/reference ratio could be concluded in ≥80% of comparisons, regardless of the comparison method used. For parameters with inter-batch CV around 5.7% (η100
), the number of successful comparisons was 100% if the number of batches compared was 5 either comparing mean values or the median batch.
However, for those parameters with high inter-batch variability (CV ≥ 9.6%; SR, σ0, η0,
) the equivalence between batches of the same reference product could not be concluded in more than 80% of comparisons, by any of the comparison methods. Thus, according to the EMA draft guideline these batches of same reference formulation would not be considered equivalent in half of the physical parameters evaluated [4
], despite the fact that more than the minimum required number of batches (five instead of only three) were tested. Since the equivalence criterion seems to be inappropriate when inter-batch variability is large, the acceptance range could be widened under these circumstances.
When widening the acceptance range to ±15%, a range applied for orally inhaled products [12
], equivalence could be concluded in ≥80% of comparisons for almost all parameters (Table 2
). The only parameter for which equivalence could not be concluded was SR
, which with 79% successful comparisons was very close of being considered equivalent.
The frequency in which equivalence could be concluded varied between comparison methods. The comparison of the population means (“5 vs. 5 batches″ method) exhibited a higher success percentage than the comparison of individual batches (“1 vs. 1 batch″ method) or than the comparison of the batches with the median values within five batches (“median batch vs. median batch″) (Table 2
). With a 15% acceptance range, for instance, equivalence for
, the parameter with the highest inter-batch variability, could be concluded in 42%, 86% and 62% of cases with the “1 vs. 1″, “5 vs. 5″ and the “median batch vs. median batch″ method, respectively.
Selection of the most representative batch for comparison, as done when comparing the median batch within five batches vs. the median batch within five batches, is required for in vivo studies [13
], as evaluation of more than one batch is not feasible. According to the data here presented, this approach would not be appropriate for in vitro equivalence evaluations, while the comparison of means, as proposed by the EMA [4
], is the preferable method.
Using conventional sample size calculation, a sample size of 56 per group would provide 80% power to conclude equivalence within an acceptance range of ±10%, assuming a difference between test and reference of 5% and an inter-individual CV of 12%. However, as the data here presented show, an even larger sample size (five batches per group and 12 replicates per batch, i.e., 60 samples per group) concluded equivalence in less than 80% (76%) of the comparisons between groups of 5 batches (Table 2
) for a rheological parameter (η0
) with a total CV of 11.8% (and an inter-batch CV of 9.6%) (Table 1
) and a theoretical difference between batches of zero (batches of the same reference product). Therefore, a sample size calculation taking into account inter-batch variability would be necessary to ensure that the experiments have the desired power. In the absence of proper sample size calculations, pharmaceutical companies might be tempted to conduct pilot experiments to select those batches that behave similarly, before performing a formal comparison for regulatory submission, which could be considered data manipulation.
Since the parametric comparison relies on normality assumption of the evaluated parameters, a requirement not met by most of the rheological parameters here evaluated, a bootstrap analysis was performed. Bootstrap methodology is currently used in several disciplines including drug research, where it has improved the comparison of dissolution formulations under high variability conditions [14
], as well as the evaluation of population pharmacokinetic/pharmacodynamic models [15
]. Based on the percentile distribution of the geometric means, the bootstrap methodology randomly samples batches (without replacement) and replicates to construct a non-parametric CI. Thus, this methodology does not require assumption of normal distribution.
A bootstrap analysis of the physical parameters was performed using the experimental data described above. The 10,000 geometric mean ratios with 12 replicates and non-parametric 90% CI were represented, using either the “1 batch vs. 1 batch″ (Figure 2
) or the “5 batches vs. 5 batches″ (Figure 3
) comparison method. Non-parametric 90% CI of physical parameters tan δ, m’, m″
and spreadability laid within the ±10% limits irrespective of the method of comparison. Thus, equivalence among batches could be concluded for parameters with total variability <5%. However, for parameters with total CV between 5 and 10% (i.e., η100
), equivalence could only be concluded with the “5 vs. 5″ comparison method, the probability distributions of the geometric means of the “1 vs. 1″ comparison method being notably wider (Figure 2
and Figure 3
). For rheological parameters with total CV > 10% (SR, σ0, η0,
), equivalence could not be concluded by neither the “1 vs. 1″, nor the “5 vs. 5″ method.
Next, the impact of number of replicates on the bootstrap analysis was evaluated. The conclusion regarding confirmation of equivalence was independent on whether 6, 12 or 24 replicates were used (Supplementary Figures S2, S3, S4 and S5
). The distribution values were quite similar and had no impact on the 90% CI estimation. Altogether, these data indicate that 6 replicates are enough to conclude equivalence by bootstrap methodology. For physical parameters with high inter-batch variability, the acceptance limit of ±10% was however too strict to permit conclusion of equivalence between batches of the same reference product.
To sum up, the same conclusions on equivalence could be drawn with either parametric and non-parametric analysis, using 12 or six replicates, respectively. Equivalence in microstructure between batches of the same reference drug for topical use could not be demonstrated for most of the parameters tested, when following the EMA draft guideline for comparison of test and reference drugs for topical use [4
]. According to this guideline, the 90% CI for the difference of means of the test and comparator products should be contained within the acceptance range of ±10% of the comparator product mean, assuming normal distribution of data, with at least three batches of test product and three batches of reference product and at least 12 replicates per batch [4
]. When comparing individual batches, equivalence could only be concluded for those parameters with a total CV < 5%. When comparing the median values of five batches, equivalence could only be concluded for parameters with a total CV < 10%. If the comparisons of the means had been done with only three batches, strictly following the minimum requirement of the EMA guideline, equivalence would only be concluded for parameters with less than 5.7% inter-batch variability and, consequently, fewer microstructure parameters would have been considered equivalent.
Altogether, the data here presented suggest that an acceptance range of ±10% to conclude equivalence of microstructure parameters of semisolid dosage forms [4
] is too strict, given the high inter-batch difference observed between batches of the same reference product. Following approaches could overcome this handicap: (1) a sample size calculation taking into account the inter-batch variability, to ensure that the number of batches is appropriate to obtain the desired power, or (2) widening the limits to ±15%, since this acceptance range seems to be suitable when the products are supposed to be the same (Table 2
and Figure 3
), or to ±20%, which is the conventional acceptance range for AUC and Cmax
in pharmacokinetic bioequivalence studies [18
] and it is also proposed by other authors for the comparison of some in vitro parameters [19