Quality by Design Approach Using Multiple Linear and Logistic Regression Modeling Enables Microemulsion Scale Up

The development of pharmaceutical nanoformulations has accelerated over the past decade. However, the nano-sized drug carriers continue to meet substantial regulatory and clinical translation challenges. In order to address some of these key challenges in early development, we adopted a quality by design approach to develop robust predictive mathematical models for microemulsion formulation, manufacturing, and scale-up. The presented approach combined risk management, design of experiments, multiple linear regression (MLR), and logistic regression to identify a design space in which microemulsion colloidal properties were dependent solely upon microemulsion composition, thus facilitating scale-up operations. Developed MLR models predicted microemulsion diameter, polydispersity index (PDI), and diameter change over 30 days storage, while logistic regression models predicted the probability of a microemulsion passing quality control testing. A stable microemulsion formulation was identified and successfully scaled up tenfold to 1L without impacting droplet diameter, PDI, or stability.


Introduction
The publication rate in drug delivery using nanoformulations has dramatically increased in the past decade, reaching 24,665 publications by 2017 [1]. However, this rapid increase in publications has coincided with a limited number of clinical trials and approved treatments. Several explanations have been proposed to justify the lack of clinical translation of nanoformulations for drug delivery, including a lack of translatability from animal models to humans, an overemphasis on the advantages of nanoformulations, confirmation bias, and the development of increasingly complex nanoformulations [1]. These problems are all relevant and need to be addressed in order for nanoformulations to reach the market. We postulate that quality by design approaches used throughout the pharmaceutical industry could be adapted to nanoformulation development. Doing so may allow challenges to be identified and corrected early on, resulting in higher quality nanoformulations during early developmental stages.
Quality by design (QbD) is defined by the International Council for Harmonization (ICH) Q8(R2) document as "a systematic approach to development that begins with predefined objectives and emphasizes product and process understanding and process control, based on sound science and quality risk management". A QbD approach includes but is not limited to (1) identification of critical quality attributes (CQAs), (2) risk analysis, (3) design of experiments (DoE), and (4) identification of critical process parameters (CPPs). CQAs are measurable quality attributes that have a significant impact on final product quality. After CQA identification, risk analyses such as a failure mode, effects, and criticality analysis (FMECA) are used to strategically rank potential failure modes and narrow down the list of process parameters to those that are most likely to significantly affect the CQAs [2]. DoE is then developed to study the relationships between the process parameters and the CQAs in an efficient way. DoE allows for sophisticated studies in which multiple factors are changed at a time, minimizing the required number of experimental runs and increasing efficiency compared to traditional one factor at a time approaches [3]. CQAs are measured for each run of the DoE, and statistical regression approaches (multiple linear regression, logistic regression, etc.) or other modeling approaches are used to identify the CPPs, the process parameters that have a significant impact on the CQAs.
QbD approaches can lead to the development of quality nanoformulations while minimizing risk, time, and resources. QbD has been applied to a wide variety of nano-sized formulations in recent years, including liposomes, emulsions, particles, micelles, and suspensions [4]. Recent examples include liposomes co-encapsulating doxorubicin and curcumin [5], topical microemulsion-based hydrogels containing itraconazole [6], aceclofenac-loaded nanostructured lipid carriers [7], rosuvastatin calcium solid lipid nanoparticles [8], quercetin-salicylic acid nanomixed micelles [9], and furosemide nanosuspensions [10]. Presented here, microemulsions are used as a model to demonstrate the usefulness and application of QbD approaches to nanoformulation development. Microemulsions are optically transparent emulsions that typically range from 20-100 nm in diameter [11]. Microemulsions are thermodynamically stable, allowing them to undergo emulsification spontaneously, a process known as emulsion inversion. During emulsion inversion, the emulsion proceeds through an intermediate phase of minimal surface tension to invert from a water in oil to an oil in water emulsion, or vice versa [12]. Emulsification can be directed through modification of the system's salinity, pH, or temperature, all of which can modify the hydrophilic/lipophilic balance [13]. Alternatively, emulsification can occur through composition modification (e.g. modification of water to oil ratio), a process known as the water titration method [12]. Microemulsion formulation via water titration is common, and many groups have used design of experiments or QbD to develop microemulsions using water titration. Many of these reports focus solely on microemulsion composition. However, several reports have found that the process parameters water addition rate and stir rate impact microemulsion diameter [14][15][16]. To increase the chances of successful scale up, it is critical to understand the impact, if any, of process parameters on microemulsion properties, such as diameter. Therefore, we propose that a thorough understanding of both composition and process parameters is necessary for quality microemulsion development. Approaches that study both composition and process are known as mixture process variable studies. Mixture process variable approaches have been utilized in the development of microemulsion electrokinetic chromatography methods for the identification of diclofenac [17] and almotriptan [18] and their impurities. To the best of our knowledge, the mixture process variable approach for the development of a microemulsion via water titration has not been reported.
As mentioned previously, the QbD approach utilizes a statistical experimental design. Statistical regression approaches are used to identify CPPs and describe the relationships between CPPs and CQAs. A variety of modeling methods have been applied to microemulsions, including multiple linear regression (MLR) [19], partial least squares [20,21], logistic regression [22], and artificial neural networks [23][24][25][26]. Artificial neural networks are advantageous, in that they can detect nonlinear relationships between variables and all possible interactions between variables [27]. However, artificial neural networks cannot explicitly approximate the significance of each input variable on the output [27]. Conversely, MLR assigns a numerical value (regression coefficient) to each input variable and evaluates the significance of each input variable on the output. We therefore used MLR to evaluate the impact of microemulsion composition and water titration parameters on day 1 diameter, day 1 polydispersity index (PDI), and 30-day percent diameter increase. A unique aspect of partial least squares is that it can handle more input variables than runs [28], and a unique aspect of logistic regression is that the output variables are binary. Therefore, logistic regression can be used to model a binary response, such as whether a microemulsion formulation meets or fails to meet a CQA specification. For this reason, logistic regression was used to model the probability of microemulsion formulations meeting CQA specifications.
We postulate that nanoformulation quality and process understanding can be improved through the utilization of QbD. In the presented work, we applied QbD methodology to a microemulsion as an example of nanoformulation. The aims of the presented work were (1) to identify stable, robust microemulsions and (2) to understand the processes that impact microemulsion diameter, PDI, and stability. To achieve these aims, we identified CQAs that would facilitate identification of stable microemulsions. We then used FMECA as an example risk analysis approach to identify potential CPPs. These potential CPPs were used to develop a screening mixture process variable DoE. The screening DoE enabled us to identify a design space in which microemulsion diameter, PDI, and stability were dependent solely upon microemulsion composition. We augmented the screening DoE to further explore this design space using MLR and logistic regression models. Specifically, we used MLR to predict microemulsion diameter, PDI, and 30-day percent diameter change as a function of microemulsion composition, and we used logistic regression to evaluate CQAs representative of stress stability. Extensive quality control (thermal cycling and shelf life studies) enabled the development of more accurate logistic models. Improved understanding of these processes allowed us to identify stable, robust microemulsions and successfully scale up a microemulsion formulation tenfold. The marriage of DoE, multiple linear regression, and logistic regression can be used to collect and analyze process information in an efficient manner. This work serves as a demonstration of how QbD approaches can be applied to other classes of nanoformulations.

Identification of Critical Quality Attributes (CQAs)
The aims of the presented work were (1) to identify stable, robust microemulsions and (2) to understand the processes that impact microemulsion diameter, PDI, and stability. The selected CQAs reflect this. Microemulsions underwent rigorous evaluation to simulate the stresses experienced upon sterilization, evaluation in in vitro pharmacological studies, transportation, and storage. These evaluative quality control tests included: (1) filtration through a 0.22 µm syringe filter; (2) centrifugation at 1620xg for 30 minutes; (3) thermal cycling test; (4) storage at ambient temperature for 30 days. CQAs were defined as microemulsion diameter change and PDI in response to each of these quality control tests and are listed in Table 1.

Selection of Microemulsion Excipients
Microemulsions consisted of oil, surfactant, co-surfactant, two solubilizers (transcutol and propylene glycol), and water. Microemulsion excipients, all on the Food and Drug Administration generally regarded as safe list, were selected, such that the final formulation could be adjusted to incorporate different drugs or multiple drugs, depending upon the disease target. Miglyol 812N was chosen as the oil, because it solubilizes lipophilic compounds better than many hydrocarbon oils [29]. Kolliphor EL and transcutol were chosen as the surfactant and primary solubilizer, respectively, because they are the second and third most commonly used excipients in microemulsion formulations [30]. PEG 400 was chosen as the co-surfactant, because multi-drug delivery microemulsion formulations have been developed with a surfactant/co-surfactant combination of Kolliphor EL and PEG 400 in a 1:1 w/w ratio [31,32]. Finally, propylene glycol was chosen as the second solubilizer, because it is a commonly used solubilizer [30] that is compatible with transcutol.

Risk Assessment and Selection of Critical Process Parameters
In practice, quality risk assessment considers each unit operation in the manufacture of a pharmaceutical product. In the case of emulsification by water titration, there is one primary unit operation. The water titration method has many variables to consider, including water addition rate, stir rate, vessel size and geometry, temperature, composition, and batch size. Given material and time limitations, it is important to select the most valuable factors to study. In the presented work, failure mode, effects, and criticality analysis (FMECA) was used as a risk assessment tool to strategically rank methods of failure during microemulsion production and identify potential critical process parameters. As part of this risk assessment, a risk priority number was assigned to each failure mode. Risk priority number is calculated as the product of three factors: Severity, frequency of occurrence, and detectability are each represented by a spectrum that rates each factor from 1 to 5 (low to high risk), as shown in Table 2. Potential failure methods were defined for each CQA defined in Table 1, and the process parameters that received the highest risk priority number were identified as potential critical process parameters. FMECA was chosen as the risk analysis method for this work due to its wide applicability in design and manufacturing, its strength in assessment of individual failure modes, and its common use in pharmaceutical manufacturing [33]. An abridged version of the risk assessment is shown in Table 3, and the full risk assessment is shown in the Supplemental Information. FMECA was used in the development of solid self-nanoemulsifying oily formulations (S-SNEOFs) to identify S-SNEOF lipid, surfactant, and co-solubilizer concentrations as high-risk factors [34]. Other groups found that water addition rate and stir rate during water titration impact emulsion diameter [14][15][16], though FMECA was not utilized in these reports. Given this, we postulated that microemulsion composition and water titration process parameters could significantly impact the defined CQAs. These points are reflected in the FMECA ( Table 3). The highest risk priority numbers correspond to high oil to surfactant ratio, fast water addition rate, and slow stir rate. To the best of our knowledge, this is the first example of FMECA applied to a microemulsion formulation development.  Table 3. Abridged risk analysis with rankings. Using ranked risk priority numbers (RPNs), a methodical risk assessment identified the most influential parameters on microemulsion critical quality attributes. These parameters were studied in a screening mixture process variable design of experiments.

Design of Experiments-Screening Design
Based upon the FMECA, a 15-run D-optimal screening design of experiments was developed to study the impact of stir rate, water addition rate, and microemulsion composition on microemulsion CQAs (runs 1-15, Table 4). CQAs were measured for each run, and the results are presented in Table 5. To assess microemulsion reproducibility, four runs from the screening DoE (2, 3, 8, and 13) were replicated in triplicate, and CQAs were measured for each replicated microemulsion. The average and standard deviation of each triplicated CQA measurement is reported in Table 6. Microemulsions were consistently reproduced and performed similarly under all quality control tests. Table 4. Runs in the design of experiments.

Stir Rate (rpm)
Water Addition Rate (mL/min) Runs 1-15 were part of a screening design that was used to identify parameters that significantly contributed to microemulsion diameter, PDI, and 30-day percent diameter increase. The design was then augmented to include runs 16-30 that enabled the study of interactions between these significant parameters. ** Indicates that the run was replicated in triplicate.  MLR was used to identify the process parameters that were most likely to significantly impact the CQAs of (1) day 1 diameter, (2) day 1 PDI, and (3) 30-day percent diameter change. All process parameters were included in these screening MLR models, and p-values for parameter estimates are shown in Table 7. Oil and transcutol significantly contribute to day 1 diameter, while oil, transcutol, and water significantly contribute to day 1 PDI (p-value < 0.05). Oil content was the only significant parameter in the 30-day percent diameter change model (p-value = 0.0008). Interestingly, water addition rate and stir rate did not have significant p-values in any of the MLR models. We concluded that in this specific design space, microemulsion diameter, PDI, and 30-day percent diameter change are dependent solely upon microemulsion composition. Specifically, microemulsion oil content appears to be the most significant predictor of the three studied CQAs, as oil had the lowest p-value in all screening MLR models. Two microemulsions (runs 5 and 10) from the screening design failed to meet the day 1 diameter and PDI specifications (Table 5). Specifically, day 1 diameters were 116.50 and 66.49 nm for runs 5 and 10, respectively. The day 1 diameter for run 5 was found to be a statistical outlier (Grubbs' test, p-value 0.01). Interestingly, runs 5 and 10 had the same composition, but different stir rates and water addition rates. This is inconsistent with the MLR results, which suggest that these parameters do not significantly impact microemulsion diameter. Upon closer inspection, three additional microemulsion pairs had identical composition but a different stir rate or water addition rate (runs 6 and 11, runs 9 and 14, and runs 4 and 15, Table 4). All three pairs had comparable day 1 diameters (less than 1 nm difference, Table 5), which is consistent with the MLR results. We hypothesized that runs 5 and 10 behave differently because they have a higher internal phase (oil, transcutol, and propylene glycol) to surfactants ratio than any other formulation. The internal phase to surfactants ratio was equal to 0.6 for runs 5 and 10, but 0.51 or less for all other formulations. We hypothesized that there is a specific design space in which stir rate and water addition rate do not significantly impact microemulsion diameter and other CQAs, and that a high internal phase to surfactants ratio causes runs 5 and 10 to fall out of this design space. Therefore, runs 5 and 10 were removed from all further analysis.

Design of Experiments-Augmented Design
Through the screening design of experiments, we were able to (1) identify a subset of parameters that significantly impacted microemulsion diameter, PDI, and 30-day percent diameter change; and (2) identify a narrower design space that enabled production of microemulsions in the desired diameter and PDI range. We used this information to add an additional 15 experimental runs to the design of experiments. In this augmented design, the internal phase was adjusted from its original maximum of 13.5% formulation weight to 12%, thus reducing the maximum internal phase to surfactant ratio. Further, the process parameters of water addition rate and stir rate were eliminated from the augmented design. Limiting the number of studied parameters increased the number of levels for each parameter and allowed us to study the impact of interaction terms. Including potential interaction terms has the potential to improve the predictive capabilities of the model. The 15 additional runs added to the DoE are shown in Table 4 (runs [16][17][18][19][20][21][22][23][24][25][26][27][28][29][30]. All microemulsions in the augmented DoE underwent the same quality control tests as the formulations in the screening DoE. The results of these quality control tests are shown in Table 5.
Results from the augmented design of experiments were used to develop MLR models for day 1 diameter, day 1 PDI, and 30-day percent diameter change as a function of microemulsion composition. Figure 1 shows a comparison of actual and predicted plots for each model, and model R 2 and RASE values are shown for training and validation sets in Table 8. Model terms, parameter estimates, standard errors, and p-values are shown in Table 9.

Predicting Microemulsion Stability with Logistic Regression
The majority of microemulsions presented here met the CQA specifications for filtration and centrifugation. However, fourteen formulations failed to meet thermal cycling or 30-day CQA specifications. We therefore chose to focus on the development of logistic regression models that could predict the probability that a microemulsion would meet the CQA specifications for thermal cycling and 30-day stability.
Upon closer inspection of the data, there appeared to be a trend between day 1 diameter and whether the microemulsion met all CQA specifications. Logistic regression models were developed to predict the probability of a microemulsion meeting the CQA specifications of 30-day percent diameter change ( Figure 2A) and day 30 PDI ( Figure 2C) as a function of day 1 diameter. The left-

Predicting Microemulsion Stability with Logistic Regression
The majority of microemulsions presented here met the CQA specifications for filtration and centrifugation. However, fourteen formulations failed to meet thermal cycling or 30-day CQA specifications. We therefore chose to focus on the development of logistic regression models that could predict the probability that a microemulsion would meet the CQA specifications for thermal cycling and 30-day stability.
Upon closer inspection of the data, there appeared to be a trend between day 1 diameter and whether the microemulsion met all CQA specifications. Logistic regression models were developed to predict the probability of a microemulsion meeting the CQA specifications of 30-day percent diameter change ( Figure 2A) and day 30 PDI ( Figure 2C) as a function of day 1 diameter. The left-hand confusion tables (Table 10) demonstrate the accuracy of these models. The 30-day percent diameter change logistic model has a single misclassification in both the training and validation data sets, and the day 30 PDI logistic model has a misclassification in the training data set. To improve predictive accuracy, logistic models were modified to predict the probability of a microemulsion meeting both the thermal cycling and 30-day CQA specifications. The first of these logistic models predicted whether a microemulsion would meet the 30-day percent diameter change and the thermal cycling diameter change CQA specifications ( Figure 2B). Similarly, a logistic model was developed to predict the probability that a microemulsion would meet the day 30 PDI and the thermal cycling PDI CQA specifications ( Figure 2D). Modifying the response to incorporate two CQA specifications from different stability tests resulted in more accurate logistic models, as both developed models had zero misclassifications in the training and validation data sets (Table 10).
predicted whether a microemulsion would meet the 30-day percent diameter change and the thermal cycling diameter change CQA specifications ( Figure 2B). Similarly, a logistic model was developed to predict the probability that a microemulsion would meet the day 30 PDI and the thermal cycling PDI CQA specifications ( Figure 2D). Modifying the response to incorporate two CQA specifications from different stability tests resulted in more accurate logistic models, as both developed models had zero misclassifications in the training and validation data sets (Table 10).

Figure 2.
Logistic regression models predict the probability that a microemulsion will meet one or more CQA specifications-based upon its day 1 diameter measurement. (A) 30-day percent diameter change; (B) 30-day percent diameter change and thermal cycling percent diameter change; (C) day 30 PDI; (D) day 30 PDI and thermal cycling PDI. The predictive accuracy of the logistic models improves when two CQA specifications must be met. Table 10. Confusion tables for the logistic models that use day 1 microemulsion diameter to predict whether a microemulsion will meet the CQA specifications of thermal cycling diameter change and 30-day diameter change.

30-Day % Diameter Change 30-Day % Diameter Change and Thermal Cycling % Diameter Change
Training Predicted Count Training Predicted Count

Microemulsion Scale-Up to 1 L
Microemulsions were reproduced consistently on a 100 mL scale (Table 6). Further, we found that for this specific design space, water addition rate, and stir rate did not significantly impact microemulsion diameter or PDI. We hypothesized that this would make the microemulsion formulation robust and therefore easier to scale up. We selected formulation 6 ( Table 4), because this formulation met all CQA specifications, had a low concentration of surfactants, and had high concentrations of both solubilizers. Therefore, formulation 6 has the potential to solubilize higher concentrations of one or more lipid-soluble drugs. Formulation 6 was scaled up to 1000 mL in triplicate. All scaled up batches underwent the same CQA testing as the 100 mL batches. Microemulsion scale-up was reproducible and comparable to the 100 mL batch (Figure 3, Table 11).

Microemulsion Scale-Up to 1 L
Microemulsions were reproduced consistently on a 100 mL scale (Table 6). Further, we found that for this specific design space, water addition rate, and stir rate did not significantly impact microemulsion diameter or PDI. We hypothesized that this would make the microemulsion formulation robust and therefore easier to scale up. We selected formulation 6 ( Table 4), because this formulation met all CQA specifications, had a low concentration of surfactants, and had high concentrations of both solubilizers. Therefore, formulation 6 has the potential to solubilize higher concentrations of one or more lipid-soluble drugs. Formulation 6 was scaled up to 1000 mL in triplicate. All scaled up batches underwent the same CQA testing as the 100 mL batches. Microemulsion scale-up was reproducible and comparable to the 100 mL batch (Figure 3, Table 11).  Table 11. Summary of CQA specification testing for scaled up (1000 mL) microemulsions.   The average and standard deviation of three scaled up microemulsions was calculated for each CQA test and compared to the result for the same formulation produced on a 100 mL scale.

Discussion
To fully evaluate stress stability of a product, it is critical to have adequate quality control analyses in place. However, there is a lack of extensive, orthogonal quality control analyses in early microemulsion development. Orthogonal quality control analyses such as centrifugation and thermal cycling are becoming more commonly used in microemulsion literature reports [35,36], but this use is inconsistent. Many microemulsion literature reports only investigate droplet diameter and PDI over the span of several months, and microemulsions exhibiting little change in these properties are deemed stable. The time it takes to identify unstable formulations using this approach is prohibitive. Additionally, this approach is not representative of the stresses that microemulsions endure upon transportation or evaluation in in vitro pharmacological studies. In the presented work, we demonstrated that logistic regression could be used to predict the probability that a microemulsion formulation would meet one or more CQA specifications. We also demonstrated that model prediction accuracy was improved when multiple CQA specifications needed to be met simultaneously. This highlights the importance of rigorous quality control analyses early in the development of microemulsions and other nanoformulations. For example, consider run 28, Table 5. This formulation met the CQA specifications for 30-day stability but failed to meet the CQA specifications for thermal cycling. Conversely, runs 13, 16, and 24 (Table 5) met the CQA specifications for thermal cycling but failed to meet the CQA specifications for 30-day stability. Had these formulations been subjected to only one stability test, they may have been deemed stable and proceeded to further testing. The extensive quality control analyses presented here demonstrate that thorough quality control testing early in the development process can aid in the identification of unsuitable or unstable formulations early, saving time and resources.
Interestingly, the MLR models for day 1 diameter and day 1 PDI were similar. The only difference between these two models was that the diameter model included propylene glycol, while this main effect was not significant in the PDI model. It was also interesting that day 1 diameter was able to accurately predict the probability that the CQA specifications for thermal cycling PDI and day 30 PDI were met ( Figure 2C,D). To further investigate this potential relationship between diameter and PDI, day 1 PDI was plotted as a function of day 1 diameter, and a single regression line was fit to the data (Figure 4). There is a significant correlation between microemulsion diameter and PDI (R 2 = 0.9016, p-value < 0.0001). This is not surprising when considering the methods used to calculate these parameters. Diameter measurements reported here are calculated with zetasizer nano software (Malvern, UK) from the signal intensity, using a cumulants analysis that is the fit of a polynomial to the log of the G1 correlation function [37]. The fitted polynomial is shown in the equation below: Since both diameter and PDI are dependent upon the z-average diffusion coefficient b, it is reasonable that a correlation would exist between diameter and PDI. The fact that both MLR and logistic regression models confirm this correlation is encouraging and suggests that these statistical regression approaches were able to facilitate our understanding of the relationships between microemulsion composition, diameter, and PDI. Models capable of predicting microemulsion stability, including the 30-day percent diameter change MLR model ( Figure 1C, Tables 8 and 9) and the logistic regression models (Figure 2, Table  10) are particularly useful, because they enable the prediction of microemulsion stability prior to production (MLR) or only 24 h after production (logistic regression). This has the potential to save significant time and resources, as microemulsions unlikely to meet CQA specifications can be discarded without running further tests.
Through a screening DoE followed by subsequent augmentation to the DoE, we were able to identify a design space in which stir rate and water addition rate did not have a significant impact on microemulsion diameter, PDI, or stability. This suggests that the presented microemulsion production approach can tolerate changes in the manufacturing process parameters, as long as composition is unchanged. This was confirmed, as we were able to consistently scale up a select microemulsion formulation to ten times its original volume. Successful and reproducible scale up has the potential to ease the transition into extensive animal work or pre-clinical trials, and is therefore useful for nanoformulations in particular, as scale up can prove challenging for these types of formulations.
Risk assessment using FMECA, in combination with a screening DoE led to the rapid and efficient identification of several stable microemulsion formulations and the identification of a robust design space that enabled microemulsion scale up. Risk analysis narrowed down the number of process parameters that were studied, and the use of DoE further saved time by minimizing the number of runs that were needed to understand that relationship between the process parameters and the CQAs. To the best of our knowledge, this is the first report of a combination risk analysis, DoE, MLR, and logistic regression approach applied to a microemulsion formulation. This combination of experimental design and modeling techniques was powerful, because it allowed us to use microemulsion composition to predict diameter, PDI, 30-day percent diameter change, and probability of meeting multiple CQA specifications. The QbD approach presented here can be adapted to other nanoformulations' development and has the potential to reduce expenses through the prediction of unstable and/or unsuitable formulations in these early developmental stages. The terms a, b, and c are the coefficients of the fitted polynomial, and t is time. The second order term is used to define the degree to which the data bear a resemblance to single decay, and the first order term defines diffusion rate, where the coefficient b is the z-average diffusion coefficient, used to calculate particle diameter. This coefficient b, along with the coefficient c, are used to calculate PDI as shown in the below equation:

Materials
Since both diameter and PDI are dependent upon the z-average diffusion coefficient b, it is reasonable that a correlation would exist between diameter and PDI. The fact that both MLR and logistic regression models confirm this correlation is encouraging and suggests that these statistical regression approaches were able to facilitate our understanding of the relationships between microemulsion composition, diameter, and PDI.
Models capable of predicting microemulsion stability, including the 30-day percent diameter change MLR model ( Figure 1C, Tables 8 and 9) and the logistic regression models (Figure 2, Table 10) are particularly useful, because they enable the prediction of microemulsion stability prior to production (MLR) or only 24 h after production (logistic regression). This has the potential to save significant time and resources, as microemulsions unlikely to meet CQA specifications can be discarded without running further tests.
Through a screening DoE followed by subsequent augmentation to the DoE, we were able to identify a design space in which stir rate and water addition rate did not have a significant impact on microemulsion diameter, PDI, or stability. This suggests that the presented microemulsion production approach can tolerate changes in the manufacturing process parameters, as long as composition is unchanged. This was confirmed, as we were able to consistently scale up a select microemulsion formulation to ten times its original volume. Successful and reproducible scale up has the potential to ease the transition into extensive animal work or pre-clinical trials, and is therefore useful for nanoformulations in particular, as scale up can prove challenging for these types of formulations.
Risk assessment using FMECA, in combination with a screening DoE led to the rapid and efficient identification of several stable microemulsion formulations and the identification of a robust design space that enabled microemulsion scale up. Risk analysis narrowed down the number of process parameters that were studied, and the use of DoE further saved time by minimizing the number of runs that were needed to understand that relationship between the process parameters and the CQAs. To the best of our knowledge, this is the first report of a combination risk analysis, DoE, MLR, and logistic regression approach applied to a microemulsion formulation. This combination of experimental design and modeling techniques was powerful, because it allowed us to use microemulsion composition to predict diameter, PDI, 30-day percent diameter change, and probability of meeting multiple CQA specifications. The QbD approach presented here can be adapted to other nanoformulations' development and has the potential to reduce expenses through the prediction of unstable and/or unsuitable formulations in these early developmental stages.

Microemulsion Production
Microemulsions were produced on a 100 mL scale via water titration. All microemulsion components (except water) were added to the beaker, with kolliphor EL and PEG 400 always added at a 1:1 w/w ratio, so that their total weight was equal to the specified surfactants weight. Excipients were stirred at a speed of 350 rpm for 30 min. Then, the stir rate was adjusted to 350 rpm or 700 rpm, and water titration was performed at the specified rate (4 mL/min or 12 mL/min). When water addition was complete, the microemulsion continued to be stirred for 60 min. All microemulsions were produced at ambient temperature, which varied between 18 and 22 • C.

Dynamic Light Scattering Measurements
All dynamic light scattering measurements were performed using a Zetasizer Nano ZS series (Malvern Instruments, Worcestershire, UK). Microemulsions were diluted 1:40 v/v in de-ionized water. Measurements were performed at a temperature of 25 • C and a light scattering angle of 173 • .

CQA Specification Testing
Filtration: After production, microemulsions were stored at ambient temperature. On day 1 (24 h) after production, 25 mL microemulsion were filtered through a Millex-GS syringe filter with a pore size of 0.22 µm and stored in a non-sterile, 50 mL plastic centrifuge tube. Diameter and polydispersity index (PDI) of filtered and unfiltered microemulsions were measured. Filtered microemulsion was stored at 4 • C, and unfiltered microemulsion was stored at ambient temperature. Centrifugation: Five days (120 h) after production, filtered microemulsions were diluted 1:40 v/v in de-ionized water and centrifuged at 1620× g for 30 min. Microemulsion diameter and PDI were then measured without additional sample dilution. Same day measurements of filtered microemulsion stored at 4 • C were used for comparison. Thermal Cycling: Immediately after filtration, 5 mL of undiluted, filtered microemulsion was added to a glass vial, sealed with parafilm, and stored at 4 • C. After 24 h, samples were moved to 50 • C. Every 24 h, vials were moved between 4 and 50 • C for a total of four thermal cycles (8 days). Upon completion, diameter and PDI of thermal cycling samples were measured after 1 h equilibration to room temperature. Same-day measurements of filtered microemulsion continuously stored at 4 • C were used for comparison.

Design of Experiments
Screening Design: A two-level, seven factor, D-optimal screening mixture process variable design of experiments was developed using JMP Pro 13 software. The following constraints were defined for the mixture variables: Miglyol (2.0, 6.0% weight), transcutol (2.5, 7.5% weight), propylene glycol (0, 2.0% weight), and surfactants (Kolliphor EL and PEG 400, 1:1 w/w, 22.5, 27.5% combined weight). An additional constraint specified that the sum of Miglyol and propylene glycol could not exceed 6% weight for any formulation. Levels of 350 and 700 rpm and 4 and 12 mL/min were defined for stir rate and water addition rate, respectively. Factor ranges were selected based upon our prior knowledge in the field in combination with previous reports which studied the same parameters [14][15][16].
Augmented Design: Based upon statistical analysis of the screening DoE, this DoE was augmented to refine the design space and facilitate investigation of composition main effects and second order interactions. The augmented design consisted of the 15 screening runs plus 15 additional runs and modified the upper limit of internal phase such that it could not exceed 12% total weight. Additionally, the upper limit of propylene glycol was increased to 5% weight to capitalize on the finding that propylene glycol did not contribute significantly to studied CQA specifications. The selected augmented design maximized the power to detect regression coefficients.

Multiple Linear Regression (MLR) Modeling
JMP Pro 13 software was used to develop MLR models that predicted day 1 diameter, day 1 PDI, and 30-day percent diameter increase as a function of modeling process parameters. For screening studies, runs 1-15 (Table 4) were used to develop models that studied main effects terms only (five composition parameters, stir rate, and water addition rate). Models contained all seven parameters and were used solely to identify parameters that were likely to significantly impact (p-value < 0.05) the CQA specifications of interest.
After screening studies, stir rate and water addition rate were determined to not significantly impact microemulsion CQA specifications (Table 7), and the design of experiments was augmented to include an additional 15 runs (runs 16-30, Table 4). This augmented, 30-run DoE was used to develop MLR models that predicted day 1 diameter, day 1 PDI, and 30-day percent diameter change as a function of microemulsion composition. All main effects and interactions were studied using a stepwise forward approach. All terms with a p-value < 0.05 were included in the models. Models were developed using 21 (75%) runs and validated using the remaining 7 (25%) runs. Validation sets were selected using a stratified random sampling of the output of interest. All models were developed without intercept terms.
In all MLR models, mixture terms were coded as L pseudocomponents. This transformation is x i = x i −L i (Total−L) where x' i is the i'th pseudocomponent, x i is the original component value, L i is the lower constraint for the i'th component, L is the sum of lower constraints for all components, and Total is the mixture total. This linear transformation allows the regression coefficients for the mixture components to be comparable in size.

Logistic Regression Modeling
Logistic regression models were developed to predict the probability that a microemulsion would meet one or more CQA specifications. Microemulsions that met the CQA specification(s) were assigned a value of 1, and microemulsions that failed to meet one or both CQA specifications were assigned a value of 0. Models were developed using 21 (75%) runs and validated using the remaining 7 (25%) runs. The validation set was selected using a stratified random sampling of day 1 diameter (the predictor variable).

Microemulsion Scale Up to 1 L
A selected microemulsion formulation (see results section for selection explanation) was scaled from 100 mL to 1000 mL in triplicate. Scaled up microemulsions were produced using a stir rate of 500 rpm and a water addition rate of 80 mL/min. Scaled up formulations underwent the same CQA specification testing as the 100 mL scale microemulsions.

Conclusions
In the present work, we used quality by design methodology to efficiently identify stable, robust microemulsions and understand the processes that impact microemulsion diameter, PDI, and stability. We used FMECA to identify the process parameters that were most likely to impact microemulsion diameter, and through a screening design of experiments, we were able to identify a design space in which microemulsion diameter, PDI, and stability were dependent solely upon microemulsion composition. We hypothesized that microemulsions that were robust to changes in production processing parameters (stir rate and water addition rate) could undergo successful scale up. This hypothesis was confirmed, as we successfully and consistently scaled up a selected microemulsion tenfold, from 100 mL to 1000 mL. Using MLR, we were able to develop predictive models for microemulsion diameter, PDI, and 30-day percent diameter change. We also developed logistic regression models that predicted the probability that a microemulsion would meet one or more CQA specifications as a function of day 1 diameter. This unique combination of MLR and logistic regression was powerful in this specific application, as it could be used to predict not just the basic colloidal properties (diameter and PDI), but also the probability that a formulation will pass quality control testing. The present work is an example meant to demonstrate the usefulness of adapting QbD approaches to nanoformulation development. Adapting QbD approaches to nanoformulations has the potential to reduce expenses through early identification of unsuitable formulations, as well as increase the likelihood that the product can be scaled up for further study.