Robust Optimization Approaches for a Natural Pharmaceutical Complex Product of Atractylodes Japonica Koidz and Schisandra Chinensis

: Experimental results pertaining to natural pharmaceutical complex products (NPCPs) often exhibit large variabilities in their associated response variables. To improve the quality of an NPCP, systemic studies (i.e., statistical analysis and mathematical optimization), including variability analysis and robust optimization, are often required. To this end, a systemic approach for an NPCP development process is proposed by integrating robust design and optimization methodologies. A quality function deployment method can be used to systematically deﬁne a standardized manufacturing process and relevant process variables for Chong Kun Dang (CKD)-497. Based on those variables, an experiment is designed using response surface methodology to mathematically estimate the output response functions associated with input variables. In addition, a design space (DS), which can guarantee the quality of an NPCP, is demonstrated by utilizing the overlaid contour plots of the estimated response functions. Finally, a CKD-497 case study is conducted for veriﬁcation and validation.


Research Background
Product quality in the pharmaceutical industry is increasingly being recognized as important by researchers and companies. Stringent regulations, such as good manufacturing practice (GMP) and quality by design (QbD), were proposed and applied in the pharmaceutical industry to improve drug quality. To comply with GMP and QbD regulations, many pharmaceutical industries have undertaken several methods, such as quality factors and process improvement activities [1]. In particular, the manufacture of natural pharmaceutical complex products (NPCPs) requires much effort as natural substances, which are the main constituents of NPCP, generally exhibit considerable variability in their properties that cannot be controlled easily.
This study targets Chong Kun Dang (CKD)-497 as NPCP under development in CKD Pharm. CKD pharm aimed to investigate the effects of the extract CKD-497, composed of Atractylodis Rhizoma Alba and Schisandra chinensis, in relieving cough and facilitating expectoration of phlegm. CKD-497 was found to inhibit inflammatory mediators, such as interleukin-8 (IL-8) and tumor necrosis factor (TNF-α) in lipopolysaccharide (LPS)-treated mouse macrophages and transient receptor potential cation channel 1(TRPV-1)-overexpressed human bronchial epithelial cells stimulated by capsaicin.
CKD-497 decreased the viscosity of the mucin solution. During in vivo experiments, CKD pharm found that CKD-497 reduced coughing numbers and increased expectoration of phlegm via mucociliary clearance enhancement. Collectively, these data suggest that CKD-497 possesses the potential for cough and phlegm expectoration treatment [2]. However, the content of crucial index components varied depending on the source and extraction conditions of Atractylodis Rhizoma Alba and Schisandra chinensis. In this situation, to maintain the stable target quality of the CKD-497, it is essential to identify the critical process variables that affect the response variables of CKD-497 and find the optimum relationship between these variables.
The design of experiments (DoE) approach, a statistical optimization method, offers several benefits for optimizing drug dosage [3,4]. This approach determines the minimum number of experiments required to screen and define critical parameters. Simultaneous consideration of all the variables improves the speed of optimization to a large extent. Determination of the relevant factors and obtained responses are crucial for the DoE approach. In this approach, the factors are manipulated for the desired responses. Then, interactions between the factors and responses can be interpreted using a suitable mathematical model [5]. In this study, we propose a robust optimization approach based on DoE, which may be widely used in many industries and in medicine to achieve the target quality of CKD-497.

Literature Review
Experimental design is widely used in industries as an innovative method for experimental parameter selection. [3] To solve mixed constraint experiments and multipurpose optimization problems in the design of pharmaceutical production processes, [1] studied the process design method for process improvement. This study presented L-pseudo components and U-pseudo components to handle physical constraints. Finally, a numerical example shows that the proposed robust desirability function (RDF) can be efficiently applied to pharmaceutical process design. Reference [5], described the successful preparation of nano-emulsions of eplerenone (EP) using non-ionic surfactants. Those formulations were designed and optimized using the design of experiments approach, which accelerates the formulation process and provides desirable properties for the formulations. Reference [3] showed that the coupling of the DoE approach with modern high-throughput automation systems can potentially maximize the capabilities of these systems and improve the productivity of many drug discovery procedures. Another study [6] used a robust design to determine the optimal combination of six major catechins from green tea. Besides, several optimal design processes for experimental design methods have been proposed for various industries [7,8], and other studies suggest several optimal schemes related to the field of design [9][10][11][12].
Atractylodes rhizome white, one of the natural components of CKD-497, has been extensively studied for the effect of leach. Atractylon, a main sesquiterpenic constituent of Atractylodes rhizomes, was examined for its inhibitory effects on tert-butyl hydroperoxide (t-BHP)-induced cytotoxicity and lipid peroxidation in the primary culture of rat hepatocytes [13]. In an attempt to elucidate the physiological activities of (6E, 12E)-tetradecadiene-8, 10-diyne-1, 3-diol diacetate (TDEYA), which was detected in a hydrolyzed form (TDEY) in the plasma after rats were orally administered with a decoction of Atractylodes rhizome, [14] examined the inhibitory effects of various enzymes assumed to participate in the regulation of body fluid levels and inflammatory reactions. From the examination of the effects of several acetylene compounds on xanthine oxidase inhibition, it is clear that the active structure of the compounds is due to the presence of conjugated triple and double bonds. Other studies related to Atractylodes rhizome included a study that aimed to determine the origin of the rhizome based on anatomical features and sequence characterized amplified region (SCAR) marker [15].
Schisandra chinensis is the remaining important natural component of CKD-497 and there have been several studies related to it. In [16], the Schisandra chinensis pollen extract has been reported to have strong antioxidant activity and significant protective effect against the acute hepatotoxicity induced by carbon tetrachloride (CCl 4 ). These have been supported by the evaluation of liver histopathology in mice. The role of Schisandra chinensis in the treatment of insomnia has been studied by examining the influence of insomnia on the levels of the neurotransmitters [17]. The effects and molecular mechanisms of the Schisandra chinensis fruit extract and its primary compound gomisin A (GA) on the contractility of rabbit penile corpus cavernosum smooth muscle (PCCSM) were also evaluated [18]. In [19], the enhanced antitumor and reduced toxicity effects of a low molecular weight purified polysaccharide from Schisandra (SCPP11) were investigated in 5-fluorouracil (5-Fu) treated Heps-bearing mice. A group of researchers [20] also showed that Schisandra polysaccharide increased the thymus and spleen indices, pinocytic activity of peritoneal macrophages, and hemolysin formation in cyclophosphamide (CTX)-induced immunosuppressed mice. In the [21] study, they attempted to investigate the effects of the lignan-riche extract of Schisandra chinensis fruits (ESP-806) on neurotoxicity and memory impairment induced by Aβ1-42 injection in mice. Moreover, their experimental results suggested that the extract of Schisandra chinensis fruits rich with dibenzocyclooctadiene lignans may be useful in the prevention and treatment of Alzheimer's disease.

Research Motivation and Significance
Available literature shows that several optimization studies based on experimental design are actively pursued in various industries and the pharmaceutical field. In this study, we propose a robust optimization approach to minimize the variability in the production of CKD-497. Firstly, a quality function deployment (QFD) is used to define the standardized manufacturing process of CKD-497 and derive the essential process variables. Based on this, an experimental design (i.e., response surface methodology, RSM) is then performed to define the process variables and output response variables (ORVs) and to estimate the combined effects of the critical process variables. The design space (DS) is derived from the overlaid contour plot, and the interval optimization of important independent variables is performed. Reproducibility experiments (RE) are performed based on the optimized results, and a comparison between the optimal predicted value and the RE results is also performed. Finally, the results of the analysis are summarized, and the conclusions are presented.

Extraction of Raw Materials
CKD-497 was extracted from the Atractylodis Rhizoma Alba and the fruit of Schisandra chinensis. The rhizome root was dried and cut horizontally, and the fruit of Schisandra chinensis was also dried and chopped. These two sources were extracted with 50% ethanol for 48 h at room temperature and then evaporated under pressure. Figure 1 shows the natural substance of CKD-497, Atractylodis Rhizoma Alba, and Schisandra chinensis.

Primary Characteristics
The index components for Atractylodis Rhizoma Alba and Schisandra chinensis were 6,12tetradecadiene-8,10-diyne-1,3-diol (TDD) and Schizandrin (SZD), respectively. High-performance liquid chromatography (HPLC) analysis conditioned using Kromasil (4.5 × 250 mm, 5 µm) column in 1 mL/min of water and acetonitrile and was detected at 254 nm by ultraviolet (Waters, Taunton, MA, USA). The retention time of TDD was 13 min, and SZD was slightly less than 15 min. The content and concentration of TDD and SZD were calculated using the area compared to commercial standard compounds in the HPLC spectrum. AU is a measurement unit in HPLC by UV and Area is an integral calculus of peak in HPLC comparing to standard material. Therefore, we can analyze the quantification of the target molecules in the extracts. AU and Area were detected by UV, and Areas were measured by integration in the software (Empower ver.3.0 Waters, Taunton, MA, USA).
The yield was calculated by dividing the total weight of the evaporated extract by the plant's total weight and obtaining the corresponding percentage. Brix was an index for the degree of evaporated extract and detected by digital Brix meter (HI96801, Hanna instrument, Korea). The first dilution of CKD-497 was 1 mL in 2 mL distilled water; however, the density was too thick to measure. We try to dilute until the density was transparency for the UV spectrometer, which was 1/2000 dilution. After CKD-497 was diluted in 1/2000, the color of diluted CKD-497 was measured by a colorimeter (LICO 620, Yuil Lab Tech, Gyeonggi-do, Korea) in the light, a and b according to the manufacturer's instruction. Transparency was detected by a UV spectrometer (V-650, Tokyo, Japan) at 400, 430, 451, and 480 nm.

Process Standardization Method
The manufacturing process of CKD-497 consists of the following: (i) weighing and mixing of natural materials, (ii) solvent injection and cutting, (iii) primary and secondary extraction, (iv) primary and secondary filtering, (v) primary and secondary concentration of filtrate, (vi) sterilization. Variables for controlling each process include the cutting size of natural substances, extraction temperature, filter paper size, and of concentration completion determination. To standardize these process variables, QFD analysis is performed to identify the process values most relevant to the main customers (Internal and Ministry of Food and Drug Safety). QFD is a widely used technique applied to meet customer demands and expectations and transfer them to the product [22].
Several researchers from the CKD-497 lab and the researchers who were directly involved in the experiments participated in the QFD analysis. Several DoE experts and industrial engineering experts participated in the study. Figure 2 shows the results of the QFD analysis. The entire set of rows indicated as area 2 ( 2 ) in the figure represents the manufacturing process parameters for CKD-497. The series of processes begins with weighing of the natural substances, mixing, and finishing with sterilization. The rows designated as area 3 ( 3 ) represents the required quality. Area 4 ( 4 ) illustrates the company's internal needs, and area 5 ( 5 ) represents a part of the quality requirements of the Ministry of Food and Drug Safety. In area 6 ( 6 ), the degree of correlation between area 2 and areas 4 and 5 is evaluated by a score. The correlation was evaluated as 9 points for extremely high, 7 points for high, 3 points for medium, 1 point for low, and 0 points irrelevant. For example, in the "primary and secondary extraction" process, the temperature was rated as 9 because it was positively correlated to Tanine. Area 7 ( 7 ) evaluates areas 4 and 5 in terms of weight, quality, and sales point, and finally multiply them to calculate the absolute importance. Demand qualities in areas 4 and 5 are evaluated based on their absolute importance. Area 8 ( 8 ) represents the importance of the process input variables. Here, the importance is calculated as the sum of the absolute importance of area 7 and the product of the scores of area 6. For example, in the "weighing and mixing of natural substances" process, the weight is the key point. The total weight is equal to the sum of absolute importance. Area 9 ( 9 ) indicates the unit, goal, and limit for items of area 2. Lastly, area 1 ( 1 ) summarizes the conditions and selection of essential process variables considered in DoE based on the importance of area 8.

Critical Quality Attributes of CKD-497
Critical quality attributes (CQAs) for DoE are defined based on the QFD results. In area 8 in Figure 2, the importance was high, but in the case of the types and weight of a natural substance and the amount of solvent input, which are uncontrollable variables, management (recording) and fixation was made, and the process of cutting and attainment of the required concentration was standardized. On the other hand, the more influential controllable variables in area 8 are the extraction temperature, stirring wing length, circulation velocity, and filter paper size. Among these, the effect on the stirring wing length and circulation speed is not determined when the circulation velocity is zero, thereby violating the independence, which is an essential factor in DoE. Therefore, the circulation velocity, which has a more significant influence on the required quality is chosen as the main factor, and the effect was minimized by fixing the stirring wing length to 10. In conclusion, the extraction temperature, circulation velocity, and filter paper size were chosen as the main input variables (IVs).
Next, the ORVs were selected from customer requirements based on the absolute importance of area 7. Table 1 summarizes the IVs and ORVs derived through QFD. Based on this, DoE is performed, and a design area that satisfies the target quality is drawn.

Robust Optimization Approach
The CQAs of CKD-497 are selected through QFD, and a robust optimization design based on DoE is performed. As a first step, the significance of the effects of IVs on ORVs is statistically evaluated through a full-factorial experiment. Second, based on the response surface method, an optimal combination of IVs that satisfies the target value of ORVs is derived. To this end, we first derive a DS based on the overlaid contour plot. Based on this, the response surface design and analysis are performed. At this time, the optimal solution is derived by applying the weighting scenario that gives priority to the ORVs. Finally, RE is conducted to compare the results of the existing experiments with the optimum predictive value and the RE. Figure 3 is a schematic of the robust optimal approach used here.
In the case of RSM, which is used for optimally deriving, polynomial regression of the second-order model is mainly used, and it has the advantage of producing curvature effects and precision optimization through repetition and axial points of the center point. The RSM regression formula is as followŝ whereβ denotes the estimated coefficient in the regression functions, and X is the design matrix of IVs. The coefficient can be estimated by using the least square method as follows [23] β = X T X −1 X T y.
The main goal of the optimization was to identify the optimal settings of the process variables. Among the various optimization methods available, the desirability function (DF)-based optimization method was utilized to solve the simultaneous dual-objectives [6], including enhancing the significant ingredients of CKD-497 where "the nominal better (N)-type" was applied. The N-type can be defined by whereŷ i denotes the estimated function between ingredients of CKD-497, and L i and T i denote the lower and upper values, respectively [23]. The superscripts w 1 and w 2 represent the importance of the individual ranges. The overall desirability, D, a geometric mean of all the DFs, can be expressed as follows [23] The optimal combinations of ingredients are obtained when the overall desirability D is maximized.

DoE Results
The full factorial experiment was performed by adding one center point to the third level with two factors. The total number of test points is nine, and the total number of tests is 81, which can be obtained by extracting three samples per test point and repeating them thrice. Table 2 represents the level and range of the input variables.
In this study, statistical software (Minitab 18) was used for the analysis of the base of RSM as previously described in Equations (1)-(4) (see Section 2.5). Based on the experimental results, the analysis of variance (ANOVA) of the response surface custom design was conducted to understand how the selected IVs influence the ORVs. The results of the analysis are shown in Table 3. If the p-value is less than 0.05, it is considered statistically significant; R 2 represents the coefficient of regression, which is deemed better as it increases in value. In the case of Y1 (TDD), Y2 (SZD), and Y3 (yield), the p-value of the model was less than 0.05, which was significant. In addition, the R 2 value was over 80%, so the regression model was highly explained. On the other hand, in the case of Y4 (solid flour), the p-value of the model was 0.11, indicating no significant effect. In addition, we analyzed the variances generated by repeated experiments. In all results, the p-value was greater than 0.05, indicating that it was not statistically significant. In conclusion, based on the ANOVA results, Y1 (TDD), Y2 (SZD), and Y3 (yield) were optimized except for Y4 (solid flour), which were not statistically significant.  To derive DS, first, fix the two factors of IVs to the X-axis and Y-axis. Next, we analyze all cases of overlapping with one remaining factor and various levels. Then, after confirming that a designable area that satisfies the upper and lower limits of the output response variable is derived, the final overlap contour plot is also derived. The derivation of DS in this study used the statistical software Design Expert Version 12. Figures 4-6 show the contour plots of X1, X2, and X3, respectively. In Figures 4-6, there is no case where a designable area is derived when all three levels (−1, 0, +1) overlap. This is because there are many upper and lower limits to be satisfied due to a multitude of output response variables, as well as large variances in the experiment. To improve this condition, it was decided to set the X3 (filter paper size) to −1 level and set the X-and Y-axes to X1 (extraction temperature) and X2 (circulation velocity), respectively. Figure 7 shows the overlapping contour plots derived by applying the 95% confidence interval. The yellow area is the designable area and the gray area is outside the upper and lower limits. In Figure 7, there is a yellow area, but it is overlaid by the gray area, which means that there is no designable area. The reason for this appeared to be due to the variance of the experimental results.   As a result, it was decided to derive the designable area using the overlapping contour plot without applying the confidence interval. Figure 8 shows the final DS, and the DS is derived from the narrower area in consideration of variance optimization results and reproducibility. The range of IVs for the final DS was fixed at 0.25 to 0.95 extraction temperature, 0.5 to 0.8 circulation velocity, and −1 filter paper size.
Subsequently, for the response surface design, the optimization target values (TVs) and the upper and lower limit values of the ORVs were arbitrarily set, as shown in Table 4. Because CKD-497 is a product under development, it is not possible to present a certain target value or upper and lower limits. Therefore, the values necessary for the experiment were set based on the previous experimental values. In addition, the scenario was derived by weighting the ORVs considered relatively crucial to find a combination of IVs that best meets the target value of the ORVs. Table 5 shows the optimization scenarios according to the weights, and all 10 cases were set by assigning weights between a minimum of 0.1 and a maximum of 10 points. Finally, the response surface user design was conducted based on the range of IVs derived from the DS, TVs of ORVs, and weighted scenarios.   Table 5. Optimization scenario based on weight by ORVs. The optimization result was derived based on the response surface user design. The optimization results were analyzed by considering only the mean value and considering both mean and variance values. In conclusion, we decided to use the result that considered only the mean value. This is because, when considering the variance, the result of optimizing the output response variable tended to be unsatisfactory because of the minimization of the variance and satisfying the target value at the same time. In addition, considering that there were no statistically significant results in the ANOVA of the IVs, it was judged reasonable to perform optimization except for the variance. As a result of the analysis based on the weights, Case 9 s results were found to be closest to the target value. Therefore, X1 (extraction temperature) = 44.15, X2 (circulation velocity) = 175.76, and X3 (filter size) = 2.5 were selected as the RE conditions. For the RE, the decimal point of the input variable was removed, and the coded and uncoded levels are summarized, as shown in Table 6. Besides, the predicted values (PVs) of the RE are shown in Table 7 together with the TVs.

Reproducibility Experiment Results
The reproducibility experiment results are summarized in Table 8 in terms of mean and standard deviation. 1st row, except the title row, indicates the TVs of ORVs. The 2nd and 3rd row show the PVs and experimental values (EVs) of the RE, respectively. The 4th row shows PV minus EV. The 5th row summarizes PV according to the IV conditions of previously performed CQ series experiments of CKD-497. From the 6th row, the EV and PV minus EV are summarized, respectively. In addition, as shown in the 1st column of Table 8, natural substances used for each experiment were different and labeled as ( ). Further, the conditions of IVs for each experiment are summarized in the last column. First, the EVs and TVs of the RE and CQ series were compared in terms of the mean. Figure 9 shows the EVs of each experiment with TVs with dotted lines around Y1 and Y2. The EVs of RE tended to be slightly higher overall than the TVs, but they were most similar to TVs in all cases except Y1 (area). On the other hand, the CQ series were much higher than TVs except for CQ1. Next, in Table 8, comparing the PVs and EVs of each experiment, EVs were higher than PVs in both the RE and CQ series. Next, Figure 10 shows a simple comparison of EVs in terms of standard deviation. In all the graphs, the CQ2 was the largest. Next, the standard deviations were higher in the order of CQ3, CQ4, and CQ1. On the other hand, the EV of the RE showed a uniformly small standard deviation.  Figure 11 shows the reduction of the standard deviation of the EV of the CQ series compared to that of the RE. The standard deviation of RE decreased significantly in all cases except Y1 (Area) of CQ1. In particular, the standard deviation of RE was reduced by 80% on average compared to CQ2.
Lastly, in the case of Y3 (Yield), the TV, EV, and PV of RE can be compared because there is no result of the CQ series. The value of EV (=50.14) was slightly higher than PV (=49.95) and lower than TV (=60).

Discussion
As shown in the previous experimental results, the EVs of RE tended to be slightly higher than the TVs on the mean side, but it was confirmed that they had a constant value based on the TVs. On the other hand, the EVs from the previous experiments show that there is variability because the large and small cases are mixed according to ORVs based on the target value. Moreover, in terms of the standard deviation, the EVs of RE were significantly smaller than those of the CQ series. In conclusion, the conditions of the optimal IVs derived from DoE not only increased the likelihood of achieving the target quality, but also significantly reduced the variability that could occur in the process.
However, the results of this study do have limitations with regard to PV accuracy. This limitation can be explained as follows: first, this study was conducted in consideration of the difficulty and cost of the experiment.
Thus, the original 20-point design was reduced to 9 points for the optimization analysis. That is, 180 experiments were reduced to 81. This resulted in the generation of quick experimental results with minimal cost, but there was a lack of information in deriving the optimization model. Secondly, the experimental results could not be disclosed, but the experimental results of the CQ series showed that there was quite a lot of variability even within experiments that use the same natural substances. Therefore, even if the RE was conducted under optimized conditions, the inevitable variability inherent in the natural substances would be included.

Conclusions
In this study, we have proposed IVs conditions to optimize the ORVs based on a DoE approach to achieve the target quality of CKD-497 and have verified the optimization results through RE. Besides, by comparing and analyzing the results of previous experiments and RE from different aspects, it was confirmed that the inherent variability occurring in natural substances could be reduced through process optimization techniques.
Based on the previously derived optimization results, the verified contents are summarized as follows: (i) in terms of mean, RE's results were slightly above TV, but they were all satisfied. On the other hand, most PE results have a much higher value or are not satisfied with TV (see Table 8). This confirms that the optimum conditions of IVs can produce satisfactory results for TV; (ii) next, in terms of standard deviation, it was found that the results of RE have a much smaller value than that of PE, and in the case of a specific experiment (CQ 2), it was found that the standard deviation was lowered by up to 84% (see Figure 11). Through this result, it was verified that the variability of TDD and SZD, the index components of CKD-497, can be reduced as well as controllable; however, (iii) we have confirmed that there exists a difference between the RE results and the PV results due to the reduced experiments to achieve maximum effectiveness at the minimum cost (see Section 3.3). We believe that further research will be necessary to improve them.
Finally, when all the results are put together, the implications of this study are as follows: (i) optimal process conditions derived from DoE can be used to obtain results close to the TVs of the ORVs; (ii) it is possible to reduce the variation of ORVs by standardizing the manufacturing process, even with the inherent variability of natural substances; and iii) further experiments will ensure that the ORVs achieve their targets and reduce the variance.