Blending Technology Based on HPLC Fingerprint and Nonlinear Programming to Control the Quality of Ginkgo Leaves

The breadth and depth of traditional Chinese medicine (TCM) applications have been expanding in recent years, yet the problem of quality control has arisen in the application process. It is essential to design an algorithm to provide blending ratios that ensure a high overall product similarity to the target with controlled deviations in individual ingredient content. We developed a new blending algorithm and scheme by comparing different samples of ginkgo leaves. High-consistency samples were used to establish the blending target, and qualified samples were used for blending. Principal component analysis (PCA) was used as the sample screening method. A nonlinear programming algorithm was applied to calculate the blending ratio under different blending constraints. In one set of calculation experiments, the result was blended by the same samples under different conditions. Its relative deviation coefficients (RDCs) were controlled within ±10%. In another set of calculations, the RDCs of more component blending by different samples were controlled within ±20%. Finally, the near-critical calculation ratio was used for the actual experiments. The experimental results met the initial setting requirements. The results show that our algorithm can flexibly control the content of TCMs. The quality control of the production process of TCMs was achieved by improving the content stability of raw materials using blending. The algorithm provides a groundbreaking idea for quality control of TCMs.


Introduction
Traditional Chinese medicine (TCM) has been used in Asia for thousands of years and is beginning to be widely accepted worldwide [1][2][3]. In recent years, Chinese medicine has demonstrated unique efficacy in the treatment of various chronic diseases. Some compounds in TCMs can induce autophagy, which has been found to cure alcoholic liver and neurodegenerative diseases [4][5][6][7][8]. Some studies have reported that certain TCMs could regulate components of apolipoprotein and key molecules involved in lipid transportation and uptake, which can help in treating conditions such as obesity and coronary heart disease [9][10][11]. TCMs have also been remarkably effective in treating patients with COVID-19 [12][13][14]. the samples are qualified. Tang's target was too similar to their samples, which differed by only one herb. Secondly, these algorithms in these studies did not show flexibility for different blending constraints. Manufacturers may change the mixing constraints for other products' needs or standards in actual production. The algorithms of Liu and Tang only performed single-condition blending, making it challenging to meet the actual needs of the manufacturer. Furthermore, these articles did not mention how to select blending samples for the blending process. In practice, not all selected samples can be blended to calculate the desired target. For example, when the content of all selected samples is less than the target, it is not possible to obtain the mixing result with equal feeding. Now, there still need to develop a blending process that establishes goals, meets multiple conditions for flexible blending, and can help users to select blending samples effectively.
In this study, we developed a new blending method based on a nonlinear optimization algorithm to meet the practical needs of manufacturers. The algorithm calculated the blending recipe to meet the overall similarity and individual content deviation requirements. Samples from different years of origin were compared to illustrate the need for blending. For the target established, high-consistency samples from the manufacturer were the target branch, and qualified samples from the market were used as blending samples. Principal component analysis (PCA) results of samples and targets were established to choose samples for mixing. Multiple sets of computational experiments were designed to demonstrate the flexibility of the nonlinear optimization algorithm for different limits of relative-deviation coefficients (RDCs) or different samples for mixing. Samples at critical conditions were chosen for the experiment, and their deviations from the computational results were analyzed. Finally, the prospect of applying algorithms in the quality control of TCMs was discussed.

Optimization of Chromatographic Conditions
To characterize the main components, the extract of the sample was analyzed by high-performance liquid chromatography with an ultraviolet detector and evaporative light-scattering detector (HPLC-UV-ELSD). The most appropriate conditions were screened out based on several chromatographic experiments, including mobile phases, gradients, flow rate, and detection wavelength. The mobile phase was composed of MeOH (A) and 1% FA (v/v) in H 2 O (B). The gradients of mobile phases A and B with respect to time were consistent with Table 1. The flow rate was 1.0 mL/min. The HPLC-UV-ELSD chromatograms were shown in Figure S1. After considering the quality standards of commercially available ginkgo biloba extracts such as Shuxuening injection [22,51,52], we chose the chromatograms from 25 min to 60 min and selected 15 peaks as the standard peaks to establish the standard fingerprints used in the production of ginkgo medicine. The fingerprints are shown in Figure 1. The qualitative attributions of its chromatographic peaks are shown in Table 2, identified by standard compounds shown in Table S1. Fiftythree batches of ginkgo biloba were obtained from different manufacturers, and their detailed information is listed in Table S2. The area of peak 8 in the UV chromatogram was used to calibrate all ELSD peaks. However, using semiquantitative data in blending does not require accurate compound identification. In this process, peak matching and area correction are more important than compound identification.

Comparison of Ginkgo Leaves from Different Sources
We compared the samples of ginkgo biloba from different origins and years, and the results are shown in Figure S2. Their average content of components was used as the standard, and the similarity and RDC of each compound were calculated, as shown in Table S3. The average of each peak was used as the unit of measurement and the number of each peak as the axis. The radar charts are shown in Figure S3.
To compare the samples of different years or sources, the peaks and peak areas of the samples were classified by different conditions. The results were shown in Figure 2. The differences in the content of P2, P5, P13, P14, and P15 ranged from 200% to 300%.
Comparing samples from different years and origins shows that the available qualified samples in the market vary greatly, which is unfavorable for quality control and subsequent quantitative research of TCMs. According to Horbowicz M. et al., this may be related to the content of methyl jasmonate in the leaves [53]. The above results also imply that it is difficult to control the quality of samples from cultivation, indicating the necessity of the blending process.  Supplementary S1-S2 and S3-S4 were the samples of the same year but with different origins. From Figure 2A and Table S3, the content of P5, P8, P13, and P15 differed by more than 30%.
Supplementary S5, S7, S9, S11, and S13 were samples of the same origin but with different growth years. Comparing Figure 2B and Table S3, the content of P13 and P15 also differed by more than 40%.
Supplementary S5-S6 and S7-S8 were samples from the same region and year. In Figure 2C and Table S3, samples of the same origin and the same year had similarities over 99% and RDCs within ±10%.
Supplementary S15, S16, and S17 were samples with more than ten years of growth, Supplementary S5 was a sample with two years of growth, and Supplementary S11 was a sample with five years of growth, and they all have the same origin. Figure 2D and Table S3 show that the samples with more than ten years of growth period differ significantly from the other samples in composition content. The differences in the content of P2, P5, P13, P14, and P15 ranged from 200% to 300%.
Comparing samples from different years and origins shows that the available qualified samples in the market vary greatly, which is unfavorable for quality control and subsequent quantitative research of TCMs. According to Horbowicz M. et al., this may be related to the content of methyl jasmonate in the leaves [53]. The above results also imply that it is difficult to control the quality of samples from cultivation, indicating the necessity of the blending process.

Establishing a Blending Target
High-consistency samples were used to establish the target. T1-T10, referred to as the target samples, were the high-quality branch recognized by the factory. Their chromatograms are shown in Figure S3, and radar plots are shown in Figure 3. The average value of samples T1-T10 was used as the blending target. Their similarities and RDCs in Table 3 show that the target samples have a pretty close similarity of 99.3% or higher and relatively small content differences with ±20% or lower. This is why the RDCs were controlled below ±20% in the following calculation.
3 show that the target samples have a pretty close similarity of 99.3% or higher and relatively small content differences with ±20% or lower. This is why the RDCs were controlled below ±20% in the following calculation.
It can be seen that the compositional content of the samples as targets was in good consistency. Thus, the constraints of our blending process referenced to the target samples to keep the RDCs of the blending results consistent (±20%) or better (±10%) with the target.   It can be seen that the compositional content of the samples as targets was in good consistency. Thus, the constraints of our blending process referenced to the target samples to keep the RDCs of the blending results consistent (±20%) or better (±10%) with the target.

Comparing and Choosing Samples
M1-M11, referred to as the blending samples for mixing, were qualified branches purchased by the factory. Their radar plots are shown in Figure 4. The similarity and RDC are shown in Table 4. The blending samples have similarities as low as about 91% and the RDC from −50% to 60%. Compared with the target samples (T1-T10), all the blending samples but M8 have lower similarities. Even in the M8 sample, which was most similar to the target, more than half of its peaks have RDCs over ±20%.

Comparing and Choosing Samples
M1-M11, referred to as the blending samples for mixing, were qualified branches purchased by the factory. Their radar plots are shown in Figure 4. The similarity and RDC are shown in Table 4. The blending samples have similarities as low as about 91% and the RDC from −50% to 60%. Compared with the target samples (T1-T10), all the blending samples but M8 have lower similarities. Even in the M8 sample, which was most similar to the target, more than half of its peaks have RDCs over ±20%.
Comparing the target samples with the qualified samples shows significant differences between even the samples of ginkgo biloba that meet the market standards. This illustrates the importance of controlling product quality with blending algorithms.   Comparing the target samples with the qualified samples shows significant differences between even the samples of ginkgo biloba that meet the market standards. This illustrates the importance of controlling product quality with blending algorithms.
The goal of our blending algorithm is to blend the qualified but inconsistent samples to improve consistency across batches. Thus, the significance of the blending algorithm is to provide a blending ratio that ensures a high overall fingerprint similarity between the product and the target and controls the RDCs of component contents.
The PCA results for the blending samples (M1-M11), the target samples (T1-T10), and their blended target are shown in Figure 5. The PCA results show that the consistency of the target samples (T1-T10) was much higher than that of the blending samples (M1-M12). The goal of our blending algorithm is to blend the qualified but inconsistent samples to improve consistency across batches. Thus, the significance of the blending algorithm is to provide a blending ratio that ensures a high overall fingerprint similarity between the product and the target and controls the RDCs of component contents.
The PCA results for the blending samples (M1-M11), the target samples (T1-T10), and their blended target are shown in Figure 5. The PCA results show that the consistency of the target samples (T1-T10) was much higher than that of the blending samples (M1-M12).
One great use of PCA results is effectively selecting the samples for a successful blending. If one chooses the blending samples distributed on one side around the target, the mixing will have a lower chance of meeting one given set of the relative deviation content limits as defined in Equation (7). Conversely, the selection of the blending samples symmetrically distributed around the target should be highly likely to meet the blending requirements.

Calculation of the Blending Method in Different Constraints with the Same Samples
To verify our algorithm to perform blending under different constraints, three different blending constraints were set for mixing. The constraints 1, 2, and 3 were abbreviated as C1, C2, and C3. The RDCs of P2, P5, and P6 with mixed high-and medium-component contents were controlled within ±10% in C1. The RDCs of P11, P12, and P13 with mixed high-and medium-component contents were held within ±10% in C2. The RDCs of P1, P4, and P9 with low-component contents were controlled within ±10% in C3. All information is summarized in Table 5. Using the PCA results in Figure 5 qualified samples M1, M2, M5, and M8 were selected as blending ingredients, uniformly distributed around the target. The calculated blending-ratio results are shown in Table 5, and the projected simi- One great use of PCA results is effectively selecting the samples for a successful blending. If one chooses the blending samples distributed on one side around the target, the mixing will have a lower chance of meeting one given set of the relative deviation content limits as defined in Equation (7). Conversely, the selection of the blending samples symmetrically distributed around the target should be highly likely to meet the blending requirements.

Calculation of the Blending Method in Different Constraints with the Same Samples
To verify our algorithm to perform blending under different constraints, three different blending constraints were set for mixing. The constraints 1, 2, and 3 were abbreviated as C1, C2, and C3. The RDCs of P2, P5, and P6 with mixed high-and mediumcomponent contents were controlled within ±10% in C1. The RDCs of P11, P12, and P13 with mixed high-and medium-component contents were held within ±10% in C2. The RDCs of P1, P4, and P9 with low-component contents were controlled within ±10% in C3. All information is summarized in Table 5. Using the PCA results in Figure 5 qualified samples M1, M2, M5, and M8 were selected as blending ingredients, uniformly distributed around the target. The calculated blending-ratio results are shown in Table 5, and the projected similarity and RDCs are shown in Table 6. The radar plot of the calculation results is shown in Figure 6. The results in Figure 6 and Table 6. show that the algorithm can successfully perform blending-optimization calculations under different constraints with 99.9% or higher similarities and ±10% or lower RDCs of the selected components in all three experiments. For P4, the RDC even improved from a minimum of −45.4% to −10% (M2). These RDCs were lower than that in the target branch. It is difficult to make the RDCs of all peaks more inferior to the target, but it is possible to control some peaks for different application scenarios strictly. The blending results show that the blending algorithm with sample selection can lower the RDCs of chosen peaks in the blending consequence than in the target. No matter how high or low the component contents are to be controlled, with the help of the PCA results, the blending method can effectively select the blending samples and successfully calculate the blending ratio by using the blending-optimization algorithm.  Table 5.   Table 5.
These RDCs were lower than that in the target branch. It is difficult to make the RDCs of all peaks more inferior to the target, but it is possible to control some peaks for different application scenarios strictly. The blending results show that the blending algorithm with sample selection can lower the RDCs of chosen peaks in the blending consequence than in the target. No matter how high or low the component contents are to be controlled, with the help of the PCA results, the blending method can effectively select the blending samples and successfully calculate the blending ratio by using the blending-optimization algorithm.

Calculation of the Blending Method in the Same Constraints with Different Samples
Using different samples for mixing under the same constraints was also verified. According to the actual needs of manufacturers, we took the similarity as the objective function and set the RDC limit less than ±20% for P2, P5, P6, P7, P8, P11, P12, P13, P14, and P15 to meet the product requirements. Using the PCA results, three sets of the blending samples were selected for mixing. The constraints 4, 5, and 6 were abbreviated as C4, C5 and C6. The mixing percentages are shown in Table 7, and their RDC and similarity are shown in Table 8. The radar plot of the calculation results is shown in Figure 7. As shown in Table 8, the blending ratios in three mixing tests satisfy their blending constraints with similarities higher than 98.5% and the RDCs within ±20%. In C4, the RDC of P7 dropped from 96% to −7.8%, and the RDC of P14 decreased from 93% to −14.3%.      Table 7.

Experimental Validation of Critical Result in Calculation
Among results in C4-C6, the blending result C4 performs the best and C6 the worst. To verify our algorithm's actual mixing ability and experimental accuracy, the C6 ratio was used for the experimental verification. The experimental validation results are shown in Figure 8 and Table 9. The model predicts results with similarity above 98% and RDC ±20%. The experimental results were consistent with the expected results, with content ratios between 80% and 120% and prediction deviations within the practical error range. The difference between the RDCs of calculations and experimental results was in the range of −3.6% to 12.6%.
The experiments showed that the blending results met the constraints. The RDCs of the blending results were similar to those of the high-quality samples (T1-T10). The deviation between calculations and experiments was mainly caused by the uneven sampling of samples. If the samples can be premixed before extracting, the internal homogeneity of each sample can be improved, and the blending results can be enhanced.
The actual experiments used the calculated marginal results, which made the experiments representative. Admittedly, the results of the actual experiments were less. Still, more experiments were conducted after adjusting the constraints to the producer's re- The results show that the nonlinear optimization algorithm can control similarity and RDCs, working well under boundary constraints.

Experimental Validation of Critical Result in Calculation
Among results in C4-C6, the blending result C4 performs the best and C6 the worst. To verify our algorithm's actual mixing ability and experimental accuracy, the C6 ratio was used for the experimental verification. The experimental validation results are shown in Figure 8 and Table 9. The model predicts results with similarity above 98% and RDC ±20%. The experimental results were consistent with the expected results, with content ratios between 80% and 120% and prediction deviations within the practical error range. The difference between the RDCs of calculations and experimental results was in the range of −3.6% to 12.6%.

Reagents and Chemicals
Standard compounds were purchased from various manufacturers or prepared in the laboratory with the detailed information shown in Table S1. Fifty-three batches of ginkgo biloba were obtained from different manufacturers, and their detailed information was listed in Table S2. Acetonitrile (HPLC grade), formic acid, and phosphoric acid (analytical purity) were purchased from Sigma-Aldrich Co. ( St. Louis, MO, USA). Deionized water (18.2 MΩ·cm) was purified using a Milli-Q system (Millipore, Bedford, MA, USA).

Extraction
The crushed ginkgo biloba sample was accurately weighed at 4 g and passed through a 355 μm sieve. The sample was placed in a Soxhlet extractor, then petroleum ether (30-60 °C) was added and refluxed in a 70 °C water bath for one hour. All petroleum ether was discarded in the (30-60 °C) solution, and the residue was evaporated and filtered through a paper cylinder. The mixture was dried in an oven at 60 °C and transferred to a corked conical flask. The conical flask was sonicated (power 200 W, frequency 40 kHz) for 30 min with 40 mL of methanol. The mixture was cooled, and the weight loss was made up of methanol, shaken well, and filtered. The 25 mL filtrate was measured, placed in an evaporating dish, and evaporated to dryness. Methanol was added to dissolve the residue, transferred to a 5 mL measuring flask, diluted to the methanol mark, shaken well, filtered, and finally the additional filtrate was taken.
All analyses were carried out on a Waters Alliance 2695 HPLC (Waters, Milford, MA, USA) equipped with a quaternary pump, an online vacuum degasser, and a diode-array detector. Data acquisition and processing were conducted on an Empower 3.0 workstation. Sample analysis was performed on a TNature C18 column (Waters&Acchrom-Tech, Beijing, China, 4.6 × 250 mm, 5 m). The detection wavelengths of the UV detector were set at 235 nm and 360 nm. ELSD was set as follows: gain, 100; method, heating; power, level 75%; drift tube, 75 °C; gas pressure, 30 psi.  The experiments showed that the blending results met the constraints. The RDCs of the blending results were similar to those of the high-quality samples (T1-T10). The deviation between calculations and experiments was mainly caused by the uneven sampling of samples. If the samples can be premixed before extracting, the internal homogeneity of each sample can be improved, and the blending results can be enhanced.
The actual experiments used the calculated marginal results, which made the experiments representative. Admittedly, the results of the actual experiments were less. Still, more experiments were conducted after adjusting the constraints to the producer's requirements, and pilot and large-scale production were performed. The results of all experiments performed well, which will be shown in a later report.
The above results prove that the nonlinear algorithm effectively controls the similarity and RDCs of chosen peaks in actual productions.

Reagents and Chemicals
Standard compounds were purchased from various manufacturers or prepared in the laboratory with the detailed information shown in Table S1. Fifty-three batches of ginkgo biloba were obtained from different manufacturers, and their detailed information was listed in Table S2. Acetonitrile (HPLC grade), formic acid, and phosphoric acid (analytical purity) were purchased from Sigma-Aldrich Co. (St. Louis, MO, USA). Deionized water (18.2 MΩ·cm) was purified using a Milli-Q system (Millipore, Bedford, MA, USA).

Extraction
The crushed ginkgo biloba sample was accurately weighed at 4 g and passed through a 355 µm sieve. The sample was placed in a Soxhlet extractor, then petroleum ether (30-60 • C) was added and refluxed in a 70 • C water bath for one hour. All petroleum ether was discarded in the (30-60 • C) solution, and the residue was evaporated and filtered through a paper cylinder. The mixture was dried in an oven at 60 • C and transferred to a corked conical flask. The conical flask was sonicated (power 200 W, frequency 40 kHz) for 30 min with 40 mL of methanol. The mixture was cooled, and the weight loss was made up of methanol, shaken well, and filtered. The 25 mL filtrate was measured, placed in an evaporating dish, and evaporated to dryness. Methanol was added to dissolve the residue, transferred to a 5 mL measuring flask, diluted to the methanol mark, shaken well, filtered, and finally the additional filtrate was taken.
All analyses were carried out on a Waters Alliance 2695 HPLC (Waters, Milford, MA, USA) equipped with a quaternary pump, an online vacuum degasser, and a diode-array detector. Data acquisition and processing were conducted on an Empower 3.0 workstation. Sample analysis was performed on a TNature C18 column (Waters&Acchrom-Tech, Beijing, China, 4.6 × 250 mm, 5 m). The detection wavelengths of the UV detector were set at 235 nm and 360 nm. ELSD was set as follows: gain, 100; method, heating; power, level 75%; drift tube, 75 • C; gas pressure, 30 psi.
The mobile phase was composed of MeOH (A) and 1% FA (v/v) in H 2 O (B) at a flow rate of 0.9 mL/min. Gradient elution is set in Table 1.

Blending Theory
The TCM blending-optimization model was established on the chromatographic peak area in its fingerprint. The model is based on two hypotheses: (a) The peak area is proportional to the corresponding compound concentration; (b) there are no interactions between compounds. The blending model could be expressed by following Equation (1). Equation (2) where n is the number of samples and m is the number of components, s is the given component concentration of one sample, x is the blending ratios of one sample, and t is the given component concentration of the target. S is the matrix of component concentrations of all samples; X is the blending ratio vector, and T is the target vector of blending. Under ideal conditions, the solution to the blending problem is to find the solution vector X. However, in general, the solution vector X does not exist. Therefore, we used the cosine similarity to evaluate mixing results in Equation (3). We can convert the blending problem to maximize the objective function, which was expressed by following Equation (5).
The blending-optimization model is aimed to maximize the overall similarity between the target and its blending result. To control the consistency of each component of the production, the RDC of content is defined as follows: In summary, the TCM blending-optimization model can be obtained: The blending ratio is calculated by the COBYLA algorithm [54]. It constructs polynomial approximations L(X) to the target and constraint functions by interpolation at the vertices of simplices. It is expressed by following Equation (7).
where ∆ is the trust region radius; ∇L is the gradient of L(X); X 0 is the point of the simplex; X is the point different from all the vertices of the current simplex.

Blending Platform
Our blending-optimization model was based on constrained minimization of multivariate scalar functions. The calculations and preprocessing involving multimodel statistical analysis were assessed using Python®(Beaverton, OR, USA, Version 3.7.3) and SciPy®(Austin, TX, USA, Version 1.7.3). The algorithm was based on linear approximations of the objective function and RDC constraints (COBYLA) [54].

Conclusions
In this paper, we demonstrate the ability of nonlinear optimization-blending algorithms to better control the consistency of TCM well under different constraints or using different blended samples. We established blending standards by the average of highquality samples and used qualified samples for blending. PCA results and radar plot results to screen samples during blending experiments also allow manufacturers to select batches quickly. Using the same samples for mixing under different constraints, the RDCs of the selected peaks were controlled to less than 10%, even exceeding the high-quality sample sequence, regardless of the peak area. Using different samples mixed under the same constraints, the RDCs of the selected peaks were contained to within than 20%. Despite some deviations from the calculated results, the similarity and RDCs satisfy the limiting constraints in the actual experiments. This gives an essential approach for manufacturers to control different samples under different constraints to improve the stability of product quality. This significantly improves efficiency and accuracy compared to the traditional process used in the industry. This method should be of great value and indispensable for herbal medicines with clear pharmacological activity or toxic substances.

Patents
This work has been patented with the patent number CN114580988A.
Supplementary Materials: The following supporting information can be downloaded at: https:// www.mdpi.com/article/10.3390/molecules27154733/s1, Figure S1: Ginkgo biloba HPLC-UV-ELSD chromatograms; Figure S2: Ginkgo biloba HPLC-UV-ELSD chromatograms between samples of different origins and growth years; Figure S3: Radar diagram of Ginkgo biloba of samples of different origins and growth years; Figure S4: Ginkgo biloba HPLC-UV-ELSD chromatograms of samples used in the factory for actual production; Figure S5: Ginkgo biloba HPLC-UV-ELSD chromatograms of qualified samples acquired in the market; Table S1: Standard compounds information; Table S2:  Sample Information; Table S3: The similarity and RDC between samples of different origins and growth years.