The Multivariate Regression Statistics Strategy to Investigate Content-Effect Correlation of Multiple Components in Traditional Chinese Medicine Based on a Partial Least Squares Method

Amultivariate regression statisticstrategy was developed to clarify multi-components content-effect correlation ofpanaxginseng saponins extract and predict the pharmacological effect by components content. In example 1, firstly, we compared pharmacological effects between panax ginseng saponins extract and individual saponin combinations. Secondly, we examined the anti-platelet aggregation effect in seven different saponin combinations of ginsenoside Rb1, Rg1, Rh, Rd, Ra3 and notoginsenoside R1. Finally, the correlation between anti-platelet aggregation and the content of multiple components was analyzed by a partial least squares algorithm. In example 2, firstly, 18 common peaks were identified in ten different batches of panax ginseng saponins extracts from different origins. Then, we investigated the anti-myocardial ischemia reperfusion injury effects of the ten different panax ginseng saponins extracts. Finally, the correlation between the fingerprints and the cardioprotective effects was analyzed by a partial least squares algorithm. Both in example 1 and 2, the relationship between the components content and pharmacological effect was modeled well by the partial least squares regression equations. Importantly, the predicted effect curve was close to the observed data of dot marked on the partial least squares regression model. This study has given evidences that themulti-component content is a promising information for predicting the pharmacological effects of traditional Chinese medicine.


Introduction
Traditional Chinese medicine is becoming more and more popular all over the world because of their safety and efficacy [1][2][3][4]. A thorough understanding of the chemical composition of traditional Chinese medicine is essential for a comprehensive assessment. The chemical constituents and their amounts in herb are different, due to growing conditions, such as climate, soil, the drying process, the harvest season, et al., The studies of traditional Chinese medicines mainly focus on the chemical content, pharmacological, and pharmacokinetic studies of the reported main ingredients [5][6][7][8]. However, the relationship between multiple components of traditional Chinese medicine and the efficacy of herbal medicines should be a focus of attention, which strongly restricts dose-effect application, especially in model research. Therefore, the herbal medicine analysis and the curative effect should be associated with each other, which can guarantee the clinical safety and effectiveness of traditional Chinese medicine. The traditional dose-effect models, such as direct model, effect compartment model, and indirect response model, were not appropriate to be applied in traditional Chinese medicine. Therefore, we apply a statistical strategy to select a partial least squares model for predicting the connection of the pharmacological effect with the content of multiple components in the traditional Chinese medicine. A partial least squares model was firstly used in the chemical field, and the wavelength of the infrared reflection spectrum was regarded as the independent variable to predict the chemical composition [9]. With the development of the partial least squares method, it has been used in more and more fields, such as chemistry, engineering, economics, et al. [10][11][12]. When compared with the ordinary least squares regression model, the partial least squares method has some outstanding advantages in solving problems, such as a small number of observations and multiple correlations. Since the partial least squares method is based on the singular value decomposition method of multidimensional array, the key advantage of using PLSR is that the multiple input variables are permissible [13]. Additionally, it is also possible to use a hybrid input of multiple types of variables such as saponin combinations, identified fingerprint and component content.
Panaxginseng saponins extract is one of the most frequently used traditional Chinese medicines due to its tonic functions for more than 2000 years in Asia. It showed wide range of pharmacological effects in cancer, diabetes, cardiovascular system, immune system, and central nervous system [14][15][16][17][18][19][20][21][22]. In the present study, panax ginseng saponins extract was taken as example to investigate the correlation between its protective effects on the cardiovascular system and its content of multiple components based on a partial least squares model. As a novel attempt, this was an effective complement to the traditional dose-effect model. The multivariate regression statistic strategy could assess the contribution of each component to the efficacy of herbal medicine, and, in turn, predict the potential efficacy of herbal medicine according to its components compositions. This would have important application value in the research and development of traditional Chinese medicine.
Diphosphate Adenosine Disodium (ADP-2Na), heparin, and sodium azide were purchased from Sigma-Aldrich Co. Ltd. (Shanghai, China). Sodium chloride, potassium chloride, sodium bicarbonate, magnesium chloride, sodium dihydrogen phosphate, and calcium chloride were purchased from Nanjing Chemical Reagent Co. Ltd. (Nanjing, China). D-glucose was provided by Chinese Pharmaceutical Group Shanghai Chemical Reagent Co. Ltd. (Shanghai, China). Acetonitrile and phosphoric acid of HPLC grade were purchased from Merck Co. Inc. (Shanghai, China). All of the the reagents were of analytical grade.

Animals
Male SD rats (200-250 g) were purchased from Beijing Vital River Laboratory Animal Technology Co., Ltd. (Beijing, China). Animals were kept on a 12 h light/dark cycle at the animal center of China Pharmaceutical University for a minimum of three days before experiments. All of the experiments were approved by the animal ethics committee at China Pharmaceutical University.

Cell Cultureand Hypoxia
The H9C2 embryonic rat heart-derived cells were obtained from American Type Culture Collection (ATCC; VA, USA) and maintained in Dulbecco's modified Eagle's medium supplemented with 10% v/v fetal bovine serum and 100 mg/mL penicillin/streptomycin at 37 • C in a humidified atmosphere containing 5% CO 2 . The cells were fed every 2-3 days, and sub-cultured once they reached 70-80% confluence. To induce hypoxia, cells were placed in a Gas Pak EZ Gasgenerating Pouch System for 8 h and were incubated with serum free and glucose free DMEM, as described previously. Asnormoxia control, serum free DMEM was added to cells and incubated for 8 h in normoxia condition (21% O 2 ).

Instruments
QX-100 platelet aggregation analyzer was obtained from Shanghai Fudan University Instrument factory (Shanghai, China). Drug analysis was performed with a Shimadzu HPLC-UV system (Tokyo, Japan), comprising a LC-10AD pump, an UV-visible detector, and a CTO-10A column oven. Milli-Q pure water instruments were from Millipore Co. Ltd. (Hong Kong, China).

The Preparation of the Panax ginseng Saponins Extract
The plant part used in our study was the dried roots of panax ginseng. The ginseng that we used was provided by the ginseng herbs GAP production base of Guizhou Yi Bai Pharmaceutical Co., Ltd. (Fusong County, Jilin Province, China), which was identified as the dried roots of panax ginseng C.A. Mey. (Fam. Araliaceae) by Professor Sun Qishe in Shenyang Pharmaceutical University (China). 400 g of panax ginseng was pulverized, mixed, and steeped in 2400 mL of drinking water for 0.5 h at room temperature before a 1 h sonication-enhanced extraction. The extract was separated by filtration, and the residue was re-extracted with 1600 mL of water. The pooled extract was concentrated under reduced pressure at 40 • C and was modified with hydroxypropyl methylcellulose at 0.3% (grams per milliliter) before addition of water to 800 mL to yield panax ginseng saponins extract. The panax ginseng saponins extract was stored at 15 • C pending use. Dissolve and mix in pure water before use. The contents of GRb 1 , GRg 1 , GRd, NGR 1 , GRe, and GRa 3 were 34.98%, 26.36%, 8.76%, 7.93%, 4.61% and 3.82% from HPLC result, respectively ( Table 1). The total contents of the six saponins reached 86% of panax ginseng saponins extract, suggesting that this six saponins were the main material basis for panax ginseng saponins extract. NaCl 4 g, KCl 0.1 g, D-glucose 0.5 g, NaHCO 3 0.75 g, MgCl 2 ·6H 2 O 0.2135 g, NaH 2 PO 4 0.0325 g, heparin 6.67 mg (1000 U), and CaCl 2 0.10 g were dissolved in pure water, and then 0.5 mL 0.1% NaN 3 solution were added. Transfer the solution to a 500 mL volumetric flask and then add pure water to the scale.

Determination of Anti-Platelet Aggregation
QX-100 platelet aggregation analyzer should be turned on 1 h before the test. Put the diluted plasma tubes with stir beads into preheating hole of the analyzer. After 5 min, move the tube into the test hole of the analyzer. Put the electrode into the tube and press stirring key, and then 5 µL diluted ADP-2Na solution was added into the plasma sample. The initiation of aggregation occurred after 15 to 30 s, and then recorded the platelet aggregation resistance value (Ω).
The inhibition of platelet aggregation in vitro was calculated by formulas as follows: Aggregation inhibition rate (%) = (platelet aggregation resistance value in control group-platelet aggregation resistance value in test group)/(platelet aggregation resistance value in control group). The inhibition of platelet aggregation in vivo was calculated by formulas, as follows: Aggregation inhibition rate (%) = (platelet aggregation resistance value before administration-platelet aggregation resistance value after administration)/(platelet aggregation resistance value before administration).
2.5.6. The Comparison of Anti-Platelet Aggregation between the Panax ginseng Saponins Extract and Individual Saponin Combinations Individual GRb 1 , GRg 1 , GRd, NGR 1 , GRe, and GRa 3 weighed accurately were mixed and dissolved in pure water to obtain individual saponin combinations. The vitro study was performed in seven groups as follows: saline group, panax ginseng saponins extract at 50 µg/mL, panax ginseng saponins extract at 100 µg/mL, panax ginseng saponins extract at 200 µg/mL, individual saponin combinations at low dose group (the each individual saponin dose was same to the individual saponin dose in panax ginseng saponins extract at 50 µg/mL), individual saponin combination at middle dose group (the each individual saponin dose was same to the individual saponin dose in panax ginseng saponins extract at 100 µg/mL), and individual saponin combination at high dose group (the each individual saponin dose was same to the individual saponin dose in panax ginseng saponins extract at 200 µg/mL). The panax ginseng saponins extract and the individual saponins combinations were added into 0.45 mL blank plasma with heparin 20 U/mL, respectively. Determine the platelet aggregation resistance value and calculate the anti-platelet aggregation rate.
In vivo study, forty-two male rats were randomly assigned to seven groups to receive an intravenous panax ginseng saponins extract at saline group, 5 mg/kg, 10 mg/kg, 20 mg/kg, individual saponin combination at low dose group (the each individual saponin dose was same to the individual saponin dose in panax ginseng saponins extract at 5 mg/kg), individual saponin combination at middle dose group (the each individual saponin dose was same to the individual saponin dose in panax ginseng saponins extract at 10 mg/kg), and individual saponin combination at high dose group(the each individual saponin dose was same to the individual saponin dose in panax ginseng saponins extract at 20 mg/kg). The plasma samples were collected at predose and 3 h after drug administrations. Determine the platelet aggregation resistance value and then calculate the anti-platelet aggregation.

The Anti-Platelet Aggregation of Different Individual Saponin Combinations
Different individual saponin combinations were prepared by weighing different saponins accurately into pure water. We selected six major saponins (GRb 1 , GRg 1 , GRd, NGR 1 , GRe, and GRa 3 ) in the panax ginseng saponins extract to set seven different individual saponin combinations based on the uniform design. Then, the anti-platelet aggregation effect of seven different combinations was studied in vitro ( Table 2) and in vivo (Table 3). In vitro, the seven saponin combinations were added into 0.45 mL blank plasma from male SD rats with heparin 20 U/mL, respectively. Determine the platelet aggregation resistance value and calculate the anti-platelet aggregation rate. In vivo, forty-two male rats were randomly assigned to seven different individual saponin combinations to receive an intravenous administration. The plasma samples were collected at predose and 3 h after drug administrations. Determine the platelet aggregation resistance value and calculate the anti-platelet aggregation rate. In order to analyze the correlation between the anti-platelet aggregation effect and individual saponin by partial least squares regression, we changed both the total concentrations and the individual saponin concentration in seven different combinations toobserve the differences of the anti-platelet aggregation effect in different groups. For six saponins, each saponin included seven concentrations (level 1-7), both in vitro and in vivo studies, according to uniform design of U 7 (7 6 ).  The content of six saponins in seven different individual saponin combinations were six independent variables. The anti-platelet aggregation effect was the only dependent variable. The correlation between the content of six saponins and the anti-platelet aggregation effect was analyzed by partial least squares regression. The content-effect data matrix was constructed with the observation in columns and the responses as variables in rows. The content of six saponins in any saponin combination were the X-matrix and the anti-platelet aggregation effect was the Y-matrix. Data processing was carried out by SIMCA-P 11 software (Umetrics, Umeå, Sweden). Because there was only one dependent variable, the partial least squares regression one prediction model was selected.

Cell Treatment
Cell studies were divided into 12 groups: sham control group, ischemia reperfusion (IR) group, IR group combined with ten different batch of panax ginseng saponins extract (S1-S10) at 300 µg/mL. In the sham control group, cells were maintained in Dulbecco's modified Eagle's medium supplemented with 10% v/v fetal bovine serum. IR model group was established through hypoxia 3 h/oxygen 2 h. In panax ginseng test group, the cells were treated with panax ginseng saponins extract (300 µg/mL) for 2 h before induction of hypoxia 3 h/oxygen 2 h.

MTT Assay
MTT assay was performed to determine the effect of panax ginseng saponins extract on the cell viability of the H9C2 embryonic rat heart-derived cells. Firstly, cells (5 × 10 4 cells/mL) were seeded in 96-wells. After two days of culture, different batches of panax ginseng saponins extract (S1-S10) was added and incubated for 24 h. Then, the MTT solution at a final concentration of 0.5 mg/mL was added to each well while continued to incubate for 4 h at 37 • C. At the end of the incubation, the culture medium was discarded and 150 mL DMSO was added to each well to dissolve dark blue formazan crystals. The absorbance was read at 570 nm using FLUO star Omega plate reader (BMG LABTECH, Ortenberg, Germany).

Animal Studies
Rats were divided randomly into 12 groups: sham control group, ischemia reperfusion (IR) group, IR group combined with ten different batch of panax ginseng saponins extract (S1-S10) at 100 mg/kg/d, with 10 rats in each group. In sham and IR group, rats were given saline by oral gavage for 15 days before surgical operation. In panax ginseng test groups, rats were administered with panax ginseng saponins extract at 100 mg/kg/d by oral gavage for 15 days before IR operation. Myocardial ischemia-reperfusion (IR) rat model was performed using the method described by Luo et al [28] with slight modifications. Briefly, Sprague-Dawleyrats were anaesthetized by intraperitoneal injection of 10% chloralhydrate (350 mg/kg), and maintained by bolus injection of 10% chloral hydrate (60-80 mg/kg, i.v.) during anaesthesiaas required. The neck was dissected and a tracheostomy was performed to provide artificial ventilation (60 strokes/min at a tidal volume of 10 mL/kg). The fourth and fifth ribs on the left side of the chest were cut to perform the thoracotomy and to incise the pericardium. The hearts were gently exteriorized and a 5/0 silk suture and were passed around the left anterior descending coronary artery. The suture then was ligated and the ends of this ligature were passed through a small vinyl tube to form a snare. After 30 min of ischemia, the snare was removed gently and myocardium was reperfused for 90 min. Ischemia was confirmed by ST-segmentelevation in the electrocardiogram and color changes in the ischemia myocardial area. Rats in the sham group underwent thoracotomy but the left anterior descending was not ligated. After the IR surgery, the animals were sacrificed with an overdose of 10% chloral hydrate (500 mg/kg, i.v.), blood samples were collected from the abdominal aorta. After standing for 30 min of the blood sample, the serum was separated by centrifugation at 4 • C (3000 r/min, 10 min), and stored in −80 • C for further study.
2.6.5. Determination of cTnI, CK and LDH cTnI, CK, and LDH were three cardiac injury markers. cTnI was one of the three subunits of troponin, which is a protein that was unique in the myocardial. CK referred to creatine kinase, LDH referred to lactate dehydrogenase, these two indicators were often increased in myocardial injury or necrosis. The levels of cTnI were analyzed using an ELISA kit (USCN Life Science, Inc., Wuhan, China), and the data were measured on a microplate reader (Bio-Tek Instruments, Inc., Winooski, VT, USA). The levels of CK and LDH were performed using a suite of commercial kits (Jian Cheng Bioengineering Institute, Nanjing, China). For cell study in vitro, culture medium was collected to measure cTnI, CK, and LDH by the kits mentioned above, according to the manufacturer's instructions. For animal study in vivo, the serum levels of cTnI, CK, and LDH were analyzed.

The Relevance Analysis between Fingerprint and Drug Effects
The peak areas of 18 identified components (P1-P18) during the fingerprint study of ten batches of panax ginseng saponins extract (S1-S10) were regarded as 18 independent component variables. For each component, the corresponding peak area was divided by the average area of the reference peak in ten panax ginseng saponins extract samples for nondimensionalization ( Table 4). The values of cTnI, CK, LDH and anti-platelet aggregation rate were the four dependent variables. The correlations between the relative content of eighteen independent components and the four dependent variables were analyzed by partial least squares regression. The relative content-effects data matrix was constructed with the observations in columns and the responses as variables in rows. The relative contents of the eighteen components from ten different origins were the X-matrix and the four effectsfrom ten different origins were the Y-matrix. Data processing was carried out by SIMCA-P 11 software. Table 4. Dimensionless data of peak areas of eighteen identified components (P1-P18) in ten different batches of panax ginseng saponins extracts from different origins (S1-S10) in example 2.

The Comparison of Anti-Platelet Aggregation between Panax ginseng Saponins Extract and Individual Saponins Combination
During in vitro study of part 2.5.6, the anti-platelet aggregation effect of individual saponin combination could reach 90.3%, 92.9%, and 93.0% than that of panax ginseng saponins extract at 50 µg/mL, 100 µg/mL, and 200 µg/mL, respectively. Similarly, during in vivo study of part 2.5.6, the anti-platelet aggregation effect of individual saponins combination could reach 89.2%, 91.5% and 92.3% than that of panax ginseng saponins extract at 5 mg/kg, 10 mg/kg and 20 mg/kg, respectively. Thus, the six saponinsofNGR 1 , GRg 1 , GRd, GRe, GRb 1 , and GRa 3 for saponin combination could be recognized as the pharmacological markers of panax ginseng saponins extract both in vitro and in vivo. These results shown that the individual saponin content of NGR 1 , GRg 1 , GRd, GRe, GRb 1 , and GRa 3 in panax ginseng saponins extract could be datasets as independent variables in partial least squares method.

The Anti-Platelet Aggregation Effect of Different Individual Saponin Combinations
The anti-platelet aggregation effects of seven different combinations of NGR 1 , GRg 1 , GRd, GRe, GRb 1 , and GRa 3 on uniform design were shown in Tables 2 and 3, respectively, for in vitro and in vivo. Based on these data sets, a partial least squares regression model was introduced to evaluate the relationship between six saponin content and the anti-platelet aggregation effect in different individual saponin combinations. In vitro, it produced a good partial least squares regression model with two principal components (R2X = 0.861, R2Y = 0.905, Q2Y = 0.883). The regression coefficients of NGR 1 , GRg 1 , GRe, GRb 1 , GRd, GRa 3 on anti-platelet aggregations were 0.025, 0.371, 0.113, 0.346, 0.028, and 0.218, respectively ( Figure 1A). Similarly, in vivo, it also produced a good partial least squares regression model with two principal components (R2X = 0.852, R2Y = 0.881, Q2Y = 0.864). The regression coefficients of NGR 1 , GRg 1 , GRe, GRb 1 , GRd, GRa 3 on anti-platelet aggregations were 0.014, 0.382, 0.108, 0.356, 0.017, and 0.219, respectively (Figure 2A). The greater the coefficients value of an individual saponin, the more significant its effect on anti-platelet aggregations. The results of variable importance (VIP) values in vitro and in vivo also showed the contribution of each individual saponin to the anti-platelet aggregation effect ( Figures 1B and 2B), with the same trend as the regression coefficients. The larger the VIP value of an individual saponin, the greater its contribution to the anti-platelet aggregation effect. Importantly, the predicted anti-platelet aggregation curves were close to the observed data of dot marked by the partial least squares regression model (Figures 1C and 2C). These results showed that the anti-platelet aggregation effect of different individual saponin combinations could be predicted accurately by this method.

The Fingerprint Studies of Panax ginseng Saponins Extract from Different Origins
Eighteen common peaks (P1-P18) of panax ginseng saponins extract from ten different origins (S1-S10) were identified from HPLC fingerprint study. P4 was identified to be ginsenoside Re (GRe) when compared with the reference standard of GRe. GRe is one of the main active components from our pharmacological studies, and the peak area of P4 was relatively higher with good separation resolution and symmetry, so P4 was chosen as the reference peak. For each component, the corresponding peak area was divided by the average area of P4 in ten panax ginseng saponins extract samples for non-dimensionalization (Table 4).

The Effects of Panax ginseng Saponins Extract from Different Origins
The potential cardioprotective effects of panax ginseng saponins extract against ischemiareperfusion (IR) injury were estimated by determined the levels of cTnI, CK and LDH in cell medium (in vitro) and rat serum (in vivo). As shown in Table 5, compared with the sham control group, the cTnI, CK, and LDH values obtained in IR model group increased significantly, but they declined more or less after pretreatment by different batches of panax ginseng saponins extract from ten different origins (S1 to S10). The reversal degree among the ten batche saponins extracts was different. The S6 showed the strongest reversal at the cTnI value while S8 was the weakest. The S4 showed the strongest reversal at the CK and LDH values, while S9 was the weakest. Similarly, in assay of platelet aggregation inhibition, the platelet aggregation rate increased significantly in IR group as compared with the sham-control group, and then it decreased after the panax ginseng saponins extract was added. Among the ten batch saponins extracts, the platelet aggregation rate of S2 was the lowest (almost close to the sham control group), while the value of S1 was the highest. Table 5. The effects of ten different batches of panax ginseng saponins extracts (S1-S10) in vitro and in vivo study in example 2.

Groups
In Vitro Study In Vivo Study

The Relevance between the Fingerprints and the Effects
In order to evaluate the relationship between the four pharmacological markers and the eighteen components of panax ginseng saponins extracts, a partial least squares regression model was introduced. In vitro, it produced a good partial least squares regression model with two principal components (R2X = 0.83, R2Y = 0.87, Q2Y = 0.85), while a good partial least squares regression model with two principal components (R2X = 0.85, R2Y = 0.89, Q2Y = 0.83) was also produced in vivo. The regression coefficients of the eighteen components on the four pharmacological markers were shown in Table 6. The results of VIP values in vitro and in vivo also showed the contribution of each component to the four markers (Table 7). Most importantly, the predicted effect curves were close to the observed data of dot that was marked by the partial least squares regression model (Figures 3 and 4). These results showed that the effects of different origins of panax ginseng saponins extract could be predicted accurately by this method.

Discussion
In this study, a partial least squares model was applied to investigate the relationship between the components content and pharmacological effect of traditional Chinese medicine. In example 1, firstly, the pharmacological effects of seven different individual saponin combinations were studied including six independent variables (six saponin doses) and one dependent variable (anti-platelet aggregation effect). The quantities of observations (n = 7) were relatively few for the six independent variables and one dependent variable. For this reason, the model fitting by the ordinary least squares regression will bias divergently. However, it was fitted well by the partial least squares model with a small number of observation. Secondly, the synergistic pharmacological effect of multiple components existed besides the individual effect of each component in traditional Chinese medicine. The ordinary multiple regression was not able to analyze the interactions among multiple components. But, the partial least squares model was able to analyze not only the correlations between the individual component and the pharmacological effect, but also the synergistic effect of multiple components. The predictive pharmacological effect of the different individual saponin combinations by the partial least squares model were close to the observed value, indicating the strong predictive function of the model. Furthermore, this method can also guide the optimization of the extraction process for traditional Chinese medicine by increasing the contents of components with high contribution to effect as much as possible. As this example, the contributions of GRg1, GRe, and GRa3 to anti-platelet aggregation effect of different saponin combinations were relatively larger according to the VIP diagrams ( Figures  1B and 2B). Thus, GRg1, GRe, and GRa3 had high positive correlation to the pharmacological effect, indicating the strong interpretation function of the model.
Before the multivariate statistics analysis in example 1, the anti-platelet aggregation effect between panax ginseng saponins extract and individual saponin combination (the dose of each saponin in individual saponin combination was equal to that of each saponin in panax ginseng saponins extract) were compared. The anti-platelet aggregation effect of the individual saponin combination was amount to 92% (in vitro) and 90% (in vivo) of the panax ginseng saponins extract, suggesting that the six saponins of NGR1, GRg1, GRd, GRe, GRb1, and GRa3 were the material basis of panax ginseng saponins extract for the antithrombotic effect. The total contents of the six saponins reached 86% in panax ginseng saponins extract. So, the investigations were designed to analyze the relationship between thecontentof multiple components and pharmacological effect by changing the individual saponin content in different individual saponin combinations. Therefore, there were two key preconditions in the application of this method. Firstly, the content of multiple the components in traditional Chinese medicine should be known, and the selected components should be the main material basis in herbal extract. Secondly, the selected components should have a major contribution to the pharmacological effect of herbal medicine. The partial least squares regression analysis could be applied to the multiple components content-effect analysis in traditional Chinese medicine if the two preconditions could be satisfied simultaneously.
The first attempt (example 1) was lack of the information of real conditions in traditional Chinese medicine. So we tried to apply the partial least squares method based on real combinations in example 2, to investigate the relationship between the fingerprint dataset and pharmacological effects of panax ginseng saponins extract. Firstly, the HPLC fingerprints of ten different batches of panax ginseng saponins extract from different origins were studied, and eighteen common peaks, regarded as eighteen independent variables, were identified in our study. Then, the protective effect son cardiovascular system of panax ginseng saponins extract were studied, and the cTnI, CK, LDH values and anti-platelet aggregation rate, regarded as four dependent variables, were selected as pharmacological indicators. Finally, a 18 × 10 X-matrix (fingerprints) and a 4 × 10 Y-matrix (effects) were formed from the raw data. This method can be used to accurately qualify the content-effect relationship of each component and predict the anti-myocardial ischemia-reperfusion injury directly by fingerprint datasets.
There are three important advantages of partial least squares analysis for fingerprints dataset of traditional Chinese medicines. First, more effective monomers of traditional Chinese medicine could be found, such as natural drugs. The relative contribution and the potential impact of each component on pharmacological effects would be analyzed by partial least squares method. After that, the monomers with high contributions could be separated, extracted, enriched, and purified, and then the pharmacological effects of these monomers were separately investigated to verify the partial least squares results. It would be an effective method to select active natural components from a large number of traditional Chinese medicines for new drug development. Second, the optimization of the extraction process for traditional Chinese medicines could be guided by our method. The amounts of active components in the traditional Chinese medicine would be selectively increased by improve the extraction method. Third, our method could provide useful information for clinical dosing regimens. Due to the pharmacological efficacy of traditional Chinese medicine, it can be speculated by our method according to the fingerprint datasets, that the dose regimens and the dose formulations of traditional Chinese medicine could be adjusted. For the modernization of traditional Chinese medicine, the study on the content-effect relationship have always been a key technology bottleneck, which was mainly due to the absence of appropriate methodology [29].Thus, the partial least squares analysis method effectively associated a large number of fingerprint spectrums with pharmacological data to form a more comprehensive research system to clarify multi-components content-effect relationships, which would be a great impetus for the research of herbal medicine.

Conclusions
In this manuscript, a multivariate regression statistic strategy was introduced to investigate the relationship between the content of multiple components and the pharmacological effects of panaxginseng saponins extract based on a partial least squares model. Both for the saponin combinations in example 1 and for the fingerprint spectrums in example 2, the content-effect correlation was fitted well by the partial least squares regression equations. The predicted effect curve was close to the observed data of dot marked on the partial least squares regression model. This study demonstrated that the multivariate regression statistic strategy could assess the contribution of each component to the efficacy of herbal medicine and, in turn, predict the potential efficacy of herbal medicine according to its components compositions. This would have important application value in the research and development of traditional Chinese medicine.