High-Sensitivity Determination of Nutrient Elements in Panax notoginseng by Laser-induced Breakdown Spectroscopy and Chemometric Methods

High-accuracy and fast detection of nutritive elements in traditional Chinese medicine Panax notoginseng (PN) is beneficial for providing useful assessment of the healthy alimentation and pharmaceutical value of PN herbs. Laser-induced breakdown spectroscopy (LIBS) was applied for high-accuracy and fast quantitative detection of six nutritive elements in PN samples from eight producing areas. More than 20,000 LIBS spectral variables were obtained to show elemental differences in PN samples. Univariate and multivariate calibrations were used to analyze the quantitative relationship between spectral variables and elements. Multivariate calibration based on full spectra and selected variables by the least absolute shrinkage and selection operator (Lasso) weights was used to compare the prediction ability of the partial least-squares regression (PLS), least-squares support vector machines (LS-SVM), and Lasso models. More than 90 emission lines for elements in PN were found and located. Univariate analysis was negatively interfered by matrix effects. For potassium, calcium, magnesium, zinc, and boron, LS-SVM models based on the selected variables obtained the best prediction performance with Rp values of 0.9546, 0.9176, 0.9412, 0.9665, and 0.9569 and root mean squared error of prediction (RMSEP) of 0.7704 mg/g, 0.0712 mg/g, 0.1000 mg/g, 0.0012 mg/g, and 0.0008 mg/g, respectively. For iron, the Lasso model based on full spectra obtained the best result with an Rp value of 0.9348 and RMSEP of 0.0726 mg/g. The results indicated that the LIBS technique coupled with proper multivariate chemometrics could be an accurate and fast method in the determination of PN nutritive elements for traditional Chinese medicine management and pharmaceutical analysis.


Introduction
Panax notoginseng (commonly known as Sanqi or Tianqi) is a valuable Chinese herbal medicine that is in clinical trials or in clinical practice [1,2]. The U.S. Dietary Supplement Health and Education Act advocated Panax notoginseng (PN) as a dietary supplement in 1994 [3]. Modern research has shown

Panax notoginseng Samples
All PN samples were obtained from 8 areas of Yunnan province in China, namely Xichou (1), Yongde (2), Malipo (3), Mile (4), Gejiu (5), Gengma (6), Shizong (7), and Qiubei (8). All samples were purchased in 2017 harvest seasons. Thirteen pieces of PN were sampled from the PN set of each origin randomly, and a total of 104 PN samples was collected for further analysis. Each piece of PN was dried in an oven at 40 • C for approximately 3 h and was ground into powder by a tissue milling machine for consistent measurement. Two hundred milligrams of each PN powder were placed into a squared die set and pressed into a pellet with 700 MPa of pressure for 30 s. A total of 104 pellets was obtained, each of which was square with a 10-mm length for each side.

Spectral Acquisition
A self-assembled LIBS setup was used to realize LIBS spectral acquisition [28]. Laser pulses at 532 nm with maximum energy of 200 mJ and an 8-ns pulse width were generated by a Q-switched Nd:YAG pulse laser (Vlite 200, Beamtech, Beijing, China). The green laser passed through our self-made optical system and focused on 2 mm below one PN pellet's surface through a plano-convex lens (f = 100 mm). The laser ablated PN and generated a high temperature plasma, which contained atoms, ions, electrons, and molecules. During out diffusion of plasma, each element in the plasma ionized to form continuous spectral lines. The information of elements was collected by a light collector and received by the spectrometer (ME5000, Andor Technology, Belfast, U.K.) combined with an ICCD camera (DH334T-18F-03, Andor Technology, Belfast, U.K.). The spectra between 230.77 and 883.24 nm with high resolution (λ/∆λ = 5000) were collected. A delay generator (DG645, Stanford Research Systems, Sunnyvale, CA, USA) was used to regulate the delay time between the ICCD camera and laser Q-switch that controlled the laser generation. The optimal experimental parameters were optimized with a laser energy of 60 mJ, a delay time of 2.827 µs, and a gate width of 20 µs. Each PN pellet was placed on a sample stage of an x-y-z motorized positioning system, which controlled the laser ablation path with 4 × 4 array. Therefore, 4 × 4 craters appeared on the pellet surface, and each crater had 5-times accumulation of laser pulses. The spectrum for each sample was recorded by the average of the 80 spectra (4 × 4 × 5) to reduce fluctuation between the laser point-to-point. The interval distance of each hit was 2 mm.

Reference Method for Nutrient Elements' Content Determination
The reference method for detecting nutrient elements K, Ca, Mg, Fe, zinc (Zn), and boron (B) content in PN samples mainly relied on an inductively-coupled plasma optical emission spectrometer (ICP-OES) [29]. Before the determination by ICP-OES, the PN samples needed to experience microwave digestion and acid discharge pretreatment. Each pellet after LIBS acquisition was weighed and placed into modified polytetrafluoroethylene vessels with 5 mL of 65% HNO 3 and 1 mL of 30% H 2 O 2 for microwave digestion at 185 • C. Then, modified polytetrafluoroethylene vessels with digested liquid were placed in a 165 • C furnace to discharge the acid till a drop of fuming digested liquid remained. The acid elimination pretreatment happened in the fuming cupboard. The least digested liquid was diluted to a volume of 30 mL with high-purity water by the weighing method. The final dilution was applied to detect elemental content by ICP-OES.

Data Preprocessing
For remitting systematic and random errors during the experiment, wavelet transform (WT) was used to preprocess the raw spectral. As an efficient denoising method, WT uses a set of wavelet basis functions to remove invalid information (noise) and reserve valid sharp peaks or spikes [30]. Daubechies 8 with the decomposition scale of 3 was selected in our paper. After WT preprocessing, the  [29], which could avoid bias in sample selection, was used to split 104 PN samples into a calibration set and a validation set. Seventy two PN samples were selected for the calibration set, while the remaining 32 samples formed the prediction set. Univariate analysis and multivariate analysis were initially done using PN samples in the calibration set.

Multivariate Analysis Methods
Multivariate analysis employed PLS, SVM, and Lasso to establish calibration models, and the parameter selection of models was done using cross-validation. Lasso was used to select effective variables to improve multivariate analysis performance.
Partial least-squares regression (PLS) is an established analytical tool for relating multivariate data analysis [31]. A PLS model expresses the relation between LIBS spectral information X and the content of element Y in PN samples. Spectral matrix X and content matrix Y were decomposed, and the invalid information in the two matrixes was handled simultaneously. Principal component information was calculated after matrix decomposition. When calculating the principal component, PLS considered larger variance of the principal component to extract more useful information and also made the principal component (latent variables (LVs)) and element concentration Y more relevant to maximize the linear relationship between spectral variables and concentrations [32,33]. The five-fold cross-validation procedure was used, and the number of LVs was selected when the first minimum or the knee point for the root mean squared error of cross-validation (RMSECV) in calibration set vs. LVs curve was obtained.
Least-squares support vector machine (LS-SVM) is an improvement of the standard SVM based on the structural risk minimization (SRM) approach proposed by Vapnik et al. [34]. LS-SVM used the least-squares linear system as the loss function and applied a set of linear equations to replace the complicated quadratic programming method, which was adopted by the standard SVM. LS-SVM reduces the computational complexity and solution speeds and improves generalization ability of model [35]. Compared with PLS, LS-SVM could not only solve linear relation, but also nonlinear regression problems. The radial basis function (RBF) kernel function and five-fold cross-validation were utilized to establish the LS-SVM calibration model. The penalty parameters (c) and kernel function parameters (g) of LS-SVM were optimized by a grid-search procedure in the range of 103-1010, and the best c and g were determined with the minimal value of RMSECV.
The least absolute shrinkage and selection operator (Lasso) is a penalized shrunken regression method [36]. Lasso is also a dimensionality reduction method for both linear and nonlinear cases. The penalty method L1 was introduced when spectral matrix X did not belong to column full rank. Lasso selected variables from spectral data according to the penalty method L1 [27]. The penalty method L1 compressed the original coefficients β, and some original small coefficients were directly compressed to 0. The variables corresponding to these β = 0 were regarded as non-significant variables to be discarded directly. The penalty method took the value when the penalty likelihood function was the smallest as the estimated value of the regression coefficient [27,37]. Five-fold cross-validation was also applied to establish the Lasso calibration model and confirm the best model parameter the boundary value t, which is analogous to LVs for PLS and the number of non-zero β, which is expressed as either t in the optimization equation or the number of steps in the step-wise procedure.

Performance Evaluation
The correlation coefficient (R), RMSECV, and root mean squared error of the prediction set (RMSEP) were used to evaluate the performance of the quantitative models for nutritive elements' content detection [38]. Correlation coefficient (R) means correlation between target element content obtained by ICP-OES and target element content detected by LIBS. R c is correlation coefficient of the calibration set and R p is the correlation coefficient of the prediction set. The closer the R value is to 1 with the smaller root mean squared error, the better the performance and detectability of LIBS variables and calibration models.

Software Tools
LIBS spectra acquisition was carried out by Andor SOLIS for Imaging (v4.26, Andor Technology, Belfast, U.K.). Data analysis was executed by MATLAB R2017a (The MathWorks, Inc., Natick, MA, USA). Origin Pro 2015 (Origin Lab Corporation, Northampton, MA, USA) was applied for graphs' design.

Nutritive Elements Content of Panax Notoginseng
Nutritive elements K, Ca, Mg, Fe, Zn, and B content of eight producing areas detected by ICP-OES are shown in Table 1. For PN samples from all regions, the content of B and Zn was much smaller than that of Fe, which also belonged to the microelement. Commonly belonging to nutritive elements, the content of K was nearly ten-times that of Ca and Mg in all PN samples. The K, Ca, Mg, Fe, and Zn content of PN in Malipo (Group 3) was higher than the other regions, and the B content was second only to the PN in Xichou (Group 1), which indicated that the quality of PN in Malipo is best. Compared with other regions, the content of those elements in PN samples from Gejiu (Group 5) and Gengma (Group 6) was relatively small. The content of these nutritive elements could reflect the quality of PN goods from different producing areas. Undoubtedly, the method of rapid and simultaneous detection of elements content is conducive to the regulation of the PN commodity market.

LIBS Spectra Analysis
The average spectrum of each area in the range of 230.77-883.24 nm is shown in Figure 1. The LIBS spectra of PN samples from the eight areas were observed to have a similar tendency, which signified that PN in different producing areas had the same element species and similar matrix compositions. Molecule bands CN and H and atomic line O I, which commonly appeared in the LIBS spectra of organic samples, can be observed in Figure 1. The emission lines of K, Ca, Mg, and Fe in eight areas of PN samples are also obvious in Figure 2 with different intensity values such as PN in Gengma (Group 6) having the lowest Mg emission line and PN in Malipo (Group 3) having the highest Mg emission line, which is consistent with the results in Table 1. The intensity difference of element emission lines indicated that different habitats significantly changed the content of PN and formed a specific proportion of elements.

LIBS Spectra Analysis
The average spectrum of each area in the range of 230.77-883.24 nm is shown in Figure 1. The LIBS spectra of PN samples from the eight areas were observed to have a similar tendency, which signified that PN in different producing areas had the same element species and similar matrix compositions. Molecule bands CN and H and atomic line O I, which commonly appeared in the LIBS spectra of organic samples, can be observed in Figure 1. The emission lines of K, Ca, Mg, and Fe in eight areas of PN samples are also obvious in Figure 2 with different intensity values such as PN in Gengma (Group 6) having the lowest Mg emission line and PN in Malipo (Group 3) having the highest Mg emission line, which is consistent with the results in Table 1. The intensity difference of element emission lines indicated that different habitats significantly changed the content of PN and formed a specific proportion of elements. Based on the Kurucz database and the National Institute of Standards and Technology (NIST) Atomic Spectra Database (ASD), more than 90 emission lines whose intensity values were above 1000 a.u. were verified identities, and the elements are shown in Table 2. Nutritive element Ca had 35 emission lines, and Ca II 393.37 nm had the maximum intensity value of all element emission lines in the PN samples. The emission line for carbon (C) without the highest intensity value may be due to the relatively long delay time, since the intensity of emission lines changed quickly within the time [18]. The observed emission lines for oxygen and nitrogen may come from the common contribution of the PN sample and air. Table 2 presents the capability of LIBS technique to detect different elements from PN samples.

Univariate Analysis
Univariate analysis is a calibration curve method that reflects that the intensity of emission line is proportional to the content of the target element in samples [40]. In common LIBS quantitative analysis, sensitive emission lines of the target element without self-absorption and overlapping peaks are the preferred choice for univariate analysis [18]. According to the NIST database, the sensitive emission lines K I 766. 49 Figure 2, the emission lines of these nutritive elements are smooth without interference peaks. The peak intensity of K, Ca, Mg, and Fe in Groups 2, 5, and 6 was significantly lower than other groups, which is consistent with the difference in element content in Table 1. Because of the low content, the sensitive emission lines of B and Zn were difficult to discriminate for univariate analysis. The univariable calibration and prediction results of the above selected emission lines of nutritive elements are shown in Table 3. For univariate analysis of K content, K I 769.90 nm performed better than K I 766.49 nm, with Rc of 0.8413, RMSECV of 1.370 mg/g in calibration, and Rp of 0.7836, RMSCP of 1.610 mg/g in prediction. The results of Ca I 422.67 nm, Mg I 517.27 nm, and Fe I 371.99 nm were all better than the other same element emission lines. Overall, the univariate analysis model for Fe I 371.99 nm performed best with Rc of 0.8944, RMSECV of 0.082 mg/g in calibration and Rp of 0.8577, RMSCP of 0.097 mg/g in prediction. However, the performance of univariate analysis in Table 2 is not sufficient to generate a robust and accurate predictive model for the content of K, Ca, Mg, and Fe in PN samples.

Multivariate Analysis
Multivariate analysis attempted to analyze the relationship between more variables of LIBS spectra and nutrient elements K, Ca, Mg, Fe, Zn, and B content obtained by ICP-OES. Full spectra or Based on the Kurucz database and the National Institute of Standards and Technology (NIST) Atomic Spectra Database (ASD), more than 90 emission lines whose intensity values were above 1000 a.u. were verified identities, and the elements are shown in Table 2. Nutritive element Ca had 35 emission lines, and Ca II 393.37 nm had the maximum intensity value of all element emission lines in the PN samples. The emission line for carbon (C) without the highest intensity value may be due to the relatively long delay time, since the intensity of emission lines changed quickly within the time [18]. The observed emission lines for oxygen and nitrogen may come from the common contribution of the PN sample and air. Table 2 presents the capability of LIBS technique to detect different elements from PN samples.

Univariate Analysis
Univariate analysis is a calibration curve method that reflects that the intensity of emission line is proportional to the content of the target element in samples [39]. In common LIBS quantitative analysis, sensitive emission lines of the target element without self-absorption and overlapping peaks are the preferred choice for univariate analysis [18]. According to the NIST database, the sensitive emission lines K I 766.49 nm, K I 769.90 nm, Ca II 393.37 nm, Ca II 396.85 nm, Ca I 422.67 nm, Mg I 517.27 nm, Mg I 518.36 nm, Fe I 373.71 nm, and Fe I 371.99 nm were selected to build univariate analysis models for Panax notoginseng. As shown in Figure 2, the emission lines of these nutritive elements are smooth without interference peaks. The peak intensity of K, Ca, Mg, and Fe in Groups 2, 5, and 6 was significantly lower than other groups, which is consistent with the difference in element content in Table 1. Because of the low content, the sensitive emission lines of B and Zn were difficult to discriminate for univariate analysis.
The univariable calibration and prediction results of the above selected emission lines of nutritive elements are shown in Table 3. For univariate analysis of K content, K I 769.90 nm performed better than K I 766.49 nm, with R c of 0.8413, RMSECV of 1.370 mg/g in calibration, and R p of 0.7836, RMSCP of 1.610 mg/g in prediction. The results of Ca I 422.67 nm, Mg I 517.27 nm, and Fe I 371.99 nm were all better than the other same element emission lines. Overall, the univariate analysis model for Fe I 371.99 nm performed best with R c of 0.8944, RMSECV of 0.082 mg/g in calibration and R p of 0.8577, RMSCP of 0.097 mg/g in prediction. However, the performance of univariate analysis in Table 2 is not sufficient to generate a robust and accurate predictive model for the content of K, Ca, Mg, and Fe in PN samples.

Multivariate Analysis
Multivariate analysis attempted to analyze the relationship between more variables of LIBS spectra and nutrient elements K, Ca, Mg, Fe, Zn, and B content obtained by ICP-OES. Full spectra or several selected variables combined with chemometric methods PLS, LS-SVM and Lasso were used to establish calibration models for the objective nutritive elements content.

Modeling Using Full Spectra
The results for multivariate analysis based on full spectra from 230.77 nm-883.24 nm are shown in Table 4. For K and Ca, the multivariate analysis by Lasso and PLS achieved good performance with R p higher than 0.9500. For Mg and Fe, the multivariate analysis based on full spectra by Lasso achieved the best performance with R p values of 0.9207 and 0.9348 and RMSEP of 0.7740 mg/g and 0.0722 mg/g, respectively. For Zn and B, PLS models obtained the best performance with R p values of 0.9460 and 0.9475 and RMSEP of 0.0016 mg/g and 0.0010 mg/g, respectively. For all the target elements in the PN samples, PLS and Lasso analysis performed well in multivariate analysis based on LIBS full spectra, which may be because PLS and Lasso were more suitable for solving multi-collinearity problems, making full use of the useful element information from the 22,036 variables for relationship fitting between LIBS and element content. LS-SVM performed well in calibration sets for all tested elements, but poor in prediction sets, which means over-fitting phenomenon occurred. A huge difference in the number of variables and the number of samples may cause this phenomenon.

Modeling Using Selected Variables
During the Lasso modeling process, irrelevant or insignificant coefficients of corresponding variables were penalized to zero and discarded directly [40]. Therefore, relevant or significant variables could be selected by comparing regression coefficients (weights) for each variable. The weights plot of the Lasso models for the nutrient elements are shown in Figure 3. For all the target elements in the PN samples, PLS and Lasso analysis performed well in multivariate analysis based on LIBS full spectra, which may be because PLS and Lasso were more suitable for solving multi-collinearity problems, making full use of the useful element information from the 22,036 variables for relationship fitting between LIBS and element content. LS-SVM performed well in calibration sets for all tested elements, but poor in prediction sets, which means over-fitting phenomenon occurred. A huge difference in the number of variables and the number of samples may cause this phenomenon.

Modeling Using Selected Variables
During the Lasso modeling process, irrelevant or insignificant coefficients of corresponding variables were penalized to zero and discarded directly [41]. Therefore, relevant or significant variables could be selected by comparing regression coefficients (weights) for each variable. The weights plot of the Lasso models for the nutrient elements are shown in Figure 3.   (Figure 3), which is due to the complex matrix components of PN and random noise. There were high weights of multiple lines such as N and O in K, Ca, Mg, Zn, and B, which indicated that these elements contributed significantly to quantitative analysis of the target elements, and the interactions among all the elements in PN herbs could not be ignored. The target elements may combine with N and O to compose compounds such as aerobic compounds or amino acids. The multiple lines like N and O may also come from air ablation and reflect noise from environmental  (Figure 3), which is due to the complex matrix components of PN and random noise. There were high weights of multiple lines such as N and O in K, Ca, Mg, Zn, and B, which indicated that these elements contributed significantly to quantitative analysis of the target elements, and the interactions among all the elements in PN herbs could not be ignored. The target elements may combine with N and O to compose compounds such as aerobic compounds or amino acids. The multiple lines like N and O may also come from air ablation and reflect noise from environmental fluctuation. Compared with the spectrum of one PN sample (Figure 3d,h), we also found that background signals without emission lines were also contributing to the quantification study of nutrient elements such as the unmarked weight lines in the Lasso weights plot for Mg (Figure 3c). The background information with high weights may be related to the matrix effects of the PN samples.
The LIBS variables with non-zero weights values in Lasso models were selected for multivariate analysis by PLS, LS-SVM, and Lasso, and the results are shown in Table 5. The number of LIBS variables selected by Lasso for K, Ca, Mg, Fe, Zn, and B was 64, 73, 61, 66, 73, and 62, respectively. For K, Ca, Mg, Zn, and B content predictions, PL-SVM models based on the selected variables by Lasso achieved the best performance with R p values of 0.9546, 0.9176, 0.9412, 0.9665, and 0.9569 and RMSEP of 0.7704 mg/g, 0.0712 mg/g, 0.1000 mg/g, 0.0012 mg/g, and 0.0008 mg/g, respectively. For Fe content predictions, PLS models achieved the best performance with a R p value of 0.9169 and an RMSEP of 0.0724 mg/g. On the whole, LS-SVM models could effectively predict the content of elements, followed by PLS models. Lasso quantitative analysis based on the selected variables preformed worse than the other analysis methods based on the same variables and Lasso quantitative analysis based on full spectra with 22,036 variables. This is because Lasso quantitative analysis based on the selected variables may lose some valid variables after over-screening these variables by the penalty method L 1 .

Discussion
For better univariate analysis, the emission lines without self-absorption and overlapping peaks should be chosen (Figure 2). The spectral emission line intensity is related to the element content. For the lower content elements, the emission line is often too weak, which makes it difficult to distinguish the sensitive emission line and may cause misjudgment. Therefore, content prediction of Zn and B, which are trace elements with relatively low amounts, was performed with multivariate analysis. However, univariate analysis for the nutritive elements obtained poor results. The matrix effect in complex PN samples has a major responsibility for the poor results.
Matrix effects come from the complex plant tissue including differences in chemical compositions and physical properties of plant tissue such as hardness, roughness, porosity, and density [28].
Matrix effects also relate to optical and plasma properties that influence the ratio of a given emission line to the abundance of the element producing that line [27]. Table 2 shows more than 90 emission lines with intensity values being above 1000 a.u. The LIBS spectra contain not only sensitive emission lines of the target elements, but also a large amount of other elements' information containing matrix effects' information. Figure 3 also indicates that multiple emission lines of other correlated elements and background variables may contribute to the prediction of target elements. Univariate analysis only considered the intensity of the sensitive emission line and lose of information related to the matrix effects. Therefore, univariate analysis is not an appropriate way to detect the nutrient elements' content in PN herbs.
Unlike univariate calibration, multivariate analysis performed well in Tables 4 and 5 and had the ability to excavate more multi-variables, including not only the useful information from sensitive emission lines of object element, but also the complex information from matrix effects, continuous background, and shot-to-shot fluctuation of the laser [17].
The effective variables are the critical point of whether to choose univariate or multivariate analysis. Full spectra had 22,036 variables including effective variables and vast ineffective variables, which inevitably resulted in model complexity and instability. After Lasso was selected, the effective LIBS variables reduced from 22,036 to 64, 73, 61, 66, 73, and 62 for K, Ca, Mg, Fe, Zn, and B, respectively. The variable screening process could significantly reduce the number of variables in LIBS spectra for model input and mitigate noise or irrelevant information from background interference and matrix effects. The results of LS-SVM models based on the selected variables clearly demonstrated the merit of variable screening by Lasso. After the variable screening, the R p and RMSEP of the SVM models were greatly improved. Comparing Tables 4 and 5, LS-SVM models based on the selected variables by Lasso weights obtained the best prediction performance for K, Ca, Mg, Zn, and B (R p = 0.9546, 0.9176, 0.9412, 0.9665, and 0.9569, respectively). For Fe content prediction, Lasso based on full spectra got the best result with R p = 0.9207. The fitting plots for the best results for K, Ca, Mg, Fe, Zn, and B are shown in Figure 4. Matrix effects come from the complex plant tissue including differences in chemical compositions and physical properties of plant tissue such as hardness, roughness, porosity, and density [28]. Matrix effects also relate to optical and plasma properties that influence the ratio of a given emission line to the abundance of the element producing that line [27]. Table 2 shows more than 90 emission lines with intensity values being above 1000 a.u. The LIBS spectra contain not only sensitive emission lines of the target elements, but also a large amount of other elements' information containing matrix effects' information. Figure 3 also indicates that multiple emission lines of other correlated elements and background variables may contribute to the prediction of target elements. Univariate analysis only considered the intensity of the sensitive emission line and lose of information related to the matrix effects. Therefore, univariate analysis is not an appropriate way to detect the nutrient elements' content in PN herbs.
Unlike univariate calibration, multivariate analysis performed well in Tables 4 and 5 and had the ability to excavate more multi-variables, including not only the useful information from sensitive emission lines of object element, but also the complex information from matrix effects, continuous background, and shot-to-shot fluctuation of the laser [17].
The effective variables are the critical point of whether to choose univariate or multivariate analysis. Full spectra had 22,036 variables including effective variables and vast ineffective variables, which inevitably resulted in model complexity and instability. After Lasso was selected, the effective LIBS variables reduced from 22,036 to 64, 73, 61, 66, 73, and 62 for K, Ca, Mg, Fe, Zn, and B, respectively. The variable screening process could significantly reduce the number of variables in LIBS spectra for model input and mitigate noise or irrelevant information from background interference and matrix effects. The results of LS-SVM models based on the selected variables clearly demonstrated the merit of variable screening by Lasso. After the variable screening, the Rp and RMSEP of the SVM models were greatly improved. Comparing Tables 4 and 5, LS-SVM models based on the selected variables by Lasso weights obtained the best prediction performance for K, Ca, Mg, Zn, and B (Rp = 0.9546, 0.9176, 0.9412, 0.9665, and 0.9569, respectively). For Fe content prediction, Lasso based on full spectra got the best result with Rp = 0.9207. The fitting plots for the best results for K, Ca, Mg, Fe, Zn, and B are shown in Figure 4. PLS yielded comparable results in terms of accuracy both based on full spectra and the selected variables by Lasso weights. Therefore, PLS models could simultaneously account for changes in all PLS yielded comparable results in terms of accuracy both based on full spectra and the selected variables by Lasso weights. Therefore, PLS models could simultaneously account for changes in all these variables in the condition of changing environments and infinitely-variable sample chemistries.
For K, Ca, Mg, Fe, Zn, and B content detection of PN herbs, PLS may be better suited for this purpose. Lasso may be better for Mg and Fe content detection of PN herbs without more variable screening analysis, because the penalty method L1 selected variables from full spectral and the weights showed us the key emission lines used to quantify each element. The performance of LS-SVM models was limited by overfitting when confronted with full spectra. This may be a limitation of "large" LIBS variables and "small" sample sizes. Except Fe content analysis, the overfitting of LS-SVM models was alleviated after a reduction of more than 99.67% for variables by Lasso weights and showed the merit of the LS-SVM method dealing with the nonlinear relationship in the matrix effects and background interference. The LS-SVM models based on selected variables by Lasso were are the most suitable methods for prediction of K, Ca, Mg, Zn, and B content in PN herbs.
The quantitative analysis profiled the accuracy of LIBS combined with chemometrics, and our method also showed the ability of rapid detection for nutritive element content of PN herbs. Compared with the ICP-OES procedure, which needs more than a 150-min pretreatment containing weighting, adding other reagent, digesting, discharging acid, and diluting, the LIBS procedure needs less than 360 s, because the grinding and tableting process of a PN sample is less than 250 s and LISB collecting of information of the pellet needs about 60 s.

Conclusions
In this experiment, we demonstrated the rapid and accurate analysis of K, Ca, Mg, Fe, Zn, and B content using our LIBS system based on 104 PN samples from eight origins. More than 90 emission lines whose intensity values were above 1000 a.u. had their identities verified for PN samples. Univariate analysis based on sensitive emission lines could not meet the accuracy detection requirements of nutritive elements. For multivariable analysis, LS-SVM models based on the selected variables by Lasso weights for K, Ca, Mg, Zn, and B content detection obtained the best prediction performance with R p values of 0.9546, 0.9176, 0.9412, 0.9665, and 0.9569 and RMSEP of 0.7704 mg/g, 0.0712 mg/g, 0.1000 mg/g, 0.0012 mg/g, and 0.0008 mg/g, respectively. The Lasso model based on full spectra obtained the best result with and R p value of 0.9348 and an RMSEP of 0.0726 mg/g for Fe content detection. PLS models performed good both with full spectra and the selected variables by Lasso weights. The Lasso weight plots provided direct information about effective variable's contribution to the quantitative analysis and can help researchers better understand the PN matrix.
LIBS technology combined with appropriate chemometric methods provided a fast, simple, and precise method for effective quantitative detection of nutrient elements in PN. However, further advances with more pharmaceutical analysis and other chemometric methods are still needed based on our study. Less and more effective variables should be discovered for the development of portable detection devices; ultimately, to provide a fast and accurate technique for laboratory analysis and online monitoring for Chinese herbal medicine quality and pharmaceutical analysis.