Quantitative Analysis of Pig Iron from Steel Industry by Handheld Laser-Induced Breakdown Spectroscopy and Partial Least Square (PLS) Algorithm

One of the main objectives in the steel production process is to obtain a blast furnace pig iron of good quality and at the lowest possible cost. In general, the quality of pig iron is evaluated on the basis of its chemical composition determined by X-ray fluorescence laboratory equipment. In the present study, the performance of a handheld (h) laser-induced breakdown spectroscopy (LIBS) instrument in the identification and the quantification of the relevant elements C, Mn, P, Si, and Ti in forty-six blast furnace pig iron samples was tested successfully. The application of two different models, i.e., univariate and multivariate partial least square (PLS) calibration and validation, to the whole LIBS data set showed that the latter approach was much more efficient than the former one in quantifying all elements considered, especially Si and Ti.


Introduction
The main metallic materials and intermediate products involved in iron and steel industry include pig iron, cast iron, carbon steel, stainless steel, and tool steel, all of which require continuous, accurate, and precise analyses to keep the entire production process under control. In particular, pig iron consists of a carbon-rich intermediate product resulting from a mixture of iron ore, coke, and limestone burnt together in a blast furnace, which is then refined into steel in an oxygen furnace [1,2]. The main goal of the production process is to obtain pig iron of good quality at the lowest possible cost. Many factors affect the quality of pig iron, among which are the quality of ferrous burden materials and coke as well as the fuels and the methods used in the blast furnace process [3]. In particular, the content of the elements Si, Mn, P, and S, which depend on their abundance in ferrous burden materials [3], and Ti, which critically affects its precipitation with saturated carbon [4], have great importance for the quality of pig iron. According to relevant customer requirements and standards, the chemical composition of pig iron should include, besides the basic element, Fe, and a high C content, a number of other alloying elements that are Si at 0.50-0.80% wt., Mn at >0.20% wt., P at <0.11% wt., and S at <0.030% wt. for the production of rail steel and 0.14% wt. for other kinds of steel [3,5]. However, every steel plant around the world requires a different elemental composition. For example, the best performance of the blast furnace operating at the ArcelorMittal Italia S.P.A. steel plant located in Taranto, Italy, where the samples examined in this work were collected, is achieved at an optimal Si content in pig iron of 0.60% wt.

Handheld LIBS Instrumentation and Analytical Procedure
The B&W Tek (Newark, DE, USA) NanoLIBS hLIBS instrument has been described in detail by Senesi et al. [20]. Briefly, the instrument consisted of a miniature-diode-pumped, solid-state, short-pulsed laser emitting at the wavelength of 1064 nm and operating at a maximum pulse energy of 150 µJ with a pulse duration of 500 ps and a high repetition rate comprised between 1 and 5 KHz. The system was equipped with a low-resolution compact spectrometer operating in the non-gated mode and covering the range from 180 to 800 nm with an overall resolution of 0.4 nm. The whole setup was enveloped in a lightweight handheld body with mass and dimensions of about 1.8 Kg and 26 × 10 × 30 cm, respectively. The instrument was provided with a rechargeable Li-ion battery that allowed up to 8 h autonomy for field operations. The system was equipped with a rastering scan that provided a spatial averaging. The measurements were performed by placing the instrument against the sample surface, and the analysis was started via a trigger. A sensor installed in the instrument head controlled automatically the laser output so that the laser could operate only when the head was in contact with the sample. Ten different positions were analysed on the sample, and a total of 150 spectra corresponding to 1800 laser shots were acquired and averaged to obtain a single spectrum for each position ( Figure 1). Then, the spectra were analysed qualitatively to identify the emission lines using the National Institute of Standards and Technology (NIST) database [21].

Pig Iron Samples
Forty-six pig iron samples were collected from the blast furnaces operating at the industrial and manufacturing steel plant of ArcelorMittal Italia S.P.A. located in Taranto, Italy. Each sample was analysed in the form of a circular disc of 35 mm diameter and 7 mm thickness with one face flat and the other rough and presenting small shrinkages. The matrix element in all samples was Fe at a content of over 90%.

Handheld LIBS Instrumentation and Analytical Procedure
The B&W Tek (Newark, DE, USA) NanoLIBS hLIBS instrument has been described in detail by Senesi et al. [20]. Briefly, the instrument consisted of a miniature-diode-pumped, solid-state, short-pulsed laser emitting at the wavelength of 1064 nm and operating at a maximum pulse energy of 150 µJ with a pulse duration of 500 ps and a high repetition rate comprised between 1 and 5 KHz. The system was equipped with a low-resolution compact spectrometer operating in the non-gated mode and covering the range from 180 to 800 nm with an overall resolution of 0.4 nm. The whole setup was enveloped in a lightweight handheld body with mass and dimensions of about 1.8 Kg and 26 × 10 × 30 cm, respectively. The instrument was provided with a rechargeable Li-ion battery that allowed up to 8 h autonomy for field operations. The system was equipped with a rastering scan that provided a spatial averaging. The measurements were performed by placing the instrument against the sample surface, and the analysis was started via a trigger. A sensor installed in the instrument head controlled automatically the laser output so that the laser could operate only when the head was in contact with the sample. Ten different positions were analysed on the sample, and a total of 150 spectra corresponding to 1800 laser shots were acquired and averaged to obtain a single spectrum for each position ( Figure 1). Then, the spectra were analysed qualitatively to identify the emission lines using the National Institute of Standards and Technology (NIST) database [21].

XRF Analysis
The reference concentrations of C, Si, Mn, P, and Ti in the pig iron samples were determined by a MagiX FAST simultaneous XRF spectrometer (PANalytical B.V., Almelo, The Netherlands) equipped with a 300 µm brass filter, a 1 mm Pb beam-stop, and a 4 kW Rh Super Sharp end-window Tube set at 60 kV and 66 mA to obtain the optimal measurement of heavier elements.

Data Pre-Treatment
In order to reduce the noise from experimental fluctuations and matrix effects, a pre-treatment was carried out on each acquired spectrum. The first step consisted in applying the standard normal variate (SNV) procedure [22], i.e., an algebraic manipulation of the spectra to have a total sum fixed at zero and the standard deviation unitary (Equation (1)): where y i is the transformed wavelength intensity, y i is the original one, and the constants y and std are, respectively, the mean and the standard deviation of the spectrum. In the second step, a Savitzky-Golay (SG) filter was applied to smooth the data, reduce the noise, and enhance the performance of the PLS model [23]. In particular, the polynomial SG smoothing method was applied and consisted in selecting N points in the spectrum and adjusting a polynomial function of order P. This adjustment generated new data points that were used to generate a new spectrum. This process was repeated by sweeping the entire spectrum, and the values of N and P were optimized to obtain the smallest validation error.
In the third step, the spectrum was cut into the selected ranges covering the emission lines of the elements C, Mn, P, Si, and Ti of interest. The emission lines were chosen based on their intensity in the spectrum and on the absence or minimization of interferences due to other transitions. The whole process was performed manually using the averaged spectra and the NIST database reference [21].

Statistical Analysis
To assess the quality of calibration curves, which directly compare the intensity obtained from the spectrum with the reference elemental concentration of the sample measured by XRF, the Pearson's R coefficient (Equation (2)) and the RMSE (Equation (3)) were used: where x i and y i are the wavelength and the LIBS intensity values, respectively, x, y and σ x , σ y are the mean and standard deviation from the distributions, and N is the number of points; and where y i re f and y i pred are, respectively, the reference and the predicted concentration values for the model, and N is the number of points. The R value measures the degree of correlation between two variables and ranges from −1 (perfect negative correlation) to +1 (perfect positive correlation). The RSME basically measures the difference between predicted and measured values. Thus, the values of R and RSME provide complementary information for model evaluation.
Thus, the analytical performance of each calibration method used for the quantification of a given element can be evaluated by the R value of the best fitting line between predicted and nominal values (R = 1 in the case of perfect agreement), the intercept value (ideally equal to zero), and the RMSE of the prediction (equal to zero in case of perfect agreement).
Further, to evaluate the validation curve, the MAPE values was calculated (Equation (4)), which measures the average percentage error of the value predicted by the model compared to the reference value: Appl. Sci. 2020, 10, 8461 where N is the total number of samples, and y i re f and y i pred are, respectively, the reference and the predicted concentration values used in the model. The MAPE value provides the quantification error expected for tested samples.
The LOD values were calculated using the calibration curve according to the following Equation (5): where σ is the error of the linear coefficient, and α represents the slope of the calibration curve. The LOD represents the minimum concentration value of the analyte that can be measured by the technique used, i.e., the technique sensitivity as the detection of concentrations below the LOD are not possible.
In order to avoid possible overfitting, which may occur if data are adjusted very well to the training data but exhibit a low predictive capacity, they were validated by the LOO-CV method by which one sample is taken out from the training set and used later for testing. In this way, a concentration can be predicted for the external sample and compared with the reference value. The process was repeated by leaving each sample out of the training so that all samples could be validated by the model trained with the remaining N-1 samples. Although this validation process is more complicated computationally, it can achieve a result more reliable than that obtained by the k-fold validation method, as the former one is independent from the random selection of the training set. As the test sample does not participate in the training set, it works as an external measure. The only relationship between the test sample and the training data set was that the measurements were carried out in a similar way (same day and equipment). Although the ideal procedure would be to validate the protocol using measurements in situ and perform it on different days, this is strictly required only in the final stages of implementation of the equipment for industrial applications. However, to present the potential of spectroscopic techniques applications, the cross-validation approach is widely used and is considered an excellent evaluation parameter.

Univariate and Multivariate Analyses
The first stage of analysis was performed by calculating the area of specific emission lines for each element of interest in the raw spectra. The area below each line was fitted to a Gaussian function by subtracting the background [24], whereas the area of interfered emission lines was calculated by fitting multiple Gaussian functions [25]. A curve was constructed for each emission line, and the R coefficient and the RMSE values were calculated for the calibration curve and MAPE values for the validation curve. Only the emission lines that showed the best R coefficients in the calibration are discussed in Section 3. For calibration curves with very low R (<0.60), the validation process was not performed, as the associated error was very high.
The multivariate analysis of data was performed on the spectra after SNV normalization and SG filter by the PLS method [26,27], which takes into account any possible interdependence of the emission lines of each element of interest and both the background, allowing one to reduce possible matrix effects [28] and the emission lines of other elements [29]. In this study, the system was optimized to operate only in the spectral range width of emission lines of the elements of interest, which included also variables related to the background and occasionally emissions of other elements close to the transition. SG parameters were optimized to reduce the validation error. To evaluate the quality of analysis, the coefficient R and RMSE values of the calibration curves and MAPE values of the validation curves were calculated.
As the emission line(s) of each element of interest selected after data pre-treatment procedure presented a slight deviation (in the order of 0.1 nm), probably due to the limited spectral resolution, from those in the NIST database, in multivariate analysis, the point of highest intensity of the line was selected as the centre of the transition. The selected spectral width of the emission line was optimized also to minimize the validation error. Thus, the model included part of the background and possible neighbouring transitions and interferences, which is a procedure desirable in multivariate analysis, as the inclusion of extra variables can help the model to circumvent laser fluctuation problems, matrix effects, and interferences [29]. A workflow summarizing the analytical steps performed is shown in Figure 2.
evaluate the quality of analysis, the coefficient R and RMSE values of the calibration curves and MAPE values of the validation curves were calculated.
As the emission line(s) of each element of interest selected after data pre-treatment procedure presented a slight deviation (in the order of 0.1 nm), probably due to the limited spectral resolution, from those in the NIST database, in multivariate analysis, the point of highest intensity of the line was selected as the centre of the transition. The selected spectral width of the emission line was optimized also to minimize the validation error. Thus, the model included part of the background and possible neighbouring transitions and interferences, which is a procedure desirable in multivariate analysis, as the inclusion of extra variables can help the model to circumvent laser fluctuation problems, matrix effects, and interferences [29]. A workflow summarizing the analytical steps performed is shown in Figure 2.

Results and Discussion
The surface morphology of the craters formed after two consecutive laser shots on a typical pig iron sample (Figure 3a) was investigated by optical microscopy. Images obtained (Figure 3b) indicated that the energy density used was sufficient to provide sample ablation and crater formation only after two laser shots. Further, the laser-impact areas in Figure 3b showed a central

Results and Discussion
The surface morphology of the craters formed after two consecutive laser shots on a typical pig iron sample (Figure 3a) was investigated by optical microscopy. Images obtained (Figure 3b) indicated that the energy density used was sufficient to provide sample ablation and crater formation only after two laser shots. Further, the laser-impact areas in Figure 3b showed a central intensely ablated region and a rim around it where some droplets and cracks were present. Some black spots could be also observed with pinholes appearing on the surface. Further, the surface colour changed to black, and the surface reflectivity decreased significantly. The crater diameters were estimated to be about 300 µm (Figure 3b).
Appl. Sci. 2020, 10, x FOR PEER REVIEW 7 of 16 intensely ablated region and a rim around it where some droplets and cracks were present. Some black spots could be also observed with pinholes appearing on the surface. Further, the surface colour changed to black, and the surface reflectivity decreased significantly. The crater diameters were estimated to be about 300 µm (Figure 3b). The full broadband LIBS emission spectra (one example is shown in Figure 4) acquired directly within few seconds by a point and shoot operation without any sample preparation were used to identify the elements present in the pig iron samples. Although all spectra featured an intense background, especially in the wavelength range from 200 to 300 nm where multiple lines of the major matrix element Fe appeared, the main elemental components of the samples, i.e., C, Mn, P, Si, The full broadband LIBS emission spectra (one example is shown in Figure 4) acquired directly within few seconds by a point and shoot operation without any sample preparation were used to identify the elements present in the pig iron samples. Although all spectra featured an intense background, especially in the wavelength range from 200 to 300 nm where multiple lines of the major matrix element Fe appeared, the main elemental components of the samples, i.e., C, Mn, P, Si, and Ti, could be revealed. The emission lines chosen for further analysis as a result of data pre-treatment procedure were: Si I (251.432 nm, 263.128   The full broadband LIBS emission spectra (one example is shown in Figure 4) acquired directly within few seconds by a point and shoot operation without any sample preparation were used to identify the elements present in the pig iron samples. Although all spectra featured an intense background, especially in the wavelength range from 200 to 300 nm where multiple lines of the major matrix element Fe appeared, the main elemental components of the samples, i.e., C, Mn, P, Si, and Ti, could be revealed. The emission lines chosen for further analysis as a result of data pre-treatment procedure were: Si I (251.432 nm, 263.128

Univariate Analysis
The univariate calibration curves achieved using the emission areas of the elements C, Mn, and P featured a poor linearity with too low (R < 0.90) correlation coefficients, which affected the accuracy and the precision of the prediction and the robustness of the model. Differently, the univariate calibration and cross-validation curves constructed using the areas of the emission lines of Si I at 288.158 nm and Ti II at 334.903 nm and 334.940 nm, which were those that presented the best results in the calibration, versus the reference XRF concentrations were more satisfactory ( Figure 5). In the case of Ti, the two emission lines considered were overlapping, and a unique area was calculated which included both lines. An R coefficient of 0.95 for Si and 0.86 for Ti and an RMSE of 237.05 for Si and 126.10 for Ti were achieved for calibration with a MAPE of 18.04% for Si and 19.40% for Ti for validation. Although these values could be considered satisfactory, they would have limited application in real cases, i.e., univariate calibration curves could be used only for a rapid estimation of some elements in the samples but not for an accurate estimation of unknown samples.
was calculated which included both lines. An R coefficient of 0.95 for Si and 0.86 for Ti and an RMSE of 237.05 for Si and 126.10 for Ti were achieved for calibration with a MAPE of 18.04% for Si and 19.40% for Ti for validation. Although these values could be considered satisfactory, they would have limited application in real cases, i.e., univariate calibration curves could be used only for a rapid estimation of some elements in the samples but not for an accurate estimation of unknown samples. Furthermore, building univariate calibration curves by plotting LIBS emission line areas vs. percentage of weight concentration might introduce matrix effect errors because the same numerical concentration in different matrices might correspond to a different weight concentration on dependence on the stoichiometry of the sample. Thus, the univariate model yielded an unsatisfactory outcome, therefore it was discarded, and the PLS multivariate calibration approach was attempted to improve the analysis of LIBS spectra in both broad spectral range and selected emission lines to possibly allow the technique to be used in industrial application. In multivariate analysis, the choice of weight or number concentration for the reference samples does not make a difference, because all emission lines are analysed simultaneously. Furthermore, building univariate calibration curves by plotting LIBS emission line areas vs. percentage of weight concentration might introduce matrix effect errors because the same numerical concentration in different matrices might correspond to a different weight concentration on dependence on the stoichiometry of the sample. Thus, the univariate model yielded an unsatisfactory outcome, therefore it was discarded, and the PLS multivariate calibration approach was attempted to improve the analysis of LIBS spectra in both broad spectral range and selected emission lines to possibly allow the technique to be used in industrial application. In multivariate analysis, the choice of weight or number concentration for the reference samples does not make a difference, because all emission lines are analysed simultaneously.

Multivariate Analysis
Before applying the PLS model, the LIBS spectra were normalized using the SNV method and an SG filter to smooth and reduce noise. In multivariate analysis, all the elemental emission lines selected were used simultaneously together with a portion of the lateral background, and the number of components used in the model was optimized to achieve the smallest error. Figure 6 shows raw LIBS spectra for all the samples compared with detailed SNV and SNV + SG spectra in a specific spectral region (310-365 nm) and illustrates how the filter directly affects the transitions and the background attenuating fluctuations. As a verification criterion to estimate the noise reduction achieved by the applied filters, three spectral configurations were considered, i.e., raw without any treatment, SNV normalized, and SNV-SG corrected of the transition of Fe II (266.663 nm) (Figure 7). This transition was chosen because it was apparently not inferring interferences, and the concentration of Fe in the sample was very high and practically constant. To ensure an identical analytical condition for the three configurations, a baseline correction and normalization was performed to guarantee that the set of spectra remained within a minimum of zero and a maximum of one in intensity. This process did not affect the fluctuation among spectra (Figure 7), as it was applied to the set of spectra as a whole and not individually to each spectrum. A sum of pixel intensities was performed, which resulted in a transition intensity/area value (Riemann sum) for each spectrum. Then, the mean of these intensities and the standard deviation were calculated. The percentage value of the standard deviation related to the mean value was 31.5% for the raw spectra set, 3.7% for the SNV set, and 1.9% for the SNV-SG set. These results show that the process as a whole was able to reduce global spectral fluctuations and probably was a collaborative factor in improving the multivariate model. noise reduction achieved by the applied filters, three spectral configurations were considered, i.e., raw without any treatment, SNV normalized, and SNV-SG corrected of the transition of Fe II (266.663 nm) (Figure 7). This transition was chosen because it was apparently not inferring interferences, and the concentration of Fe in the sample was very high and practically constant. To ensure an identical analytical condition for the three configurations, a baseline correction and normalization was performed to guarantee that the set of spectra remained within a minimum of zero and a maximum of one in intensity. This process did not affect the fluctuation among spectra (Figure 7), as it was applied to the set of spectra as a whole and not individually to each spectrum. A sum of pixel intensities was performed, which resulted in a transition intensity/area value (Riemann sum) for each spectrum. Then, the mean of these intensities and the standard deviation were calculated. The percentage value of the standard deviation related to the mean value was 31.5% for the raw spectra set, 3.7% for the SNV set, and 1.9% for the SNV-SG set. These results show that the process as a whole was able to reduce global spectral fluctuations and probably was a collaborative factor in improving the multivariate model.   The calibration and the validation curves of the elements C, Mn, P, Si, and Ti constructed by applying PLS to the areas of the corresponding elemental lines versus the certified concentrations are shown in Figure 8. The R and the RMSE values achieved for the calibration curves and MAPE and LOD for the validation curves using univariate and multivariate models (Table 1) indicated a significant improvement achieved by PLS and PLS + SG data treatment. In particular, the R (above 0.93) and the RMSE (below 0.07, with the exception of C) values of the calibration curves constructed using the PLS approach were very satisfactory for all elements, and especially for Si and Ti. Additionally, the MAPE percentage values achieved in the validation by PLS (below 15% for all elements except Mn) were also satisfactory. The calibration and the validation curves of the elements C, Mn, P, Si, and Ti constructed by applying PLS to the areas of the corresponding elemental lines versus the certified concentrations are shown in Figure 8. The R and the RMSE values achieved for the calibration curves and MAPE and LOD for the validation curves using univariate and multivariate models (Table 1) indicated a significant improvement achieved by PLS and PLS + SG data treatment. In particular, the R (above 0.93) and the RMSE (below 0.07, with the exception of C) values of the calibration curves constructed using the PLS approach were very satisfactory for all elements, and especially for Si and Ti. Additionally, the MAPE percentage values achieved in the validation by PLS (below 15% for all elements except Mn) were also satisfactory. Appl. Sci. 2020, 10   The best SG were with seven points and order three polynomium. R: Pearson's correlation coefficient; RMSE: root mean square error; LOD: limit of detection; MAPE: mean absolute prediction error.
Although the hLIBS approach might present intrinsic laser fluctuations leading to an unpredictable interaction between radiation and matter, overfitting processes could be totally avoided by performing an efficient cross-validation procedure. Thus, the error due to equipment and laser matter interaction could be minimized, but the error associated with the reference technique could not.

Conclusions
The primary qualitative factor of pig iron, which specifies its class and features required by the steel plant, is its chemical composition. In particular, one of the most important indicators of the technical level and the quality of steel production is the Si content achieved in the intermediate pig iron produced in the blast furnace. This study demonstrates that broadband LIBS analysis performed by an hLIBS instrument associated with PLS treatment of spectral data is able to identify and quantify different elements, specifically C, Mn, P, Si, and Ti, in pig iron samples. Thus, the capacity of an hLIBS instrument to perform spatially resolved, multi-element, in-situ, online analysis of element concentrations in open air without any sample pre-treatment provides an obvious advantage with respect to the use of a traditional laboratory bench-top apparatus.  The best SG were with seven points and order three polynomium. R: Pearson's correlation coefficient; RMSE: root mean square error; LOD: limit of detection; MAPE: mean absolute prediction error.
Although the hLIBS approach might present intrinsic laser fluctuations leading to an unpredictable interaction between radiation and matter, overfitting processes could be totally avoided by performing an efficient cross-validation procedure. Thus, the error due to equipment and laser matter interaction could be minimized, but the error associated with the reference technique could not.

Conclusions
The primary qualitative factor of pig iron, which specifies its class and features required by the steel plant, is its chemical composition. In particular, one of the most important indicators of the technical level and the quality of steel production is the Si content achieved in the intermediate pig iron produced in the blast furnace. This study demonstrates that broadband LIBS analysis performed by an hLIBS instrument associated with PLS treatment of spectral data is able to identify and quantify different elements, specifically C, Mn, P, Si, and Ti, in pig iron samples. Thus, the capacity of an hLIBS instrument to perform spatially resolved, multi-element, in-situ, online analysis of element concentrations in open air without any sample pre-treatment provides an obvious advantage with respect to the use of a traditional laboratory bench-top apparatus.
Although hLIBS cannot be considered yet totally satisfactory in terms of accuracy, the time reduction from sampling to result delivery and the limited sample preparation no doubt contribute to a more rapid and efficient control of the steel-making process. The use of the hLIBS instrument is thus expected to streamline operations in the blast furnace, gain real-time information of sample quality, and allow a rapid decision of which samples to consider for further laboratory analyses, thus saving time and costs and minimizing transportation to the laboratory. Further, testing of industrial materials by portable instruments is often performed in field harsh conditions, thus easy portability, high robustness, and fast data acquisition are required for these instruments to incorporate easily the analytical process into the production and the processing workflows. Finally, a variety of industrial applications of hLIBS systems can be expected by the progress of spectral data evaluation tools.