Residual Skewness Monitoring-Based Estimation Method for Laser-Induced Breakdown Spectroscopy

Zhu, Bin; Shen, Xiangcheng; Liu, Tao; Wang, Sirui; Hang, Yuhua; Mo, Jianhua; Shao, Lei; Wang, Ruizhi

doi:10.3390/electronics14173343

Open AccessArticle

Residual Skewness Monitoring-Based Estimation Method for Laser-Induced Breakdown Spectroscopy

by

Bin Zhu

¹,

Xiangcheng Shen

²,

Tao Liu

^1,*,

Sirui Wang

²,

Yuhua Hang

¹,

Jianhua Mo

²,

Lei Shao

² and

Ruizhi Wang

¹

Suzhou Nuclear Power Research Institute, Suzhou 215004, China

²

School of Electronic and Information Engineering, Soochow University, Suzhou 215031, China

^*

Author to whom correspondence should be addressed.

Electronics 2025, 14(17), 3343; https://doi.org/10.3390/electronics14173343

Submission received: 6 July 2025 / Revised: 10 August 2025 / Accepted: 11 August 2025 / Published: 22 August 2025

Download

Browse Figures

Versions Notes

Abstract

To address the challenges of narrow peak characteristics and low signal-to-noise ratio (SNR) detection in Laser-Induced Breakdown Spectroscopy (LIBS), in this paper, we combine the Sparse Bayesian Learning–Baseline Correction (SBL-BC) algorithm with residual skewness monitoring to propose a spectral estimation method tailored for LIBS. In LIBS spectra, discrete peaks are susceptible to baseline fluctuations and noise, while the Gaussian dictionary modeling and fixed convergence criterion of the existing SBL-BC lead to the inaccurate characterization of narrow peaks and high computational complexity. To overcome these limitations, we introduce a residual skewness dynamic tracking mechanism to mitigate residual negative skewness accumulation caused by positivity constraints under high noise levels, preventing traditional convergence criterion failure. Simultaneously, by eliminating the dictionary matrix and directly modeling the spectral peak vector, we transform matrix operations into vector computations, better aligning with LIBS’s narrow peak features and high-channel-count processing requirements. Through simulated and real spectral experiments, the results demonstrate that this method outperforms the SBL-BC algorithm in terms of spectral peak fitting accuracy, computational speed, and convergence performance across various SNRs. It effectively separates spectral peaks, baseline, and noise, providing a reliable approach for both quantitative and qualitative analysis of LIBS spectra.

Keywords:

laser-induced breakdown spectroscopy; baseline correction; spectral peak estimation; variational inference; residual skewness monitoring

1. Introduction

LIBS, which uses laser pulses to excite plasma emissions, enables rapid multi-element spectral acquisition, with in situ non-contact detection advantages in metallurgy [1], environmental monitoring [2], biomedicine [3], and food safety [4]. LIBS spectra can be decomposed into discrete spectral lines (peaks) that carry elemental information, low-frequency low-variation baselines, and additive noise. Among them, the spectral peaks, which are the core signals for quantitative and qualitative analyses, show a spike structure discretely distributed in the wavelength dimension. These characteristic spectral lines originating from atomic energy level jumps exist in a discontinuous localized form in the overall spectrum, and their intensities and positions directly reflect the elemental species and contents, constituting the key fingerprint information for elemental identification by virtue of the precise positions and intensity differences of the independent spikes. However, in industrial field detection, LIBS spectra are subject to multiple interferences: Internally, plasma evolution is accompanied by nonlinear high-baseline backgrounds caused by blackbody radiation, bremsstrahlung, and recombination radiation [5]. Matrix effects [6] induce inter-element interference, leading to drastic baseline fluctuations. Externally, factors such as laser energy fluctuations [7], uneven sample surfaces, and detector dark current noise [6] introduce random noise, further degrading the signal-to-noise ratio. These interferences cause the spectral peaks to be submerged by noise and severe baseline drift, which significantly affects the accuracy of subsequent analysis and may even lead to overfitting in the analytical model [8]. Therefore, effectively separating spectral peaks, baseline, and noise is a crucial prerequisite for quantitative and qualitative analysis of LIBS spectra.

Existing methods for addressing complex spectral interference are primarily categorized into two strategies: step-by-step processing and joint estimation [9,10,11,12]. Traditional stepwise strategies typically process data in stages, such as performing baseline correction, denoising, and spectral peak extraction sequentially. The parameters for each step are often determined by independent local optimization criteria. For instance, penalty coefficients in baseline correction methods like ASLS [13], airPLS [14], arPLS [15], and asPLS [16], along with decomposition levels and adjustment parameters in the improved compromise threshold wavelet denoising method [17], each belong to distinct optimization frameworks. These methods have significant drawbacks: staged parameters are difficult to jointly optimize and residual noise from baseline correction can mislead subsequent denoising, leading to missed detection of weak peaks or introduction of false peaks, with errors accumulating progressively through multiple processing steps. In recent years, joint estimation methods such as BEADS [10] and SSFBCFP [11] have leveraged sparsity based on the L1 norm [18] to simultaneously handle baseline and spectral peaks through a unified model. Although these approaches require the setting of hyperparameters, the parameters are co-optimized within the same optimization framework, effectively avoiding parameter incompatibility issues that often arise between stages in stepwise strategies. SBL-BC [12] further demonstrates the advantages of the joint estimation method. Based on a hierarchical Bayesian framework, it determines the hyperparameters through data-driven approaches without manually presetting regularization terms, thereby exhibiting more robust adaptability.

Although SBL-BC demonstrates excellent performance for broad peak spectra such as chromatography and Raman spectroscopy spectra, its application in LIBS faces limitations: the Gaussian prior for the spectral peak weight, w, differs from L1 regularization, resulting in limited sparsity. Consequently, under low-SNR conditions, spectral peak misidentification becomes inevitable. In this paper, a residual skewness monitoring mechanism is introduced to effectively prevent the asymmetric accumulation of residual distributions caused by spectral peak positivity constraints in high-noise environments, thereby avoiding the failure of traditional convergence criteria. Furthermore, considering the narrow spectral peaks in LIBS with pulse-like characteristics that occupy fewer data points, in this paper, the dictionary matrix is removed, the weight vector is treated as the spectral peak vector, and the matrix inversion operation is transformed into vector operations. This improvement enables the method to efficiently handle the channel scale of LIBS spectra.

2. Algorithm Introduction

2.1. Problem Modeling

First, we assume that the observed spectral data x ∈ ℝ^N can be decomposed into a pure spectral signal s ∈ ℝ^N, a baseline b ∈ ℝ^N, and noise n ∈ ℝ^N:

x = s + b + n

(1)

In order to avoid overfitting or underfitting of the model due to local noise fluctuations and to simplify the algorithm, it is assumed that each component of the noise vector n has an independent homogeneous distribution assumption and obeys a Gaussian distribution with a mean of 0 and a precision of α. The conditional distribution of x can be expressed as follows:

p (x | s, b, α) = N (x | s + b, α^{- 1} I)

(2)

For baseline b, given that the local curvature of a smooth curve approaches zero, the absolute values of its second-order differences are generally small. Therefore, the second-order difference matrix D ∈ ℝ^(N−2)×^N is used to characterize the smoothness of the baseline. Assuming Db follows an independent Gaussian process with a mean 0 and a precision of β, we have the following:

p (D b | β) = \frac{| β I |^{\frac{1}{2}}}{(2 π)^{\frac{N}{2}}} e x p {- \frac{1}{2} (D b)^{T} (β I) (D b)}

(3)

According to the definition of the Gaussian distribution, it can be concluded that

p (b | β) = N (b | 0, {(β D^{T} D)}^{- 1}) .

(4)

Since rank (βD^TD) = N − 2, βD^TD is practically irreducible, and in order to satisfy mathematical plausibility, the regular term κI is added, and the correction is as follows:

p (b | β) = N (b | 0, {(β D^{T} D + κ I)}^{- 1})

(5)

κ can describe the inherent fluctuations of the baseline itself. However, during the actual computation of the posterior covariance of baseline b, the noise n automatically introduces a regularization term, so we ignore this term in subsequent derivations.

In SBL-BC, the spectral peak signal s is represented by a Gaussian dictionary matrix A ∈ ℝ^N^×^N; i.e., s = Aw, where w ∈ ℝ^N is a sparse weight vector. After column-wise partitioning of matrix A, each column vector represents a Gaussian peak with a distinct fixed mean:

A = (a_{1}, a_{2} \dots a_{N})

(6)

If

a_{j}

(j = 1,2…N) are N-dimensional column vectors, then s is a linear combination of

a_{j}

.

s = w_{1} a_{1} + w_{2} a_{2} + \dots + w_{N} a_{N}

(7)

The calculation formula for the specific elements is as follows, where σ_j² represents the variance in the Gaussian function:

a_{j} (i) = e x p (- \frac{{(i - j)}^{2}}{2 σ_{j}^{2}})

(8)

To impose a sparse prior on w, we assume that w follows a Gaussian distribution with the precision vector γ ∈ ℝ^N, where its components are independent and identically distributed as follows:

p (w | γ) = \prod_{n = 1}^{N} N (w_{n} | 0, γ_{n}^{- 1})

(9)

To address the limitations of the SBL-BC method in LIBS, in this paper, the spectral peak s is not represented using the Gaussian dictionary matrix A and sparse weight vector w. Instead, s is directly modeled. The physical rationale lies in the following: the collected LIBS spectral peaks are relatively low in intensity and narrow in width; coupled with instrument resolution constraints, each peak occupies only a small number of data points. This characteristic makes it more appropriate to model the spectral peak s directly, without relying on the Gaussian dictionary matrix A and sparse weight vector w. Note that the “peak width” of s here is a computational unit, distinct from the actual physical peak width of LIBS peaks. The data width corresponding to this computational peak width retains the necessary information to support calculations of electron density and temperature.

p (s | γ) = \prod_{n = 1}^{N} N (s_{n} | 0, γ_{n}^{- 1})

(10)

In the equivalent optimization framework of this model, the update equations for s and b can also be derived by taking partial derivatives. The corresponding objective function is as follows, where Γ = diag{γ}, which is used to achieve point-wise localization of peak positions.

\min_{s, b} α | | {x - s - b | |}^{2} + β | | D b | |^{2} + s^{T} Γ s

(11)

α, β, and γ are all unknown parameters that are difficult to determine in the optimization framework and need to be updated through parameter estimation in the Bayesian framework. They are typically assumed to follow a non-informative prior Gamma distribution, where parameters c and d are set to extremely small values.

\{\begin{matrix} p (α) = Γ (α | c, d) \\ p (β) = Γ (β | c, d) \\ p (γ) = \prod_{n = 1}^{N} Γ (γ_{n} | c, d) \end{matrix}

(12)

2.2. Variational Inference Solution and Parameter Optimization

The observable spectral x is treated as the observed variable. In the previous section, the variables were estimated and hyperparameters were defined as latent variables Θ = {s, b, α, β, γ}. The core objective of the algorithm is to infer the latent variables Θ from the observed variable x and use the mean values of s and b in the posterior distribution p(Θ|x) as the final estimates.

According to the Bayesian formula, the computation of the posterior probability involves multiple integrals over the denominator with respect to each random variable. When dealing with spectral data involving thousands of variables, direct computation becomes infeasible. Therefore, variational inference [19] is employed to approximate the posterior distribution using a variational distribution q. To measure the similarity between distributions, the Kullback–Leibler (KL) divergence [20] is utilized. This transforms the original integration problem into an optimization problem:

q^{*} (Θ) = \underset{q (Θ)}{argmin} K L (q (Θ) | | p (Θ | x))

(13)

In the variational Bayesian framework, based on the mean-field variational assumption [21], it is possible to decompose the approximate distribution q(Θ) as a product of the following five factors:

q (Θ) = q (s) q (b) q (α) q (β) q (γ)

(14)

According to the hierarchical Bayesian model, the joint distribution can be written as follows:

p (x, Θ) = p (x | s, b, α) p (s | γ) p (b | β) p (α) p (β) p (γ)

(15)

Since the observed spectrum is known, the complete likelihood is a constant value. Moreover, as the likelihood is the sum of the KL divergence and the Evidence Lower Bound (ELBO) [22], Equation (13) can be transformed into the following:

q^{*} (Θ) = \underset{q (Θ)}{argmax} E L B O (q (Θ))

(16)

To solve the distribution of a certain latent variable, we first fix the distributions of other latent variables and iteratively solve for q(s), q(b), q(α), q(β), and q(γ) sequentially using the principle of coordinate ascent [21], as shown in the following formula:

\ln q^{*} (θ_{g}) \propto {< \ln p (x, Θ) >}_{q (Θ) / q (θ_{g})}, g = 1, 2, \dots, 5

(17)

where <lnp(x,Θ)>_q_(Θ) denotes the expectation of lnp(x,Θ) under the distribution q(Θ), and ∝ denotes proportionality, meaning the two sides of the equation differ by a constant factor.

Since a conjugate prior [23] is used, the form of the posterior distribution is known, and only the parameters of these distributions need to be determined. The specific derivation of the latent variable distribution is as follows:

The spectral peaks in the joint distribution are only associated with the probability terms p(x|s, b,α) and p(s|γ). Substituting into (16), we obtain the following:

\begin{array}{l} \ln q (s) & \propto {< \ln p (x | s, b, α) + \ln p (s | γ) >}_{q (b) q (α) q (γ)} \\ \propto {< {- \frac{α}{2} | | x - s - b | |}^{2} >}_{q (b) q (α)} + {< - \frac{1}{2} s^{T} Γ s >}_{q (γ)} \\ \propto - \frac{\hat{α}}{2} ({| | x - s - μ_{b} | |}^{2} + t r (Σ_{s})) - \frac{1}{2} s^{T} \hat{Γ} s \\ \propto - \frac{1}{2} s^{T} (\hat{α} I + \hat{Γ}) s + \hat{α} s^{T} (x - μ_{b}) \\ \propto N (s | μ_{s}, Σ_{s}) \end{array}

In this, the formula <||s||²>_q_(s) = ||μ_s||² + tr(Σ_s) is used. From the results, it can be seen that the approximate posterior of s remains a Gaussian distribution, with a mean and variance as follows:

{Σ_{s} = (\hat{α} I + \hat{Γ})}^{- 1}

(18)

μ_{s} = \hat{α} Σ_{s} (x - μ_{b})

(19)

It can be observed that the covariance matrix Σ_s ∈ ℝ^N^×^N simplifies to a diagonal matrix, which can be directly characterized by its diagonal vector σ_s² ∈ ℝ^N. Consequently, the inversion operation reduces to an element-wise reciprocal computation.

σ_{s}^{2} = \frac{1}{\hat{α} + \hat{γ}}

(20)

The update of vector s can also be translated into the following expression:

μ_{s} = \hat{α} d i a g {σ_{s}^{2} {(x - μ_{b})}^{T}}

(21)

Generally, only positive spectral peaks possess practical physical significance. Therefore, it is necessary to introduce a non-negativity constraint after updating μ_s to distinguish noise, which serves as an essential step for enhancing the “spectral peak structure” following the removal of matrix A.

μ_{s} = \frac{μ_{s} + | μ_{s} |}{2}

(22)

For the update of baseline b, we focus only on p(x|s,b,α) and p(b|β):

\begin{array}{l} \ln q (b) & \propto {< \ln p (x | s, b, α) + p (b | β) >}_{q (s) q (α) q (β)} \\ \propto {< {- \frac{α}{2} | | x - s - b | |}^{2} >}_{q (s) q (α)} + {< - \frac{β}{2} {| | D b | |}^{2} >}_{q (β)} \\ \propto - \frac{1}{2} b^{T} (\hat{α} I + \hat{β} D^{T} D) b + \hat{α} b^{T} (x - μ_{s}) \\ \propto N (b | μ_{b}, Σ_{b}) \end{array}

From this, the mean and variance of b can also be obtained:

Σ_{b} = {(\hat{α} I + \hat{β} D^{T} D)}^{- 1}

(23)

μ_{b} = \hat{α} Σ_{b} (x - μ_{s})

(24)

All hyperparameter priors are Gamma-distributed, where c and d are configured as 10⁻⁶:

\begin{array}{l} \ln q (α) & \propto {< \ln p (x | s, b, α) + \ln p (α) >}_{q (s) q (b)} \\ \propto \frac{N}{2} \ln α - \frac{α}{2} {< {| | x - s - b | |}^{2} >}_{q (s) q (b)} + (c - 1) \ln α - d α \\ \propto (c + \frac{N}{2} - 1) \ln α - α (d + \frac{1}{2} ({| | x - μ_{s} - μ_{b} | |}^{2} + t r (Σ_{s} + Σ_{b}))) \\ \propto Γ (α | c + N / 2, d + d_{α} / 2) \end{array}

\begin{array}{l} \ln q (β) & \propto {< \ln p (b | β) + \ln p (β) + >}_{q (b)} \\ \propto \frac{N}{2} \ln β - \frac{β}{2} {< {| | D b | |}^{2} >}_{q (b)} + (c - 1) \ln β - d β \\ \propto (c + \frac{N}{2} - 1) \ln β - β (d + \frac{1}{2} ({| | D μ_{b} | |}^{2} + t r (D^{T} D Σ_{b}))) \\ \propto Γ (β | c + N / 2, d + d_{β} / 2) \end{array}

d_{α} = {| | x - μ_{s} - μ_{b} | |}^{2} + t r (Σ_{s} + Σ_{b})

(25)

d_{β} = {| | D μ_{b} | |}^{2} + t r (D^{T} D Σ_{b})

(26)

For γ, since its components are independent, each component can be solved individually:

\begin{array}{l} \ln q (γ_{j}) & \propto {< \ln p (s_{j} | γ_{j}) + \ln p (γ_{j}) >}_{q (s_{j})} \\ \propto \frac{1}{2} \ln γ_{j} - \frac{γ_{j}}{2} {< s_{j}^{2} >}_{q (s_{j})} + (c - 1) \ln γ_{j} - d γ_{j} \\ \propto (c + \frac{1}{2} - 1) \ln γ_{j} - γ_{j} (d + \frac{1}{2} (μ_{s_{j}}^{2} + Σ_{s, j j})) \\ \propto Γ (γ_{j} | c + 1 / 2, d + d_{γ_{j}} / 2) \end{array}

Expressed in vector form, we have the following:

d_{γ} = d i a g {μ_{s} μ_{s}^{T}} + σ_{s}^{2}

(27)

From this, it can be seen that the posterior distribution of the Gamma prior remains a Gamma distribution, and the updated hyperparameters are the ratio of the first parameter to the second parameter of the Gamma distribution.

\{\begin{matrix} \hat{α} = (c + N / 2) / (d + d_{α} / 2) \\ \hat{β} = (c + N / 2) / (d + d_{β} / 2) \\ \hat{γ} = (c + N / 2) / (d + d_{γ} / 2) \end{matrix}

(28)

Since the noise precision α tends to infinity when the noise level is extremely low, which affects the update of other parameters, we set an upper limit for α.

\hat{α} = m i n {\hat{α}, 10^{5}}

(29)

In practical applications, the baseline smoothing parameter β is fixed. This is because the span of β ranges from 10³ to 10¹² orders of magnitude, and when updated using variational inference, it is difficult to achieve changes in the order of magnitude within the maximum number of iterations. In this paper, β is replaced by the smoothness parameter ρ=β/α, thereby modifying the update formula for μ_b to the following:

μ_{b} = {(I + ρ D^{T} D)}^{- 1} (x - μ_{s})

(30)

I + ρD^TD is a symmetric array which can be accelerated using the Cholesky decomposition [24]:

U = c h o l (I + ρ D^{T} D)

(31)

Finally, Equation (30) can be revised to the following:

μ_{b} = U \ (U^{T} \ (x - μ_{s}))

(32)

In SBL-BC, the algorithm terminates when the relative change rate of s is less than ε, as in following equation:

\frac{| | μ_{s}^{t} - μ_{s}^{t - 1} | |}{| | μ_{s}^{t} | |} < ε

(33)

This algorithm retains the traditional convergence criteria while addressing the issue of persistent negative skewness in residuals under strong noise conditions caused by insufficient sparsity in the original method. A dynamic termination condition based on residual skewness monitoring is proposed: iteration is terminated when the absolute value of the third-order moment of residual distribution reaches its minimum. At this point, residuals approximate noise. As derived from Equation (1), under accurate baseline correction, spectral peaks can be correctly estimated. The mathematical formulation of skewness is as follows:

s k e w (r) = \frac{\frac{1}{N} \sum_{n = 1}^{N} {(r_{n} - \bar{r})}^{3}}{{(\frac{1}{N} \sum_{n = 1}^{N} {(r_{n} - \bar{r})}^{2})}^{\frac{3}{2}}}

(34)

Due to the gradually decreasing skewness of the noise, the improved algorithm employs a dynamic termination criterion: iteration stops when skew(r^(t−1)) > 0 and skew(r^(t)) < 0, selecting the iteration step with the smaller absolute skewness as the final solution. This mechanism monitors skewness sign changes to capture the optimal state of residual symmetry, thereby avoiding non-convergence issues inherent in traditional criteria.

For precise handling of LIBS spectral data in this study, Algorithm 1: SBL-BC for LIBS was adopted for the specific algorithm process, as follows:

Algorithm 1: SBL-BC for LIBS
1.	Input spectral data x, maximum iterations maxiter, and error tolerance ε.
2.	Normalize x and save the normalization parameter scale_x. Initialize α, γ, ρ, and set the initial iteration variable t = 0.
3.	Let s = 0. Compute U using Equation (31), and initialize μ_b using Equation (32).
4.	Update μ_s using Equations (20)–(22), update α, γ using Equations (25), (27)–(29), and update μ_b using Equation (32).
5.	Verify whether skew(r⁽^t⁻¹⁾) > 0 and skew(r⁽^t⁾) < 0, are satisfied, if not, go to step 7.
6.	Verify whether abs(r⁽^t⁻¹⁾) < abs(r⁽^t⁾) is satisfied, if it holds, go to step 8, otherwise go to step 9.
7.	Verify whether Equation (33) or t == maxiter is satisfied. If not, go to step 4; otherwise, go to step 9.
8.	Output μ_s = scale_x ∗ μ_s^(t−1), μ_b = scale_x ∗ μ_b^(t−1).
9.	Output μ_s = scale_x ∗ μ_s^(t), μ_b = scale_x ∗ μ_b^(t).

3. Experimentation and Analysis

3.1. Simulation of Spectrum Generation

In order to simulate the generation of effective LIBS spectral peaks for simulation experiments, a set of simulation spectrum generation algorithms with specific conditions were designed in this study to generate spectra that match the characteristics of the actual remote LIBS spectra collected, covering the total number of data points N, the number of peaks M, the peak width, the distribution of peak heights, and other characteristics. First, the basic parameters of the simulated spectra were determined, and N = 2000 was set to meet the demand of spectral details. M = 100 peaks were determined, and the number of data points of each peak was in the range of 5 to 7, which is in line with the characteristics of actual LIBS spectra that have narrower peak widths. Then, the linespace function of MATLAB was used to construct a linear space containing 1000 points, and the Gaussian function was used to generate continuous-unit Gaussian peaks as the basic template for subsequent peak sampling. At the same time, a zero matrix of N rows and M columns was created to initialize the spectral matrix, which was then used to generate and combine the peaks.

After that, peaks were generated one by one. For each peak, first, we randomly selected the width of 5 to 10 data points and the total number of data points within the range of the starting position, based on the parity of the peak width, to determine the number of left and right sampling points, ensure that the difference is no more than 1 according to the peak width of the different settings of the initial random range of sampling points, and enhance the randomness of the calculation of the appropriate sampling interval on the Gaussian peaks’ equally spaced samples. This ensured that the sampling index is a positive integer; then, we extended the sampling peaks to the total number of data points and set the value close to “0”.

Finally, weighted column vectors were generated based on the statistical properties of the LIBS peaks, and the peaks were classified into three categories, with 2% of the peaks having peak heights in the range of 3000–5000, 15% in the range of 1000–3000, and the remaining peaks in the range of 200–1000, ensuring that the lower limit of the peak height was 200, which closely aligned with the characteristics of the peak height distribution of the actual spectra. The spectral matrix was multiplied by the weight column vector to obtain the final simulation peak spectrum preservation. The specific algorithm flow is shown in Figure 1.

The baseline was constructed using a form similar to SBL-BC, represented by the superposition of trigonometric functions, exponential functions, and constants to simulate complex baseline components. The specific formula is presented as follows, and the resulting baseline and residuals are shown in Figure 2, wherein the “Intensity” on the ordinate of all subsequent spectral figures (starting from Figure 2) is dimensionless, corresponding to ADC counts reflecting relative optical signal strength.

b (i) = 300 (s i n (\frac{i}{500}) + e x p (- \frac{3 i}{500}) + 1)

(35)

Noise was generated using MATLAB’s built-in randn() function and saved. The size of the simulated spectral peak was fixed, while the noise level was adjusted to achieve a final SNR of 20 dB, 10 dB, or 5 dB. The resulting simulated spectral diagrams are shown in Figure 3.

3.2. Real Spectrum Acquisition

The experimental device for the real spectral source is the telescopic LIBS system shown in Figure 4, which consists of a focusing system, a laser emission and spectral acquisition system, and a control system. The core component of the focusing system is the lens set, which is mounted on a displacement platform equipped with a motor controller and connected to a computer via the RS485 communication protocol, realizing the precise control of its position. For the laser emission and spectral acquisition system, a Nd:YAG solid-state laser, model Dawa-200 (manufactured by Beijing Beamtech Optronics Co., Ltd, located in Beijing, China), was used as the light source for the experiment, with the following specific parameters: a wavelength of 1064 nm, a repetition frequency of 10 Hz, a single-pulse laser energy of 200 mJ, and a pulse width of about 7 ns. The Q-switching signal (a trigger pulse) has a high-level duration of 10 μs, and its rising edge is delayed by 150 μs relative to the rising edge of the CLK signal. This delay occurs before LIBS signal generation and thus does not affect spectral acquisition, consistent with the laser manual’s recommendation for optimal pulse intensity. The optical path of the system is illustrated in Figure 5. Specifically, the telescope is a Cassegrain telescope, which is connected to six AvaSpec NEXOS spectrometers (manufactured by Avantes, with its Chinese branch, Beijing Avantes Technology Co., Ltd located in Beijing, China) of different wavelengths through a one-part six-fiber fiber optic to detect the spectral signals in the wavelength range of 180 nm–885 nm. In the control system, the ARM-FPGA acquisition card not only undertakes the task of timing control of the laser and spectrometer but also receives commands from the host computer through the USB protocol so as to realize the collaborative work of various modules of the system.

The sample selected for this experiment is stainless steel 316, with the specific specimen number YSB S 353242-2019, as shown in Figure 6. Its main elements include iron (the matrix element), chromium, nickel, molybdenum, etc. Since the spectrum of stainless steel exhibits prominent spectral lines in the near-ultraviolet range, data from the spectrometer covering the wavelength range of 306.530–417.205 nm were selected for processing.

3.3. Spectral Peak Fitting Error Experiment

Based on the assumption that peak widths within the same spectrum exhibit minimal variation, the SBL-BC algorithm initializes all Gaussian functions in set A with identical variance parameters. Updating these parameters relies on the gradient of the ELBO with respect to precision. To avoid oscillations during parameter updates, a fixed-step gradient method with single-step updates can be employed for variance optimization. However, due to the small step size, this approach may fail to converge to the optimal peak width within the maximum number of iterations. Therefore, appropriate initialization of the initial variance parameters is crucial.

In this section, the SBL-BC algorithm with different initialization variance parameters is employed to process real spectral data at a distance of 60 cm, comparing their estimated baseline and residuals. The experiment sets σ_j² to 0.1, 0.3, 0.5, 1, 5, 8, and 15, corresponding to peak width data points of 1, 3, 5, 7, 13, 17, and 23, respectively. The final results are shown in Figure 7 and Figure 8.

From the figures, it can be observed that the correction results of SBL-BC exhibit significant variations under different initialization variance parameters. The ideal residual should be independent and identically distributed Gaussian noise with a uniform intensity across different spectral wavelength regions. The ideal baseline should exhibit undulations internally while maintaining overall continuity without abrupt changes. When the spectral peak width is relatively broad, the peak width in the dictionary may exceed the actual LIBS peak width, causing the dictionary’s peaks to fail in representing the true LIBS spectral peaks, thereby introducing errors in baseline and spectral peak estimation. In the extreme case of σ_j² = 15, significant outliers appear in the residuals within strong peak regions, and the baseline even exhibits localized concave deformation. However, when the initial Gaussian peak width in the dictionary is narrower (e.g., σ_j² = 0.1, 0.3, 0.5, 1), it aligns with the aforementioned ideal characteristics. Specifically, under the condition of an initial peak width of 1 (σ_j² = 0.1), not only are excellent correction results achieved, but the final dictionary matrix A also achieves the highest sparsity. Theoretically, this configuration allows for the fastest algorithm convergence speed, which indirectly validates the rationality of the dictionary matrix pruning method proposed in this work.

3.4. Residual Skewness Experiment

To verify the necessity and advantages of monitoring residual skewness, in this section, the following experiments are detailed: comparisons of the residuals res1 (generated from 5 dB Gaussian noise) and res2 (obtained from 5 dB low-SNR spectra processed by SBL-BC with σ_j² = 0.5 and SBL-BC-Simple algorithms) and observations of the evolutionary patterns of residuals as iterations progress. Here, the three methods are related as follows: SBL-BC-Simple is a simplified version of the original SBL-BC, which omits both the dictionary matrix and the residual skewness monitoring mechanism; SBL-BC for LIBS is an improved version of SBL-BC-Simple, further incorporating an iteration termination condition.

To comprehensively present the residual evolution patterns, the iteration termination conditions for both methods were disabled by explicitly ignoring Equation (33).

Figure 9 shows the correction results of SBL-BC and SBL-BC-Simple obtained via the above two methods. Compared with Figure 2, the sparsity deteriorates significantly, with many noisy features misidentified as spectral peaks. As seen in Figure 10, although Gaussian noise exhibits a symmetric distribution, both residuals res1 and res2 show distinct negative skewness: the maximum positive residual is significantly smaller than the absolute value of the maximum negative residual. Using Equation (34), the skewness values of noise, res1, and res2 were calculated as 0.0137, −0.9748, and −0.7785, respectively. This leads to the conclusion that the spectral peak structure introduced by the dictionary matrix cannot fully mitigate the negative skewness phenomenon. Furthermore, in-depth analysis of the distribution characteristics of noise and residual res1 reveals a high degree of similarity in their distribution patterns within the negative value regions. Based on this observation, it can be inferred that if residual res1 satisfies symmetric distribution conditions, its positive residual portion may achieve an ideal state through transitivity, thereby serving as an effective approximate estimate for positive noise.

Since the baseline converges rapidly, the iterative process primarily manifests as an energy exchange between noise and spectral peaks in most iterations. Figure 11 and Figure 12 illustrate the variations in res1 and res2 at the 1st, 3rd, 5th, 7th, 10th, 15th, 20th, and 100th iteration. Figure 11 shows that in early iterations, the spectral peak estimates approach zero due to the large initial value of hyperparameter γ (strong sparsity constraint), resulting in residuals r ≈ x and a positively skewed distribution. During the iteration process, the residual energy gradually flows into the spectral peaks. Positive residual energy can be progressively absorbed by the signal, while negative residuals remain unchanged due to the non-negativity constraint, causing the residual skewness to transition from positive to negative. Subsequently, the spectral peaks continue absorbing positive noise, further exacerbating negative skewness. However, traditional convergence criteria fail to respond to this, preventing the algorithm from terminating at the optimal solution. At maximum iterations, the intensity of positive noise becomes significantly weaker than that of negative noise. When the critical iteration point (skewness ≈ 0) is reached, the symmetry of the residual distribution is temporarily optimized, aligning most closely with the noise distribution. This demonstrates the feasibility of the proposed residual skewness monitoring mechanism. Figure 12 exhibits similar variation patterns but reveals residual outliers at fine spectral peaks (as shown in Figure 7) due to their initially larger spectral widths relative to certain positions. This indicates that the suppression of negative skewness and the reduction in fitting errors cannot be simultaneously optimized through initial variance parameter settings, further highlighting the necessity of introducing the residual skewness monitoring mechanism.

3.5. Algorithm Comparison Experiment

To verify the performance of this algorithm, simulated spectral data were used for comparison with the SBL-BC algorithm. For SBL-BC for the LIBS algorithm in this paper, the initial parameters of the spectrum were set as follows: α = 1, γ = 1_N × 10⁶, ρ = 10⁵, ε = 0.01, and maximum number of iterations = 100. For the traditional SBL-BC algorithm, while keeping the above parameters unchanged, the initial parameter of the variance of the Gaussian peak function was set as σ_j² = 0.1, 0.3, 0.5, and 1, corresponding to Gaussian peak spectral widths of 1, 3, 5, and 7, respectively. The program runs on a computer equipped with an Intel Core i7-1360P CPU (2.20 GHz base frequency) and 16 GB of onboard RAM (4800 M transfers per second), with Windows 11 as the operating system and MATLAB R2022a as the specific software.

The simulation experiment used normalized mean square error (NMSE), speed (denoted by runtime), the number of iterations, and the sparsity of the estimated peaks as metrics. The definition of NMSE is shown below, where s_m denotes the true original signal, and s_m with a hat symbol represents the reconstructed signal.

N M S E = \frac{1}{M} \sum_{m = 1}^{M} \frac{{‖\hat{s_{m}} - s_{m}‖}^{2}}{{‖s_{m}‖}^{2}}

(36)

The “sparsity of the estimated peaks” specifically refers to the ratio of the number of zero data points to the total number of data points. The smaller this value, the sparser the estimated spectral peaks, which also means the algorithm is less likely to misjudge noise as spectral peaks.

The experiment compared the speed and correction effectiveness of SBL-BC for LIBS and SBL-BC for simulated spectra with different SNRs. The number of iterations and convergence behavior of each method were analyzed, with the final results presented in Table 1.

As shown in the table above, SBL-BC for LIBS demonstrates significant superiority over the original SBL-BC algorithm, with initial σ_j² values of 0.3, 0.5, and 1 under different SNRs. Specifically, the NMSE of the former is consistently lower than that of the latter across all tested conditions; the former achieves computation speeds on the millisecond level, while the latter consistently exceeds one second. SBL-BC for LIBS also converges within the specified iteration limit, whereas the original SBL-BC algorithm requires significantly more iterations and may even fail to converge. For the SBL-BC algorithm with an initial σ_j² = 0.1, although its NMSE outperforms SBL-BC for LIBS under medium-SNR conditions and shows a comparable NMSE under both high- and low-SNR conditions, SBL-BC for LIBS converges faster and does not exhibit failure to converge in low-SNR scenarios (20 dB). Additionally, the sparsity of SBL-BC for LIBS is consistently lower than that of the original SBL-BC algorithm, indicating that SBL-BC for LIBS has a stronger ability to avoid judging noise as spectral peaks. Overall, SBL-BC for LIBS demonstrates a more advantageous performance across broader operational ranges.

The baseline correction and peak estimation results of SBL-BC for LIBS on simulated spectra are shown in Figure 13.

3.6. Real Spectrum Test

This section focuses on analyzing the measured spectra of 316 stainless steel samples to validate the algorithm’s applicability under different signal-to-noise ratios and signal-to-background ratios. Using the experimental setup described in Section 3.2, LIBS tests were conducted on signals at a distance of 60 cm. The correction results for multiple excitations at the same sample point are presented below, demonstrating reliable outcomes.

To clarify the large intensity variation between the two raw LIBS spectra in Figure 14 and Figure 15, the underlying mechanisms are as follows: The strong signal in the first laser shot likely arises from surface contaminants such as stains and oxide layers which have a low ablation threshold and can be fully ionized easily, resulting in prominent emission. The second laser pulse interacts with the stainless steel substrate; due to the substrate’s higher ablation threshold and the abrupt surface state transition such as a transition layer formed after contaminant removal, the signal intensity drops sharply. As wpre-ablation proceeds, the surface gradually stabilizes with complete contaminant removal and regular ablation crater morphology, causing the signal to recover and stabilize gradually.

Furthermore, it can be observed from the figure that during the first excitation, the two Ca spectral lines at 393.361 and 396.821 are relatively prominent, while the spectral lines near 340 are weaker. In the subsequent excitation, these two lines gradually weaken, whereas the lines around 340 become increasingly stronger. This indicates that the former corresponds to surface contamination, while the latter represents the constituent elements of stainless steel after contamination removal.

4. Summary and Outlook

This paper focuses on the challenges in LIBS analysis. To address the limitations of traditional methods, we propose an improved LIBS estimation approach based on residual skewness monitoring. The algorithm’s design features a reconstructed spectral peak model, eliminating the dictionary matrix and streamlining the computational process. Additionally, a residual skewness monitoring mechanism is introduced to optimize iteration termination criteria. Simulated spectral experiments generate realistic spectral data, verifying the algorithm’s effectiveness across varying SNRs. Compared to the traditional SBL-BC algorithm, our method demonstrates advantages in accuracy, speed, and convergence. Real spectral experiments on stainless steel samples using a telescope-based LIBS system achieve spectral correction and enable analysis of surface compositional variations.

However, this study still has room for improvement. On the one hand, although most parameters of the algorithm can be automatically learned and updated based on the data after initialization, the smoothing parameters β or ρ remain fixed and require manual adjustment through cross-validation. Future work could focus on developing adaptive mechanisms for these smoothing parameters, potentially by incorporating variable-step gradient descent algorithms [25] or heuristic optimization methods [26] to achieve automatic tuning across orders of magnitude. On the other hand, the experiments were conducted only on single-material samples. Given the significant characteristic spectral variations across different materials, future studies should expand the range of sample types to validate the algorithm’s adaptability in more complex real-world scenarios. Furthermore, integrating complementary spectral analysis techniques or advanced machine learning algorithms could enhance the accuracy and reliability of LIBS spectral analysis, thereby promoting the broader application of this technology in various fields.

Author Contributions

Conceptualization, B.Z. and X.S.; methodology, B.Z. and X.S.; software, B.Z. and X.S.; validation, B.Z., X.S., and S.W.; formal analysis, Y.H.; investigation, J.M.; resources, T.L.; data curation, S.W.; writing—original draft preparation, B.Z. and X.S.; writing—review and editing, B.Z. and X.S.; visualization, R.W.; supervision, L.S.; project administration, B.Z. and T.L.; funding acquisition, B.Z. and T.L. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the National Key Research and Development Program of China (2022YFB3707203).

Data Availability Statement

The data are not publicly available due to privacy or ethical restrictions.

Conflicts of Interest

Author B. Z. was employed by the company Suzhou Nuclear Power Research Institute. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest. The authors declare that this study received funding from the National Key Research and Development Program of China (2022YFB3707203). The funder was not involved in the study design, collection, analysis, interpretation of data, the writing of this article or the decision to submit it for publication.

Abbreviations

The following abbreviations are used in this manuscript:

LIBS	Laser-Induced Breakdown Spectroscopy
SBL-BC	Sparse Bayesian Learning–Baseline Correction
SNR	Signal-to-Noise Ratio
NMSE	Normalized Mean Square Error

References

Myakalwar, A.K. LIBS as a Spectral Sensor for Monitoring Metallic Molten Phase in Metallurgical Applications—A Review. Minerals 2021, 11, 1073. [Google Scholar] [CrossRef]
Pathak, A.K.; Kumar, R.; Singh, V.K.; Agrawal, R.; Rai, S.; Rai, A.K. Assessment of LIBS for Spectrochemical Analysis: A Review. Appl. Spectrosc. Rev. 2011, 47, 14–40. [Google Scholar] [CrossRef]
Rehse, S.J. A review of the use of laser-induced breakdown spectroscopy for bacterial classification, quantification, and identification. Spectrochim. Acta Part B At. Spectrosc. 2019, 154, 50–69. [Google Scholar] [CrossRef]
Sezer, B.; Bilge, G.; Boyaci, I.H. Capabilities and limitations of LIBS in food analysis. TrAC Trends Anal. Chem. 2017, 97, 345–353. [Google Scholar] [CrossRef]
Huang, X.H. Element Detection in Scrap Steel Using Portable LIBS and Sparrow Search Algorithm-Kernel Extreme Learning Machine(SSA-KELM). Spectrosc. Spectr. Anal. 2024, 44, 2412–2419. [Google Scholar]
Yan, H.Y. Adaptive Baseline Correction Method for Laser-induced Breakdown Spectroscopy. Acta Photonica Sin. 2024, 53, 271–281. [Google Scholar]
Bai, W.Y. Fine Classification Method of Stainless Steel Based on LIBS Technology. Laser Optoelectron. Prog. 2022, 59, 270–274. [Google Scholar]
Tao, L. Using Laser Induced Breakdown Spectroscopy and Machine Learning to Identify Jiangxi Spring Tea Harvesting Periods. Laser Optoelectron. Prog. 2024, 61, 491–497. [Google Scholar]
Yu, S. A New Approach for Spectra Baseline Correction Using Sparse Representation. In Proceedings of the IASTED International Conference, Phuket, Thailand, 10–12 April 2013. [Google Scholar]
Ning, X.; Selesnick, I.W.; Duval, L. Chromatogram baseline estimation and denoising using sparsity (BEADS). Chemom. Intell. Lab. Syst. 2014, 139, 156–167. [Google Scholar] [CrossRef]
Han, Q. Simultaneous spectrum fitting and baseline correction using sparse representation. Analyst 2017, 142, 2460–2468. [Google Scholar] [CrossRef]
Li, H. Sparse Bayesian learning approach for baseline correction. Chemom. Intell. Lab. Syst. 2020, 204, 104088. [Google Scholar] [CrossRef]
Eilers, P.H.; Boelens, H.F. Baseline correction with asymmetric least squares smoothing. Leiden Univ. Med. Cent. Rep. 2005, 1, 5. [Google Scholar]
Zhang, Z.; Chen, S.; Liang, Y. Baseline correction using adaptive iteratively reweighted penalized least squares. Analyst 2010, 135, 1138–1146. [Google Scholar] [CrossRef]
Baek, S. Baseline correction using asymmetrically reweighted penalized least squares smoothing. Analyst 2015, 140, 250–267. [Google Scholar] [CrossRef]
Zhang, F. Baseline correction for infrared spectra using adaptive smoothness parameter penalized least squares method. Spectrosc. Lett. 2020, 53, 222–233. [Google Scholar] [CrossRef]
Xie, S. Accuracy improvement of quantitative LIBS analysis using wavelet threshold de-noising. J. Anal. At. Spectrom. 2017, 32, 629–637. [Google Scholar] [CrossRef]
Tibshirani, R. Regression shrinkage and selection via the lasso. J. R. Stat. Soc. Ser. B Stat. Methodol. 1996, 58, 267–288. [Google Scholar] [CrossRef]
Jaakkola, T.S.; Jordan, M.I. Bayesian parameter estimation via variational methods. Stat. Comput. 2000, 10, 25–37. [Google Scholar] [CrossRef]
Kullback, S.; Leibler, R.A. On information and sufficiency. Ann. Math. Stat. 1951, 22, 79–86. [Google Scholar] [CrossRef]
Jordan, M.I. An introduction to variational methods for graphical models. Mach. Learn. 1999, 37, 183–233. [Google Scholar] [CrossRef]
Blei, D.M.; Kucukelbir, A.; McAuliffe, J.D. Variational inference: A review for statisticians. J. Am. Stat. Assoc. 2017, 112, 859–877. [Google Scholar] [CrossRef]
Murphy, K.P. Conjugate Bayesian analysis of the Gaussian distribution. arXiv 2022, arXiv:2212.13612. [Google Scholar]
Krishnamoorthy, A.; Menon, D. Matrix inversion using Cholesky decomposition. In Proceedings of the 2013 IEEE Signal Processing: Algorithms, Architectures, Arrangements, and Applications (SPA), Poznan, Poland, 26–28 September 2013. [Google Scholar]
Ma, S. A LIBS spectrum baseline correction method based on the non-parametric prior penalized least squares algorithm. Anal. Methods 2024, 16, 4360–4372. [Google Scholar] [CrossRef] [PubMed]
Yan, H.Y. Explosive Residue Identification Study by Remote LIBS Combined With GA-arPLS. Spectrosc. Spectr. Anal. 2024, 44, 3199–3205. [Google Scholar]

Figure 1. Peak spectrum generation method.

Figure 2. Simulated spectral peaks and simulated baselines.

Figure 3. Simulated spectra with different signal-to-noise ratios.

Figure 4. Experimental setup.

Figure 5. The optical path of the system.

Figure 6. Stainless steel 316 sample (manufactured by Shenyang Northern Standard Sample Distribution Center, located in Shenyang, Liaoning Province, China).

Figure 7. SBL-BC residuals with different initial variance parameters.

Figure 8. Baseline of SBL-BC estimation with different initial variance parameters and raw spectrum.

Figure 9. Correction results of SBL-BC and SBL-BC-Simple.

Figure 10. Comparison of 5 dB noise and calibrated residuals.

Figure 11. The variation in residual res1 with the number of iterations.

Figure 12. The variation in residual res2 with the number of iterations.

Figure 13. Estimated baseline and spectral peaks of SBL-BC for LIBS under different signal-to-noise ratios. The red lines represent the estimated baselines, and the black lines denote the corrected spectral peaks.

Figure 14. First and second excitation correction effects.

Figure 15. Third and fourth excitation correction effects.

Table 1. Performance of algorithm at different signal-to-noise ratios.

SNR	Indicator Parameters	SBL-BC -LIBS	SBL-BC (0.1)	SBL-BC (0.3)	SBL-BC (0.5)	SBL-BC (1.0)
5 dB	Speed/ms	2.47	229.29	1410.47	1641.65	1833.95
	Iterations	12	100	51	89	100
	NMSE	0.1062	0.1058	0.1374	0.1697	0.3269
	Sparsity	0.4955	0.8025	0.8475	0.5790	0.6225
10 dB	Speed/ms	2.42	66.79	1368.40	1601.67	1948.39
	Iterations	12	38	44	89	100
	NMSE	0.0330	0.0209	0.0521	0.1143	0.2894
	Sparsity	0.4785	0.8880	0.7370	0.6290	0.6995
20 dB	Speed/ms	8.65	48.55	1433.54	1710.76	1937.75
	Iterations	30	30	41	90	100
	NMSE	0.0089	0.0088	0.0406	0.0998	0.2775
	Sparsity	0.5190	0.9240	0.8475	0.7905	0.7195

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Zhu, B.; Shen, X.; Liu, T.; Wang, S.; Hang, Y.; Mo, J.; Shao, L.; Wang, R. Residual Skewness Monitoring-Based Estimation Method for Laser-Induced Breakdown Spectroscopy. Electronics 2025, 14, 3343. https://doi.org/10.3390/electronics14173343

AMA Style

Zhu B, Shen X, Liu T, Wang S, Hang Y, Mo J, Shao L, Wang R. Residual Skewness Monitoring-Based Estimation Method for Laser-Induced Breakdown Spectroscopy. Electronics. 2025; 14(17):3343. https://doi.org/10.3390/electronics14173343

Chicago/Turabian Style

Zhu, Bin, Xiangcheng Shen, Tao Liu, Sirui Wang, Yuhua Hang, Jianhua Mo, Lei Shao, and Ruizhi Wang. 2025. "Residual Skewness Monitoring-Based Estimation Method for Laser-Induced Breakdown Spectroscopy" Electronics 14, no. 17: 3343. https://doi.org/10.3390/electronics14173343

APA Style

Zhu, B., Shen, X., Liu, T., Wang, S., Hang, Y., Mo, J., Shao, L., & Wang, R. (2025). Residual Skewness Monitoring-Based Estimation Method for Laser-Induced Breakdown Spectroscopy. Electronics, 14(17), 3343. https://doi.org/10.3390/electronics14173343

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Residual Skewness Monitoring-Based Estimation Method for Laser-Induced Breakdown Spectroscopy

Abstract

1. Introduction

2. Algorithm Introduction

2.1. Problem Modeling

2.2. Variational Inference Solution and Parameter Optimization

3. Experimentation and Analysis

3.1. Simulation of Spectrum Generation

3.2. Real Spectrum Acquisition

3.3. Spectral Peak Fitting Error Experiment

3.4. Residual Skewness Experiment

3.5. Algorithm Comparison Experiment

3.6. Real Spectrum Test

4. Summary and Outlook

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI