Modeling of Chromium, Copper, Zinc, Arsenic and Lead Using Portable X-ray Fluorescence Spectrometer Based on Discrete Wavelet Transform

A modeling method based on discrete wavelet transform (DWT) was introduced to analyze the concentration of chromium, copper, zinc, arsenic and lead in soil with a portable X-ray fluorescence (XRF) spectrometer. A total of 111 soil samples were collected and observed. Denoising and baseline correction were performed on each spectrum before modeling. The optimum conditions for pre-processing were denoising with Coiflet 3 on the 3rd level and baseline correction with Coiflet 3 on the 9th level. Calibration curves were established for the five heavy metals (HMs). The detection limits were compared before and after the application of DWT, the qualitative detection limits and the quantitative detection limits were calculated to be three and ten times as high as the standard deviation with silicon dioxide (blank), respectively. The results showed that the detection limits of the instrument using DWT were lower, and that they were below national soil standards; the determination coefficients (R2) based on DWT-processed spectra were higher, and ranged from 0.990 to 0.996, indicating a high degree of linearity between the contents of the HMs in soil and the XRF spectral characteristic peak intensity with the instrument measurement.


Introduction
Rapid population growth and urbanization, together with expansion of industrial production, have resulted in serious soil contamination issues globally. Heavy metal (HM) pollution is a major type of soil pollution. HMs in soil mainly derive from atmospheric dust, sewage irrigation, mining and smelting, and the application of pesticides and fertilizers [1]. Heavy metal pollution in soil deteriorates air and water quality, causes a decline in the yield and quality of crops, and threatens human health through the food chain [2]. HM pollution is difficult to identify because of the range of different contaminants and attempts to conceal pollution events, and difficult to remediate owing to the complex chemical behavior of HMs and their ecological effects [3]. The main detection methods of HMs in soil involve the use of strong acid to digest soil samples, which are then tested by methods such as atomic absorption spectroscopy (AAS) [4], inductively coupled plasma atomic emission spectrometry (ICP-AES) [5], and inductively coupled plasma mass spectrometry (ICP-MS) (ICP-MS) [6]. These methods are highly accurate and have good precision; however, the pre-processing steps are tedious, time-consuming, expensive, and the strong acid used in the experiments may also contribute to contamination. A rapid detection method with simple pre-processing and accurate detection results is needed. X-ray fluorescence (XRF) spectroscopy has the potential to meet these requirements, owing to its ability to rapidly analyze samples, the simplicity of its sample pre-treatment and operation, its

Collection of Soil Samples, Equipment and Measurement Conditions
All measurements were made with a portable XRF spectrometer (SX-100S, Beijing Research Center for Agricultural Standards and Testing, Beijing, China) fitted with an Ag anode X-ray tube, Al + Mo filter, and silicon drift detectors ( Figure 1). The instrument was operated at a voltage of 30 kV, current of 30 µA, and detection time of 100 s. Soil samples were placed into an ethylene sample cup (D × H: 30 mm × 10 mm, NCS Testing Technology Co., Beijing, China) with a fixed Mylar film (Premier Lab Supply Co., Woburn, MA, USA; special film for X-ray analysis, thickness: 6 µm) collar.
A total of 111 samples were collected, including 46 national standard soil samples (National Standard Material Research Center, Beijing, China), 30 standard addition soil samples (collected from Heilongjiang, Yunnan, Jiangsu and Xinjiang provinces from typical soil, namely, black soil, paddy soil, red soil, and brown soil), and 35 natural soil samples (collected from around Beijing) from the soil surface of farmland at a depth of 0-20 cm. The samples were preserved after air-drying, ground, and sieved with a nylon mesh indoors. The standard additional soil samples were mixed with the appropriate metal salts over a range of concentrations by serial dilution. All the tools used during the processing were made of ceramic or agate to ensure no interfering XRF signals at the tested heavy metal peak energies. Analysis of the HMs in the soil samples was performed in accordance with national standards, including the national standard soil samples (GSS-4) for quality control; three spectra were collected for each sample. The content of chromium (Cr), copper (Cu), zinc (Zn), lead (Pb) were determined with a Solaar-M (Thermo Fisher Scientific Inc., Waltham, MA, USA) atomic absorption graphite furnace. Arsenic (As) content was tested with a AFS-830 (Jitian Instrument Inc., Beijing, China) atomic fluorescence analyzer. A total of 111 samples were collected, including 46 national standard soil samples (National Standard Material Research Center, Beijing, China), 30 standard addition soil samples (collected from Heilongjiang, Yunnan, Jiangsu and Xinjiang provinces from typical soil, namely, black soil, paddy soil, red soil, and brown soil), and 35 natural soil samples (collected from around Beijing) from the soil surface of farmland at a depth of 0-20 cm. The samples were preserved after air-drying, ground, and sieved with a nylon mesh indoors. The standard additional soil samples were mixed with the appropriate metal salts over a range of concentrations by serial dilution. All the tools used during the processing were made of ceramic or agate to ensure no interfering XRF signals at the tested heavy metal peak energies. Analysis of the HMs in the soil samples was performed in accordance with national standards, including the national standard soil samples (GSS-4) for quality control; three spectra were collected for each sample. The content of chromium (Cr), copper (Cu), zinc (Zn), lead (Pb) were determined with a Solaar-M (Thermo Fisher Scientific Inc., Waltham, MA, USA) atomic absorption graphite furnace. Arsenic (As) content was tested with a AFS-830 (Jitian Instrument Inc., Beijing, China) atomic fluorescence analyzer.

Wavelet Transform
Wavelet transform [22] is a localized analysis of time (space) frequency by stretching shift operation on the signal (function) multiscale refined gradually, and ultimately achieves time segmentation at high frequencies and frequency segmentation at low frequencies [23,24]. Since most signals in nature are non-stationary, the characteristics of WT make it superior for signal analysis compared to Fourier transform and short-time Fourier transform. Fourier transform can only capture the frequency components contained in a signal, but cannot calculate when the components appear, so it is inherently defective for dealing with non-stationary signals. Short-time Fourier transform (STFT) adds windows during the decomposition process to obtain time components, but it is hard to find the appropriate width of the window when dealing with non-stationary signals. The narrow window has a high time resolution and a low frequency resolution, the wide window has a low time resolution and a high frequency resolution. For time-varying non-steady signals, high frequencies are suitable for small windows, low frequencies are suitable for large windows. However, the STFT window is fixed and the width does not change during one STFT calculation. So STFT is still unable to meet the need to analyze the frequency change of an unsteady signal. The WT can automatically adapt to the requirements of time-frequency signal analysis, and thus to focus on any details of a signal. The wavelet is a function family produced by panning or stretching a form function ( ) that satisfy certain conditions [25]:

Wavelet Transform
Wavelet transform [22] is a localized analysis of time (space) frequency by stretching shift operation on the signal (function) multiscale refined gradually, and ultimately achieves time segmentation at high frequencies and frequency segmentation at low frequencies [23,24]. Since most signals in nature are non-stationary, the characteristics of WT make it superior for signal analysis compared to Fourier transform and short-time Fourier transform. Fourier transform can only capture the frequency components contained in a signal, but cannot calculate when the components appear, so it is inherently defective for dealing with non-stationary signals. Short-time Fourier transform (STFT) adds windows during the decomposition process to obtain time components, but it is hard to find the appropriate width of the window when dealing with non-stationary signals. The narrow window has a high time resolution and a low frequency resolution, the wide window has a low time resolution and a high frequency resolution. For time-varying non-steady signals, high frequencies are suitable for small windows, low frequencies are suitable for large windows. However, the STFT window is fixed and the width does not change during one STFT calculation. So STFT is still unable to meet the need to analyze the frequency change of an unsteady signal. The WT can automatically adapt to the requirements of time-frequency signal analysis, and thus to focus on any details of a signal. The wavelet is a function family produced by panning or stretching a form function ψ(t) that satisfy certain conditions [25]: where ψ(t) is the wavelet basis (WB), and a and b are the scale and translation parameters, respectively. Many wavelets have been developed, including Haar, Daubechies, Coiflet, Symlets, Mexican Hat. The DWT is often used instead of WT in computation by Equation (2) [26].
where j, k ∈ Z. The discrete wavelet coefficients of a function f (t) or a signal are computed according to Equation (3) [27].

Mallat Algorithm
The Mallat [28] algorithm is often performed for DWT. Combined with multi-resolution analysis, Mallat proposes a method with a sub-band structure to achieve a DWT algorithm, unified computing sub-band filter and WT. The L 2 (R) space produces two subspaces with multi-resolution analysis (MRA)-scale space {V j } j∈Z and wavelet space {W j } j∈Z -and {ϕ j,k } j,k∈Z and {ψ j,k } j,k∈Z are the orthonormal bases of the two spaces, respectively. The process of the Mallat decomposition and reconstruction algorithm is shown in Figure 2. It is assumed that a conjugate mirror filter is produced by the orthogonal scaling function and wavelet function [29,30]. The scaling coefficients and wavelet coefficients of the WT as c j,k and d j,k , and the recursive formulas are shown in Equations (4) and (5), which enable the calculation {c j,k , d j,k } [31]. Thus, with the initial sequence {c j,k } k∈Z in space V j , we can calculate all the scale coefficients and wavelet coefficients of any spaces V j (j < J). Equations (4) and (5) are known as the DWT decomposition formula. The Mallat reconstruction formula is given in Equation (6): Mexican Hat. The DWT is often used instead of WT in computation by eq 2 [26].
where , ∈ Z . The discrete wavelet coefficients of a function ( ) or a signal are computed according to eq 3 [27].

Mallat Algorithm
The Mallat [28] algorithm is often performed for DWT. Combined with multi-resolution analysis, Mallat proposes a method with a sub-band structure to achieve a DWT algorithm, unified computing sub-band filter and WT. The L 2 (R) space produces two subspaces with multi-resolution analysis (MRA)-scale space {Vj}jZ and wavelet space {Wj}jZ-and {j,k}j,kZ and { j,k}j,kZ are the orthonormal bases of the two spaces, respectively. The process of the Mallat decomposition and reconstruction algorithm is shown in Figure 2. It is assumed that a conjugate mirror filter is produced by the orthogonal scaling function and wavelet function [29,30]. The scaling coefficients and wavelet coefficients of the WT as cj,k and dj,k, and the recursive formulas are shown in eq 4 and 5, which enable the calculation {cj,k, dj,k} [31]. Thus, with the initial sequence {cj,k}kZ in space Vj, we can calculate all the scale coefficients and wavelet coefficients of any spaces Vj (j < J). Equation 4 and 5 are known as the DWT decomposition formula. The Mallat reconstruction formula is given in eq 6:

Wavelet Transform Processing
Wavelet denoising mainly includes three steps: Signal decomposition, high-frequency coefficient threshold quantization and signal reconstruction. The most critical step in the process is how to select and quantify the threshold, which directly relates to the quality of denoising.

Wavelet Transform Processing
Wavelet denoising mainly includes three steps: Signal decomposition, high-frequency coefficient threshold quantization and signal reconstruction. The most critical step in the process is how to select and quantify the threshold, which directly relates to the quality of denoising.

Determination and Validation of Calibration Curves
The characteristic Kα X-ray line, peak position and absorption band of five HMs are listed in Table 1. After processed by DWT, calibration curves were constructed with mean values of counts and standard values. The measured spectra were re-transferred back to the software to obtain detection values calculated from the calibration curves. Detection values and standard values were contrasted to verify the accuracy of the instrument. To determine the detection limits of the instrument and investigate the reproducibility and accuracy further, soil samples with different compositions together with silicon dioxide (blank, Aladdin industrial Co., Shanghai, China) were chosen for retesting. Each sample was tested 11 times.

Evaluation Criteria of Denoising Results
To ensure the consistency between the denoising and original spectrum, we examined the signal to noise ratio (SNR, Equation (7)), mean square error (MSE, Equation (8)) and information entropy (H, Equation (9)) [32,33]. For SNR and H, a larger value is better, for MSE a lower value is better.
where N is the channel number, y i is the original value, x i is the noise-free value after processing, p(x i ) is the probability of information appears at a certain point. To comprehensively evaluate the effect of denoising, the coefficient α was created, which is proportional to the denoising effect:

Selection of WB
All the signal processes were performed on a desktop (Hewlett-Packard, P6-1499CN, Palo Alto, CA, USA) using MATLAB version 2014a software (MathWorks Inc., Natick, MA, USA). All programs for noise reduction and baseline correction were written locally in the lab. Some commonly used wavelets were chosen to denoise the XRF signal at a decomposition level of 4 with soft thresholding mode ( Table 2, Coiflet (coif) 2-5, Daubechies (db) 5-10, Symmlet (sym) 5-8). Coif 3 was chosen as the optimized wavelet and given the lowest value of α. The optimal decomposition level was determined by coif 3 from level 3 to 10 ( Table 3). Level 3 proved to be the optimum.  The detected spectra of 111 samples were processed by DWT. To show the results clearly, a representative spectrum is shown in Figure 3. The denoising results in the figure in the channel range 700-2000 are expanded to enable observation of signal processing effects on the HMs peaks. The peaks of Cr, Cu, Zn, As, and Pb appear around the channels 816-856, 1224-1264, 1315-1355, 1609-1649, and 1926-1969, respectively. Figure 3c,d show that with coif 3 at level 3, the noise level was successfully reduced without affecting the peak shape.  The detected spectra of 111 samples were processed by DWT. To show the results clearly, a representative spectrum is shown in Figure 3  To effectively resolve the spectrum and improve the operation speed of the algorithm, a "speed-up" method of peeling peaks was used. The wavelet decomposition was used for the XRF spectrum f 0 (n) To effectively resolve the spectrum and improve the operation speed of the algorithm, a "speed-up" method of peeling peaks was used. The wavelet decomposition was used for the XRF spectrum f 0 (n) to extract the approximate part f j (n) from a higher scale; if f 0 (n) > f j (n), then f 0 (n) = f j (n). In this way, peak pealing was accelerated; the same process was then applied to the new f 0 (n), and the cycle was repeated several times, in order to obtain a relatively gentle curve. Wavelet decomposition was used on the relatively flat curve, and extracted an appropriate level as an approximation of the XRF spectra baseline to achieve better results.
Through many experiments, the optimized decomposition level was 9, which achieved a good approximation of the baseline (Figure 4), and the decomposition result was shown in a9. The spectrum between 1500-4000 channels was intercepted to show the result of calculation by DWT clearly. The signal at a9 was regarded to be the baseline, and was used for background deduction.
Baseline corrections (coif 3 at level 9) were performed based on denoising (coif 3 at level 3). The baseline was successfully removed without losing peak information ( Figure 5). The spectrum between 800-2000 channels was magnified to show the calculation's effects on the HM peaks more clearly; the baseline and peak positions were unaffected by the processing.
repeated several times, in order to obtain a relatively gentle curve. Wavelet decomposition was used on the relatively flat curve, and extracted an appropriate level as an approximation of the XRF spectra baseline to achieve better results.
Through many experiments, the optimized decomposition level was 9, which achieved a good approximation of the baseline (Figure 4), and the decomposition result was shown in a9. The spectrum between 1500-4000 channels was intercepted to show the result of calculation by DWT clearly. The signal at a9 was regarded to be the baseline, and was used for background deduction. Baseline corrections (coif 3 at level 9) were performed based on denoising (coif 3 at level 3). The baseline was successfully removed without losing peak information ( Figure 5). The spectrum between 800-2000 channels was magnified to show the calculation's effects on the HM peaks more clearly; the baseline and peak positions were unaffected by the processing.   repeated several times, in order to obtain a relatively gentle curve. Wavelet decomposition was used on the relatively flat curve, and extracted an appropriate level as an approximation of the XRF spectra baseline to achieve better results.
Through many experiments, the optimized decomposition level was 9, which achieved a good approximation of the baseline (Figure 4), and the decomposition result was shown in a9. The spectrum between 1500-4000 channels was intercepted to show the result of calculation by DWT clearly. The signal at a9 was regarded to be the baseline, and was used for background deduction. Baseline corrections (coif 3 at level 9) were performed based on denoising (coif 3 at level 3). The baseline was successfully removed without losing peak information ( Figure 5). The spectrum between 800-2000 channels was magnified to show the calculation's effects on the HM peaks more clearly; the baseline and peak positions were unaffected by the processing.

Instrument Calibration Curves
The concentration ranges of Cr, Cu, Zn, As, and Pb were 32-749, 11.4-577, 31-680, 4.4-806, and 13.4-644 mg/Kg. The calibration curves were drawn with the raw spectral data and processed spectral data, which were obtained from each of the processed lines by the wavelet analysis using the method described in 2.3. Comparisons were made between the modeling results with the data before and after 8 of 12 processing. The values of R 2 for Cr, Cu, Zn, As, and Pb based on the raw spectral data were 0.9886, 0.9836, 0.9833, 0.9846 and 0.9846, respectively. These values were lower than the R 2 values processed with DWT ( Figure 6), indicating that the calibration curves fit better after the DWT processing. Thus, the data obtained after processing were better suited to quantitative analysis, with the DWT method also improving the stability and accuracy of the data. A high degree of linearity was found between the HM contents in the soil and the XRF spectral characteristic peak intensities from measurements within the appropriate range. A model was established to determine a calibration curve for the instrument.

Instrument Calibration Curves
The concentration ranges of Cr, Cu, Zn, As, and Pb were 32-749, 11.4-577, 31-680, 4.4-806, and 13.4-644 mg/Kg. The calibration curves were drawn with the raw spectral data and processed spectral data, which were obtained from each of the processed lines by the wavelet analysis using the method described in 2.3. Comparisons were made between the modeling results with the data before and after processing. The values of R 2 for Cr, Cu, Zn, As, and Pb based on the raw spectral data were 0.9886, 0.9836, 0.9833, 0.9846 and 0.9846, respectively. These values were lower than the R 2 values processed with DWT ( Figure 6), indicating that the calibration curves fit better after the DWT processing. Thus, the data obtained after processing were better suited to quantitative analysis, with the DWT method also improving the stability and accuracy of the data. A high degree of linearity was found between the HM contents in the soil and the XRF spectral characteristic peak intensities from measurements within the appropriate range. A model was established to determine a calibration curve for the instrument.

Detection Results and Detection Limits
The standard values of HMs in soil samples were tested by chemical methods, the results were compared to the XRF testing results with original data and processed data using DWT (Figure 7). This showed that, within their standard deviation range, the detected values were closer to the standard values, indicating that the calibration line was sufficiently accurate. A confirmatory test was performed according to the method described in 2.3. The instrument's qualitative detection limit (QDL) is three times the standard deviation of the silicon dioxide (blank), while the quantitative detection limit (QNDL) is ten times the standard deviation of the blank. The detection limits of the raw spectra and the spectra processed by DWT were calculated. The results are shown in Table 4. The table shows that both the QDL and QNDL values were considerably reduced. Thus, the DWT method has potential for practical applications.

Detection Results and Detection Limits
The standard values of HMs in soil samples were tested by chemical methods, the results were compared to the XRF testing results with original data and processed data using DWT (Figure 7). This showed that, within their standard deviation range, the detected values were closer to the standard values, indicating that the calibration line was sufficiently accurate. A confirmatory test was performed according to the method described in 2.3. The instrument's qualitative detection limit (QDL) is three times the standard deviation of the silicon dioxide (blank), while the quantitative detection limit (QNDL) is ten times the standard deviation of the blank. The detection limits of the raw spectra and the spectra processed by DWT were calculated. The results are shown in Table 4. The table shows that both the QDL and QNDL values were considerably reduced. Thus, the DWT method has potential for practical applications.

Conclusions
Denoising and baseline correction by the DWT method as a pre-processing procedure show good effectiveness for handling results of XRF spectroscopy. The Coiflet 3 wavelet base was chosen as an optimized filter for DWT. Good denoising effects were achieved at decomposition level 3 and an approximation of the baseline was achieved at decomposition level 9. A comparison of the calibration curves and detection limits between the raw spectral data and processed spectral data with DWT was performed. The results show that the accuracy is higher and the detection limits are lower for the processed data, which means that better modeling results can be obtained using DWT. The portable XRF spectrometer is capable of rapid detection of Cr, Cu, Zn, As, and Pb in soil within 100 s. The detection limits of the instrument were below the national secondary standard level (Environmental Quality Standard for Soils, GB15618-1995) in processed spectra, and the instrument showed good precision and accuracy, indicating potential for use in rapid in-situ screening of HM contamination in soil.