Research on Enhanced Detection of Benzoic Acid Additives in Liquid Food Based on Terahertz Metamaterial Devices

It is very important for human health to supervise the use of food additives, because excessive use of food additives will cause harm to the human body, especially lead to organ failures and even cancers. Therefore, it is important to realize high-sensibility detection of benzoic acid, a widely used food additive. Based on the theory of electromagnetism, this research attempts to design a terahertz-enhanced metamaterial resonator, using a metamaterial resonator to achieve enhanced detection of benzoic acid additives by using terahertz technology. The absorption peak of the metamaterial resonator is designed to be 1.95 THz, and the effectiveness of the metamaterial resonator is verified. Firstly, the original THz spectra of benzoic acid aqueous solution samples based on metamaterial are collected. Secondly, smoothing, multivariate scattering correction (MSC), and smoothing combined with first derivative (SG + 1 D) methods are used to preprocess the spectra to study the better spectral pretreatment methods. Then, Uninformative Variable Elimination (UVE) and Competitive Adaptive Reweighted Sampling (CARS) are used to explore the optimal terahertz band selection method. Finally, Partial Least Squares (PLS) and Least square support vector machine (LS-SVM) models are established, respectively, to realize the enhanced detection of benzoic acid additives. The LS-SVM model combined with CARS has the best effect, with the correlation coefficient of prediction set (Rp) is 0.9953, the root mean square error of prediction set (RMSEP) is 7.3 × 10−6, and the limit of detection (LOD) is 2.3610 × 10−5 g/mL. The research results lay a foundation for THz spectral analysis of benzoic acid additives, so that THz technology-based detection of benzoic acid additives in food can reach requirements stipulated in the national standard. This research is of great significance for promoting the detection and analysis of trace additives in food, whose results can also serve as a reference to the detection of antibiotic residues, banned additives, and other trace substances.


Introduction
Benzoic acid and benzoate are widely used in food as a kind of preservatives. With the improvement of people's living standards, more attention has been paid to food safety. The categories and limits of food additives in dairy products are clearly stipulated in National Food Safety Standard: Standard for Uses of Food Additives (GB-2760-2014), in which the limited dosage of benzoic acid is no more than 1.0 g/kg, in which the limited dosage of benzoic acid is no more than 1.0 g/kg [1]. The above food additives can be added to food under the dosage limited by the national standard, but excessive intake of which can be harmful to the human body, especially lead to liver failure, kidney failure, and even cancers [2]. The commonly adopted detection methods of food additives at home and abroad can be divided into two categories: The bioassay method and physicochemical analysis whose change caused by covering different samples is reflected in the position and amplitude of the THz resonance peak frequency, thus enhancing the ability to capture the micro change in the environment. This characteristic enables THz time-domain spectroscopy (THz-TDS) equipment to conduct more sensitive and convenient analysis and quantitative determination of additives.
Taking four kinds of tetracycline as research objects, Chen Min in 2017 [13] explored the feasibility of THz-TDS technology for the qualitative identification of these four kinds of substances by studying their characteristic THz spectra and analyzing their characteristic absorption peaks. Based on Asymmetric Dual-Wire Resonator (ADWR) and THz-TDS technology, Chen established a method system for detecting tetracycline hydrochloride (TCH) and its degradation products aqueous solution, providing a new idea for the rapid determination of tetracycline and its degradation products residues, and other antibiotic residues in liquid food. In 2018, Xu Wendao [14] established a quantitative detection model based on THz metamaterials and nanotechnology. With the assistance of metamaterials, THz-TDS technology was used in combination with nanomaterials (gold nanoparticles and graphene). This model realizes the rapid detection of protein and pesticide, providing new methods for pesticide residue detection and agro-products and biosensing technology. In 2018, Zhang Huo [15] preliminarily studied the possibility of applying the metamaterial resonator in detecting traditional Chinese medicine (TCM), given that the position of resonant peak generated by the metamaterial resonator in its THz transmission spectrum varies with the surface dielectric. By comparing the red shift of samples with different concentrations, Zhang successfully classified several borneols and sensitively obtained the variation of chlorpheniramine maleate with five concentrations ranging from 2‰ to 10‰.
Food preservatives play a vital role in preventing food from spoiling and prolonging the shelf life, but an excessive dosage of food additives can also cause serious hazards to the human body. Based on the theory of electromagnetics, this research designs a terahertzenhanced metamaterial resonator that can be used to improve the sensitivity of terahertz technology to the detection of benzoic acid additives. The absorption peak of the terahertz metamaterial resonator was designed to be 1.95 THz. Firstly, the original terahertz spectra of the aqueous solution samples of basic benzoic acid were collected. Secondly, smoothing, MSC, and SG + 1 st D methods were adopted to preprocess the spectra and to study the better spectral pretreatment methods. Then, UVE and CARS will be used to explore the optimal terahertz band selection method. Finally, PLS and LS-SVM models are established, respectively, to realize the enhanced detection of benzoic acid additives. This research is of great significance for promoting the detection and analysis of trace additives in food, whose results can also serve as a reference to the detection of antibiotic residues, banned additives, and other trace substances.

Numerical Simulation
In this study, the Finite Difference Time Domain (FDTD) Solutions software has been used to design the structural parameters of the metamaterial resonator. FDTD Solutions is a micro-nano photonic simulation software based on finite difference time domain method to solve the vector Maxwell equation. The software can simulate the interaction between electromagnetic waves (from the ultraviolet to terahertz and microwave bands) with typical and complex sub-wavelength structures; it can also be used in the design, analysis, and optimization of micro-nano optical materials, micro-nano photonic devices, etc. Considering that metamaterials are composed of periodic structural units, the transmission or reflection properties of a periodic structural unit in metamaterials can represent the spectral properties of the whole metamaterial. Therefore, in the simulation process, a periodic structural unit of the simulated metamaterial can be used to improve the simulation speed. Full-wave simulation of micro-structure is performed by commercial software FDTD, Lumerical FTDT Solutions 2020. Because of the good conductivity of gold at the terahertz (THz) band, it was regarded as a perfect electrical conductor (PEC) in simulation. The periodic boundary conditions are employed in x and y direction and perfect matched layer (PML) in z direction. A plane wave source is set at 100 µm above the metal structure. To reduce the reflection signal from the back of the quartz substrate, the power monitor is set inside the substrate, 100 µm below the metal structure. The refractive index of quartz substrate is obtained from reference [16].
The absorption peak of benzoic acid at the terahertz is found to be 1.95 THz through the previous study [10]. Therefore, FDTD is used in this paper to design a metamaterial structure that makes the absorption peak close to 1.95 THz. Specific dimensional parameters are shown in Figure 1a: The period (p) is 32 µm; the silicon substrate is adopted (the refractive index is 3.335); the coating metal is gold, with cross length (L 1 ) of 21 µm; length of the cross frame (L 2 ) is10 µm; width of the cross frame (W) is 3 µm. Figure 1b is the spatial distribution of electric field intensity around the "X" shaped array calculated using FDTD, where X and Y are the position coordinates of a unit cell in a metamaterial array. The color from blue to red corresponds to the change of electric field intensity from small to large-that is, the dark red with the largest wavelength represents the largest electric field intensity at this position. The largest electric field is located in the "X" shaped intersection region. This location, also known as a terahertz hotspot, tends to produce a large signal enhancement in terahertz studies. The analysis of the simulation results can provide a theoretical basis for selecting the optimal excitation frequency. Figure 1c is the transmittance spectrum of the metamaterial. The black line is the transmittance spectrum line of the metamaterial calculated using the FDTD method, and the red line is the transmittance spectrum line of the metamaterial by the experimental method, which shows a good agreement with simulation.

Device Fabrication
The 500 µm quartz substrates are cleaned by acetone, methanol, and isopropyl alcohol sonication bath each for 10 min and baked on a hot plate for 2 min. Then, the ultraviolet (UV) photoresist AZ5214 is spin coated on the substrate at 4000 rpm for 40 s and baked at 90 • C for 6 min. Next, the micro-structures were patterned on the AZ5214 by standard UV lithography. After the development process, a 100 nm-thick Au is deposited on the samples by electron beam evaporation tool. Finally, a lift-off process is performed by acetone sonication bath for 20 min to get the metal patterns. Figure 2 is the optical microscopy of metamaterial structure.

Sample Preparation
Benzoic acid was purchased from Aladdin (Shanghai, China) and is delivered in the shape of a white crystal. Deionized water was used as a solvent to reduce the interference of external substances in the experimental results. For the rapid and even dissolution of benzoic acid crystals, the crystals were first fully ground in a mortar, and then benzoic acid aqueous solution samples were prepared with deionized water in turn according to the sample ratio table for preparing an aqueous solution. The concentration ratios are as follows: 0 g/L, 0.055 g/L, 0.11 g/L, 0.215 g/L, 0.635 g/L, 1.115 g/L, 1.985 g/L. Six groups per concentration gradient of the total seven were prepared, and the prepared sample solution was placed on the scroll oscillator for mixing and vibration for 3 min so that benzoic acid can be fully dissolved in the deionized water. Since water has strong absorption of terahertz waves, it is necessary to dry the sample dropped on the metamaterial device. Firstly, the sample solution of 20 µL was absorbed by pipetting gun and dropped on the metamaterial device. Then the device was put into the drying box for 20 min, where the temperature was 50 • C to ensure that the solution on the metamaterial sheet could be fully dried. As the drying was completed, the device was put into the THz system for measurement using the transmission mode, and the THz spectrum of the corresponding sample was collected. All samples were operated in the same way in turn. section region. This location, also known as a terahertz hotspot, tends to produce a large signal enhancement in terahertz studies. The analysis of the simulation results can provide a theoretical basis for selecting the optimal excitation frequency. Figure 1c is the transmittance spectrum of the metamaterial. The black line is the transmittance spectrum line of the metamaterial calculated using the FDTD method, and the red line is the transmittance spectrum line of the metamaterial by the experimental method, which shows a good agreement with simulation. (c) Figure 1. The metamaterial structure and the electromagnetic field distribution of the metamaterial structure based on FDTD method: (a) Size structure of the "X" shaped metamaterial; (b) electromagnetic field distribution of the "X" shaped metamaterial structure; (c) The transmittance spectrum of the "X" shaped metamaterial.

Device Fabrication
The 500 μm quartz substrates are cleaned by acetone, methanol, and isopropyl alcohol sonication bath each for 10 min and baked on a hot plate for 2 min. Then, the ultraviolet (UV) photoresist AZ5214 is spin coated on the substrate at 4000 rpm for 40 s and baked at 90 °C for 6 min. Next, the micro-structures were patterned on the AZ5214 by standard UV lithography. After the development process, a 100 nm-thick Au is deposited on the samples by electron beam evaporation tool. Finally, a lift-off process is performed by acetone sonication bath for 20 min to get the metal patterns. Figure 2 is the optical microscopy of metamaterial structure.   The metamaterial structure and the electromagnetic field distribution of the metamaterial structure based on FDTD method: (a) Size structure of the "X" shaped metamaterial; (b) electromagnetic field distribution of the "X" shaped metamaterial structure; (c) The transmittance spectrum of the "X" shaped metamaterial. (c) Figure 1. The metamaterial structure and the electromagnetic field distribution of the metamaterial structure based on FDTD method: (a) Size structure of the "X" shaped metamaterial; (b) electromagnetic field distribution of the "X" shaped metamaterial structure; (c) The transmittance spectrum of the "X" shaped metamaterial.

Device Fabrication
The 500 μm quartz substrates are cleaned by acetone, methanol, and isopropyl alcohol sonication bath each for 10 min and baked on a hot plate for 2 min. Then, the ultraviolet (UV) photoresist AZ5214 is spin coated on the substrate at 4000 rpm for 40 s and baked at 90 °C for 6 min. Next, the micro-structures were patterned on the AZ5214 by standard UV lithography. After the development process, a 100 nm-thick Au is deposited on the samples by electron beam evaporation tool. Finally, a lift-off process is performed by acetone sonication bath for 20 min to get the metal patterns. Figure 2 is the optical microscopy of metamaterial structure.

Sample Preparation
Benzoic acid was purchased from Aladdin (www.aladdin-e.com) and is delivered in the shape of a white crystal. Deionized water was used as a solvent to reduce the interference of external substances in the experimental results. For the rapid and even dissolution of benzoic acid crystals, the crystals were first fully ground in a mortar, and then benzoic acid aqueous solution samples were prepared with deionized water in turn according to the sample ratio table for preparing an aqueous solution. The concentration ratios are as

THz Spectral Detection System
The detection device adopted in the experiment is the TAS7500 THz time-domain spectrometer produced by Advantest company (Tokyo, Japan). The schematic diagram of the device used in the experiment is shown in Figure 3. The device consists of two ultra-short pulse fiber lasers with a pulse center wavelength of 1550 nm, a maximum output power of 50 mW, and a system scan sampling rate of 8 ms /time. The measuring range of the spectrum is 0.1-5.0 THz; the resolution is 7.6 GHz; scanning frequency is 4048 points per sample; in the range of 1.0-3.0 THz, the light spot diameter is ranged from 2200 µm to 733 µm. During the experiment, the humidity was kept below 10%, and the temperature was 25 • C. In order to reduce the impact of random errors on the experimental results, the measurements were repeated 5 times for each sample, and the average spectrum was obtained for the establishment of subsequent models. The THz spectra of all samples collected were divided into calibration sets and prediction set by K-S (Kennard-Stone) algorithm [17] in a ratio of about 3:1.
was collected. All samples were operated in the same way in turn.

THz Spectral Detection System
The detection device adopted in the experiment is the TAS7500 THz time-domain spectrometer produced by Advantest company (Tokyo, Japan). The schematic diagram of the device used in the experiment is shown in Figure 3. The device consists of two ultrashort pulse fiber lasers with a pulse center wavelength of 1550 nm, a maximum output power of 50 mW, and a system scan sampling rate of 8 ms /time. The measuring range of the spectrum is 0.1-5.0 THz; the resolution is 7.6 GHz; scanning frequency is 4048 points per sample; in the range of 1.0-3.0 THz, the light spot diameter is ranged from 2200 μm to 733 μm. During the experiment, the humidity was kept below 10%, and the temperature was 25 °C. In order to reduce the impact of random errors on the experimental results, the measurements were repeated 5 times for each sample, and the average spectrum was obtained for the establishment of subsequent models. The THz spectra of all samples collected were divided into calibration sets and prediction set by K-S (Kennard-Stone) algorithm [17] in a ratio of about 3:1.

Extraction of THz Spectral Parameters
A fast Fourier transform (FFT) was adopted to acquire the spectral distribution of the THz pulse in the frequency, as shown in Equation (1): E ω are amplitude, phase of the electric field, timedomain waveform, and frequency-domain waveform, respectively [18].

Extraction of THz Spectral Parameters
A fast Fourier transform (FFT) was adopted to acquire the spectral distribution of the THz pulse in the frequency, as shown in Equation (1): where A(ω), ϕ(ω), E(t) and E(ω) are amplitude, phase of the electric field, time-domain waveform, and frequency-domain waveform, respectively [18]. As shown in Equation (2), ω represents the frequency; ϕ(ω) is the phase difference between reference signal and sample signal; d is the sample thickness, and c is the speed of light in vacuum. Equation (2) is the calculation of the refractive index of the test sample, while Equation (3) is the formula for sample absorption coefficient, where A S and A R were the amplitude of the sample and reference signals, respectively.
Then the transmission spectrum could be extracted by comparing the sample spectrum with the reference spectrum.
where A S and A R were the amplitude of the sample and reference signals, respectively. The averaged transmission spectrum from three measurements for each sample was used for further analysis [19]. The limit of detection (LOD) is calculated according to the standard error of predictive concentration, and the slope of the fitting curve between the true value and the predicted value of the model built by terahertz spectrum. The calculation result shows that the LOD has a confidence interval of 99.86% [20]. In Equation (4), σ refers to the standard error of Through comparison of Correlation Coefficient of the Calibration (R c ), correlation coefficient of the prediction (R p ), Root Mean Square Error of Calibration (RMSEC), and Root Mean Square Error of Prediction (RMSEP), the models were evaluated. The value of correlation coefficient of the model is inversely proportional to that of the RMSEP, but is in direct proportion to the model's accuracy. Besides, if values of RMSEC and RMSEP get closer, then the model will be more stable correspondingly.

Spectra Analysis
Terahertz spectrum is rich in information, including many THz optical parameters, such as absorption coefficient, dielectric constant, refractive index, phase angle, etc., which can reflect the internal information of substances in multiple dimensions. Figure 4 shows the THz absorption coefficient spectra of benzoic acid solutions with different concentrations. The spectra were measured by metamaterials. Given that both the front end and the back end have seen more noise interference than other places, the THz spectra with the frequency range of 1.0-3.0 THz were intercepted for the convenience of later data processing. It is shown in Figure 4 that as the concentration of benzoic acid increases, the peak has a tendency to shift to the low frequency and also decreases. The reason may be that the increase of the concentration improves the surface reflection, thereby reducing the absorption peak. The metamaterial resonator has a strong absorption at a specific frequency of the THz waveband (the absorption frequency of the metamaterial resonator used in this paper is 1.95 THz). And the THz metamaterial resonator has a strong interaction with the incident THz wave, which can form a strong electric field. Compared with the direct detection of samples by terahertz waves, the interaction between the electric field which is generated by the metamaterial resonator and the sample is obviously enhanced, which can amplify the sample signal. After the sample was added to the surface of the metamaterial resonator, the surface dielectric environment of the metamaterial resonator was changed, and the impedance matching relationship between the air and the metamaterial resonator structure was also changed, which leads to a red shift in the resonance peak frequency. Therefore, the interaction between the THz wave and the sample can be strengthened through a metamaterial resonator; the frequency shift signal of the resonant peak in the metamaterial resonator can be used for quantitative detection and analyzing the sample.

Validation by FDTD Calculations
Simulation of metamaterial gives us the theoretical analysis of the interaction between THz wave and matter, which is the theoretical guide when we are doing an experiment. The unit cell of metamaterial can be regarded as a micro-resonator comprised of capacitive and inductive response. It will excite a strong resonance, which will be very sensitive to the background change when illuminating the electromagnetic wave at the resonance frequency. The simulation is utilized to propose the scheme of concentration detection and the experiment is for demonstrating and verifying the scheme feasible.
The concentration of the solution can be represented by the thickness of benzoic acid, or the distribution densities of solute depends on the concentration of the solution. If the benzoic acid aqueous solution has a high concentration, its solute after drying can fully cover the metamaterial. On the other hand, the solution with a low concentration will form a discontinuous layer of solute after drying. Based on the two situations mentioned above, two solute film models (1. Various thickness with certain refractive index; 2. Fixed thickness with different refractive index) are made to analyze the influence of concentration on transmission spectrum. The FDTD software was used to conduct these two kinds of simulations. The thickness of the coating was fixed at T = 2 µm, and the variation of the transmission spectrum with the refractive index of the coating is shown in Figure 5a. The refractive index of the coating was fixed at n = 1.7, and the variation of transmission spectrum with the thickness of the coating is shown in Figure 5b. metamaterial resonator structure was also changed, which leads to a red shift in the resonance peak frequency. Therefore, the interaction between the THz wave and the sample can be strengthened through a metamaterial resonator; the frequency shift signal of the resonant peak in the metamaterial resonator can be used for quantitative detection and analyzing the sample.

Validation by FDTD Calculations
Simulation of metamaterial gives us the theoretical analysis of the interaction between THz wave and matter, which is the theoretical guide when we are doing an experiment. The unit cell of metamaterial can be regarded as a micro-resonator comprised of capacitive and inductive response. It will excite a strong resonance, which will be very sensitive to the background change when illuminating the electromagnetic wave at the resonance frequency. The simulation is utilized to propose the scheme of concentration detection and the experiment is for demonstrating and verifying the scheme feasible.
The concentration of the solution can be represented by the thickness of benzoic acid, or the distribution densities of solute depends on the concentration of the solution. If the benzoic acid aqueous solution has a high concentration, its solute after drying can fully cover the metamaterial. On the other hand, the solution with a low concentration will form a discontinuous layer of solute after drying. Based on the two situations mentioned above, two solute film models (1. Various thickness with certain refractive index; 2. Fixed thickness with different refractive index) are made to analyze the influence of concentration on transmission spectrum. The FDTD software was used to conduct these two kinds of simulations. The thickness of the coating was fixed at T = 2 μm, and the variation of the transmission spectrum with the refractive index of the coating is shown in Figure 5a. The refractive index of the coating was fixed at n = 1.7, and the variation of transmission spectrum with the thickness of the coating is shown in Figure 5b.
Here, the LC resonance model = 1/(2 √ ) is used to qualitatively analyze the change of metamaterial resonance response with the background change with the above two models. In our paper, each unit of metamaterial can be regarded as a micro-resonator comprised of capacitive and inductive response. The metal strip of the cross can be seen as micro-inductance, which will generate induced current, and the gap between the metal  5a shown, the increase of the refractive index of the cladding layer will lead to the resonance peak shift to low frequency.
In the second situation, the various concentration of the solution transforms into the different thickness of solute. Although the metal structure is fully covered by the solute, the thin film cannot cover all evanescent wave fields around the structure, some displacement current between metal strips will still cover the solute film and air. With the film thickness increase, the ratio of film to air in evanescent wave region will be increased, and also the equivalent refractive index around the metal structure which will lead to the capacitance of structure enlarged and resonance peak red shifted, as shown in Figure 5b.
It is worth noting that the increase of the refractive index of the thin film in the first situation will cause the almost linear red shift of the resonance peak, as shown in Figure  5a. In contrast to this situation, the equal step of increment of thickness with the certain refractive index will lead to a minor difference when the film is thick. That is because of the evanescent wave is exponential decay along the z axis (perpendicular to the metamaterial plane), the film closes to the metal structure has more impact on resonance. With fixed thickness, the background of metal structure has the same impact on the evanescent wave, and the equal refractive index difference has an almost equal influence on resonance. However, the impact on capacitance will decrease with the equal step increment of film thickness with fixed refractive index. When increasing the thickness further, the film will cover all evanescent fields around the metal structure, and the capacity response will not be changed, and the frequency of resonance peak will be fixed. The concentration we utilized in our experiment is likely to be the first situation; the equal increment of concentration has a similar impact on resonance peak. The theoretical guide is given by the simulation that the metamaterial composed by the micro-resonator array is more sensitive to concentration change of solution with a low concentration than that with a high concentration. Here, the LC resonance model f = 1/ 2π √ LC is used to qualitatively analyze the change of metamaterial resonance response with the background change with the above two models. In our paper, each unit of metamaterial can be regarded as a micro-resonator comprised of capacitive and inductive response. The metal strip of the cross can be seen as micro-inductance, which will generate induced current, and the gap between the metal strips forms micro-capacitance, which will generate displacement current under THz wave illuminance. When the cladding of metal micro-structure changes, the cladding's permeability and permittivity will be changed, which will lead to the inductive and capacitive response change, respectively. Except for some magnetic materials, like ferromagnetic materials, most of the materials have a minor differences in permeability, including air and benzoic acid used in this work. The same permeability of different cladding materials (air and benzoic acid) will lead to a similar inductive response of the micro-structures. However, the difference of permittivity (permittivity is the square of the refractive index, ε = n 2 ) of distinct materials is usually obvious. Thus, we only analyze the capacitive response in our micro-structures.
In the first situation, as Figure 5a shown, different concentrations of the solution will be transformed into the different distribution densities of solute. The discontinuous layer of benzoic acid can be seen as a uniform continuous film with a refractive index between air and itself. With the refractive index of the background increase, capacitance of micro-structure will be enlarged (C∝ε), and the resonance frequency will be red shift. As Figure 5a shown, the increase of the refractive index of the cladding layer will lead to the resonance peak shift to low frequency.
In the second situation, the various concentration of the solution transforms into the different thickness of solute. Although the metal structure is fully covered by the solute, the thin film cannot cover all evanescent wave fields around the structure, some displacement current between metal strips will still cover the solute film and air. With the film thickness increase, the ratio of film to air in evanescent wave region will be increased, and also the equivalent refractive index around the metal structure which will lead to the capacitance of structure enlarged and resonance peak red shifted, as shown in Figure 5b.
It is worth noting that the increase of the refractive index of the thin film in the first situation will cause the almost linear red shift of the resonance peak, as shown in Figure 5a. In contrast to this situation, the equal step of increment of thickness with the certain refractive index will lead to a minor difference when the film is thick. That is because of the evanescent wave is exponential decay along the z axis (perpendicular to the metamaterial plane), the film closes to the metal structure has more impact on resonance. With fixed thickness, the background of metal structure has the same impact on the evanescent wave, and the equal refractive index difference has an almost equal influence on resonance. However, the impact on capacitance will decrease with the equal step increment of film thickness with fixed refractive index. When increasing the thickness further, the film will cover all evanescent fields around the metal structure, and the capacity response will not be changed, and the frequency of resonance peak will be fixed. The concentration we utilized in our experiment is likely to be the first situation; the equal increment of concentration has a similar impact on resonance peak. The theoretical guide is given by the simulation that the metamaterial composed by the micro-resonator array is more sensitive to concentration change of solution with a low concentration than that with a high concentration.

Preprocessing of THz Spectra of Benzoic Acid Aqueous Solution Based on Metamaterials
In order to eliminate the influence of factors, such as uneven sample mixing, stray light, experimental environment, and noise during sampling, preprocessing methods of SG (Savitzky-Golay) convolution smoothing, multiple scattering correction (MSC), and SG + 1 st D were selected in this paper to preprocess spectral data [21]. Noises can be reduced by using S-G smoothing, to increase the SNR (Signal to Noise Ratio); 1 st D can significantly remove baseline interference and other background interferences, resolve overlapping peaks and improve resolution and sensitivity. The preprocessed spectral data and the benzoic acid content in aqueous solution were taken as the input variable and the output variable of the model, respectively, to establish the PLS model. The optimal preprocessing method was then picked out with the reference of the R p and RMSEP in the model. Figure 6 shows the THz spectra of the samples after being preprocessed by SG + 1 st D.
Results of the PLS model of THz spectra for the samples established by different preprocessing methods are counted in Table 1. The results show that the PLS model established after preprocessing by SG + 1stD has the best results, and the R p and RMSEP of the model are 0.9791 and 1.03×10 −4 , respectively. Therefore, the data preprocessed by SG + 1 st D are used for further analysis. SG + 1 st D were selected in this paper to preprocess spectral data [21]. Noises can be reduced by using S-G smoothing, to increase the SNR (Signal to Noise Ratio); 1 st D can significantly remove baseline interference and other background interferences, resolve overlapping peaks and improve resolution and sensitivity. The preprocessed spectral data and the benzoic acid content in aqueous solution were taken as the input variable and the output variable of the model, respectively, to establish the PLS model. The optimal preprocessing method was then picked out with the reference of the Rp and RMSEP in the model. Figure 6 shows the THz spectra of the samples after being preprocessed by SG + 1 st D. Results of the PLS model of THz spectra for the samples established by different preprocessing methods are counted in Table 1. The results show that the PLS model established after preprocessing by SG + 1stD has the best results, and the Rp and RMSEP of the model are 0.9791 and 1.03×10 −4 , respectively. Therefore, the data preprocessed by SG + 1 st D are used for further analysis.   The selection results by UVE of THz spectral variables of benzoic acid samples are shown in Figure 7. The green vertical bar in Figure 7 represents the wavelength separation line, on the left of which is the distribution curve of the wavelength stability. The two horizontal points and lines correspond with two values that represent the threshold values screened by the UVE wavelength method, and the upper and lower threshold values are negative to each other, +9.1851 and −9.1851, respectively. Only those values of stability beyond the range between two threshold values can be used as the input variables of the model. After UVE selection, 141 out of the 340 spectral variables (41.47%) were left, thus reducing the input variables of the model to some extent [22].

CARS Algorithm-Based Wavelength Variable Selection
Competitive adaptive weighted sampling (CARS) refers to an emerging but effective variable selection method. Its principle is identical to that of "the fittest survive" in the evolution theory initiated by Darwin. Each wavelength variable is considered as one individual, while the unsuitable individuals are gradually eliminated. During the selection of wavelength, the wavelength variables with the larger absolute value of regression coefficient in the PLS model were selected by the adaptive reweighted sampling (ARS), and those with smaller weight were removed. Finally, the subset of wavelength variable whose RMSECV value was the lowest was selected by cross-test (CV). This algorithm can be used to effectively single out the best wavelength combination that is relative to detection indexes [23,24].

UVE Algorithm-Based Wavelength Variable Selection
The selection results by UVE of THz spectral variables of benzoic acid samples are shown in Figure 7. The green vertical bar in Figure 7 represents the wavelength separation line, on the left of which is the distribution curve of the wavelength stability. The two horizontal points and lines correspond with two values that represent the threshold values screened by the UVE wavelength method, and the upper and lower threshold values are negative to each other, +9.1851 and -9.1851, respectively. Only those values of stability beyond the range between two threshold values can be used as the input variables of the model. After UVE selection, 141 out of the 340 spectral variables (41.47%) were left, thus reducing the input variables of the model to some extent [22].

CARS Algorithm-Based Wavelength Variable Selection
Competitive adaptive weighted sampling (CARS) refers to an emerging but effective variable selection method. Its principle is identical to that of "the fittest survive" in the evolution theory initiated by Darwin. Each wavelength variable is considered as one individual, while the unsuitable individuals are gradually eliminated. During the selection of wavelength, the wavelength variables with the larger absolute value of regression coefficient in the PLS model were selected by the adaptive reweighted sampling (ARS), and those with smaller weight were removed. Finally, the subset of wavelength variable whose RMSECV value was the lowest was selected by cross-test (CV). This algorithm can be used to effectively single out the best wavelength combination that is relative to detection indexes [23,24].
The CARS algorithm was adopted to select variables on the full THz spectra of flourbenzoic acid mixed samples, and the selection results are shown in Figure 8. The CARS algorithm was adopted to select variables on the full THz spectra of flourbenzoic acid mixed samples, and the selection results are shown in Figure 8. From Figure 8a, as the number of sampling increases, the number of retained wavelengths decreases in lockstep. At the outset, the number of retained wavelengths decreases rapidly as the number of sampling increases. Then the reduction gradually slows down, and the number of retained wavelengths ends up remaining unchanged with the increase of the number of sampling. This reflects the process from rough selection to careful selection of wavelength variables by the CARS algorithm. Figure 8b shows the tendency figure of the RMSECV value changing with the increase of the number of sampling in selecting wavelength variables. It can be seen from Figure 8b that RMSECV value gradually decreases when the number of sampling increases from 1 to 28. When the number of sampling reaches 28, RMSECV is at the minimum value of 1.6547 × 10 −4 ; when the number of sampling exceeds 28, the RMSECV value gradually increases. The above process indicates that when the number of sampling was below 24, spectral information unrelated to benzoic acid content was weeded out by the CARS algorithm; when the number of sampling was above 24, important information related to benzoic acid content was weeded out. Figure 8c shows the wavelength of the variable regression coefficient changing with the increase of the number of sampling in the wavelength selection process. "*" mark in Figure 8c represents the corresponding number of sampling with the minimum RMSECV value. It can be seen from Figure 8c that when the number of sampling is 28, RMSECV has the minimum value. Correspondingly, the number of wavelength variables is 20.

PLS and LS-SVM Models Establishment and Comparison for THz Spectra of Benzoic Acid
Samples in Solution Based on Metamaterials 3.5.1. Establishment of PLS Model for THz Spectra of the Samples Partial least squares (PLS) perfectly integrates multiple linear regression, principal component analysis, and canonical correlation analysis. When the spectral array and concentration array are decomposed simultaneously, the mutual relationship between them is also considered in the decomposition. With an enhanced corresponding calcula-tion relationship, the optimal correction model can be obtained [25]. Table 2 compares the modeling effect of the PLS model through adopting different methods for selecting wavelength variables. The wavelength variables selected by UVE and CARS were used as input and the benzoic acid content in flour as output to establish the PLS model. To better compare the influence on modeling using various methods to select wavelengths, data of variables of wavelength relatively selected by UVE and CARS were used to establish the PLS models. length variables. It can be seen from Figure 8b that RMSECV value gradually dec when the number of sampling increases from 1 to 28. When the number of sam reaches 28, RMSECV is at the minimum value of 1.6547 × 10 −4 ; when the number o pling exceeds 28, the RMSECV value gradually increases. The above process indicat when the number of sampling was below 24, spectral information unrelated to b acid content was weeded out by the CARS algorithm; when the number of samplin above 24, important information related to benzoic acid content was weeded out. F 8c shows the wavelength of the variable regression coefficient changing with the in of the number of sampling in the wavelength selection process. "*" mark in Figure 8 resents the corresponding number of sampling with the minimum RMSECV value. be seen from Figure 8c that when the number of sampling is 28, RMSECV has the mum value. Correspondingly, the number of wavelength variables is 20.

Establishment of PLS Model for THz Spectra of the Samples
Partial least squares (PLS) perfectly integrates multiple linear regression, pri component analysis, and canonical correlation analysis. When the spectral array an centration array are decomposed simultaneously, the mutual relationship between is also considered in the decomposition. With an enhanced corresponding calculati lationship, the optimal correction model can be obtained [25]. Table 2 compares the eling effect of the PLS model through adopting different methods for selecting wave variables. The wavelength variables selected by UVE and CARS were used as inpu the benzoic acid content in flour as output to establish the PLS model. To better co the influence on modeling using various methods to select wavelengths, data of var

Establishment of the LS-SVM Model for THz Spectra of Samples
Least squares support vector machine [26,27] (LS-SVM) refers to a machine learning method that emerged from the basis of statistical learning theory. Its key parameter indexes are input vector, type of kernel function, and the corresponding parameters of this function. The radial basis function (RBF) and linear kernel function (Lin) represent the two most typical kernel functions in LS-SVM. Table 3 compares the performance of the model established by different wavelength variable selection methods and the LS-SVM method. The LS-SVM model was established in which the inputs were wavelength variables selected by the methods of UVE and CARS, and those variables singled out by the methods of UVE and CARS in combination with the principal component analysis method, respectively.  Figure 9 shows the optimal predictive value effect of the PLS model and LS-SVM model for benzoic acid concentration of mixed samples. For the PLS model, wavelength selection by CARS not only reduces the calculation amount, but also improves the accuracy of the model. In the establishment of the LS-SVM model, wavelength selection using the CARS method can simplify the model while improving the accuracy of the model. In contrast, the prediction accuracy of the LS-SVM model was higher, and the prediction effect of the CARS-LS-SVM model established through CARS wavelength selection was better, with the prediction correlation coefficient (R p ) of 0.9953, the prediction root mean square error (RMSEP) of 7.3 × 10 −6 , and the LOD of 2.3610 × 10 −5 g/mL

Conclusions
In this paper, the absorption peak of 1.95 THz was designed according to the electromagnetic theory, which is used for the effective detection of benzoic acid samples. The original THz spectra of benzoic acid aqueous solution samples based on metamaterial were preprocessed by SG + 1 st D, and then the effective spectral wavebands were selected by CARS to establish the model. The results show that the CARS-LS-SVM model established by the method of CARS has an optimal prediction effect, with the prediction correlation coefficient (Rp) of 0.9953, the prediction root mean square error (RMSEP) of 7.3 × 10 −6 , and the LOD of 2.3610 × 10 −5 g/ mL. Compared with the traditional compression method, the detection accuracy is improved by about 1400 times [10]. The results provide a basis for terahertz spectral analysis of benzoic acid additives, which will promote the rapid detection of benzoic acid additives in food. This research is of great significance for promoting the detection and analysis of trace additives in food, whose results can also serve as a reference to the detection of antibiotic residues, banned additives, and other

Conclusions
In this paper, the absorption peak of 1.95 THz was designed according to the electromagnetic theory, which is used for the effective detection of benzoic acid samples. The original THz spectra of benzoic acid aqueous solution samples based on metamaterial were preprocessed by SG + 1 st D, and then the effective spectral wavebands were selected by CARS to establish the model. The results show that the CARS-LS-SVM model established by the method of CARS has an optimal prediction effect, with the prediction correlation coefficient (R p ) of 0.9953, the prediction root mean square error (RMSEP) of 7.3 × 10 −6 , and the LOD of 2.3610 × 10 −5 g/ mL. Compared with the traditional compression method, the detection accuracy is improved by about 1400 times [10]. The results provide a basis for terahertz spectral analysis of benzoic acid additives, which will promote the rapid detection of benzoic acid additives in food. This research is of great significance for promoting the detection and analysis of trace additives in food, whose results can also serve as a reference to the detection of antibiotic residues, banned additives, and other trace substances.