Echo-Signal De-Noising of CO 2 -DIAL Based on the Ensemble Empirical Mode Decomposition

: The carbon dioxide (CO 2 ) differential absorption lidar echo signal is susceptible to noise and must satisfy the high demand for signal-retrieval precision. Thus, a proper de-noising method should be selected to improve the inversion result. In this paper, we simultaneously decompose three signal pairs into different intrinsic mode functions (IMFs) using the method of ensemble empirical mode decomposition (EEMD). Further, the correlation coefﬁcients of the IMFs with the same temporal scale are regarded as the criterion to determine the components that need removal. This method not only retains the useful information effectively but also removes the noise component. A signiﬁcant improvement in the R 2 of the differential absorption optical depth (DAOD) of the de-noised signals is obtained. The results of the simulated and observed analysis signal demonstrated improvement both in the SNR and in the


Introduction
Since the Industrial Revolution, the concentration of atmospheric carbon dioxide (CO 2 ) has continued to rise due to the massive combustion of fossil fuels [1,2]. Research has shown that part of the CO 2 generated by human activities has inexplicably disappeared, which indicates that some unknown carbon sinks may exist [3]. To determine the distribution of carbon sources and sinks, many CO 2 observation methods have been presented [4][5][6][7][8][9]. Differential absorption lidar (DIAL) is considered the most promising method for observing CO 2 concentration [10]. In contrast to passive remote sensing, DIAL can provide better precision to retrieve atmospheric CO 2 concentration [11]. Moreover, a ground-based DIAL can provide vertical profiles of CO 2 , which can be of great significance to the research of the carbon cycle [12][13][14].
The echo signal of a ground-based CO 2 -DIAL mainly relies on the scattering of aerosol and atmospheric molecules [15]. Therefore, the signal intensity is relatively weak. Because a lidar signal is susceptible to noise such as background light, dark current, and thermal noise, any small perturbation will result in poor signal inversion [16]. Moreover, because the signal-to-noise ratio (SNR) rapidly decreases with the increase in range, useful information will be overwhelmed by the noise over long distances [17]. Signal de-noising of the CO 2 -DIAL will inevitably improve the effective detection range and will play an important role in the signal inversion precision. Therefore, a proper de-noising method must be selected for an important performance.
Until now, several methods have been developed, and some are in the development process for application in lidar signal de-noising [18][19][20][21]. The multi-pulse average method has been widely used in signal de-noising by calculating the mean of the multiple sets of profiles. However, multiple averages smoothen the fast-changing components of aerosols. The wavelet transform (WT) has been proven quite useful in extracting signals from data generated in noisy, nonlinear, and nonstationary processes. Zhou et al. applied it to reduce lidar echo signal noise and improve the system SNR [22]. WTs are founded on basis functions, which inevitably have poor adaptability. Mao et al. put forward an alternative approach to obtain a de-noised signal as a by-product of combining the ensemble Kalman filter and the Fernald method [23]. Wu et al. and Tian et al. applied the empirical mode decomposition (EMD) to lidar signal de-noising [24,25]. The EMD developed by Huang et al. decomposed the signal on the basis of the local characteristic time scale of the signal itself and dispensed with any basis functions [26]. As discussed by Huang et al., a major drawback in the original EMD is the frequent appearance of mode mixing, which makes the physical meaning of the individual intrinsic mode functions (IMFs) unclear. To overcome this problem, a new noise-assisted data analysis (NADA) method is proposed, namely the ensemble EMD (EEMD), which adds a white noise with finite amplitude into the signal and performs EMD [27,28].
In the present study, we assume that the signals in a short time have the same useful information and that their differences are only caused by noise in stable atmospheric conditions. We apply the EEMD method to decompose the signals into different IMFs with frequencies ranging from high to low. Generally, we consider the noise existing in the high-frequency parts and implement signal de-noising by removing them. The correlation coefficients of the IMFs with the same temporal scale are regarded as the criterion to determine the components that need removal. Finally, we regard the linear fitting parameters of the differential absorption optical depth (DAOD) as the criterion for evaluating the de-noising effect.
In the next section, the system and theoretical foundation of the CO 2 -DIAL and the EEMD method are presented in detail. Further, we study the feasibility of the method by processing simulated signals. Two parts were selected for the analysis, and an obvious improvement in the result was obtained. Moreover, we apply the above-mentioned method to the observed signal, and the r-square increased from 0.444 to 0.755, which is doubtless beneficial to the inversion precision. In conclusion, the method turns out to be applicable to simulated and observed signals.

System and Theoretical Foundation of CO 2 -DIAL
Our group is devoted to the development of a ground-based CO 2 -DIAL system. The system located in Wuhan University involves four parts, namely, laser-emitting, frequencystabilization, optical-receiving, and signal-acquisition systems. The system was designed to work at~1.57 µm with a pulse repetition frequency of 20 Hz. Two close wavelengths were selected, and they were at the peak and wing of the absorption line, namely, on-wavelength (λ on ) and off-wavelength (λ off ), respectively. The sampling frequency is 20 MHz with a vertical range resolution of 7.5 m. The configuration of the lidar system is shown in the Figure 1.
The foundation of the CO 2 -DIAL is the backscatter Lidar equation, as expressed in the following: where R is the detection range, P is the received power of range R, P 0 is the transmitted power, A is the effective receiver area, c is the speed of light, τ is the laser pulse width, η is the receiver efficiency, β(R) is the atmospheric backscatter coefficient, α(R) is the atmospheric extinction coefficient, N is the number density of CO 2 , and σ is the absorption cross-section of CO 2 . Because the on-line and off-line wavelengths are very close, β(R) and α(R) can be considered equivalent at two similar wavelengths. From Han et al., we can ignore the effect of water vapor through strict wavelength selection [15]. Given that the inversion of the CO 2 concentration is mainly based on the absorption difference of the on-line and off-line wavelength, a slight noise may cause a poor result. The signal intensity exponentially attenuates with range. Therefore, the amplitude of the signal affected by noise considerably varies at different ranges. can be considered equivalent at two similar wavelengths. From Han et al., we can ignore the effect of water vapor through strict wavelength selection [15]. Given that the inversion of the CO2 concentration is mainly based on the absorption difference of the on-line and off-line wavelength, a slight noise may cause a poor result. The signal intensity exponentially attenuates with range. Therefore, the amplitude of the signal affected by noise considerably varies at different ranges.

EEMD
The EMD proposed by Huang et al. is a non-linear, non-stationary, and self-adaptive decomposition technology that, unlike the traditional methods (Fourier transform, WT, etc.), dispenses with any basis functions and can decompose the signal into a series of IMFs.
where y is the signal, ci represents the IMFs that include different frequency bands ranging from high to low, and r is the final residue that indicates the trend of the signal. Compared with previous methods, EMD requires no preliminary knowledge and possesses better adaptation. However, the method still suffers from the drawback of mode mixing, which makes the physical meaning of the IMFs unclear. To solve this problem, Wu and Huang [28] proposed a new NADA method named EEMD, which adds a white noise series with finite amplitude to the targeted data and then decomposes the data into IMFs. By repeating the above-mentioned process with different white noise series every time, we can obtain multiple IMFs with the same temporal scale. Ultimately, we treat the mean of the corresponding IMFs as the final result. The EEMD algorithm can be directly defined by the following steps: ① Add random white noise ni (t) with magnitude ε to the original signal s(t) to generate a new signal yi(t) as ② Use the EMD method to decompose the new generated signal yi(t) as ③ Calculate the mean of the corresponding IMFs and the residues as the final result as

EEMD
The EMD proposed by Huang et al. is a non-linear, non-stationary, and self-adaptive decomposition technology that, unlike the traditional methods (Fourier transform, WT, etc.), dispenses with any basis functions and can decompose the signal into a series of IMFs.
where y is the signal, c i represents the IMFs that include different frequency bands ranging from high to low, and r is the final residue that indicates the trend of the signal. Compared with previous methods, EMD requires no preliminary knowledge and possesses better adaptation. However, the method still suffers from the drawback of mode mixing, which makes the physical meaning of the IMFs unclear. To solve this problem, Wu and Huang [28] proposed a new NADA method named EEMD, which adds a white noise series with finite amplitude to the targeted data and then decomposes the data into IMFs. By repeating the above-mentioned process with different white noise series every time, we can obtain multiple IMFs with the same temporal scale. Ultimately, we treat the mean of the corresponding IMFs as the final result. The EEMD algorithm can be directly defined by the following steps: (1) Add random white noise n i (t) with magnitude ε to the original signal s(t) to generate a new signal y i (t) as (2) Use the EMD method to decompose the new generated signal y i (t) as (3) Calculate the mean of the corresponding IMFs and the residues as the final result as where y i (t) is the new generated signal, s(t) is the original signal, n i (t) stands for different white noise series, c i,j represents the IMFs, c j is the mean of the corresponding IMFs of the decompositions, n is the ensemble number, r i,m is the residue after n-times of decomposition, and r m is the mean of the residues. By calculating the means of the IMFs and the residues, the added white noise series will finally cancel each other, and the problem of mode mixing can very well be solved. As explained above, a series of IMFs with frequency ranging from high to low can be obtained. We generally consider the noise to exist in the high-frequency component. By removing the first few IMFs, we can achieve signal de-noising. Then, the key problem is how many IMFs should be removed.
In this work, we consider that the signals within a short time are similar to one another. In other words, the useful information in these signals is the same. Therefore, we simultaneously select several echo signals to decompose by EEMD and calculate the correlation coefficient of the IMFs with the same temporal scale. In the absence of the noise effect, the IMFs with the same temporal scale will have a strong correlation; otherwise, they have lesser relativity. According to the correlation coefficient, we can determine the number of removed IMFs.
The methodology shown in Figure 2 consists of two parts: signal de-noising and performance evaluation. The input matrix Y represents the original data, which includes three signal pairs. The first part starts by decomposing each segment into different time scales, c R S (s = 1, 2, . . . , S), through EEMD and then calculating the correlation coefficient,c R S . Finally, the signal is reconstructed on the basis of the threshold. The second part mainly evaluates the de-noising performance by calculating the DAOD.
where yi(t) is the new generated signal, s(t) is the original signal, ni(t) stands for different white noise series, ci,j represents the IMFs, cj is the mean of the corresponding IMFs of the decompositions, n is the ensemble number, ri,m is the residue after n-times of decomposition, and rm is the mean of the residues. By calculating the means of the IMFs and the residues, the added white noise series will finally cancel each other, and the problem of mode mixing can very well be solved. As explained above, a series of IMFs with frequency ranging from high to low can be obtained. We generally consider the noise to exist in the high-frequency component. By removing the first few IMFs, we can achieve signal de-noising. Then, the key problem is how many IMFs should be removed.
In this work, we consider that the signals within a short time are similar to one another. In other words, the useful information in these signals is the same. Therefore, we simultaneously select several echo signals to decompose by EEMD and calculate the correlation coefficient of the IMFs with the same temporal scale. In the absence of the noise effect, the IMFs with the same temporal scale will have a strong correlation; otherwise, they have lesser relativity. According to the correlation coefficient, we can determine the number of removed IMFs.
The methodology shown in Figure 2 consists of two parts: signal de-noising and performance evaluation. The input matrix Y represents the original data, which includes three signal pairs. The first part starts by decomposing each segment into different time

Simulated Signal Analysis
To verify the feasibility of the method, we analyzed the simulated CO2-DIAL signal. Figure 3a shows the pure signal of the on-line and off-line wavelengths. The initial CO2 concentration was assumed to be 400 parts per million, and the temperature and pressure profiles were obtained by interpolating the sounding data of the temperature and pressure at Wuhan in March 2015. Compared with the off-line wavelength, the on-line wavelength fell even more rapidly due to CO2 absorption. We introduced a white noise into the simulated pure signal and considered their combination as the observed signal [ Figure  3b]. Because of the influence of noise, the intensity of on-line wavelength echo signal may be stronger than off-line wavelength and can impair the absorption phenomenon. Thus,

Simulated Signal Analysis
To verify the feasibility of the method, we analyzed the simulated CO 2 -DIAL signal. Figure 3a shows the pure signal of the on-line and off-line wavelengths. The initial CO 2 concentration was assumed to be 400 parts per million, and the temperature and pressure profiles were obtained by interpolating the sounding data of the temperature and pressure at Wuhan in March 2015. Compared with the off-line wavelength, the on-line wavelength fell even more rapidly due to CO 2 absorption. We introduced a white noise into the simulated pure signal and considered their combination as the observed signal [ Figure 3b]. Because of the influence of noise, the intensity of on-line wavelength echo signal may be stronger than off-line wavelength and can impair the absorption phenomenon. Thus, this problem can influence CO 2 retrieval precision. Hence, we introduced the above-mentioned method to process the echo signal. this problem can influence CO2 retrieval precision. Hence, we introduced the above-mentioned method to process the echo signal.
where r is final residue and can be viewed as useful information of the signal, C1 is the first IMF and can approximately represent the noise of the signal, Sp and Sn represent the pure signal and noise of a signal, respectively. As Figure 4 shows, we have calculated the FSNRs and RSNRs of ten simulated signals. The difference between FSNR and RSNR is smaller, so we can substitute FSNR for RSNR. FSNRs of ten measured signals have been calculated and they have minor difference between any two adjacent signals (Table 1). Therefore, three similar simulated signal groups with different SNRs, (99,100,101), were selected as our research objects ( Figure 5). We proposed false SNR (FSNR) to approximately estimate the real SNR (RSNR) of the signal. and where r is final residue and can be viewed as useful information of the signal, C 1 is the first IMF and can approximately represent the noise of the signal, S p and S n represent the pure signal and noise of a signal, respectively. As Figure 4 shows, we have calculated the FSNRs and RSNRs of ten simulated signals. The difference between FSNR and RSNR is smaller, so we can substitute FSNR for RSNR.
Atmosphere 2022, 13, x FOR PEER REVIEW 5 of 16 this problem can influence CO2 retrieval precision. Hence, we introduced the above-mentioned method to process the echo signal. 10*log where r is final residue and can be viewed as useful information of the signal, C1 is the first IMF and can approximately represent the noise of the signal, Sp and Sn represent the pure signal and noise of a signal, respectively. As Figure 4 shows, we have calculated the FSNRs and RSNRs of ten simulated signals. The difference between FSNR and RSNR is smaller, so we can substitute FSNR for RSNR. FSNRs of ten measured signals have been calculated and they have minor difference between any two adjacent signals (Table 1). Therefore, three similar simulated signal groups with different SNRs, (99,100,101), were selected as our research objects ( Figure 5). FSNRs of ten measured signals have been calculated and they have minor difference between any two adjacent signals (Table 1). Therefore, three similar simulated signal groups with different SNRs, (99,100,101), were selected as our research objects ( Figure 5). They can be considered as observed signals collected within a short time in stable atmospheric conditions. We segmented the signals into two parts to verify the performance of the method under different signal intensities. One part was approximately from 300 to 1500 m, and the other was from 1500 to 3000 m. pheric conditions. We segmented the signals into two parts to verify the performance of the method under different signal intensities. One part was approximately from 300 to 1500 m, and the other was from 1500 to 3000 m.  Figure 6 shows the IMFs of the selected simulated signals of the on-line wavelength from 300 to 1500 m decomposed by EEMD. The frequency of the IMF systematically decreased. We generally consider that IMFs with high frequency contain little useful information, and most of them are occupied by noise. Therefore, we assumed the IMFs with high frequency to be completely submerged in noise. We adopted the method of removing high-frequency IMFs to realize CO2-DIAL signal de-noising. Therefore, the key to the problem was how to determine the IMFs that need to be removed. To solve this problem, the correlation coefficients of the IMFs with the same temporal scale were calculated.  Figure 6 shows the IMFs of the selected simulated signals of the on-line wavelength from 300 to 1500 m decomposed by EEMD. The frequency of the IMF systematically decreased. We generally consider that IMFs with high frequency contain little useful information, and most of them are occupied by noise. Therefore, we assumed the IMFs with high frequency to be completely submerged in noise. We adopted the method of removing high-frequency IMFs to realize CO 2 -DIAL signal de-noising. Therefore, the key to the problem was how to determine the IMFs that need to be removed. To solve this problem, the correlation coefficients of the IMFs with the same temporal scale were calculated.
They can be considered as observed signals collected within a short time in stable atmospheric conditions. We segmented the signals into two parts to verify the performance of the method under different signal intensities. One part was approximately from 300 to 1500 m, and the other was from 1500 to 3000 m.  Figure 6 shows the IMFs of the selected simulated signals of the on-line wavelength from 300 to 1500 m decomposed by EEMD. The frequency of the IMF systematically decreased. We generally consider that IMFs with high frequency contain little useful information, and most of them are occupied by noise. Therefore, we assumed the IMFs with high frequency to be completely submerged in noise. We adopted the method of removing high-frequency IMFs to realize CO2-DIAL signal de-noising. Therefore, the key to the problem was how to determine the IMFs that need to be removed. To solve this problem, the correlation coefficients of the IMFs with the same temporal scale were calculated. We determined the IMFs that need to be removed and reconstructed the signal according to the following equations: and Signal y j is reconstructed by several IMFs on individual scale C js (t) (s = 1, 2, . . . , S) with a relevance factor K s (t), where CR represents the correlation coefficients of the IMFs with the same temporal scale. We considered that the IMF contains more noise than useful information on the condition that half of all correlation coefficients are less than 0.5. The 0.5 correlation threshold is determined by a statistical experiment. The main idea of the experiment is to determine the threshold when the signal gets the maximum SNR and conduct a count in various thresholds. In order to verify the selection of threshold in various signal quality, we have simulated 50 signals with various SNR (51, 52, . . . , 100). We use the method proposed in this paper to analyze the signals in the sets of three and it totals 48 groups. The result showed that the signals obtained the maximum SNR when 0.3, 0.4, 0.6, 0.7 were selected as the criterion in some cases (Figure 7). Although sometimes several thresholds can also get the highest SNR, only the 0.5 correlation threshold can apply to almost all of the cases (45 groups in whole 48 groups). Therefore, a 0.5 correlation threshold is optimal. time.
We determined the IMFs that need to be removed and reconstructed the si cording to the following equations: Signal yj is reconstructed by several IMFs on individual scale Cjs(t) (s = 1, 2, … a relevance factor Ks(t), where CR represents the correlation coefficients of the IM the same temporal scale. We considered that the IMF contains more noise than u formation on the condition that half of all correlation coefficients are less than 0.5 correlation threshold is determined by a statistical experiment. The main idea o periment is to determine the threshold when the signal gets the maximum SNR a duct a count in various thresholds. In order to verify the selection of threshold in signal quality, we have simulated 50 signals with various SNR (51, 52, …, 100). We method proposed in this paper to analyze the signals in the sets of three and it groups. The result showed that the signals obtained the maximum SNR when 0.3 0.7 were selected as the criterion in some cases (Figure 7). Although sometimes thresholds can also get the highest SNR, only the 0.5 correlation threshold can almost all of the cases (45 groups in whole 48 groups). Therefore, a 0.5 correlation old is optimal. The following table (Table 2) lists the correlation coefficients of the IMFs same temporal scale of the on-line and off-line wavelengths. The correlation coe of the first IMF were comparatively low (i.e., below 0.5), whereas those of IM reached more than 0.5. Therefore, we removed the first IMF and reconstructed the ing components.
The ultimate purpose of de-noising is to improve the inversion accuracy; t simultaneously analyzed the off-line wavelength. The following table (Table 2) lists the correlation coefficients of the IMFs with the same temporal scale of the on-line and off-line wavelengths. The correlation coefficients of the first IMF were comparatively low (i.e., below 0.5), whereas those of IMFs 2-7 reached more than 0.5. Therefore, we removed the first IMF and reconstructed the remaining components.
The ultimate purpose of de-noising is to improve the inversion accuracy; thus, we simultaneously analyzed the off-line wavelength. Figure 8 shows the linear regression of the de-noised signal as a function of the pure signal. The good linear relationship demonstrates that the signals have a high degree of similarity. Meanwhile, the SNR and mean square error (MSE) were calculated (Figure 9). A better result was obtained after removing the first IMF. From the above discussion, we can conclude that the method works well for on-line or off-line wavelength echo signals. However, evaluating the log ratio of two de-noised signals is more significant, which is crucial to the inversion precision. The following equation is the inversion formula for the CO 2 concentration [29,30]: where N CO2 represents the concentration of CO 2 , σ is the absorption cross-section of CO 2 , and ∆R is the range resolution.  From the above discussion, we can conclude that the method works well for on-line or off-line wavelength echo signals. However, evaluating the log ratio of two de-noised signals is more significant, which is crucial to the inversion precision. The following equation is the inversion formula for the CO2 concentration [29,30] where NCO2 represents the concentration of CO2, σ is the absorption cross-section of CO2, and ΔR is the range resolution. In this paper, DAOD refers to the difference between on-line and off-line wavelength in the CO2-DIAL system, which is used to represent the difference between the two laser echo signals caused by CO2 absorption. Then, Equation (11) can also be expressed as the following equation by DAOD: Figure 10 shows the DAOD of the pure, original, and de-noised signal and directly shows that the data quality significantly improves and has a better fitting effect. The rsquare listed in Table 3 is consistent with this. According to Equation (12), the slope of the DAOD is the ratio of the DAOD and ΔR, which is important for the inversion of the CO2 concentration. By comparing the slopes of the three data types, the slope of the de-noised signal is closer to the expected value, which demonstrates the validity of the de-noising method.  From the above discussion, we can conclude that the method works well for on-line or off-line wavelength echo signals. However, evaluating the log ratio of two de-noised signals is more significant, which is crucial to the inversion precision. The following equation is the inversion formula for the CO2 concentration [29,30] where NCO2 represents the concentration of CO2, σ is the absorption cross-section of CO2, and ΔR is the range resolution. In this paper, DAOD refers to the difference between on-line and off-line wavelength in the CO2-DIAL system, which is used to represent the difference between the two laser echo signals caused by CO2 absorption. Then, Equation (11) can also be expressed as the following equation by DAOD: Figure 10 shows the DAOD of the pure, original, and de-noised signal and directly shows that the data quality significantly improves and has a better fitting effect. The rsquare listed in Table 3 is consistent with this. According to Equation (12), the slope of the DAOD is the ratio of the DAOD and ΔR, which is important for the inversion of the CO2 concentration. By comparing the slopes of the three data types, the slope of the de-noised signal is closer to the expected value, which demonstrates the validity of the de-noising method. In this paper, DAOD refers to the difference between on-line and off-line wavelength in the CO 2 -DIAL system, which is used to represent the difference between the two laser echo signals caused by CO 2 absorption. Then, Equation (11) can also be expressed as the following equation by DAOD: Figure 10 shows the DAOD of the pure, original, and de-noised signal and directly shows that the data quality significantly improves and has a better fitting effect. The r-square listed in Table 3 is consistent with this. According to Equation (12), the slope of the DAOD is the ratio of the DAOD and ∆R, which is important for the inversion of the CO 2 concentration. By comparing the slopes of the three data types, the slope of the de-noised signal is closer to the expected value, which demonstrates the validity of the de-noising method.   The above simulation results indicate that the method performed well in the range from 300 to 1500 m, which can be considered to have little noise influence. Further analysis should be implemented to assess the capability of the de-noising method. We performed the same steps specified above to analyze the range from 1500 to 3000 m, which is more readily affected by noise. Seven IMFs were obtained after EEMD, and we carried out signal reconstruction by removing the first four IMFs that did not meet the criteria (Table 4). Compared with the pure signal, the de-noised signal had good consistency ( Figure 11). Therefore, we can deduce that the de-noising process destroys little useful information from the original signal. The highest SNR and the lowest MSE were obtained when the first four IMFs were removed ( Figure 12). These results suggest that the de-noising method works well for single signals.   The above simulation results indicate that the method performed well in the range from 300 to 1500 m, which can be considered to have little noise influence. Further analysis should be implemented to assess the capability of the de-noising method. We performed the same steps specified above to analyze the range from 1500 to 3000 m, which is more readily affected by noise. Seven IMFs were obtained after EEMD, and we carried out signal reconstruction by removing the first four IMFs that did not meet the criteria (Table 4). Compared with the pure signal, the de-noised signal had good consistency ( Figure 11). Therefore, we can deduce that the de-noising process destroys little useful information from the original signal. The highest SNR and the lowest MSE were obtained when the first four IMFs were removed ( Figure 12). These results suggest that the de-noising method works well for single signals. Table 4. Linear fitting parameters of the DAOD for simulated signal from 1500 to 3000 m.

Pure
Original De-Noised  The above simulation results indicate that the method performed well in the range from 300 to 1500 m, which can be considered to have little noise influence. Further analysis should be implemented to assess the capability of the de-noising method. We performed the same steps specified above to analyze the range from 1500 to 3000 m, which is more readily affected by noise. Seven IMFs were obtained after EEMD, and we carried out signal reconstruction by removing the first four IMFs that did not meet the criteria (Table 4). Compared with the pure signal, the de-noised signal had good consistency ( Figure 11). Therefore, we can deduce that the de-noising process destroys little useful information from the original signal. The highest SNR and the lowest MSE were obtained when the first four IMFs were removed ( Figure 12). These results suggest that the de-noising method works well for single signals.  Because of the influence of noise, the signal intensity of the on-line and off-line wavelength signals was manifested; thus, the large changes in the log ratio of the two signals produced many negative values (Figure 13, left). These values impaired the validity and credibility of the linear fitting and led to a poor result. Nevertheless, a good result was obtained after implementing the de-noising method (Figure 13, right). The figure shows that no negative value was obtained, which fully illustrates that the errors of the two echo signals were greatly corrected. Because the DAOD represents the log ratio of two signals significantly smoothened after de-noising, some smooth fluctuations in the DAOD are acceptable. Moreover, the slope of the de-noised data was closer to the slope of the pure signal than that of the original signal (Table 4), which improved the inversion precision. Because of the influence of noise, the signal intensity of the on-line and off-line wavelength signals was manifested; thus, the large changes in the log ratio of the two signals produced many negative values (Figure 13, left). These values impaired the validity and credibility of the linear fitting and led to a poor result. Nevertheless, a good result was obtained after implementing the de-noising method (Figure 13, right). The figure shows that no negative value was obtained, which fully illustrates that the errors of the two echo signals were greatly corrected. Because the DAOD represents the log ratio of two signals significantly smoothened after de-noising, some smooth fluctuations in the DAOD are acceptable. Moreover, the slope of the de-noised data was closer to the slope of the pure signal than that of the original signal (Table 4), which improved the inversion precision. In order to further prove the superiority of the method in this paper, we compared the proposed method with several de-noising methods ( Figure 14). The multi-pulse average method contributed little to the result. Compared with the EEMD method, the EMD and WT methods did not sufficiently improve the performance. From the data listed in Table 5, the EEMD method is more representative of the other methods.   Because of the influence of noise, the signal intensity of the on-line and off-line wavelength signals was manifested; thus, the large changes in the log ratio of the two signals produced many negative values (Figure 13, left). These values impaired the validity and credibility of the linear fitting and led to a poor result. Nevertheless, a good result was obtained after implementing the de-noising method (Figure 13, right). The figure shows that no negative value was obtained, which fully illustrates that the errors of the two echo signals were greatly corrected. Because the DAOD represents the log ratio of two signals significantly smoothened after de-noising, some smooth fluctuations in the DAOD are acceptable. Moreover, the slope of the de-noised data was closer to the slope of the pure signal than that of the original signal (Table 4), which improved the inversion precision. In order to further prove the superiority of the method in this paper, we compared the proposed method with several de-noising methods ( Figure 14). The multi-pulse average method contributed little to the result. Compared with the EEMD method, the EMD and WT methods did not sufficiently improve the performance. From the data listed in Table 5, the EEMD method is more representative of the other methods.  In order to further prove the superiority of the method in this paper, we compared the proposed method with several de-noising methods ( Figure 14). The multi-pulse average method contributed little to the result. Compared with the EEMD method, the EMD and WT methods did not sufficiently improve the performance. From the data listed in Table 5, the EEMD method is more representative of the other methods.

Observed Signal Analysis
The CO2-DIAL echo signal for analysis was collected in relatively stable atmospheric conditions and without acute changes on 29 December 2015. Thus, we assumed that the useful information of the signals is the same as one another; any difference could be due to the noise with a 30 s duration. Three signal pairs sequentially collected within 30 s were selected as our research objects; each pair includes on-line and off-line wavelength echo signals.
In our CO2-DIAL system, two acquisition modes were used, namely, photon-count-

Observed Signal Analysis
The CO 2 -DIAL echo signal for analysis was collected in relatively stable atmospheric conditions and without acute changes on 29 December 2015. Thus, we assumed that the useful information of the signals is the same as one another; any difference could be due to the noise with a 30 s duration. Three signal pairs sequentially collected within 30 s were selected as our research objects; each pair includes on-line and off-line wavelength echo signals.
In our CO 2 -DIAL system, two acquisition modes were used, namely, photon-counting and analog modes. The former has a better SNR, and the latter can compensate for the limitation in collecting the near-range echo signal of the former. Then, we chose the photonic signals of three signal pairs ranging from 1100 to 3000 m for presentation and named them as T1, T2, and T3. In particular, we only implemented signal de-noising on T2, and T1 and T3 served as reference signals. Figure 15 shows the IMFs of the off-wavelength echo signals of T1, T2, and T3. We calculated the correlation coefficients of the IMFs with the same temporal scale. From Table 6, we can find that the correlation coefficients of the first two IMFs are lower than 0.5. Those of the remaining IMFs that are not listed are higher than 0.5. Thus, we deleted the two IMFs and reconstructed the residue components. A good de-noising result is shown in the following ( Figure 16). We see that the denoising signal became smoother as well as retained the characteristics of the original signal. To validate the effect of the de-noising method for the investigated signal, we calculated the DAOD of the original and the de-noised signals. A significant improvement was obtained. The DAOD of the original signal showed negative values and had large discreteness ( Figure 17). In contrast to the original signal, the de-noised signal had good linearity, and the r-square after linear fitting was 0.31 higher than that of the original signal (Table 7). Moreover, the interference of the negative values was eliminated, which undoubtedly enhances the data reliability. Meanwhile, the slope of the DAOD improved to some extent. A good de-noising result is shown in the following (Figure 16). We see that the de-noising signal became smoother as well as retained the characteristics of the original signal. To validate the effect of the de-noising method for the investigated signal, we calculated the DAOD of the original and the de-noised signals. A significant improvement was obtained. The DAOD of the original signal showed negative values and had large discreteness ( Figure 17). In contrast to the original signal, the de-noised signal had good linearity, and the r-square after linear fitting was 0.31 higher than that of the original signal (Table 7). Moreover, the interference of the negative values was eliminated, which undoubtedly enhances the data reliability. Meanwhile, the slope of the DAOD improved to some extent.  Figure 15. IMFs of the off-wavelength echo signals of T1, T2, and T3 (C1, C2, …, C7 represent various temporal scales).
A good de-noising result is shown in the following (Figure 16). We see that the denoising signal became smoother as well as retained the characteristics of the original signal. To validate the effect of the de-noising method for the investigated signal, we calculated the DAOD of the original and the de-noised signals. A significant improvement was obtained. The DAOD of the original signal showed negative values and had large discreteness ( Figure 17). In contrast to the original signal, the de-noised signal had good linearity, and the r-square after linear fitting was 0.31 higher than that of the original signal (Table 7). Moreover, the interference of the negative values was eliminated, which undoubtedly enhances the data reliability. Meanwhile, the slope of the DAOD improved to some extent.

Conclusions
For the CO2-DIAL, CO2 inversion is an important process. However, the existence of noise severely hinders the improvement of inversion precision. In this paper, we used the EEMD algorithm to de-noise the echo signals of the ground-based CO2-DIAL system and

Conclusions
For the CO 2 -DIAL, CO 2 inversion is an important process. However, the existence of noise severely hinders the improvement of inversion precision. In this paper, we used the EEMD algorithm to de-noise the echo signals of the ground-based CO 2 -DIAL system and set the threshold by calculating the correlation coefficients of the same scale components of multiple groups of adjacent signals decomposed by EEMD so as to determine the number of basic mode components to be removed. This method not only retains the useful information effectively but also removes the noise component. We decomposed three adjacent signal pairs into different IMFs using EEMD and removed the IMFs according to their correlation coefficients with the same temporal scale. Then, we reconstructed the residual components to complete the signal de-noising.
To verify the feasibility of the method, we analyzed the simulated CO 2 -DIAL signal. Three similar simulated signal groups with different SNRs were selected as our research objects. They can be considered as observed signals collected within a short time in stable atmospheric conditions. We segmented the signals into two parts to verify the performance of the method under different signal intensities. First, the de-noised signal and the pure signal were compared in this paper. The good linear relationship between the de-noised signal and the pure signal demonstrates that the signals have a high degree of similarity. Meanwhile, the SNR and MSE were calculated to prove the effectiveness of the method proposed in this paper. Then, the DAOD of the de-noised signal is calculated. The R 2 of the DAOD reaches 0.841 and 0.835 for the detected range of 300-1500 m and 1500-3000 m, indicating that the quality is high, which is conducive to the high-precision inversion of CO 2 . In order to further prove the superiority of the method in this paper, we also compared the proposed method with several de-noising methods. The results show that our method is more representative of the other methods in CO 2 -DIAL de-noising. Finally, the method was used to observe CO 2 -DIAL signal de-noising. The de-noised signal became smoother as well as retained the characteristics of the original signal. In contrast to the original signal, the de-noised signal had good linearity, and the R 2 of the DAOD was improved from 0.444 of the original signal to 0.735 of the de-noised signal, which is beneficial to the enhancement of the inversion precision. Certainly, this work may have some shortcomings. For example, the threshold for selecting the correlation coefficients was determined by many experiments but lacked theoretical foundations. We simply discussed the effect of the method under stable atmospheric conditions, and other cases were not considered. The above-mentioned deficiencies still require further research, and the proposed method requires continuous improvement in the future.

Author Contributions:
The study was completed with cooperation between all authors. C.X. and Y.Z. designed the research topic; C.X., Y.Z. and R.L. conducted the experiment; A.L. checked and analyzed the experimental results; C.X. wrote the paper. All authors have read and agreed to the published version of the manuscript.

Conflicts of Interest:
The authors declare no conflict of interest.