Long Short-Term Memory Recurrent Network Architectures for Electromagnetic Field Reconstruction Based on Underground Observations

: Deep underground laboratories offer advantages for conducting high-precision observations of weak geophysical signals, benefiting from a low background noise level. Enhancing strong, noisy ground electromagnetic (EM) field data using synchronously recorded underground EM signals, which typically exhibit a high signal-to-noise ratio, is both valuable and feasible. In this study, we propose an EM field reconstruction method employing a Long Short-Term Memory (LSTM) recurrent neural network with referenced deep underground EM observations. Initially, a deep learning model was developed to capture the time-varying features of underground multi-component EM fields using the LSTM recurrent neural network. Subsequently, this model was applied to process synchronously observed strong, noisy data from other conventional observation systems, such as those at the surface, to achieve noise suppression through signal reconstructions. Both the theoretical analysis and the practical observational data suggest that the proposed method effectively suppresses noise and reconstructs clean EM signals. This method is efficient and time-saving, representing an effective approach to fully utilizing the advantages of deep underground observation data. Furthermore, this method could be extended to the processing and analysis of other geophysical data.


Introduction
Deep underground spaces are characterized by low background noise, minimal cosmic ray intensity, and limited electromagnetic radiation.These features make underground laboratories crucial locations for research across microphysics, astrophysics, cosmology, and geoscience [1].Over the past decades, numerous international underground laboratories have been constructed and continuously developed, including SNO in Canada [2], Kamioka in Japan [3], Modane and LSBB in France [4,5], Boulby in the UK [6], Baksan in Russia [7,8], and SarGrav in Italy [9,10], among others.These underground laboratories range in volume from a few hundred to thousands of cubic meters, with vertical rock cover thicknesses from a few hundred meters to over two thousand meters.The thick cover layer offers an ideal low-background environment for experimental research across various disciplines [11].Some underground laboratories have conducted multi-geophysical field observations.For example, LSBB in France employed superconducting magnetometers to observe magnetic field signals, demonstrating the inherent advantages of deep underground environments in capturing weak electromagnetic signals [12].Since 2020, collaborative observations of multigeophysical fields have been conducted at the Huainan underground laboratory (HNLab) in China.These observations highlight the advantages and potential of the underground environment for sustained, stable geophysical field measurements [13][14][15][16][17][18][19].However, geophysical fields observations in HNLab are still in the early stages [16], and numerous challenges remain, including experimental method design, underground lab infrastructure renovation, and instrument research and development.Additionally, it is crucial to fully utilize observational data from these 'ultra-silent' and 'ultra-clean' environments to explore their potential in denoising traditional surface observation signals.
The magnetotelluric method (MT) is a geophysical method used to observe natural time-varying electromagnetic fields for probing subsurface geo-electrical structures.MT has been widely employed to investigate crust and mantle structures, subsurface deformations, and deep processes.However, weak natural EM field signals are susceptible to contamination by various anthropogenic noises, significantly limiting the practical effectiveness of MT.To obtain stable EM signals with high signal-to-noise ratios, improving MT technology has become a key focus [20,21].A typical MT data processing algorithm aims to optimize superimposed partial transfer functions or power spectral density matrices across all frequency domain windows.This process involves identifying, eliminating, and weighting outliers in the time and/or frequency domains.However, the measured data often deviate from an ideal Gaussian distribution, rendering traditional least-squares methods less effective.Manual labeling of time slices or partial power spectral density matrices is a direct but time-consuming and inefficient method for enhancing superposition.Robust estimations based on various regression algorithms can mitigate the impact of outliers induced by random impulse noise [22][23][24][25].However, achieving robust results is challenging when noise persists throughout the entire observation period.Another effective approach is the remote reference method, which suppresses uncorrelated noise between local and remote stations by utilizing an additional 'clean' remote station [26][27][28].This method is commonly applied using magnetic channels, leveraging strong correlations among regional signals.In cases where array datasets are available, frequency domain principal component analysis is a reliable processing approach using Fourier coefficient vectors from multi-channels across multi-stations [29][30][31][32].This method reduces the dimensionality of high-dimensional datasets to lower dimensions, enhancing processing efficiency.However, the multiplechannel algorithm requires specific data quality and quantity criteria to achieve optimal results.The mentioned methods often involve a short-time Fourier transform.Additionally, MT data processing utilizes other transforms and digital signal processing algorithms, such as the wavelet transform [33], the Hilbert-Huang Transform [34], Empirical Mode Decomposition [35], compressive sensing technology [36], Wiener filtering [37], independent component analysis [38], and the blind source separation method [39], among others.These methods demonstrate effectiveness in improving processing outcomes when dealing with typically distributed noise, provided that specific parameters are appropriately defined.However, they can be challenging to apply effectively in scenarios involving complex and strong noise.
Processing initial time series data, including noise recognition, removal, and signal reconstruction, represents a direct and effective approach to enhance the signal-to-noise ratio.In recent years, owing to continuous advancements in computer technology, deep learning algorithms have provided unique advantages in signal analysis and data processing, becoming crucial tools in geoscience [40].Theses algorithms have been successfully applied to optimize and process EM field data.Manoj and Nagarajan [41] utilized artificial neural networks to identify and remove noise segments from EM data.This method requires manual identification and labeling of features, such as data quality and signal correlation, in the training dataset.Li et al. [42] employed a convolutional neural network (CNN) to construct a nonlinear mapping model between noisy data and noise contours.Li et al. [43] proposed a denoising algorithm combining CNNs and long-short term memory (LSTM) recurrent neural networks.Initially, signal-to-noise separation is conducted using CNNs, and then the LSTM model is trained with denoised data from the CNNs to predict clean data from noisy signals.However, significant misfits may occur when continuous noise exists in time series or when the length of noisy data is too long.Han et al. [44] developed a noise suppression method using LSTM networks, and the reconstructed data were employed in 2D inversion modeling, effectively suppressing strong noise with typical morphological features.Additionally, other improved convolutional networks were pro-posed in the denoising application of EM data [45,46].However, lower accuracy can result when noisy and clean datasets exhibit similar morphological features.Most typical deep learning-based noise suppression methods are designed for single-channel data.Zhang et al. [47] proposed a multi-channel approach using CNNs and applied the method in audio magnetotelluric data processing, although longer-period data remain ambiguous.The efficacy of supervised learning-based methods depends on available training datasets that encompass the principal signal and noise morphological features.Additionally, unsupervised data-driven deep learning techniques based on sparse coding algorithms are employed in the noise identification, separation, and suppression of EM data, such as K-SVD dictionary learning [48][49][50][51][52][53], improved shift-invariant sparse coding [54], and adaptive sparse representation [55], among others.However, appropriately defining specified coefficients is essential for these methods, often requiring multiple experiments to achieve optimal results.Furthermore, two key dataset conditions are necessary to ensure efficacy: a sufficiently high signal-to-noise ratio and strong regularity characteristics [48].
Robust training models play crucial roles in ensuring successful time series denoising outcomes.Generally, supervised learning algorithms use labeled training datasets to generalize training models [56], and having a rich and representative training dataset is crucial to ensure good generalization ability of the model in practical applications.Traditionally, typical EM noises, such as charge and discharge triangle waves, square waves, peak noise, and step noise, are superimposed on clean measured and/or theoretical data to construct the training datasets [42][43][44]46].However, constructing sufficient sample libraries that encompass various complex noise features of practical data can be challenging.Although data-driven algorithms such as dictionary learning based on sparse coding [48][49][50]54,55] can be used to obtain atoms that match the target signal directly from the observed data, the applicability of such methods when faced with persistent and strong noise interference still requires further investigation.
In this study, we propose a signal reconstruction method based on LSTM neural networks using underground EM observation data.The LSTM model was utilized for its sensitivity to time and robust learning ability for features of time series.The method eliminates the need to pre-construct a database comprising a substantial number of synthesized signals and noises.Instead, it utilizes clean and stable multi-channel data from underground observations as the model training set in a data-driven manner.This approach allows the method to fully learn regional EM field variations and characteristics.Subsequently, noise suppression or 'cleaning' is achieved through the process of signal reconstruction using noisy segments from synchronously observed ground data.

Recurrent Neural Network
Recurrent neural networks (RNNs) excel in processing time series by capturing correlations between data points that are close in the sequence [57].Unlike traditional neural networks, RNNs maintain a 'state vector' in their hidden units that implicitly retains information about the history of all past elements in the sequence [58].This capability allows RNNs to overcome the challenges related to backward and forward dependencies encountered by traditional neural networks.
Figure 1 illustrates a schematic diagram depicting the process of unfolding an RNN into a complete network.At time moment t, the input is denoted by X t , the output of the hidden layer neuron is denoted by S t , and the output is denoted by O t .The corresponding weight matrices are represented by U, V, and W. The calculation of S t involves the input (X t ) at the current time and the output (S t−1 ) of the previous hidden layer neuron (Equation (1)): The activation function for the hidden layer is typically either tanh or ReLU.Th weight matrices are represented by U and W, and the bias vector is denoted by bs.
Equation ( 2) illustrates the relationship between the neuron input St and the outpu Ot: The activation function of output layer, typically SoftMax, is denoted by g.Th weight matrix is denoted by V, and the bias vector is denoted by bo.
Recurrent neural networks exhibit a memory effect when processing time series data This structure enables RNNs to store, remember, and process past complex signals ove extended time periods [59].

Long Short-Term Memory Neural Network
RNN can be considered as a deep feed-forward neural network where all layers shar the same weights.However, during the model training phase, issues such as exploding gradients or vanishing gradients can arise, making it challenging to learn long-term de pendencies in the data [59,60].LSTM networks are a variant of RNNs designed to over come the limitations of traditional RNNs in capturing long-term dependencies within se quences and addressing challenges such as exploding gradients and vanishing gradient [61,62].LSTM has the capability to address a wide range of tasks, such as text recognition natural language processing, image and video description, sentiment analysis, and com puter vision.Additionally, it has demonstrated significant efficacy in processing time se ries data [63].
As depicted in Figure 2, LSTM incorporates cell states to retain long-term information The cell state functions akin to a conveyor belt, running directly through the entire se quence with minimal linear interactions [64].Controlling this cell state is pivotal to LSTM and the concept of gates was introduced to manage this process.These gates regulate in formation flow within hidden neurons, enabling the preservation of extracted feature from previous time steps [65].The activation function for the hidden layer is typically either tanh or ReLU.The weight matrices are represented by U and W, and the bias vector is denoted by b s .
Equation (2) illustrates the relationship between the neuron input S t and the output O t : The activation function of output layer, typically SoftMax, is denoted by g.The weight matrix is denoted by V, and the bias vector is denoted by b o .
Recurrent neural networks exhibit a memory effect when processing time series data.This structure enables RNNs to store, remember, and process past complex signals over extended time periods [59].

Long Short-Term Memory Neural Network
RNN can be considered as a deep feed-forward neural network where all layers share the same weights.However, during the model training phase, issues such as exploding gradients or vanishing gradients can arise, making it challenging to learn long-term dependencies in the data [59,60].LSTM networks are a variant of RNNs designed to overcome the limitations of traditional RNNs in capturing long-term dependencies within sequences and addressing challenges such as exploding gradients and vanishing gradients [61,62].LSTM has the capability to address a wide range of tasks, such as text recognition, natural language processing, image and video description, sentiment analysis, and computer vision.Additionally, it has demonstrated significant efficacy in processing time series data [63].
As depicted in Figure 2, LSTM incorporates cell states to retain long-term information.The cell state functions akin to a conveyor belt, running directly through the entire sequence with minimal linear interactions [64].Controlling this cell state is pivotal to LSTM, and the concept of gates was introduced to manage this process.These gates regulate information flow within hidden neurons, enabling the preservation of extracted features from previous time steps [65].
The forget gate (f t ) determines the extent to which the previous cell state (C t−1 ) at the previous time step is retained in the current cell state (C t ) at the current time step: The input gate (i t ) determines the extent to which the current input (X t ) at the current time step is retained in the current cell state (C t ): The candidate cell state (C ′ t ) describing the current input (X t ) is expressed as follows: The current cell state (C t ) is then determined as follows: The output gate (O t ) regulates the transfer of the cell state (C t ) to the current output (h t ): Ultimately, the output of the LSTM (h t ) is determined by combining the output gate (O t ) with the cell state (C t ): In the above equations, W represents the weight matrix, b represents the bias vector, and h t−1 and h t denote the hidden states of the previous and current time steps, respectively.The function σ represents the sigmoid function, and ⊙ denotes the Hadamard product.The forget gate (ft) determines the extent to which the previous cell state (Ct−1) at the previous time step is retained in the current cell state (Ct) at the current time step: The input gate (it) determines the extent to which the current input (Xt) at the current time step is retained in the current cell state (Ct): The candidate cell state (′ ) describing the current input (Xt) is expressed as follows: The current cell state ( ) is then determined as follows: The output gate ( ) regulates the transfer of the cell state ( ) to the current output ( ): Ultimately, the output of the LSTM (ht) is determined by combining the output gate ( ) with the cell state ( ): In the above equations, W represents the weight matrix, b represents the bias vector, and ht−1 and ht denote the hidden states of the previous and current time steps, respectively.The function σ represents the sigmoid function, and ⨀ denotes the Hadamard In summary, the LSTM recurrent neural network introduces the mechanism of cell states and 'gates' and offers advantages in capturing long-term dependencies.Challenges such as vanishing gradients or exploding gradients in traditional RNNs are overcome in the LSTM, enabling the neural network to preserve information over extended durations [66].The LSTM network features a self-connection at the next time step with a weight.This connection allows the network to copy its own real-valued state and accumulates an external signal.Moreover, this self-connection is gated by another unit, which learns to determine when to clear the content of the memory [58].The LSTM model is sensitive to time and can effectively learn features present in time series data.This capability proves advantageous in tasks like time series prediction and signal processing [61], making it suitable for operations such as EM signal reconstruction and noise suppression.

Reconstruction of Electromagnetic Data
The EM signal reconstruction method based on underground data and an LSTM recurrent neural network involves two main phases: (1) Model training: utilizing underground EM data as the training dataset, the LSTM recurrent neural network is trained.(2) Signal reconstruction: the noisy sections of synchronously observed ground EM data are reconstructed to enhance the data quality.Figure 3 depicts a flowchart of this data reconstruction method, illustrating the underground environment and the station deployment with EM observation experiments conducted in HNLab.The LSTM network model consists of an input layer, one or multiple hidden layers, and an output layer.The input layer receives multi-channel EM time series, which are then processed by the hidden layer to perform nonlinear transformations and extract feature information.The output layer is connected to the hidden layers, and the final output is obtained through linear transformation by the fully connected layer.The hidden layers comprise two layers, with a hidden state dimension of 64.
The EM signal reconstruction method based on underground data and an LSTM re-current neural network involves two main phases: (1) Model training: utilizing underground EM data as the training dataset, the LSTM recurrent neural network is trained.(2) Signal reconstruction: the noisy sections of synchronously observed ground EM data are reconstructed to enhance the data quality.Figure 3 depicts a flowchart of this data reconstruction method, illustrating the underground environment and the station deployment with EM observation experiments conducted in HNLab.The LSTM network model consists of an input layer, one or multiple hidden layers, and an output layer.The input layer receives multi-channel EM time series, which are then processed by the hidden layer to perform nonlinear transformations and extract feature information.The output layer is connected to the hidden layers, and the final output is obtained through linear transformation by the fully connected layer.The hidden layers comprise two layers, with a hidden state dimension of 64.It is necessary to preprocess raw observation data through data normalization [67] to accelerate the gradient descent: where  denotes the input data,  and  represent the maximum and minimum values of the input data, and  denotes the normalized data.
The loss function and optimization are two crucial parameters for training an LSTM recurrent neural network.The loss function calculates the misfit between the labeled and predicted values to assess the model's accuracy.The optimization adjusts the model's parameters to minimize the loss function [68].We employed the Adam optimization algorithm, which is a gradient-based, stochastic objective function optimization algorithm combining the advantages of the AdaGrad algorithm for sparse gradient processing and RMSprop for nonstationary target processing.The Adam algorithm is easy to implement, It is necessary to preprocess raw observation data through data normalization [67] to accelerate the gradient descent: where x i denotes the input data, x max and x min represent the maximum and minimum values of the input data, and x denotes the normalized data.The loss function and optimization are two crucial parameters for training an LSTM recurrent neural network.The loss function calculates the misfit between the labeled and predicted values to assess the model's accuracy.The optimization adjusts the model's parameters to minimize the loss function [68].We employed the Adam optimization algorithm, which is a gradient-based, stochastic objective function optimization algorithm combining the advantages of the AdaGrad algorithm for sparse gradient processing and RMSprop for nonstationary target processing.The Adam algorithm is easy to implement, efficient, and requires minimal memory [69].The Mean Square Error (MSE) loss function is defined as follows: where x represents the measured data, and y represents the output data.The MSE is a commonly used loss function in regression problems due to its differentiability and computational efficiency.During neural network training, model parameters are updated based on the loss function.The backpropagation algorithm calculates the gradient of the loss function with respect to the model parameters, facilitating their adjustment to minimize the loss function and achieve optimal model performance.

Synthetic Experiments
Four types of simulated noise, including charge and discharge triangle waves, Gaussian noise, square waves, and peak noise, were randomly superimposed onto the synthetic time series with a sampling rate of 1000 Hz and a length of 10 s. Figure 4 illustrates the reconstruction results from the time series contaminated with these different types of noise.The reconstructed data effectively suppress the four typical types of EM noise, exhibiting a high degree of agreement between the reconstructed and original clean data in terms of waveform and amplitude.
based on the loss function.The backpropagation algorithm calculates the gradient of t loss function with respect to the model parameters, facilitating their adjustment to mi mize the loss function and achieve optimal model performance.

Synthetic Experiments
Four types of simulated noise, including charge and discharge triangle waves, Gau ian noise, square waves, and peak noise, were randomly superimposed onto the synthe time series with a sampling rate of 1000 Hz and a length of 10 s. Figure 4 illustrates t reconstruction results from the time series contaminated with these different types noise.The reconstructed data effectively suppress the four typical types of EM noise, hibiting a high degree of agreement between the reconstructed and original clean data terms of waveform and amplitude.To further assess the robustness of the reconstruction technique, the four types noise were randomly combined and added to the original time series.The signal-to-no ratios were −5 dB, −15 dB, −13 dB, and 5 dB for cases of charge and discharge trian waves, square waves, peak noise, and Gaussian noise, respectively.Both triangle wav and square waves had a frequency of 1 Hz. Figure 5 illustrates the data reconstructi results for the time series contaminated with this complex noise combination.The reco structed data showed good agreement with the original clean data in terms of wavefo shape and developmental trend, even under the influence of complex random noise.Ho ever, obvious differences in amplitude were present in the details, suggesting limitatio of the approach, especially for high-frequency components, given the significant dist tion characteristics associated with noise-containing data.To further assess the robustness of the reconstruction technique, the four types of noise were randomly combined and added to the original time series.The signal-to-noise ratios were −5 dB, −15 dB, −13 dB, and 5 dB for cases of charge and discharge triangle waves, square waves, peak noise, and Gaussian noise, respectively.Both triangle waves and square waves had a frequency of 1 Hz. Figure 5 illustrates the data reconstruction results for the time series contaminated with this complex noise combination.The reconstructed data showed good agreement with the original clean data in terms of waveform shape and developmental trend, even under the influence of complex random noise.However, obvious differences in amplitude were present in the details, suggesting limitations of the approach, especially for high-frequency components, given the significant distortion characteristics associated with noise-containing data.
The mean absolute percentage error (MAPE, Equation ( 11)) and the symmetric mean absolute percentage error (SMAPE, Equation ( 12)) were employed as evaluation metrics to assess the denoising effect: where, n represents the number of samples, y denotes the original data, and The mean absolute percentage error (MAPE, Equation ( 11)) and the symmetric mean absolute percentage error (SMAPE, Equation ( 12)) were employed as evaluation metrics to assess the denoising effect: where, n represents the number of samples, y denotes the original data, and y indicates the reconstructed data.MAPE measures the mean percentage error between the original and reconstructed data, while SMAPE incorporates the relative error into the assessment.Lower values of both MAPE and SMAPE indicate better denoising performance.The quantitative evaluation results for the reconstructed data depicted in Figures 4  and 5 are summarized in Table 1.The findings indicate that for the four typical singlenoise datasets, the MAPE was approximately 2%, with the SMAPE being less than 1%.Additionally, for the randomly combined noise datasets, the MAPE was only 3.85%, with a SMAPE of 1.09%.These results indicate that the LSTM network model effectively suppresses noise components, whether dealing with single-type noise data or complex multinoise interference, highlighting the method's efficacy and reliability.Similar synthetic experiments utilizing the four typical types of noise were conducted in previous studies using different CNN and/or RNN models [42][43][44][45].The results of data contaminated by single noises showed good agreement between clean data and denoised data, as shown in our results in Figure 4. Additionally, Li et al. [42] and Li et al. [43]  The quantitative evaluation results for the reconstructed data depicted in Figures 4 and 5 are summarized in Table 1.The findings indicate that for the four typical single-noise datasets, the MAPE was approximately 2%, with the SMAPE being less than 1%.Additionally, for the randomly combined noise datasets, the MAPE was only 3.85%, with a SMAPE of 1.09%.These results indicate that the LSTM network model effectively suppresses noise components, whether dealing with single-type noise data or complex multi-noise interference, highlighting the method's efficacy and reliability.Similar synthetic experiments utilizing the four typical types of noise were conducted in previous studies using different CNN and/or RNN models [42][43][44][45].The results of data contaminated by single noises showed good agreement between clean data and denoised data, as shown in our results in Figure 4. Additionally, Li et al. [42] and Li et al. [43] employed combined noise experiments, where different noises were added to clean data in an orderly manner and distributed separately.In Figure 5, we also designed a dataset with multiple types of noise, but the noise was randomly stacked, with various noises superimposed on each other.The more complicated noise combination in our dataset increased the difficulty of signal reconstruction to some extent, which led to relatively large curve misfits in Figure 5 and large MAPE and SMAPE values in Table 1.

Data
The EM observation data utilized for method validation were obtained from HNLab in China.HNLab was repurposed from a large closed coal mine and is situated over 100 km away from the TanLu fault zone in the east and approximately 200 km away from the junction of the South China Block and the North China Block in the south [16].The underground environment at HNLab is characterized by an 870 m thick sedimentary cover, which effectively attenuates high-frequency anthropogenic EM noises.This attenuation facilitates high-quality, stable, and weak EM field observations underground.In contrast, groundlevel observations are subject to complex and time-varying noise characteristics [19].
We utilized synchronous observations from underground point G (−848 m below sea level) and ground point Q (+22 m above sea level) (Figure 3) for model learning and signal reconstruction.Data were observed on 5 November 2022 at a sampling rate of 1000 Hz.Due to ongoing infrastructure renovations at HNLab, improvements in electrode deployments are still necessary, so only magnetic field data were used in this study.However, the approach was sensitive to time and the strong correlations present between electrical and magnetic fields; thus, similar results should be obtainable for electrical fields observed under appropriate conditions.Both horizontal components, Hx (north-south) and Hy (east-west), were employed as multi-dimensional inputs for LSTM network model training.Subsequently, the noisy segments of the synchronous ground data were reconstructed using the trained LSTM network model.Larger datasets allow for the utilization of more hidden layers and neurons during neural network model training, reducing the likelihood of overfitting [70].A sufficiently large and high-quality dataset is essential for a robust training model.We used nighttime data for processing and analysis to avoid the impact of human-induced noise during daytime hours.

Processing and Results
A total of 10 min of normalized magnetic field data were partitioned into training, validation, and test datasets, accounting for 80%, 10%, and 10% of the data, respectively.We employed a multivariate and multi-step sliding time window approach for model training.This method introduces a lag effect in trend perception, causing a phase shift in the reconstructed data.However, this phase shift can be easily corrected in the time domain by referencing the original data.
Figure 6 compares the original and reconstructed ground Hx time series spanning one minute, sampled at 1000 Hz.Additionally, the synchronous underground data used in the model training are also shown.Previous studies have indicated an approximate difference of one to two orders of magnitude in the high-frequency band (>1 Hz) between underground and ground magnetic field data [19].Thanks to the attenuation effects of the conductive cover layer, underground data are not contaminated with significant noise disturbances present at the ground level.The ground data near the 27 and 50 s marks showed obvious impulse noises that were absent in the underground data (Figure 6).The results exhibited good agreement in amplitudes between the reconstructed and observed ground data, with both sets showing roughly consistent waveforms.In addition, the noise in the observed data as significantly suppressed in the reconstructed signals, particularly regarding the impulse noise near 27 s and 50 s (Figure 6).Notably, the reconstructed data exhibited less high-frequency information compared to the observed data.Xie et al. [19] conducted a comparative analysis between original underground data and low-pass-filtered ground data, indicating that the attenuation of the cover layer at HNLab can approximately be considered a low-pass filter with a cutoff frequency of 1 Hz.This finding explains the inconsistencies in local details between the original and reconstructed data (Figure 6).We conducted model training with underground data and performed data reconstruction with the original ground data.The difference between the two datasets in spectral components in the high-frequency band resulted in the absence of high-frequency information in the reconstructed ground signals.Additionally, it is notable that the amplitude of EM fields attenuates by a factor of 1/e after penetration through the sedimentary layer, meaning that high-frequency components are significantly weakened in amplitude but not entirely eliminated [19].Therefore, high-frequency signals are present in both the ground and underground data in Figure 6.
formation in the reconstructed ground signals.Additionally, it is notable that the am tude of EM fields attenuates by a factor of 1/e after penetration through the sedimen layer, meaning that high-frequency components are significantly weakened in ampli but not entirely eliminated [19].Therefore, high-frequency signals are present in both ground and underground data in Figure 6.We further applied a fifth-order Chebyshev Type I low-pass filter to the ground d Figure 7 illustrates the reconstruction of the ground Hx component using 100 Hz pass-filtered data.Compared to the results using the original data (Figure 6), Figu shows better agreement in amplitude and waveform between the filtered observation and the corresponding reconstructions.The reconstructed data exhibited substantial pression of the high-frequency noise present in the observation data, notably at two cific instances (around 27 s and 50 s).However, nearly sine wave signals with an app imate frequency of 2 Hz were present in both the filtered ground data and reconstru data (Figure 7b).Previous studies have indicated the presence of periodic noise w frequency of 2 Hz on the ground [19].This suggests that the approach has limitation handling cases where strong periodic noises are present in the input dataset.Digital si processing techniques, such as notch filtering, are still required when the frequenci certain noises are known as prior information.We further applied a fifth-order Chebyshev Type I low-pass filter to the ground data.Figure 7 illustrates the reconstruction of the ground Hx component using 100 Hz low-passfiltered data.Compared to the results using the original data (Figure 6), Figure 7 shows better agreement in amplitude and waveform between the filtered observation data and the corresponding reconstructions.The reconstructed data exhibited substantial suppression of the high-frequency noise present in the observation data, notably at two specific instances (around 27 s and 50 s).However, nearly sine wave signals with an approximate frequency of 2 Hz were present in both the filtered ground data and reconstructed data (Figure 7b).Previous studies have indicated the presence of periodic noise with a frequency of 2 Hz on the ground [19].This suggests that the approach has limitations in handling cases where strong periodic noises are present in the input dataset.Digital signal processing techniques, such as notch filtering, are still required when the frequencies of certain noises are known as prior information.
Xie et al. [19] suggest that ground and underground data are roughly at the same level in the low-frequency band of <1 Hz. Figure 8 depicts the reconstruction results using the 1 Hz low-pass-filtered ground data.The results indicated good agreement in both amplitude and waveform between the filtered ground data and the corresponding reconstructed data.Additionally, the noise around 27 s was significantly suppressed in the reconstructed data.
Figure 9 illustrates comparisons of the power spectral density (PSD) between the ground data and the corresponding reconstructed data.Both the original and low-passfiltered data were used in the processing.The data reconstruction based on the LSTM model and derived from the underground data suggested the absence of a spectrum component or significant attenuation in the high-frequency band of the reconstructed results.This was clearly indicated in the PSD of the original data, where the reconstructed signals exhibited lower amplitudes than the observed data in the high-frequency band of >10 Hz (Figure 9a).Similar results are suggested in Figure 9b, where both PSDs in the frequency band of >100 Hz have a low amplitude of less than −100 dB due to the 100 Hz low-pass filtering.
For the 1 Hz filtered data, the coincident reconstructed and filtered data in the frequency band of >10 Hz were lower than −100 dB, while the reconstructed data showed slightly higher amplitudes in the frequency band of 1-10 Hz.Although the conductive cover at HNLab is considered a low-pass filter with a cutoff frequency of 1 Hz, this natural low-pass filter is an approximation based on the definition of skin depth [19].Therefore, in the stopband (>1 Hz), the reconstructed data showed slightly higher amplitudes compared to data filtered through a digital low-pass filter defined with a stronger attenuation coefficient.Focusing on the low-frequency band, all three cases of PSD indicated good agreement between the observed data and the reconstructed data.These results suggest that the reconstructed ground data effectively retained the essential spectrum information in the low-frequency band.Xie et al. [19] suggest that ground and underground data are roughly at the s level in the low-frequency band of <1 Hz. Figure 8 depicts the reconstruction results u the 1 Hz low-pass-filtered ground data.The results indicated good agreement in both plitude and waveform between the filtered ground data and the corresponding re structed data.Additionally, the noise around 27 s was significantly suppressed in th constructed data.Xie et al. [19] suggest that ground and underground data are roughly at th level in the low-frequency band of <1 Hz. Figure 8 depicts the reconstruction result the 1 Hz low-pass-filtered ground data.The results indicated good agreement in bo plitude and waveform between the filtered ground data and the corresponding structed data.Additionally, the noise around 27 s was significantly suppressed in constructed data.pared to data filtered through a digital low-pass filter defined with a stronger attenuati coefficient.Focusing on the low-frequency band, all three cases of PSD indicated go agreement between the observed data and the reconstructed data.These results sugg that the reconstructed ground data effectively retained the essential spectrum informat in the low-frequency band.The PSDs illustrated stacked results derived from different windows.We further c culated the power spectral probability density functions (PDFs) [71] to check variations the power spectral density during the entire period.Figure 10 presents comparisons of PDFs between the ground data and the corresponding reconstructed data.In the f quency band of <100 Hz, the results derived from the original and 100 Hz low-passtered data were similar, with the reconstructed data > 10 Hz presenting lower PDF valu (Figure 10a,b).In contrast, both the 1 Hz filtered data and the reconstructed data exhibit PDF amplitudes less than −50 dB in the stopband (>1 Hz).In the lower frequency ban the PDFs of the reconstructed data were in agreement with the original and/or filter ground data (Figure 10c).Additionally, the PDFs of the reconstructed data present more focused distributions in the low frequency band, indicating that random stro noise is effectively suppressed using the LSTM model.Both the PSD and the PDF resu demonstrate that our signal reconstruction method is robust in noise suppression wh preserving the effective components of the original data.However, a limitation of t approach is the absence of high-frequency information in the reconstructed data caus by the LSTM model being trained with the attenuated underground data.The PSDs illustrated stacked results derived from different windows.We further calculated the power spectral probability density functions (PDFs) [71] to check variations in the power spectral density during the entire period.Figure 10 presents comparisons of the PDFs between the ground data and the corresponding reconstructed data.In the frequency band of <100 Hz, the results derived from the original and 100 Hz low-passfiltered data were similar, with the reconstructed data > 10 Hz presenting lower PDF values (Figure 10a,b).In contrast, both the 1 Hz filtered data and the reconstructed data exhibited PDF amplitudes less than −50 dB in the stopband (>1 Hz).In the lower frequency band, the PDFs of the reconstructed data were in agreement with the original and/or filtered ground data (Figure 10c).Additionally, the PDFs of the reconstructed data presented more focused distributions in the low frequency band, indicating that random strong noise is effectively suppressed using the LSTM model.Both the PSD and the PDF results demonstrate that our signal reconstruction method is robust in noise suppression while preserving the effective components of the original data.However, a limitation of this approach is the absence of high-frequency information in the reconstructed data caused by the LSTM model being trained with the attenuated underground data.
Figure 11 illustrates the comparisons derived from a one-dimensional continuous wavelet transform between the ground data and the corresponding reconstructed data using a Morse wavelet.Both the original and reconstructed data were employed for analysis.Notably, both the original and filtered data exhibited obvious interference noise around 2:19:10, which was significantly suppressed in the reconstructed data.Consistent with the results of the PSDs and PDFs (Figures 9 and 10), the wavelet spectra of the reconstructed signals aligned with the original and/or filtered data in a certain lower frequency band, demonstrating more stable and noise-free features in the reconstructed data.Furthermore, Figure 11 highlights that due to inherent differences in frequency components between the model training data (underground data) and the target processing data (ground data), there was a certain loss of high-frequency components in the reconstructed ground data.This consideration is important for practical applications.Figure 11 illustrates the comparisons derived from a one-dimensional continuous wavelet transform between the ground data and the corresponding reconstructed data using a Morse wavelet.Both the original and reconstructed data were employed for analysis.Notably, both the original and filtered data exhibited obvious interference noise around 2:19:10, which was significantly suppressed in the reconstructed data.Consistent with the results of the PSDs and PDFs (Figures 9 and 10), the wavelet spectra of the reconstructed signals aligned with the original and/or filtered data in a certain lower frequency band, demonstrating more stable and noise-free features in the reconstructed data.Furthermore, Figure 11 highlights that due to inherent differences in frequency components between the model training data (underground data) and the target processing data (ground data), there was a certain loss of high-frequency components in the reconstructed ground data.This consideration is important for practical applications.

Discussion and Conclusions
This study presents an EM data reconstruction method utilizing LSTM recurrent ne ral networks trained on data observed in deep underground environments known their low-background noise advantages.The LSTM network model is specifically tailor to capture the complex coupling relationships and dynamic characteristics inherent

Discussion and Conclusions
This study presents an EM data reconstruction method utilizing LSTM recurrent neural networks trained on data observed in deep underground environments known for their low-background noise advantages.The LSTM network model is specifically tailored to capture the complex coupling relationships and dynamic characteristics inherent in multi-channel EM field components.The method focuses on reconstructing strong noise segments synchronously observed at noisy stations, particularly typical ground stations.Key characteristics of this approach include the following: (1) Typical MT noise suppression methods utilizing CNNs or RNNs are primarily supervised learning algorithms.The effectiveness and robustness of both model training and denoising processing hinge on the availability of diverse and extensive training datasets encompassing both signals and noises.Training datasets often include synthetic data representing typical types of noise and/or high-quality measured signals, aiding in noise identification, separation, and reconstruction for specific observed signals [42][43][44].However, EM field signals exhibit strong spatial and weak temporal correlations and are susceptible to contamination from random and complex noise sources.As a result, synthetic data-derived signal and noise patterns may not fully encapsulate the diverse characteristics of observed data, limiting the applicability and reliability of these methods.The inherent complexities and variability of real-world EM field data necessitate novel approaches that can effectively adapt and generalize to diverse noise conditions and signal characteristics.
In this study, both the training dataset and the processing dataset were synchronously observed, indicating strong inherit correlations and coupling between the two datasets within a specific spatial range, particularly with respect to magnetic field components.Unlike approaches that have focused solely on characterizing the morphological contours of noise, our method leverages the robust spatial correlation of EM fields and high-quality underground data.This approach utilizes the intrinsic characteristics of synchronous observations and the coupling between different EM components obtained through deep learning to achieve data reconstruction.The method capitalizes on the strong spatial correlation within EM field signals and the high signal-to-noise ratio characteristic of underground data.Importantly, this process does not require prior knowledge of noise types, and it reduces the necessary sample size for the training dataset compared to traditional methods, thereby offering advantages in processing efficiency.Moreover, traditional methods typically involve independent training and processing for specific single components.In contrast, our multidimensional model reconstruction approach not only comprehensively learns the spatiotemporal characteristics of individual components but also establishes a robust network mapping based on the couplings among different EM components.This holistic approach enables more effective and efficient data reconstruction while capturing complex interdependencies among various EM field parameters.
(2) The methodology employed in this study shares formal similarities with the remote reference method, as both approaches utilize reference station data observed in low-EM noise environments and leverage the strong spatial correlation exhibited by EM fields in regional areas.However, there are fundamental distinctions between them: Firstly, traditional remote reference methods like those proposed by Gamble et al. [27] and Clarke et al. [28], along with variants such as the remote reference with 'magnetic control' (RRMT) introduced by Varentsov [72] and Varentsov [73], operate in the frequency domain.These methods capitalize on the lack of noise correction between remote and local channels to suppress noise levels in the power spectral density matrix.In contrast, the method presented in this study utilizes deep learning technology to establish network mapping for time series reconstructions of noise-affected data segments in the time domain.Secondly, traditional remote reference methods focus on signal correlation between local and remote stations, calculating correlations independently for each component.In contrast, our method considers correlations among all channels from both local and remote stations.Varentsov [72] introduces the horizontal magnetic tensor into the traditional remote reference method, which also incorporates constraints arising from the correlation between different magnetic field components.
(3) Deep learning has found widespread application in the analysis of seismic data [74,75], gravity data [76], and various other geophysical field observations.These applications have demonstrated highly effective processing and denoising capabilities for time series data.The LSTM network model training and signal reconstruction method proposed in this study, which is based on underground EM observation data, can be extended to encompass other deep underground observation experiments involving multiple geophysical fields.This extension would entail adapting the model training and processing for individual geophysical fields or multiple coupled geophysical fields, thereby serving as a preprocessing tool to capture weak signals and address scientific challenges.It is important to recognize that time series data from different geophysical field observations exhibit diverse patterns and are subject to varying types of noise.Consequently, adjustments to model parameters will be necessary when applying this method to process other multi-geophysical field data.Each field may require tailored approaches to optimize performance and achieve accurate signal reconstruction and denoising effects.
(4) The complex noise in the reconstructed data is effectively suppressed with the LSTM network model, benefiting from the attenuation to high-frequency interference noise when penetrating through the conductive cover layer.However, this attenuation also leads to differences in frequency components between the training data and the target reconstructed data, resulting in an amplitude loss of high-frequency information in the reconstructed signal.Therefore, when targeting high-frequency information in processing, the method is no longer directly applicable, and further amplitude correction of the effective signal becomes necessary.
(5) It is notable that transfer functions and/or sounding curves are not available due to the absence of robust electrical fields in this study.Therefore, further observations, supported by an appropriate infrastructure that benefits stable electrical field recordings in underground environments, are required to fully prove the robustness of this approach.

1 Figure 1 .
Figure 1.Recurrent neural networks (RNNs) and unfolding in time.The input is denoted by Xt, th output of the hidden layer neuron is denoted by St, and the output is denoted by Ot.The corre sponding weight matrices are represented by U, V, and W.

Figure 1 .
Figure 1.Recurrent neural networks (RNNs) and unfolding in time.The input is denoted by X t , the output of the hidden layer neuron is denoted by S t , and the output is denoted by O t .The corresponding weight matrices are represented by U, V, and W.

Figure 2 .
Figure 2. Structure of Long Short-Term Memory unit.

Figure 3 .
Figure 3. Flowchart of electromagnetic data reconstruction using LSTM neural network based on underground observations.

Figure 3 .
Figure 3. Flowchart of electromagnetic data reconstruction using LSTM neural network based on underground observations.

Figure 4 .
Figure 4. Data reconstruction from clean time series superimposed with different types of noise: charge and discharge triangle waves; (b) square waves; (c) Gaussian noise; and (d) peak noise.Sc matic diagram of different noise types.Red lines represent the noisy data, blue lines represent clean data, and black lines represent the reconstructed data.SNR: signal-to-noise ratio.

Figure 4 .
Figure 4. Data reconstruction from clean time series superimposed with different types of noise: (a) charge and discharge triangle waves; (b) square waves; (c) Gaussian noise; and (d) peak noise.Schematic diagram of different noise types.Red lines represent the noisy data, blue lines represent the clean data, and black lines represent the reconstructed data.SNR: signal-to-noise ratio.

19 Figure 5 .
Figure 5. Data reconstruction from time series superimposed with four types of noise.Red lines represent the noise data, blue lines represent the clean data, and black lines represent the reconstructed data.MAPE: mean absolute percentage error; SMAPE: symmetric mean absolute percentage error.

Figure 5 .
Figure 5. Data reconstruction from time series superimposed with four types of noise.Red lines represent the noise data, blue lines represent the clean data, and black lines represent the reconstructed data.MAPE: mean absolute percentage error; SMAPE: symmetric mean absolute percentage error.

Figure 6 .
Figure 6.Reconstruction of ground Hx component using original data.(a) One-minute data; (b) second data.Light orange lines represent the original ground data, orange lines represent th constructed data, and black lines represent the underground data.

Figure 6 .
Figure 6.Reconstruction of ground Hx component using original data.(a) One-minute data; (b) five-second data.Light orange lines represent the original ground data, orange lines represent the reconstructed data, and black lines represent the underground data.

Atmosphere 2024 , 11 Figure 7 .
Figure 7. Reconstruction of ground Hx component using 100 Hz low-pass-filtered data.(a) minute data; (b) five-second data.Light orange lines represent the ground data, and orange represent the reconstructed data.

Figure 8 .
Figure 8. Reconstruction of ground Hx component using 1 Hz low-pass-filtered data.Light or

Figure 7 .Figure 7 .
Figure 7. Reconstruction of ground Hx component using 100 Hz low-pass-filtered data.(a) One-minute data; (b) five-second data.Light orange lines represent the ground data, and orange lines represent the reconstructed data.

Figure 8 .
Figure 8. Reconstruction of ground Hx component using 1 Hz low-pass-filtered data.Light lines represent the ground data, and orange lines represent the reconstructed data.

Figure 9
Figure 9 illustrates comparisons of the power spectral density (PSD) betwe ground data and the corresponding reconstructed data.Both the original and low filtered data were used in the processing.The data reconstruction based on the model and derived from the underground data suggested the absence of a spectrum ponent or significant attenuation in the high-frequency band of the reconstructed r

Figure 8 .
Figure 8. Reconstruction of ground Hx component using 1 Hz low-pass-filtered data.Light orange lines represent the ground data, and orange lines represent the reconstructed data.

Figure 9 .
Figure 9. Power spectral densities of ground and reconstructed data.The ground data used are original data (a), the 100 Hz low-pass-filtered data (b), and the 1 Hz low-pass-filtered data (c).

Figure 9 .
Figure 9. Power spectral densities of ground and reconstructed data.The ground data used are the original data (a), the 100 Hz low-pass-filtered data (b), and the 1 Hz low-pass-filtered data (c).

Figure 10 .
Figure 10.Probability density functions of ground and reconstructed data.The ground data used are the original data (a), the 100 Hz low-pass-filtered data (b), and the 1 Hz low-pass-filtered data (c).

Figure 10 .Figure 11 .
Figure 10.Probability density functions of ground and reconstructed data.The ground data used are the original data (a), the 100 Hz low-pass-filtered data (b), and the 1 Hz low-pass-filtered data (c).Atmosphere 2024, 15, x FOR PEER REVIEW 14 of

Table 1 .
Evaluation metrics for different noise suppression effects.

Table 1 .
Evaluation metrics for different noise suppression effects.