1. Introduction
As critical components in power grids, the operational condition of power transformers directly impacts the safety and stability of the entire power system. With the increasing adoption of high-voltage direct current (HVDC) transmission and the effects of geomagnetic disturbances, the potential voltage at the transformer neutral point changes, resulting in DC bias in the transformer core [
1,
2]. The DC bias shifts the static magnetic operating point of the core, thereby driving the magnetic flux into saturation in one half-cycle. Consequently, vibration and noise levels increase, and the operation of the equipment will be put at risk in severe cases [
3]. The existing studies show that transformer core vibration originates mainly from the magnetostriction effect of silicon steel sheets which is known as the slight dimensional changes in ferromagnetic materials during magnetization [
4]. It has been established that the magnetostriction effect amplifies vibration in large motor cores by more than 17% [
5]. Therefore, an in-depth investigation into the magnetostriction properties of silicon steel sheets under DC bias conditions is of great significance for understanding the mechanisms behind transformer vibration and noise.
The magnetostriction phenomenon has been widely investigated. The classic Jiles–Atherton model provides a theoretical basis for the hysteresis characteristics of ferromagnetic materials [
6]. The magnetostriction varies significantly depending on the magnetization method. For example, the magnitude of magnetostriction under rotating magnetization is generally greater than that under alternating magnetization [
7]. Moreover, harmonic components significantly influence the spectral structure of magnetostriction [
8], and the local structural features (such as T-joints) can also lead to a notable increase in strain [
9]. Under DC bias conditions, the previous studies have demonstrated that DC bias distorts the symmetry of the magnetostriction curve and considerably alters its amplitude and spectral characteristics [
10,
11,
12]. Based on a hysteresis model, Yan et al. analyzed the effects of DC bias on magnetostriction properties and identified that such bias worsens waveform distortion while intensifying nonlinearity [
13]. Furthermore, from a magneto-acoustic coupling perspective, Liu et al. examined the transformer vibration and noise characteristics, revealing that operating conditions substantially affect vibration response [
14].
The precise identification of the DC bias level from the magnetostriction waveform remains a challenge. Wang et al. employed a backpropagation neural network (BPNN) to model the dynamic hysteresis curves derived from magnetic field amplitude and phase [
15]. Guo et al. proposed a wavelet convolutional neural network architecture for DC bias prediction, utilizing the features extracted from the wavelet transform of vibration signals [
16]. However, harmonic amplitudes can be significantly affected by measurement noise and spectral leakage, thus limiting the reliability of identification in non-ideal conditions. Therefore, both the Fourier and wavelet transform methods are limited by their reliance on accurate amplitude estimation and may fail to capture the interdependence within time sequences. In our previous study [
17], a physics-informed neural network (PINN) was proposed to calculate the exact magnetostriction value for a given magnetic field excitation. Nevertheless, the effectiveness of PINN is constrained by its reliance on accurate underlying physical models.
To address these limitations, this study proposes a DC bias identification framework that integrates time-domain statistical features with frequency-domain physical features. In the time domain, multiscale mutual information features are extracted directly from the magnetostriction series [
18,
19,
20]. These features capture the nonlinear statistical dependencies induced by DC bias without requiring transformation to the frequency domain or prior knowledge of the physical model. In the frequency domain, the feature set with a small number of key harmonic amplitudes is also provided to enhance discriminative capability. Furthermore, long short-term memory (LSTM) networks have exhibited strong performance in modeling sequential data [
21]. Previous studies have demonstrated the potential of deep learning in addressing DC bias identification [
22]. However, most existing machine learning approaches are predominantly data-driven, which limits insight into the underlying physical mechanisms of DC bias. To bridge this gap, this study explores the integration of physical mechanism analysis with data-driven methods.
The remainder of this paper is organized as follows.
Section 2 introduces the underlying magnetostriction theory, the feature extraction methodology, and the proposed DC bias identification model.
Section 3 describes the experimental system, which is designed with six voltage levels (2.0–4.0 V) and seven DC bias ratios (0–30%) to collect comprehensive magnetostriction strain datasets.
Section 4 presents an analysis of the magnetostriction response under DC bias, examining both time-domain and frequency-domain characteristics.
Section 5 evaluates the prediction performance of an LSTM model that combines multi-scale mutual information and frequency-domain features. Finally, the conclusions are summarized in
Section 6.
2. Background and Methodology
2.1. Magnetostriction Mechanism
For ferromagnetic materials, such as grain-oriented silicon steel sheets, the magnetostriction phenomenon refers to the dimensional changes that accompany the magnetization process of a material. The magnetostriction strain
λ is defined as the fractional change in length along the direction of magnetization:
where Δ
l represents the length change, and
l is the static length of the ferromagnetic material. The unit of
λ is μm/m.
For silicon steel sheets, there is a nonlinear relationship between the magnetostriction strain λ and the magnetization
M (or magnetic flux density
B). Based on the ferromagnetic theory of Jiles [
6], the magnetostriction strain
λ exhibits a nonlinear dependence on the applied magnetic field strength
H, which can be expressed as an even-order polynomial of
H:
where α
1, α
2 are material-dependent magnetostriction coefficients. Consequently, under a sinusoidal magnetic field at 50 Hz, the fundamental frequency (100 Hz) of the magnetostriction response is twice the excitation frequency.
2.2. Nonlinear Modeling Under DC Bias
When a DC bias is superimposed on an AC excitation, the magnetic field strength acting on the silicon steel sheet can be expressed as:
where
Hdc is the DC bias field,
Hac is the amplitude of the AC field,
ω = 2π
f, and
f = 50 Hz. In this study, we focus on the relationship between λ and
H. To investigate the effect of the DC component
Hdc on the total magnetostriction, we expand λ(
H) as a Taylor series around the DC bias
Hdc:
Substituting
h(t) =
Hac cos(
ωt) into Equation (4):
where the coefficient
kn is the
n-th order nonlinear coefficient at the DC bias
Hdc, which varies with
Hdc and reflects the modulation of the material’s nonlinear characteristics by the DC bias.
Using the trigonometric identities,
hn(
t) can be expanded as a combination of harmonics. The linear term
produces the 50 Hz component. The quadratic term
produces the DC component and 100 Hz components. The cubic term
produces 50 Hz and 150 Hz components. The quartic term produces the DC, 100 Hz, and 200 Hz components, and so on. Neglecting minor higher-order and cross terms, the harmonic amplitudes are given by:
Equations (6)–(8) prove that the second harmonic (100 Hz) is primarily contributed by the even-order nonlinear coefficients (k2, k4, …), and the third harmonic (150 Hz) is primarily contributed by the odd-order nonlinear coefficients (k3, k5, …). In the following cases, the harmonic variations reflect how the coefficient kn varies with Hdc:
- (1)
Monotonic decrease in the 100 Hz amplitude: As shown in Equation (7), A2 ∝ k2 (neglecting higher-order terms). The experiment will show that A2 decreases monotonically as the DC bias increases, indicating that k2 decreases monotonically as Hdc increases. Physically, k2 characterizes the curvature of the λ-H curve at Hdc. In symmetric magnetostriction curves (such as the butterfly curve), the curvature is greatest near the origin.
- (2)
150 Hz amplitude first increases then decreases: From Equation (8), it is known that A3 ∝ k3. The experimental results will show that A3 peaks at a 10–15% bias, indicating that k3 first increases and then decreases with Hdc, exhibiting a maximum value. k3 characterizes the degree of asymmetry of the λ-H curve.
- (3)
Non-monotonic variation in the 50 Hz amplitude: In Equation (6), A1 is jointly influenced by k1 and k3. k1 represents the slope of the λ-H curve, which typically increases with rising Hdc. However, the peak value of k3 affects the fundamental component through interaction terms, causing A1 to exhibit a complex behavior of first decreasing and then increasing under high voltage.
- (4)
The complex evolution of higher-order harmonics is primarily determined by the coefficients k4, k5, and so on. These coefficients are sensitive to Hdc and may exhibit non-monotonic behavior, leading to complex “shaping” effects in the spectrum, such as harmonics in the 200–500 Hz range peaking near a 20% bias.
The time-domain waveform distortion can also be explained by the above mechanism. The waveform asymmetry factor
η and the crest factor
CF are defined as follows:
where
λ+ represents the absolute value of the positive peak, and
λ− represents the absolute value of the negative peak. A larger
η value indicates poorer waveform symmetry.
λpeak is the peak value, and
λrms is the root mean square value. For a standard sine wave, CF ≈ 1.414. The greater the deviation from this value, the more severe the waveform distortion. Under DC bias, the changes in
k2 cause the alterations in the amplitude and phase of the 100 Hz component, resulting in waveform asymmetry. Additionally, the peak-to-peak value reflects the maximum dynamic amplitude of the magnetostriction strain, defined by the following formula:
In the frequency domain, to quantify the overall effect of DC bias on the degree of nonlinearity, the total harmonic distortion (THD) is defined as:
where
A2 is the amplitude of the 100 Hz component, and
A2n represents the amplitude at 100
n Hz. In this study, the dominant frequency of the magnetostriction under pure AC excitation is 100 Hz. Therefore, 100 Hz is used as the “fundamental frequency”.
2.3. Feature Extraction
Given the highly nonlinear characteristics of the magnetostriction strains from silicon steel sheets, this paper employs a feature fusion method combining multiscale mutual information and frequency-domain features. The former is used to characterize the nonlinear statistical dependencies of the signal across different time scales, while the latter is used to characterize key harmonic components, thereby enabling joint modeling of time-domain and frequency-domain information to enhance the feature discriminability.
The mutual information is defined as:
where
τ represents the delay time, and
p(⋅) is the corresponding probability density function. The sequences
x and
y represent the magnetostriction strain
λ before and after the delay, respectively. This feature comprehensively reflects the statistical characteristics of the magnetostriction strain across different time scales.
Multiscale mutual information feature extraction focuses on the nonlinear dynamic characteristics of magnetostriction strain signals and enables fast computation through sliding window segmentation and histogram statistics. In this study, a one-second signal (sampled at 10 kHz) is divided into 50×
i segments based on the multiscale parameter
i, and the average mutual information between the first segment and all other segments is calculated:
where
T represents the total number of sampling points within one second. In this procedure, the first segment is selected as a reference to quantify the statistical dependence between it and its time-delayed versions. This process of averaging mutual information across all subsequent segments improves the feature’s robustness to local signal fluctuations. The probability density functions
p(
x),
p(
y), and
p(x,
y) required for mutual information calculation are estimated via a two-dimensional histogram-based method with a fixed number of bins (e.g., 50). For a one-second signal segment containing 10,000 data points, this approach offers a trade-off between computational efficiency and estimation accuracy. The multiscale parameter
i was chosen as 1, 2, …, 8 (yielding eight features). An upper limit of 8 was chosen because a larger
i would result in too few data points per segment and render the mutual information more noise-sensitive. By varying the value of the segmentation parameter
i, a feature vector containing multiscale information is constructed:
This feature vector quantifies the changes in statistical dependence induced by DC bias, compensating for the shortcomings of traditional linear features in characterizing nonlinearity, and provides effective input for state identification.
In the frequency domain, 100 Hz, 150 Hz, and 350 Hz are selected as feature frequencies, corresponding to even-order, odd-order, and higher-order nonlinear responses, respectively. The spectral amplitude is calculated via FFT to construct a frequency-domain feature [V100, V150, V350], which is used to characterize the harmonic energy distribution and the spectral reconstruction process.
Finally, a combined vector containing both mutual information and frequency-domain features is used as the input to the LSTM model. The combined feature vector integrates statistically and physically complementary information. Thus, it achieves a more comprehensive characterization of the nonlinear modulation in the magnetostriction response under DC bias.
2.4. LSTM-Based Identification Model
Long short-term memory (LSTM) networks are widely used for state prediction and recognition due to their advantages in processing time series data. The input to the LSTM network is a sequence of the extracted 11-D feature vectors, and outputs the predicted DC offset ratio sequence. In this study, a sequence of 30 samples was constructed through interval sampling. The LSTM neural network architecture is shown in
Figure 1.
The designed LSTM network architecture consists of two LSTM layers with 16 and 8 hidden units, respectively, followed by a Dropout layer (dropout rate of 0.2) to prevent overfitting, and finally outputs the predicted values through a fully connected layer. The specific structure is shown in
Table 1.
3. Experimental Setup
The magnetostriction characteristics were measured using a dedicated platform presented in
Figure 2. The system is composed of three main parts: an excitation source, a strain measurement system, and a data acquisition system. A closed magnetic circuit was constructed by using a B30P105 grain-oriented silicon steel sheet (400 mm × 100 mm × 0.28 mm), an excitation coil (100 turns), and an induction coil (150 turns). The steel sheet was clamped with adjustable tensile stress (0–15 MPa) in the rolling direction (RD). The voltage excitation source, consisting of a Tektronix AFG31000 waveform generator and a power amplifier, supplied a composite signal of AC with a superimposed DC bias to the excitation coil. The resulting magnetostriction was measured at the sheet end by a Keyence LK-G5001 laser displacement sensor with an LK-H008 head (Keyence, Osaka, Japan), which has a resolution of 0.005 μm. In addition, the strain signal was amplified before acquisition. An NI USB-6002 DAQ card (National Instruments, Austin, Texas, USA), synchronized and controlled via MATLAB 2024b, sampled all signals at 10 kHz per channel.
To simulate transformer DC bias, the composite excitation signal was employed. The AC frequency was fixed at 50 Hz at six RMS voltage levels (2.0–4.0 V). For each AC level, seven DC bias ratios (0–30% of the AC amplitude) were applied, creating 42 test conditions. Each condition was sampled for 15 s and repeated six times to ensure reliability.
When a current passes through the excitation coil, an excitation magnetic field is generated around it, which produces a magnetic flux in the test sample. The magnetic field strength of the induced magnetic field can be calculated as:
where
N1 is the number of turns of the excitation coil,
i1 is the obtained current in the excitation coil, and
lm denotes the magnetic path length. During the magnetization process of the steel sheet, an induced voltage
u2 is generated in the induction coil. According to Faraday’s law of electromagnetic induction, the AC component of the magnetic flux density can be expressed as:
where
N2 is the number of turns of the sensing coil and
Ac is the cross-sectional area of the test sample. Integrating the above expression yields:
The total magnetic flux density is defined as B = Bac + Bdc, where Bdc is evaluated based on the static B-H curve at the point Hdc.
4. Experimental Results
4.1. Time-Domain Results
Figure 3 compares the time-domain magnetostriction waveforms under different DC bias levels at 2.0 V. As shown in the figure, the positive and negative half-cycles of the waveform are approximately symmetrical without DC bias. As the DC bias increases, the positive peak gradually decreases, while the absolute value of the negative peak first decreases and then increases, resulting in a clearly asymmetrical waveform.
Figure 4 presents the hysteresis loops (B–H curves) under typical operating conditions. As shown in
Figure 4a,b, in the absence of DC bias, the hysteresis loops maintain good central symmetry under both low excitation (2.0 V) and high excitation (4.0 V). The high excitation voltage leads to a significant increase in the maximum magnetic field strength, driving the magnetization process into the saturation regime. When a 30% DC bias is superimposed, as illustrated in
Figure 4c,d, the
B–
H loops undergo severe deformation and shift. The DC bias introduces a constant DC magnetic field component (
Hdc), forcing the operating point of magnetization to deviate from the origin. In this case, the forward magnetization enters the flat saturation regime, while the reverse demagnetization is weakened, resulting in severe unilateral shift and asymmetric deformation of the loop. Physically, the asymmetric shift in the magnetization trajectory corresponds to a change in the operating point
Hdc in Equation (4), causing drastic variations in the even-order (
k2) and odd-order (
k3) nonlinear coefficients. This asymmetric transfer in the magnetization process leads to the significant difference in the amplitudes of the positive and negative half-cycles of the strain waveforms.
To quantify the variability of the experimental measurements, each operating condition was measured six times, and the standard deviation was computed for all quantities.
Figure 5 illustrates the variation in the asymmetry factor
η with the DC bias ratio, and the explicit error bands are also shown in the figure.
η increases monotonically at all voltages, indicating that DC bias continuously enhances waveform asymmetry by introducing even-order nonlinear terms. At the same bias level, higher voltages result in lower
η values, indicating that the magnetization process is dominated by the AC component under high AC excitation, thereby reducing the modulation effect of even-order nonlinearity on the waveform. At a low bias ratio (<10%), all voltages exhibit a similar
η value, reflecting weak nonlinearity. At high bias ratios (>10%), the curves exhibit clear divergence. This behavior is attributed to a significant shift in the operating point and a corresponding strengthening of the voltage–bias coupling effect.
Figure 6 illustrates the variation in the peak-to-peak value of magnetostriction with DC bias under different AC voltages. Because the observed standard deviations are very small (typically <3% of the mean value), error bands are not shown. Under all operating conditions, the overall trend follows a pattern of initial decrease and subsequent increase. A rapid decrease from 0% to 25%, followed by a slight recovery from 25% to 30%. The initial decrease stems from the DC bias shifting the operating point away from the symmetrical center. The recovery at high bias is related to the enhancement of higher-order harmonics in the strong nonlinear region. At the same bias level, higher voltages result in larger peak-to-peak values and a more gradual change, indicating that high voltage enhances the magnetization drive and weakens the relative modulation effect of the DC bias. The waveform distortion at high bias reflects the enhancement of coupling nonlinearity.
Figure 7 shows the distribution of the crest factor (CF) under different voltages and DC biases. CF is used to characterize harmonic distortion (for a sine wave, CF ≈ 1.414). Under all operating conditions, CF increases monotonically with DC bias, and the rate of increase accelerates after 10%, indicating that DC bias enhances higher-order harmonics, causing the waveform to evolve from a smooth waveform to a sharply distorted waveform. At the same bias level, the higher the voltage, the lower the CF. This indicates that at high voltages, the fundamental component becomes dominant at the expense of higher harmonics, thereby diminishing their role in shaping the spikes.
4.2. Frequency-Domain Results
The frequency-domain analysis focuses on the spectrum of the magnetostriction signal. Through analysis of the 100 Hz fundamental, 150 Hz third harmonic, and the resulting THD, this work quantitatively elucidates how DC bias modulates magnetostriction nonlinearity. In nonlinear terms, these harmonics are manifestations of even-order, odd-order, and higher-order nonlinearities. Their evolution under an applied DC bias directly reflects changes in the respective nonlinear coefficients.
Figure 8 illustrates the variation in the 100 Hz amplitude with DC bias at different voltages. The overall trend is downward, with a slight rebound occurring at high bias (approximately >25%). DC bias shifts the operating point away from the symmetrical center of the curve, thereby effectively attenuating the even-order nonlinear response associated with the generation of second harmonics. The slight rebound observed at extremely high bias levels may be related to the enhancement of higher-order nonlinearity in the deep saturation region. At the same bias level, higher voltages result in larger amplitudes and slower decay, indicating that high voltage enhances the magnetization while weakening the modulation effect of the bias.
Table 2 details how the amplitude of the 50 Hz component varies with bias ratios ranging from 10% to 30%. At all voltage levels, the amplitude generally increases with increasing bias. The DC bias disrupts the symmetry of the magnetostriction, introducing odd-order nonlinearities that generate the 50 Hz harmonic. The rise in amplitude with higher bias reflects the strengthening of these odd-order nonlinear effects, as represented by the coefficients such as
k1 in Equation (5). Notably, at higher excitation voltages (3.2–4.0 V) and a low bias of 10%, the 50 Hz amplitude drops significantly (e.g., 0.996 μm/m at 3.6 V). Once the bias ratio exceeds 15%, the odd-order nonlinearity becomes fully activated, leading to a clear increase in amplitude.
At the same bias level, higher voltages result in larger amplitudes, indicating that enhanced magnetization can improve dynamic response. At a 10% bias, some high-voltage points exhibit locally lower values, which may be related to the nonlinear response not yet being fully established. Once the bias exceeds 15%, the amplitude increases steadily.
Table 3 analyzes the 150 Hz component amplitude as a function of bias ratio (10–30%). Overall, it displays a pronounced single-peak (rise-then-fall) behavior, especially at lower voltages, indicating that a moderate bias maximizes the third harmonic by enhancing magnetostriction asymmetry and the third-order coefficient (
k3). As the bias continues to increase, magnetization approaches saturation, and the amplitude decreases. At the same bias level, the lower the voltage, the more pronounced the peak, indicating that nonlinear modulation is more sensitive at lower voltages. Although a single-peak behavior is evident across all voltages, it weakens at higher levels (e.g., 4.0 V). This reflects more complex interactions between third-order and higher-order nonlinearities under strong AC drive.
Figure 9 shows the higher harmonics (200–500 Hz) under different DC bias conditions. Their amplitude remains low in the absence of DC bias, rises significantly as bias is applied, and peaks at approximately 20%. This demonstrates that DC bias enhances higher-order nonlinearity, increasing coefficients (e.g.,
k4,
k5) and shifting the spectrum from a low-order to a multi-harmonic distribution. Under high bias, the resulting spectral broadening and more intricate energy distribution indicate that the magnetization has entered a strongly nonlinear regime.
Figure 10 shows the variation in THD with DC bias. At all voltage levels, THD first increases and then decreases. An increase in bias strengthens the third-order coefficient (
k3), causing a significant rise at 150 Hz, while simultaneously exciting higher-order terms (
k4,
k5), which increases high-frequency harmonics and drives a rapid rise in THD. As the bias increases further, magnetization approaches saturation, some nonlinear effects weaken, and THD declines. Moreover, THD is higher at lower voltages, indicating that the system exhibits greater sensitivity to DC bias under these conditions.
The above results indicate a strong coupling effect between the AC voltage and the DC bias. Under low-voltage conditions, nonlinear modulation is stronger, manifested as higher harmonic distortion and waveform asymmetry. By comparison, high voltage can attenuate the nonlinear response to some extent. A comprehensive analysis of both the time and frequency domains indicates that the 20–25% bias ratio corresponds to the region where the third-order and higher-order nonlinearities are most pronounced. For example, the 150 Hz harmonic peaks near 10–15%, while THD and higher-order harmonics (200–500 Hz) peak near 20–25%. This suggests that this range represents a critical stage in the evolution of magnetostriction nonlinearity.
5. Prediction Performance
To further validate the effectiveness of the proposed feature extraction method and the LSTM model, the model’s prediction performance under various operating conditions is evaluated, focusing on its convergence characteristics, prediction accuracy, and generalization under different DC bias ratios and excitation voltage conditions.
During the model training, the Adam optimizer is used with an initial learning rate of 0.01 and a stepwise decay strategy (decreasing to 0.1 times the original value every 200 epochs). The maximum number of training epochs was set to 1000, and the L2 regularization coefficient was 0.001. The loss function is the mean squared error (MSE). The 42 sets of operational condition data were first segmented using a sliding window of 1 s width and 0.5 s step, yielding about 30 samples per condition and 1260 samples in total. To rigorously evaluate the model’s generalization and avoid data leakage due to temporal correlation between adjacent windows from the same recording, a stratified 5-fold cross-validation scheme was employed. Specifically, the 42 independent conditions (defined by different AC voltage and DC bias-ratio combinations) were randomly split into five mutually exclusive groups. In each fold, the samples from four groups (about 33–34 conditions) formed the training set, while the remaining samples (about 8–9 conditions) were used for testing. This ensures that data from the same operating condition never appear in both training and test sets. The procedure was repeated five times, and the final performance metrics were averaged. To prevent information leakage from the test set, the normalization parameters (minimum and maximum values for scaling to the [−1, 1] range) were computed solely on the training data of each cross-validation fold and subsequently applied to the corresponding test split within the same fold using the mapminmax function in MATLAB. During the training, as the number of iterations increased, the losses on both the training and validation sets gradually decreased and stabilized. The model’s RMSE on the test set was 0.0218, close to the 0.0214 on the training set, indicating that the model achieved a good balance between fitting accuracy and generalization ability.
As shown in
Figure 11, the predictions of the LSTM model under various voltage conditions align well with the set values in the moderate bias range (10–25%). However, relatively large errors are observed at both the low (5%) and high (30%) bias ratios. This observation is consistent with the theory presented in
Section 2, which indicates that the magnetization process exhibits stronger nonlinearity and complexity at the two extreme conditions. It is worth noting that the trained model (a single instance) exhibits robust prediction performance under varying AC voltages. By learning voltage-invariant features, the model thus demonstrates effective generalization across different operating conditions. Compared to high-voltage conditions, the prediction error in the 15–20% bias range is slightly higher at low voltages (2.0–2.8 V). This is consistent with the conclusion in
Section 4, where stronger nonlinear distortion is observed under low-voltage conditions.
The error boxplot in
Figure 12 further illustrates the model’s error distribution across different DC bias ratios. The “+” markers in the figure represent identified anomalies. The narrow boxes and medians near zero indicate that the model’s predictions exhibit low random error. This shows that the constructed features and the model can stably capture the dominant nonlinear characteristics under DC bias, demonstrating robust predictive stability and reliability.
To assess the contribution of different feature sets and benchmark the performance of the proposed LSTM model against other algorithms, a comprehensive ablation study was conducted. Specifically, three feature configurations were compared: (1) mutual information features only (MI-only), represented by an 8-D multiscale mutual information vector [
E(1), …,
E(8)]. (2) frequency-domain features only (FD-only), consisting of the three amplitudes
V100,
V150, and
V350. and (3) the proposed method, formed by a fusion of the two sets, resulting in an 11-D vector. For each feature set, four regression models were implemented and evaluated: the proposed LSTM network, an extreme learning machine (ELM) with 100 hidden neurons and a sigmoid activation function, a support vector regression (SVR) model with a radial basis function kernel, and a random forest (RF) with 100 trees. Model performance is evaluated using three standard metrics: the mean absolute error (MAE), root mean square Error (RMSE), and the coefficient of determination (R
2). Their definitions are provided in [
23].
The results of the ablation study and model comparison results under stratified 5-fold cross-validation are presented in
Table 4. The fused feature set (11-D) delivers the best performance, achieving an RMSE of 0.0336 and an R
2 of 0.8810 under the LSTM model. This represents an approximately 41% reduction in RMSE compared to using MI-only features (8-D), and a further 5% reduction compared to frequency-only features (3-D), demonstrating the complementary nature of time-domain statistical and frequency-domain physical features. Furthermore, the LSTM model outperforms the other three regression models. On the fused features, it has the lowest RMSE (0.0336) and highest R
2 (0.8810) among all models. Although the random forest achieves a similar MAE (0.0278), its RMSE is slightly higher (0.0378).
When the input is an 11-D static feature vector, the gating mechanisms of the LSTM (input, forget, and output gates) enable it to perform complex nonlinear transformations on these features. This gated architecture allows the LSTM to capture higher-order feature interactions and dependencies. While standard regression models lack a mechanism to model dependencies between features, the LSTM consistently demonstrates a clear advantage in capturing these inter-feature couplings.