Next Article in Journal
A Real-Time Investigation of an Enhanced Variable Step PO MPPT Controller for Photovoltaic Systems Using dSPACE 1104 Board
Previous Article in Journal
Enhanced Sliding Mode Control for Dual MPPT Systems Integrated with Three-Level T-Type PV Inverters
Previous Article in Special Issue
Diagnosis of Power Transformer On-Load Tap Changer Mechanical Faults Based on SABO-Optimized TVFEMD and TCN-GRU Hybrid Network
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Transformer Oil Acid Value Prediction Method Based on Infrared Spectroscopy and Deep Neural Network

1
State Grid Weifang Power Supply Company, State Grid Shandong Electric Power Company, Weifang 261000, China
2
School of Electrical Engineering, Shandong University, Jinan 250061, China
*
Author to whom correspondence should be addressed.
Energies 2025, 18(13), 3345; https://doi.org/10.3390/en18133345
Submission received: 26 May 2025 / Revised: 21 June 2025 / Accepted: 24 June 2025 / Published: 26 June 2025

Abstract

The traditional detection method of transformer oil acid value has limitations, such as long detection period and toxicity of reagents; while, with the traditional spectral analysis, it is difficult to realize the efficient extraction of key features related to the acid value content. Early detection of rising acid levels is critical to prevent transformer insulation degradation, corrosion, and failure. Conversely, delayed detection accelerates aging and can cause costly repairs or unplanned outages. To address this need, this paper proposes a new method for predicting the acid value content of the transformer oil based on the infrared spectra in the transformer oil and a deep neural network (DNN). The infrared spectral data of the transformer oil is acquired by ALPHA II FT-IR spectrometer, the high frequency noise effect of the spectrum is reduced by wavelet packet decomposition (WPD), and the bootstrapping soft shrinkage (BOSS) algorithm is used to extract the spectra with the highest correlation with the acid value content. The BOSS algorithm is used to extract the feature parameters with the highest correlation with the acid value content in the spectrum, and the DNN prediction model is established to realize the fast prediction of the acid value content of the transformer oil. In comparison with the traditional infrared spectral preprocessing method and regression model, the proposed prediction model has a coefficient of determination (R2) of 97.12% and 95.99% for the prediction set and validation set, respectively, which is 4.96% higher than that of the traditional model. In addition, the accuracy is 5.45% higher than the traditional model, and the R2 of the proposed prediction model is 95.04% after complete external data validation, indicating that it has good accuracy. The results show that the infrared spectral analysis method combining WPD noise reduction, BOSS feature extraction, and DNN modeling can realize the rapid prediction of the acid value content of the transformer oil based on infrared spectroscopy technology, and the prediction model can be used to realize the analytical study of transformer oils. The model can be further applied to the monitoring field of the transformer oil characteristic parameter to realize the rapid monitoring of the transformer oil parameters based on a portable infrared spectrometer.

1. Introduction

In oil-immersed power transformers, transformer oil serves as a fundamental dielectric medium [1]. It is a mineral oil that plays a critical role in electrical insulation and thermal management. The acid value, a key chemical indicator of oil degradation [2], reflects the cumulative concentration of short-chain acids, including formic acid, acetic acid, and naphthenic acids [3]. Progressive acid accumulation during transformer operation accelerates insulation deterioration, significantly compromising both oil service life and equipment reliability [4]. Early detection of rising acid content is crucial to prevent insulation degradation, corrosion, and potential transformer failure. Conversely, delayed or absent prediction can lead to accelerated equipment aging, costly repairs, and unplanned outages. This necessitates precise acid value monitoring through advanced analytical techniques.
Conventional acid quantification methods, notably potentiometric titration [5] and bromothymol blue (BTB) colorimetry [6], demonstrate analytical precision but suffer from operational constraints, including prolonged analysis time and hazardous reagent requirements. By contrast, infrared spectroscopy emerges as a non-destructive analytical technique offering rapid measurement capabilities and enhanced operational safety through reagent-free analysis. Recent applications have validated its efficacy in determining multiple oil quality parameters, including acid value [7], furfural content [8], moisture levels [9], and antioxidant concentrations [10], establishing it as a versatile analytical platform for transformer oil characterization [11].
An infrared spectral analysis of the transformer oils encounters three principal interference factors, including light scattering artifacts, baseline drift, and stochastic noise [12]. These perturbations introduce extraneous high-frequency components that compromise quantitative accuracy. Current preprocessing strategies encompass first-derivative (D1) and second-derivative (D2) transformations, baseline correction, and Savitzky–Golay (SG) smoothing. While Shi and Yu [13] achieved partial success using D1/D2 with SG smoothing in partial least squares (PLS) modeling, predictive performance for low-concentration samples remained suboptimal. Huang et al. [14] demonstrated wavelet transform-enhanced SG smoothing with parametric optimization of window size and polynomial order; whereas Ge et al. [15] developed higher-order derivative processing to enhance spectral resolution through peak narrowing and selectivity improvement. However, conventional preprocessing techniques exhibit limited high-frequency noise suppression capabilities, necessitating advanced signal refinement approaches.
Wavelet packet decomposition (WPD) addresses these limitations through multiscale signal deconstruction, offering superior frequency band resolution compared to standard wavelet transforms [16,17]. This methodology effectively segregates high-frequency noise components while preserving critical spectral features, achieving simultaneous noise reduction and signal-to-noise ratio enhancement. The resultant smoothed spectral profiles facilitate improved quantitative analysis reliability.
Feature extraction represents another critical preprocessing step to mitigate spectral redundancy and prevent model overfitting [18]. An et al. [19] employed PLS regression on derivative-processed spectra to establish predictive models for acid value and flash point (standard errors: 0.46 mg/100 mL and 2.77 °C, respectively), albeit with subjective wavelength selection requirements. Jiang et al. [20] implemented principal component analysis (PCA) for dimensionality reduction, achieving >99% model accuracy despite potential chemical information loss in low-variance components. Yin et al. [21] advanced the field through bootstrapping soft shrinkage (BOSS) algorithm implementation, successfully preventing overfitting in neural network-based acid value prediction models.
This study establishes a novel integrated framework coupling WPD denoising, the BOSS algorithm, and DNN regression to advance infrared spectral prediction methodologies for the transformer oil acid values. The primary objective is to achieve rapid early prediction of the transformer oil acid value content and enhance the predictive capability and robustness of the transformer oil condition assessment. Comprehensive evaluation against conventional chemometric approaches demonstrates significant improvements in both accuracy and model robustness. Section 2 delineates the experimental methodology and elucidates the theoretical foundations of the algorithms employed. Section 3 details the model development process and the configuration of critical parameters. Section 4 analyzes the prediction model’s performance in denoising, feature extraction, and acid value prediction, alongside comparative accuracy validation. The conclusions are synthesized in Section 5. Beyond refining the transformer oil diagnostics, this work contributes a systematic framework transferable to infrared spectral quantification challenges in dielectric fluids.

2. Basic Theory

To enhance the predictive accuracy of infrared spectroscopic models for transformer oil acid value quantification, we implement a three-phase analytical framework. First, spectral preprocessing addresses baseline drift and high-frequency noise artifacts through a comparative evaluation of three techniques, including WPD, D1, and D2 transformations. Subsequent feature optimization employs BOSS algorithms to extract maximally correlated spectral descriptors while eliminating redundant variables, thereby improving model training efficiency and mitigating overfitting risks. Finally, we establish a DNN architecture demonstrating superior predictive performance over conventional PLS regression through rigorous comparative validation.

2.1. Principles of Infrared Spectroscopy and Acid Value Detection Methods

Infrared spectroscopy is an analytical method for determining the molecular structure of substances and identifying compounds based on the relative vibration of atoms within a molecule and the rotation of the molecules. Different functional groups have their characteristic absorption peaks in different wavelength bands; for example, the carbon and oxygen double bond (C=O) has a strong correlation with the wave number range of 1600~2000 cm−1 of the infrared spectrum. Further research has found that the carboxyl functional group (-COOH), which is related to the acid value, has a strong absorption peak within the range of 1600~1850 cm−1, which can be utilized to establish a method for predicting the acid value content of the transformer oil, and to realize the acid value of the oil. The characteristic information in this range can be used to establish a prediction method for the acid value content of the transformer oil and to realize the quantitative analysis of the acid value content. Meanwhile, the hydroxyl group (O-H) has strong and broad absorption peaks in the wave number range of 3200~3650 cm−1; the methyl and methylene groups in saturated hydrocarbons have absorption peaks in the range of 2850~2960 cm−1 [22]. The absorption peaks at different wavelengths can be used to determine the type of functional groups, and the characteristic information of the absorption peaks can be used to realize the quantitative analysis of specific substances, so infrared spectroscopy is widely used to identify the type of compounds to be measured or the quantitative analysis of specific substances.
In this paper, the actual acid content of the transformer oil was determined using the traditional method of testing acid content in electrical tests, i.e., the BTB method. The specific experimental method can be outlined as the use of boiling ethanol to extract the acidic components of the oil, the use of potassium hydroxide ethanol solution for titration to determine the neutralization of 1 g of oil samples of the acidic components of the number of milligrams of potassium hydroxide required; that is, the BTB indicator turns from yellow to blue-green when the amount of consumed potassium hydroxide ethanol solution. Ultimately, through the calculation of the oil samples, the content of the acid value contained in the oil can be obtained.

2.2. WPD Algorithm Preprocessing Principle

Infrared spectral analysis of the transformer oils inherently suffers from the following three principal interference factors: (1) Mie scattering artifacts, (2) baseline drift, and (3) stochastic noise contamination [23,24]. These perturbations necessitate rigorous spectral preprocessing to enhance signal fidelity prior to predictive modeling. Conventional preprocessing techniques—including baseline correction, SG smoothing, D1, D2, and vector normalization—provide partial solutions but exhibit limited high-frequency noise suppression capabilities.
WPD is a multilevel decomposition of the signal, which not only combines the wavelet transform, but provides a more detailed band division than the wavelet transform [25]. By performing successive wavelet decompositions of the signal, WPD is able to effectively separate the signal from the noise, which usually behaves as a high-frequency component, while the signal is mainly contained in the low-frequency component. The wavelet packet decomposition formula is as follows:
f ( t ) = n c A ( n ) Φ ( n t ) + n d k ( n ) Ψ k ( n t )
where c A ( n ) is the approximation coefficient, representing the low-frequency part of the function, d k ( n ) is the detail coefficient, representing the high-frequency part of the signal, f ( t ) is the original infrared spectral signal, Φ ( n t ) is the scale function, and Ψ k ( n t ) is the wavelet basis function.
The dual-tree complex wavelet (DT-CW) [26] was selected as our wavelet basis function to address the critical limitations inherent in discrete wavelet transforms, including direction selectivity, translation invariance, and spectral aliasing. Kingsbury [27] proposed the DT-CW method, which is designed according to certain rules in the form of a double-tree filter design, which retains the advantages of general complex wavelets and can be completely reconfigured at the same time. The algorithm equation can be expressed as follows:
Ψ ( t ) = Ψ r ( t ) + j Ψ i ( t )
where Ψ r ( t ) and Ψ i ( t ) are the real and imaginary parts of the complex wavelet, respectively, and they are both real functions, so that the dual-tree complex wavelet transform can be expressed as two independent wavelet transforms which contain two mutually parallel decomposition trees: tree A and tree B.
Figure 1 shows a schematic diagram of the decomposition of the DT-CW, where h 0 ( n ) and h 1 ( n ) are the low-pass and high-pass filters of tree A, g 0 ( n ) and g 1 ( n ) are the low-pass and high-pass filters of tree B. The decomposition of the DT-CWT is shown in Figure 1.
In the first layer of the transformation, let tree A have a sampling period delay relative to B, to ensure that the first layer of tree B is downsampled to obtain the unsampled value after the sampling of the intervals in tree A, to ensure the signal continuity. The preprocessed infrared spectral data of the transformer oil can be obtained by signal reconstruction after soft-thresholding the high-frequency coefficients for noise reduction.
The WPD algorithm used in this paper is realized as follows: three-layer wavelet packet decomposition of the spectral signal to obtain low-frequency and high-frequency coefficients; threshold coefficients are obtained by calculating the median amplitude of the highest-frequency sub-band, and the high-frequency coefficients are processed by soft-thresholding the noise reduction using this threshold; the processed coefficients are used for signal reconstruction to obtain the preprocessed infrared spectral data.

2.3. Principles of BOSS Algorithm Feature Extraction

The infrared spectrum contains a large amount of information. If the entire spectral data is used to train the prediction model, it is easy to cause overfitting, slow learning speed, and other problems. The feature extraction method used in this paper is the BOSS algorithm [28], which is a variable selection method that combines the bootstrap sampling technique and the concept of flexible contraction, which can not only screen out the representative and highly relevant feature wavelength parameters instead of all wavelengths of the spectral information to build the model, but can optimize the data structure and improve the learning speed on the basis of meeting the stability and accuracy of the model. It can also optimize the data structure and improve the learning speed on the basis of model stability and accuracy.
The core idea of the BOSS algorithm is to generate K subsets in the whole spectral data space through bootstrap sampling, and use all subsets to establish K partial least squares regression sub-models. The infrared spectral data used in this paper contains a total of 1750 data points, the collection of 7 data points as a subset, that is, a total of 250 subsets are generated, using the BOSS algorithm to establish regression sub-models, respectively, and the wavelength variables in all the subsets are given equal initial weights to establish the initial weighted model. The weights were updated by calculating the regression coefficients of each variable and according to the size of the regression coefficients. Adjustment of the weights of all variables is realized by increasing the weights of the variables with high regression coefficients, and decreasing the weights of the variables with low regression coefficients. Based on the updated weights, a new round of sampling is conducted, focusing on retaining the high-weight wavelength variables to participate in the modeling, and flexibly contracting the variables. The best model with the lowest root mean square error (RMSE) for leave-one-out interaction validation was finally exited from the iteration and the best dataset during the iteration was selected [29].
The formula for calculating the RMSE is as follows:
R M S E = i = 1 N ( y r y p ) 2 N
where N is the number of samples, y p is the predicted value from the BOSS algorithm model, and y r is the actual content of acid value in the transformer oil.
In this paper, the BOSS algorithm is used to extract features from the infrared spectral data of the transformer oil and to reduce the dimensionality of the infrared spectra, which avoids the problems of overfitting and slow learning speed during the training process of the subsequent prediction model and improves the accuracy of the prediction model.

2.4. Deep Neural Network Modeling

The infrared spectral information of the transformer oil after WPD noise reduction and BOSS dimensionality reduction is used as input, and a deep neural network (DNN) is used for modeling and prediction in this paper [30,31]; a prediction model that can quickly and accurately predict the acid value content of the transformer oil can be obtained after good training. The DNN is an artificial neural network with two or more hidden layers, including input layer, hidden layer and output layer [32]. The mapping relationship between its input and output is as follows:
y o u t = f θ ( n ) ( f θ ( n ) ( x i n ) )
f θ i ( x i ) = R ( W i x i + b i )
where y o u t denotes the output of the neural network (the output of the last hidden layer), f θ i denotes the feed-forward transfer function of the neurons in layer i , θ = { W , b } denotes the parameters to be optimized by the neural network, W i and b i denote the weights and offsets between the neurons in layer i and layer i + 1 , respectively, R represents the activation function, and x i denotes the inputs of the neurons in layer i , i = 1 , 2 , , n , where n denotes the number of layers in the hidden layer of the neural network.
In this paper, the adaptive moment estimation (Adam) algorithm [33] is used, which combines the momentum method and the idea of adaptive learning rate, and is able to calculate the adaptive learning rate of each parameter, which has the advantages of fast convergence speed and strong robustness of hyperparameters, among others.
The rectified linear unit (ReLU) activation function is selected for the DNN hidden layers primarily to overcome the gradient vanishing challenges in deep architectures while maintaining computational efficiency. The ReLU’s non-saturating property preserves gradient magnitude during backpropagation, as its derivative remains one for positive inputs rather than approaching zero like saturating functions (e.g., sigmoid/tanh). This characteristic is particularly critical for our transformer oil acid value prediction model, where the DNN must propagate gradients through >15 layers to extract linear relationships from feature parameters without signal degradation. Compared with other activation functions, the ReLU activation function can improve convergence speed and prediction performance.
The parameters of the input layer are set as wave number and absorbance size, and the parameters of the output layer are set as acid content. The number of hidden layers with the highest prediction accuracy was obtained through experiments as three layers, the number of neurons in each layer was 20, the learning rate was 0.0001, and the number of iterations was 2000. Finally, the established DNN model is shown in Figure 2. In this paper, the DNN is optimized by constructing the loss function L and minimizing it during the DNN training process, and the accuracy of the prediction model is verified using the RMSE and the coefficient of determination (R2).
L ( θ ) = 1 n n ( y p y r ) 2
R 2 = 1 ( y r y p ) 2 ( y r y ¯ p ) 2
where n is the number of samples used for the DNN, which together form the training set of the DNN, y p is the predicted value of the DNN, y ¯ p is the average of the predicted values, and y r is the actual content of the transformer oil acid value.
The overall flow of the prediction model established in this paper is shown in Figure 3. The infrared spectral data of the transformer oil collected by the infrared spectrometer is preprocessed using three preprocessing methods, namely the WPD algorithm, D1 processing, and D2 processing. The preprocessed spectra are input into the DNN prediction model and the PLS regression model by the BOSS algorithm feature extraction, respectively, so that the root mean square errors and the coefficients of determination of the predictions are exported for comparison. The root mean square error and coefficient of determination of the prediction results are derived and compared to verify the advancement of the prediction model composed of WPD noise reduction, BOSS feature extraction, and DNN prediction proposed in this paper.

3. Transformer Oil Infrared Spectroscopy Detection and Prediction Modeling Methods

3.1. Sample and Data Acquisition

The instrument used for the infrared test and analysis is the ALPHA II FT-IR spectrometer, and the accessory used is the Platinum ATR module, which measures the mid-infrared wavelength band from 400 to 4000 cm−1 with a resolution of 4 cm−1, and the number of scans is 32. Each group of samples is collected three times, and the average value is taken as the final spectrum. The transformer oil samples used in this paper are the Karamay 25# transformer oil from 89 of 220 kV transformers actually in operation, and the samples contain the same transformer with different sampling times and different transformers of two categories, which together constitute 143 groups of sample sets. According to the traditional BTB method for detecting the acid value of the transformer oil, the titration was carried out by using potassium hydroxide ethanol solution to determine the number of milligrams of potassium hydroxide required to neutralize the acidic component of 1 g of oil samples, and then the acid content of the oil samples could be obtained by calculation. By matching the infrared spectra with the actual acid value data, the prediction of the transformer oil acid value based on artificial intelligence technology is realized.
The detection of the samples occurs when the sample cell material and the environment of the air, and so on, also exist in the infrared absorption, such as the environmental background spectral acquisition, which will lead the sample spectrum to the above impurities spectral information contained in the spectrum. In order to eliminate the influence of the background spectrum on the transformer oil spectrum, in this paper, before collecting the spectrum of each oil sample, we will collect the background spectrum, the background spectrum acquisition, and the transformer oil spectrum acquisition method to maintain the basic consistency based on the infrared spectroscopy software (OPUS 8.0) for the acquisition. After the background spectrum is collected, OPUS can automatically delete the background spectral information in the sample spectrum to avoid the data error caused by the environmental background factors of the sample. At the same time, the method of collecting background spectra for each test avoids the negative impact on the accuracy of the sample spectra due to the weak changes in the environmental background caused by the experimental operation.
The infrared spectral preprocessing used in this paper includes WPD, D1 processing, and D2 processing, in which D1 and D2 can play a role in reducing the overlap of the characteristic peaks between different components of the samples, and amplifying the information, such as absorbance enhancement rate or curvature of the characteristic peaks of the infrared oil of the transformer oil. The BOSS feature extraction method is used to realize the downscaling of the infrared spectral data of the transformer oil and to avoid overfitting. This method for reducing dimensionality from 1750 to 7 variables, as shown in Section 4.4, validates the reduced overfitting as follows: the training/testing accuracy gaps decreased from 12.13% to 1.13% post-optimization. The DNN model and the PLS model are used as the prediction model. Finally, a total of nine combined modeling methods are formed, and the predictive performance of their modeling is shown by comparing the prediction accuracy of the nine groups, which further verifies the advancement of the infrared spectral analysis method combining WPD noise reduction, the BOSS feature extraction, and the DNN modeling proposed in this paper.

3.2. Spectral Preprocessing and Feature Parameter Extraction

The infrared spectral signal contains a large amount of high-frequency noise, and by thresholding the high-frequency sub-band processed by the WPD algorithm, the high-frequency noise in the signal can be removed and the signal-to-noise ratio of the infrared spectra can be improved. In this paper, the number of wavelet packet layers consists of three layers, the threshold processing method is soft threshold processing, the threshold value of 0.0005 is calculated, and the WPD preprocessed infrared spectrum can be obtained by selecting the sub-bands that contain the useful information for signal reconstruction after noise reduction of the transformer oil infrared spectral data using this threshold value.
The infrared spectrum contains a large amount of redundant information that is not useful for model training. The BOSS algorithm is based on the bootstrap sampling technique and the concept of flexible contraction, and realizes the feature extraction of the infrared spectra of the transformer oil by constantly updating the weights of the variables in each subset. The initial wave number of the transformer oil infrared spectrum obtained from the experiment has 1750 data points, the full spectrum data is divided into 250 subsets, that is, each subset contains 7 feature parameters, the iteration number of the BOSS algorithm is set to 50 times, and the initial weights of the variables are set to 1/1750, so that the variables have the same weight, and the optimal wave number range is determined by the output of the RMSE after each iteration of the wave number range. By taking the feature extraction result as the input parameter of the DNN model and the transformer oil acid content as the output parameter of the DNN prediction model, the accurate prediction of the transformer oil acid content can be realized after good training.

3.3. Data Set Partitioning and Predictive Modeling

A total of 143 sets of spectral data were used in this paper, and the spectral data were divided according to the ratio of 8:2:1, i.e., 104 sets of data were used as the training set, 26 sets of data were used as the validation set, and the remaining 13 were used as the complete external validation set, The division of the dataset and its range of acid value content are shown in Table 1.
The infrared spectral data of the transformer oil through WPD noise reduction and the BOSS dimensionality reduction are used as the input layer of neurons, and the acid content of the transformer oil is used as the output to establish a DNN prediction model. The activation function of the hidden layer of the DNN model is set as the ReLU function, and the number of neuron nodes is 20, with the learning rate of 0.0001. The number of training times is 2000, and the prediction model is verified by using the RMSE and R2 to determine the accuracy of the prediction model. accuracy. At the same time, this paper also applies the traditional quantitative regression method PLS to compare with the accuracy of the DNN model to further verify the superiority of the neural network model for predicting the acid value content in the transformer oil spectrum.

4. Results and Discussion

4.1. Transformer Oil Infrared Spectral Data Analysis

The 143 transformer oil samples were collected using an ALPHA II FT-IR spectrometer, and each sample was scanned three times; the average infrared spectra were taken as the final infrared ATR spectra in the transformer oils, as shown in Figure 4.
Figure 4 shows the absorbance profiles of the 143 selected transformer oil samples across wave numbers from 400 to 4000 cm−1. The x-axis represents the infrared light wave number range, while the y-axis shows the absorbance (Abs) of the transformer oil infrared spectra. As seen in the figure, the absorption peak positions in the mid-infrared spectral curves of these samples are essentially identical. This consistency occurs because the transformer oil is primarily composed of compounds like alkanes, cycloalkanes (saturated hydrocarbons), and aromatic unsaturated hydrocarbons. Importantly, these inherent compositional characteristics exert a much stronger influence on the spectra than the characteristic parameters associated with the transformer oil faults. Consequently, the spectral information among the oil samples is very similar, and their absorption peak positions align closely. At the same time, the absorbance intensity of the absorption peaks varies across the samples due to differences in the concentration of the contained substances.
The presence of the aryl ring C-H telescopic vibration absorption peaks at a wave number of 2920 cm−1, the aryl ring C=C vibration absorption peaks at a wave number of 1600 cm−1, and a combination of C-H (-CH3 and -CH2) telescopic vibration peaks at 1450 cm−1 indicates that there is a certain degree of correlation between the absorption peaks and the chemical molecular groups in the transformer oil, and that information about its chemical characteristics can be further obtained from analyzing the mid-infrared spectra of the oil, thus realizing the analysis of oil quality. The mid-infrared spectra of the transformer oil can be analyzed to further obtain the information of its chemical characteristic parameters, thus realizing the analysis of the oil quality of the transformer oil. The acid value of the transformer oil correlates with the information from the carboxylic acid (-COOH) functional group within the 1600~1850 cm−1 range. However, because the content of the acidic compounds is very low compared to the multi-carbon hydrocarbons, a direct analysis based on peak height or peak area to build a standard curve is not feasible. Therefore, feature extraction algorithms are employed. These algorithms extract specific parameters within the transformer oil spectrum that exhibit the strongest correlation with the acid value. Applying these identified parameters to a prediction model then enables a rapid prediction of the transformer oil’s acid value.

4.2. Comparison of Results Between the WPD Algorithm and Traditional Preprocessing Methods

The comparison between the spectral data and the original data after noise reduction using the WPD algorithm is shown in Figure 5, and the selected wave number range is the region with small absorbance, which is greatly affected by the noise. From Figure 5, it can be seen that, after the WPD method is processed by soft thresholding, the high-frequency noise is basically removed, and the overall curve becomes smoother; at the same time, the effective features at the absorption peaks are also retained, which provides spectral data with higher signal-to-noise ratios for the subsequent feature extraction and model prediction.
Figure 6 shows the infrared spectra of the transformer oil after three different pretreatment methods. Derivative processing of the infrared spectrum can eliminate the effect of baseline drift of the spectral data and amplify the infrared spectral features to improve the spectral signal-to-noise ratio. The first-order derivative can provide the slope and peak information of the spectral curve, while the second-order derivative can provide the concave and convex nature of the spectral curve and curvature information, and the use of derivative spectra can realize functions such as peak localization and separation of spectrally overlapping peaks. However, it can be seen from the figure that the spectral curve after noise reduction by the WPD algorithm is smoother, and the spectral curve processed by D1 and D2 still has more high-frequency noise, which indicates that the preprocessing effect of the WPD algorithm on the infrared spectra is better than that of the traditional preprocessing methods, and it can effectively remove the adverse effects of noise in the spectra on the data itself.
Since the mid-infrared spectral region is consistent with the vibrational absorption region of the C=O, C=C, and O-H functional groups in organic molecules, the scanning of the mid-infrared spectra of the transformer oil samples can be used to obtain the characteristic information of the samples containing the carboxyl, aldehyde, and hydroxyl groups, among others. The preprocessed infrared spectral data and the acid value content of the transformer oil are used as inputs to the BOSS feature extraction algorithm. After feature extraction, some of the feature coefficients with the highest correlation with the acid value content are obtained, and the prediction model can be effectively predicted by using this part of the coefficients to train the prediction model.

4.3. Analysis of Feature Extraction Results

The ranges of the feature parameters and their lowest RMSE values obtained from the transformer oil infrared spectral data processed by the three preprocessing methods, WPD, D1, and D2, respectively, after feature extraction using the BOSS algorithm are shown in Table 2. The wave number positions of the spectral curves where the feature parameters are located are shown in Figure 7.
The highlighted bands in Figure 7 are the range of characteristic bands extracted by the BOSS algorithm, and Figure 7a–c show the distribution maps of the spectral data preprocessed by the WPD algorithm, D1 and D2, respectively, after the BOSS algorithm feature extraction. The characteristic covariates are marked out and enlarged in the original figure to facilitate the observation of the specific location of their characteristic bands. As can be seen in Figure 7, the parameters extracted by WPD noise reduction and the BOSS algorithm are basically retained in the vicinity of the wave number of 1600 cm−1, which is basically consistent with the principle that the characteristic peaks of carboxyl functional groups range from 1600 to 1850 cm−1, and indicates that, in the infrared spectroscopic data of the transformer oil obtained in the present experiments, the spectral data of this band are closely related to the acid value content of the transformer oil. This indicates that the spectral data in this band are closely related to the acid value content of the transformer oil. The characteristic parameters extracted by D1 and D2 preprocessing and the BOSS special rule are roughly distributed near 1450 cm−1, and the characteristic functional groups in this band are basically the combined telescopic vibration peaks of C-H (-CH3 and -CH2); it is thus speculated that the spectral data after D1 and D2 preprocessing may lead to the reduction of the correlation of the parameter information that directly responds to the acid value content, or the spectral data at this time are affected by noise, which makes the characteristic information disappear, thus leading to the loss of the characteristic information in B1 and D2 preprocessing, which results in the loss of the characteristic information. This results in the loss of feature information, which leads to the selection of the feature parameter after the BOSS extraction as the characteristic positions of the methyl and methylene groups instead of the characteristic peak positions of the carboxyl functional groups, which suggests that the different preprocessing methods will have some influence on the selection of the feature parameter.

4.4. Analysis of Prediction Model Accuracy

In this paper, the PLS algorithm, a traditional infrared spectral regression method, is used as a comparison with the proposed DNN prediction model, which is used to deal with the regression problem among multivariate variables as a traditional statistical modeling method [34]. The comparison with the PLS prediction results can further validate the improvement of the DNN prediction model proposed in this paper in terms of prediction accuracy, indicating the superiority of this prediction model. By taking the feature parameters extracted by the BOSS algorithm as inputs and the acid value content of the transformer oil as outputs, the two models, the PLS algorithm and the DNN, are trained. The performance comparisons of the prediction models obtained by deriving the prediction accuracies of the training set and the validation set are shown in Table 3 and Figure 8. A total of nine different combination methods were obtained by combining the different preprocessing and prediction models.
The superiority of the WPD algorithm proposed in this paper is verified by comparing the different preprocessing methods and the same prediction model; the effectiveness of the BOSS algorithm in improving the prediction accuracy of the model is verified by comparing the feature parameters extracted by the same preprocessing method and the prediction model with or without the BOSS algorithm; and the DNN model proposed in this paper is verified by comparing the same preprocessing method and the different prediction models compared to the traditional PLS regression model.
By comparing the final prediction accuracy results corresponding to the different preprocessing methods, as shown in the PLS model (No. 1–3) and the DNN model (No. 4–6) in Table 3 and Figure 8, respectively, the WPD algorithm can effectively improve the accuracy of the infrared spectral prediction model, with the coefficient of determination being closer to 1, and the root mean square error being smaller. Further calculations on the validation data in Table 3 show that, compared with the D1 processing method, the R2 of the PLS model after noise reduction by the WPD algorithm differs by 0.0311, which is about 3.54%, and the R2 of the DNN model differs by 0.0221, which is 2.36%; compared with the D2 processing method, the R2 of the PLS model after noise reduction by the WPD algorithm differs by 0.0280, which is about 3.17%, and the R2 of the DNN model differs by 3.17%, and the R2 of the DNN model differs by 0.0280, which is about 3.17%; compared with the D2 processing method, the DNN model R2 difference 0.0192, improved by 2.04%. By comparing the prediction accuracy results of the DNN model and the PLS model under the same processing method (e.g., Serial No. 1 and Serial No. 4), the DNN model proposed in this paper can effectively improve the prediction accuracy of the acid value of the transformer oil. Compared with the traditional regression model, it has higher superiority, and its R2 difference of 0.0496 is improved by 5.45%.. In order to verify the effect of the BOSS algorithm on the final accuracy of the prediction model, this paper trains the DNN prediction model with the unfeatured dataset. The accuracy results show that the BOSS algorithm can improve the accuracy of the prediction model training The existence of the BOSS algorithm improves the R2 of the training set from 0.7758 to 0.9712, and the accuracy improves by 0.1954, i.e., 25.19%. Meanwhile, the validation set error in the model without feature extraction is larger than the training set error, which indicates that, in this case, the speculative model may express an overfitting phenomenon, resulting in good accuracy in the training set that cannot be in the validation set of good performance, indicating that the BOSS algorithm, to a certain extent, can avoid the DNN model in the training process of overfitting problems. From Table 3 and Figure 8, it can be seen that the predictive model of the transformer oil spectral acid content proposed in this paper, which consists of WPD noise reduction, BOSS dimensionality reduction, and DNN prediction, has an R2 of more than 0.95 compared with the traditional preprocessing and regression methods, and the value of its value is even closer to 1, with smaller error value and superior prediction performance, which indicates that its prediction accuracy is higher; it is able to realize the accurate prediction of the acid content of the transformer oil. content in the transformer oil.

4.5. Full External Validation

Figure 9 shows the scatter plots of the predicted results versus the actual values of the transformer oil acid content for the 13 sets of infrared spectral data from the complete external validation set predicted by the DNN model. The samples of the data in the complete external validation set consist of the same transformer with different collection times and different transformers. The complete external validation of the DNN model can further show the accuracy of the model in predicting the acid value content of the transformer oil, which indicates that the model has good robustness and generalization ability. As can be seen in Figure 9, all the scatter points are basically clustered around the 45° line, and the calculated coefficient of determination for R2 is 0.9504 and for the RMSE is 0.0067, indicating that the predicted values are similar to the actual values, which shows that the predictive performance of the transformer oil acid value prediction model proposed in this paper is still very high for the completely external data, and further verifies that the prediction model is highly accurate.

5. Conclusions

This paper adopts a method combining mid-infrared spectral data of transformer oil with specific analysis techniques, including WPD noise reduction, BOSS feature extraction, and DNN modeling. This approach establishes the correlation between the transformer oil’s infrared spectral data and its acid value content. This correlation is compared with that obtained from traditional methods. Ultimately, a prediction model capable of accurately predicting the transformer oil’s acid value content is established. The main conclusions of this work are as follows:
(1)
The mid-infrared ATR spectral data of the transformer oil samples in shipment were obtained by the ALPHA II FT-IR mid-infrared spectrometer, and the actual content of acid value of the transformer oil was measured by using the traditional BTB method. Preprocessing methods, such as the WPD algorithm, D1, and D2, were utilized for preprocessing of the transformer oil spectra to effectively reduce the influence of noise in the spectral data, to make spectral curves smoother, and to improve the prediction model’s accuracy.
(2)
Applying the BOSS algorithm to extract features from the full infrared spectral data of the transformer oil, the number of variables is reduced from 1750 to 7 for the input of the prediction model, and the dimensionality of the feature covariates is greatly reduced, which improves the computing speed of the model, and also avoids the overfitting problem of the model during the prediction process; this improves the accuracy of the model prediction.
(3)
After building the DNN model and training it, a well-predicted neural network model is obtained, with a prediction set RMSE of 0.0543 and a coefficient of determination R2 of 0.9712, and a validation set RMSE of 0.0668 and a coefficient of determination R2 of 0.9599. Comparing the model with the traditional PLS regression model algorithm, the model performance improves by 4.71%, which verifies the model’s advancement. The prediction accuracy of the model is further verified by applying a fully external validation set, and the coefficient of determination (R2) of the validation result is 0.9504, which indicates that it is able to obtain the accurate acid value content of the transformer oil.
The results of this paper demonstrate that the influence of noise in the spectral information can be reduced by WPD preprocessing, and the feature coefficients extracted by the BOSS algorithm can reduce the dimensionality of the spectral data and improve the prediction accuracy of the model. The combination of WPD, the BOSS algorithm, and the DNN model provides a new research idea for the quantitative analysis of the acid value based on the transformer oil infrared spectral data. However, at the same time, this paper has limitations, such as experimental errors caused by experimental instruments, ambient temperature, humidity, or manual operation during the determination of the acid value content of the transformer oil and the collection of infrared spectra; the limited number of actual transformers in operation, which leads to a low number of the transformer oil infrared spectra samples used for training, and so on. In subsequent studies, the training samples in the training set will be continuously expanded to further improve the prediction capability of the model. On this basis, future research will be oriented to portable infrared spectrometers and nested artificial intelligence algorithms to realize the monitoring function of the transformer oil characteristic parameters based on the infrared spectral data of the transformer oil, which will provide a simpler technical means for the development of the transformer oil monitoring device towards a portable and small type, as well as for the detection of the transformer oil.

Author Contributions

Conceptualization, L.F., C.Z. and Z.P.; methodology, L.F., Y.T., X.H. and S.Z.; validation, Z.P., Y.T., X.H. and Y.Z.; formal analysis, L.F., C.Z. and S.Z.; investigation, X.H. and Y.Z.; data curation, Y.Z. and S.Z.; writing—original draft preparation, L.F., C.Z. and Z.P.; writing—review and editing, and X.W.; visualization, Y.T. and S.Z.; supervision, X.W. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Science and Technology Project of State Grid Shandong Electric Power Company (Research on Portable Transformer Oil Intelligent Monitoring and Fault Early Warning Technology Based on Multi-feature Parameter Sensing Technology), grant number 520604240007.

Data Availability Statement

Data available upon request due to restrictions of privacy and legal reasons.

Acknowledgments

The authors would like to thank State Grid Shandong Electric Power Company for providing experimental samples and experimental data. The authors also thank Shandong University for technical assistance.

Conflicts of Interest

Authors Linjie Fang, Chuanshuai Zong, Zhenguo Pang, Ye Tian, Xuezeng Huang, Yining Zhang are employed by the State Grid Shandong Electric Power Company. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest. The authors declare that this study received funding from State Grid Shandong Electric Power Company. The funder had the following involvement with the study: provision of data and textual materials, formal analysis, investigation, writing—original draft preparation, and validation.

Abbreviations

The following abbreviations are used in this manuscript:
DNNDeep neural network
WPDWavelet packet decomposition
BOSSBootstrapping soft shrinkage
R2coefficient of determination
BTBbromothymol blue
D11st Derivative
D22nd Derivative
SGSavitzky–Golay
PLSPartial least squares
AbsAbsorbance
DT-CWdouble-tree complex wavelet
AdamAdaptive moment estimation
RMSERoot mean square error

References

  1. Chen, Z.M. A review of transformer oils. Synth. Lubr. Mater. 2018, 45, 28. [Google Scholar]
  2. Xiong, Y.; He, D.-L.; Feng, Y.; Chang, X.-Y.; Liu, F.-R. A rapid method for determination of acid value in transformer oil by PPy modified electrode. Cent. South. Univ. 2014, 21, 2202–2207. [Google Scholar] [CrossRef]
  3. Mehmood, M.A.; Nazir, M.T.; Li, J.; Zhou, Z.; Wang, F.; Azam, M.M. Comprehensive investigation on service aged power transformer insulating oil after decades of effective performance in field. Arab. J. Sci. Eng. 2020, 45, 6517–6528. [Google Scholar] [CrossRef]
  4. Liao, R.J.; Hao, J.; Liang, S.W.; Zhu, M.Z.; Yang, L.J. Effects of moisture and acid on thermal aging of mineral oil and natural ester hybrid oil-paper insulation. J. Electrotechnol. 2010, 25, 31–37. [Google Scholar]
  5. NB/SH/T 0836-2010; Determination of Acidity in Oil Insulation Automatically Potentiometric Titration. The National Energy Bureau: Beijing, China, 2010.
  6. GB/T 28552-2012; Determination of Acid Value of Transformer Oil and Turbine Oil Acid (BTB Method). China Standard Press: Beijing, China, 2012.
  7. Wang, Y.; Wang, H.M.; Qian, Y.H.; Wang, Q. Study on the acid value of insulating oil by infrared spectroscopy. Transformer 2022, 59, 55–60. [Google Scholar]
  8. Jiang, F.Y.; Wu, X.; Gao, S.T.; Wang, S.Y.; Zhang, R.R.; Li, Q.; Liu, X. Determination of furfural content in transformer oil by infrared spectroscopy. China Test 2022, 48, 72–77. [Google Scholar]
  9. Sun, J.S.; Liu, Y.J.; Zhang, S.N.; Lang, H.; Han, Z.Y. Research on rapid detection technology of turbine oil water content. Contemp. Chem. Ind. 2018, 47, 1310–1313. [Google Scholar]
  10. Lu, X.L.; Fan, X.L.; Ma, L.H. Comparative study of test conditions for detecting antioxidant content of transformer oil by infrared spectroscopy. Lubr. Oil 2014, 29, 42–49. [Google Scholar]
  11. Wang, J.; Fu, L.F.; Wang, X.W. Infrared spectroscopy detection technology for power oil. Therm. Power Gener. 2024, 53, 177–183. [Google Scholar]
  12. Gerretzen, J.; Szymańska, E.; Bart, J.; Davies, A.N.; van Manen, H.-J.; Heuvel, E.R.v.D.; Jansen, J.J.; Buydens, L.M. Boosting model performance and interpretation by entangling preprocessing selection and variable selection. Anal. Chim. Acta 2016, 93844–93852. [Google Scholar] [CrossRef]
  13. Shi, H.; Yu, P. Using Molecular Spectroscopic Techniques (NIR and ATR-FT/MIR) Coupling with Various Chemometrics to Test Possibility to Reveal Chemical and Molecular Response of Cool-Season Adapted Wheat Grain to Ergot Alkaloids. Toxins 2023, 15, 151. [Google Scholar] [CrossRef] [PubMed]
  14. Huang, T.; Bi, W.; Song, Y.; Yu, X.; Wang, L.; Sun, J.; Jiang, C. DMC-LIBSAS: A Laser-Induced Breakdown Spectroscopy Analysis System with Double-Multi Convolutional Neural Network for Accurate Traceability of Chinese Medicinal Materials. Sensors 2025, 25, 2104. [Google Scholar] [CrossRef] [PubMed]
  15. Ge, C.H.; Liu, Y.J.; Chen, M.S.; Yang, C.; Liang, P.P.; Yao, Z.X.; Zhang, K. Least-squares prediction of wind turbine lubricant acid value by higher-order derivatives of attenuated total reflection-Fourier infrared spectra combined with angular volume. Anal. Chem. 2024, 52, 1254–1269. [Google Scholar]
  16. Sun, H.Y.; Zhang, H.G.; Xue, M.C.; Lu, F.K. Research on fault diagnosis of solenoid valve of braking system based on wavelet packet decomposition and BP neural network. Railw. Roll. Stock. 2024, 44, 39–45. [Google Scholar]
  17. Ye, R.L.; Guo, Z.Z.; Liu, R.Y.; Liu, J.N. Wind speed and wind power prediction for wind farms based on wavelet packet decomposition and improved Elman neural network. J. Electrotechnol. 2017, 32, 103–111. [Google Scholar]
  18. Mu, W.Z.; Zhang, G.Y.; Zhang, W.; Yao, R.; Fu, N. Optimization of quantitative model for near-infrared spectra of yellow water starch based on CARS-SPA feature extraction. Food Sci. 2024, 45, 8–14. [Google Scholar]
  19. An, M.; Yang, X.Y.; He, L.N.; Zhang, J.Y.; Wang, Y.B. Study on infrared spectroscopy for rapid determination of acidity and other properties of automotive diesel fuel. Pet. Refin. Chem. Ind. 2020, 51, 87–92. [Google Scholar]
  20. Jiang, J.J.; Jin, X.K.; Li, W.S.; Zhuang, L.; Yuan, X.Z.; Zhu, C.Y. Identification of hemp fiber based on PCA-LDA statistical analysis of infrared spectra. Silk 2024, 61, 102–108. [Google Scholar]
  21. Yin, Y.; Wang, S.; Liu, C. Quantitative analysis of acid value and peroxide value of edible oil based on near infrared spectroscopy. J. Food Saf. Qual. Test. 2023, 14, 68–76. [Google Scholar]
  22. Yuan, H.; Lu, L.J.; Wang, S.; Zhao, W.H.; Qiu, L.M.; Xu, G.T. Application of infrared spectroscopy from structural analysis to in situ characterization in petroleum refining and chemical catalyst research. J. Pet. (Pet. Process.) 2024, 40, 1420–1429. [Google Scholar]
  23. Pu, H.; Kamruzzaman, M.; Sun, D.-W. Selection of feature wavelengths for developing multispectral imaging systems for quality, safety and authenticity of muscle foods—A review. Trends Food Sci. Technol. 2015, 45, 86–104. [Google Scholar] [CrossRef]
  24. Liu, G.; Gong, Y.Q.; Zhang, H.; Liang, H.B. Infrared spectral noise reduction algorithm based on wavelet transform optimized EEMD combined with SG. Infrared Technol. 2024, 46, 1453–1458. [Google Scholar]
  25. Cai, J.H.; Yin, S.J.; Chen, C.; Wang, S.M.; Liang, S.W. Study of Fourier transform attenuated total reflectance infrared spectroscopy screening model for phenylketonuria based on wavelet and wavelet packet transforms. J. Anal. Sci. 2016, 32, 458–462. [Google Scholar]
  26. Shi, H.L.; Hu, B. A review of dual-tree complex wavelet transform and its applications. Inf. Electron. Eng. 2007, 3, 229–234. [Google Scholar]
  27. Kingsbury, N. Complex Wavelets for Shift Invariant Analysis and Filtering of Signals. Appl. Comput. Harmon. Anal. 2001, 10, 234–253. [Google Scholar] [CrossRef]
  28. Deng, B.-C.; Yun, Y.-H.; Cao, D.-S.; Yin, Y.-L.; Wang, W.-T.; Lu, H.-M.; Luo, Q.-Y.; Liang, Y.-Z. A bootstrapping soft shrinkage approach for variable selection in chemical modeling. Anal. Chim. Acta 2016, 908, 63–74. [Google Scholar] [CrossRef]
  29. Yan, H.; Song, X.; Tian, K.; Chen, Y.; Xiong, Y.; Min, S. Quantitative determination of additive chlorantraniliprole in abamectin preparation: Investigation of bootstrapping soft shrinkage approach by mid-infrared spectroscopy. Spectrochim. Acta A 2018, 191, 296–302. [Google Scholar] [CrossRef] [PubMed]
  30. Jiang, D.Y. Analysis of Tobacco Near-Infrared Spectral Data BASED on Deep Learning. Master’s thesis, Chongqing University of Posts and Telecommunications, Chongqing, China, 2021. [Google Scholar]
  31. Wang, Z.; Li, J. Prediction model of SBS content in DNN-based modified asphalt. J. Constr. Mater. 2021, 24, 630–636. [Google Scholar]
  32. Abiodun, O.I.; Jantan, A.; Omolara, A.E.; Dada, K.V.; Mohamed, N.A.; Arshad, H. State-of-the-art in artificial neural network applications: A survey. Heliyon 2018, 4, e00938. [Google Scholar] [CrossRef]
  33. Kim, H.; Wang, C.; Byun, H.; Hu, W.; Kim, S.; Jiao, Q.; Lee, T.H. Variable three-term conjugate gradient method for training artificial neural networks. Neural Netw. 2023, 159, 125–136. [Google Scholar] [CrossRef]
  34. Xiao, Z.L.; Yuan, R.Y.; Fu, Z.; Liu, C.; Yin, B.L.; Xiao, M.Z.; Zhao, T.T.; Kuang, Y.J.; Song, L.B. Research on the aging behavior of transformer oil based on machine learning and infrared spectroscopy. Spectrosc. Spectr. Anal. 2025, 45, 434–442. [Google Scholar]
Figure 1. Schematic diagram of dual-tree complex wavelet decomposition.
Figure 1. Schematic diagram of dual-tree complex wavelet decomposition.
Energies 18 03345 g001
Figure 2. Deep neural network model figure.
Figure 2. Deep neural network model figure.
Energies 18 03345 g002
Figure 3. General flow chart of the transformer oil acid value prediction modeling.
Figure 3. General flow chart of the transformer oil acid value prediction modeling.
Energies 18 03345 g003
Figure 4. Transformer oil infrared spectra raw curve: (a) enlarged infrared spectrum in the wavenumber range of 1300~1700 cm−1; (b) enlarged infrared spectrum in the wavenumber range of 2700~3200 cm−1. The different colors in the figure represent different transformer oil samples.
Figure 4. Transformer oil infrared spectra raw curve: (a) enlarged infrared spectrum in the wavenumber range of 1300~1700 cm−1; (b) enlarged infrared spectrum in the wavenumber range of 2700~3200 cm−1. The different colors in the figure represent different transformer oil samples.
Energies 18 03345 g004
Figure 5. Comparison of infrared spectra of the transformer oil before and after noise reduction by the WPD algorithm.
Figure 5. Comparison of infrared spectra of the transformer oil before and after noise reduction by the WPD algorithm.
Energies 18 03345 g005
Figure 6. Infrared spectral curve after preprocessing: (a) WPD processing; (b) D1 processing; (c) D2 processing.
Figure 6. Infrared spectral curve after preprocessing: (a) WPD processing; (b) D1 processing; (c) D2 processing.
Energies 18 03345 g006
Figure 7. Distribution of Acid Content Characterization Results Extracted by Different Pretreatment Methods: (a,a1) distribution of feature extraction after WPD + BOSS processing; (b,b1) distribution of feature extraction after D1 + BOSS processing; (c,c1) distribution of feature extraction after D2 + BOSS processing.
Figure 7. Distribution of Acid Content Characterization Results Extracted by Different Pretreatment Methods: (a,a1) distribution of feature extraction after WPD + BOSS processing; (b,b1) distribution of feature extraction after D1 + BOSS processing; (c,c1) distribution of feature extraction after D2 + BOSS processing.
Energies 18 03345 g007
Figure 8. Comparison of the R2 and RMSE of the training and validation sets for the different methods.
Figure 8. Comparison of the R2 and RMSE of the training and validation sets for the different methods.
Energies 18 03345 g008
Figure 9. Figure of predicted versus actual values for the full external validation set.
Figure 9. Figure of predicted versus actual values for the full external validation set.
Energies 18 03345 g009
Table 1. Data set division and acid content range.
Table 1. Data set division and acid content range.
Sample Dataset NameNumber of Sample SetAcid Value Content Range (mgKOH/g)
infrared spectral data sets1430.002~0.152
training set1040.002~0.152
validation sets260.030~0.124
full external validation set130.033~0.105
Table 2. Distribution of Characteristic Parameter and minimum RMSE under different preprocessing methods.
Table 2. Distribution of Characteristic Parameter and minimum RMSE under different preprocessing methods.
Processing MethodsCharacteristic Parameter Range (cm−1)Minimum RMSE
WPD + BOSS1604~16660.0006611
D1 + BOSS1428~14460.0010242
D2 + BOSS1390~14030.0008651
Table 3. Comparison of predictive performance of different methods and models.
Table 3. Comparison of predictive performance of different methods and models.
Modeling MethodSerial
Number
Processing MethodTraining SetValidation Set
R2RMSER2RMSE
PLS1WPD + BOSS0.92750.00850.91030.0106
2D1 + BOSS0.89010.01120.87920.0148
3D2 + BOSS0.90280.01000.88230.0122
DNN4WPD + BOSS0.97120.00540.95990.0066
5D1 + BOSS0.92030.00850.93780.0098
6D2 + BOSS0.93240.00720.94070.0085
7WPD0.77580.02270.65450.0244
8D10.73610.02710.62750.0304
9D20.76930.02570.64520.0271
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Fang, L.; Zong, C.; Pang, Z.; Tian, Y.; Huang, X.; Zhang, Y.; Wang, X.; Zhang, S. Transformer Oil Acid Value Prediction Method Based on Infrared Spectroscopy and Deep Neural Network. Energies 2025, 18, 3345. https://doi.org/10.3390/en18133345

AMA Style

Fang L, Zong C, Pang Z, Tian Y, Huang X, Zhang Y, Wang X, Zhang S. Transformer Oil Acid Value Prediction Method Based on Infrared Spectroscopy and Deep Neural Network. Energies. 2025; 18(13):3345. https://doi.org/10.3390/en18133345

Chicago/Turabian Style

Fang, Linjie, Chuanshuai Zong, Zhenguo Pang, Ye Tian, Xuezeng Huang, Yining Zhang, Xiaolong Wang, and Shiji Zhang. 2025. "Transformer Oil Acid Value Prediction Method Based on Infrared Spectroscopy and Deep Neural Network" Energies 18, no. 13: 3345. https://doi.org/10.3390/en18133345

APA Style

Fang, L., Zong, C., Pang, Z., Tian, Y., Huang, X., Zhang, Y., Wang, X., & Zhang, S. (2025). Transformer Oil Acid Value Prediction Method Based on Infrared Spectroscopy and Deep Neural Network. Energies, 18(13), 3345. https://doi.org/10.3390/en18133345

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop