Next Article in Journal
Improving the Effective Utilization of Liquid Nitrogen for Suppressing Thermal Runaway in Lithium-Ion Battery Packs
Previous Article in Journal
Transfer Learning-Enhanced Safety Modeling for Lithium-Ion Batteries Under Mechanical Abuse
Previous Article in Special Issue
Designing a Sustainable Off-Grid EV Charging Station: Analysis Across Urban and Remote Canadian Regions
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Research on the State of Charge Estimation of Electric Forklift Batteries Based on an Improved Transformer Model

1
School of Automotive Engineering, Suzhou University of Technology, Changshu 215500, China
2
Shanghai Yiri Technology Co., Ltd., Shanghai 201805, China
*
Author to whom correspondence should be addressed.
Batteries 2026, 12(2), 41; https://doi.org/10.3390/batteries12020041
Submission received: 6 December 2025 / Revised: 19 January 2026 / Accepted: 20 January 2026 / Published: 23 January 2026

Abstract

The state of charge (SoC) is one of the critical parameters in battery management systems, as it directly determines the operational safety and reliability of batteries. To accurately predict the SoC of an electric forklift under varying operating conditions, two surrogate models, an improved Transformer and an improved Transformer 2, are developed. The experimental data obtained through real-vehicle tests are multi-dimensional and contain multiple sources of noise, resulting in poor prediction accuracy when only a single preprocessing algorithm is used. Therefore, this paper first discusses the effect of the preprocessing algorithms on SoC estimation. Compared with the original experimental data and the Kalman filter algorithm, the Kalman filter–principal component analysis (PCA) method is more suitable for preprocessing the original electric forklift data. The mean absolute error (MAE) and root mean square error (RMSE) of the improved Transformer model obtained using the Kalman filter – PCA method are reduced by 26.32% and 27.73% respectively, compared to the single Kalman method. Then, this study investigates the impact of data with different dimensions on the prediction performance of the improved Transformer mode. The results show that five-dimensional data can more effectively train the improved Transformer model, since the MAE decreases by 14.63% and 19.54%, and the RMSE decreases by 14.85% and 20.37% compared to three-dimensional and seven-dimensional data. Through the analysis of the improved Transformer model, an improved Transformer 2 model with higher prediction accuracy is obtained. Then, the improved Transformer 2 model is compared with the LSTM and CNN algorithms. The results indicate that the improved Transformer 2 model can predict SoC more stably and accurately than the single LSTM and CNN algorithms. Specifically, compared with the LSTM model, the proposed Transformer 2 model reduces the MAE by 77.16% and the RMSE by 91.75%. In comparison with the CNN model, the MAE is reduced by 71.81% and the RMSE by 80%.

1. Introduction

With the continuous development and exploration of renewable energy, lithium-ion batteries have occupied a dominant position in energy storage systems owing to their advantages of large energy storage capacity. As one of the main parameters of batteries, the state of charge (SoC) can accurately reflect the remaining capacity and provide key data for the operation of electric vehicles. Since the SoC must be indirectly calculated through measurable physical quantities such as current and voltage, rapid and accurate SoC estimation is of great significance for the large-scale application of lithium-ion batteries [1].
Direct test methods, model-based methods, and data-driven methods can be used to estimate SoC. Direct test methods need extensive testing and are always susceptible to the influence of test conditions [2]. Model-based methods use domain knowledge to establish complex mathematical models, and the parameter adjustment process of these methods is quite complex, and their convergence is also poor [3]. Data-driven methods have become an effective SoC estimation approach in recent years. Data-driven methods use various artificial intelligence models to analyze the laws between battery-related parameters and SoC [4].
Kim and Lee [5] trained a long short-term memory (LSTM) model using battery data under multi-temperature and multi-load conditions, achieving a significant decrease in mean absolute error (MAE) and root mean square error (RMSE) with an increase in training epochs. Franzese et al. [6] utilized an LSTM layer combined with a fully connected layer to predict the residual capacity of a 40 Ah lithium iron phosphate (LFP) battery. The improved model exhibited excellent generalization in scenarios across battery aging cycles and chemical systems, enabling low-error estimation with a small amount of training data. Xie et al. [1] proposed a physics-constrained informer-LSTM model. This model utilizes a second-order equivalent circuit model and a temperature-aware weighted loss function to evaluate SoC. It achieves high-precision estimation with an average MAE of 0.782% and RMSE of 0.974% within a wide temperature range (0–50 °C). The proposed model is significantly superior to traditional data-driven models. Ding et al. [7] combined convolutional neural network (CNN), LSTM, self-attention mechanisms, and squeeze-and-excitation attention mechanisms to predict SoC. The research results show that the prediction RMSE under the Urban Dynamometer Driving Schedule (UDDS) condition is only 1.73%. Compared with gated recurrent unit (GRU), LSTM, and CNN-LSTM models, its error is reduced by 15–31.9%. Jafari et al. [4] used the PSO algorithm to optimize the parameters of the CNN-LSTM-ConvLSTM model. Then, a fusion model was utilized to predict SoC, achieving a prediction result with an R2 of 99%. Song et al. [8] proposed a CNN-LSTM algorithm for SoC estimation of lithium-ion batteries. In the algorithm, the CNN-LSTM model was trained on data from different operating conditions. Van et al. [9] employed the LSTM network to estimate the SoC of lithium-ion batteries used in electric vehicles. Shen and Ge [10] used an LSTM with an attention mechanism to predict the SoC of a lead–acid battery. From the above literature, the frameworks based on LSTM and CNN are the main approaches for SoC estimation. With the development of artificial intelligence technology, the application of Transformer models in data prediction has become feasible. The Transformer algorithm, with its attention mechanism, offers a promising method for accurately capturing multi-dimensional features of the SoC. However, the above literature reveals that the existing studies have not explored the feasibility of the Transformer model for SoC estimation of an electric forklift under actual variable operating conditions. Therefore, this paper uses the Transformer algorithm to study its feasibility in SoC prediction for an electric forklift.
For traditional vehicles, the coulomb counting algorithm is commonly used to estimate SoC, with subsequent modifications through integration with traditional methods such as the Kalman filter. However, forklifts often operate under long-term variable load conditions. This makes traditional algorithms unsuitable for calculating the SoC due to large estimation errors. In contrast, neural networks can find the intrinsic relationships within data, thereby accurately estimating the SoC. Some studies have combined neural networks with traditional methods to evaluate battery parameters. Leonori et al. [11] integrated an equivalent circuit model with a neural network to achieve accurate modeling of battery cells, but this study did not investigate the feasibility of this developed algorithm in SoC prediction. Estimation of SoC needs to capture complex nonlinear electrochemical features of batteries, presenting greater challenges than voltage prediction. Lyu et al. [12] adopted an LSTM model to characterize the dynamic performance of batteries and used the unscented Kalman filter algorithm to predict SoC. In the algorithm, a proportional integral derivative (PID) controller was established to improve SoC estimation accuracy under different aging states. Paul et al. [13] used an optimized relevant LSTM-squared gain extended Kalman filter model to investigate the influence of training and testing working conditions on SoC estimation. Zhang et al. [14] used an LSTM, attention mechanism, and Kalman filter to estimate the SoC of lithium-ion batteries. This method was trained and tested under different conditions with different temperatures and initial errors. The Kalman filter method was used to update the prediction results.
Noisy data cannot effectively train neural networks. To better capture the intrinsic relationships within the data, Hu et al. [15] employed a non-linear state space reconstruction LSTM method to estimate the SoC of LFP battery cells for electric wheelchairs. The Pearson correlation coefficient (PCC) was used to simplify the key parameters. Then the feasibility of the non-linear state space reconstruction LSTM algorithm was verified by predicting SoC at different cycle numbers. The PCC method is used in this paper to process the original data in order to obtain the critical parameters for training the improved LSTM network. The battery data exhibit strong nonlinear relationships. The original data obtained from real-vehicle tests are high-dimensional and contain noise. The existing studies typically use a single data preprocessing method to process the original experimental data. These studies lack a detailed discussion of the prediction performance combining denoising and dimensionality reduction. In addition, previous research lacks the analysis on the impact of data dimensions on SoC prediction accuracy under varying operating conditions of an electric forklift. Therefore, this paper discusses the performance of different preprocessing algorithms by comparing three data preprocessing schemes (original experimental data, Kalman filter, and Kalman filter–principal component analysis). The 12-dimensional data are reduced to 3, 5, and 7 dimensions for SoC prediction, and the optimal dimension is determined, thereby filling a research gap in this field. Then, an improved Transformer model is developed to estimate the SoC of an electric forklift using the optimal sample data. The main contributions of the proposed method are summarized as follows:
(1)
A Kalman filter–principal component analysis (PCA) algorithm is used in this paper to process the original experimental data. The Kalman filter is first used to suppress noise, and then PCA is employed to reduce the dimensionality of the filtered data, thereby lowering computational complexity. By evaluating the performance of different data preprocessing methods for training the surrogate model, an optimal data processing strategy is found, which improves the accuracy of SoC estimation.
(2)
The proposed initial improved Transformer algorithm incorporates a convolution layer, an LSTM layer, and two-layer attention mechanisms. By evaluating the performance of different algorithms (LSTM, CNN, and some improved Transformer models) in battery SoC estimation, an optimal SoC estimation method suitable for an electric forklift under variable operating conditions is proposed.
The structure of this paper is as follows: Section 2 introduces the main mathematical methods adopted in this paper. Section 3 describes the experimental methods for the lithium iron phosphate battery of an electric forklift. Section 4 provides a detailed discussion of the prediction results of different algorithms for SoC evaluation, and Section 5 presents the conclusions of this paper.

2. Methodology

2.1. PCA

Due to the large volume of testing data obtained from each individual battery and the entire battery pack through experiments, the PCA method is employed in this study to reduce the dimensionality of the original experimental data. This method can effectively project high-dimensional data onto a low-dimensional subspace by finding an optimal projection matrix R using [16,17]:
arg min R T R = W j = 1 m x j R R T x j 2 2
where 2 2 denotes the squared norm, and W represents the identity matrix.

2.2. LSTM

LSTM is a type of recurrent neural network that improves upon the standard RNN architecture by introducing gating mechanisms. The LSTM network consists of three gate structures: the forget gate, the input gate, and the output gate. The outputs of the forget gate are given by [18]:
f s = σ w l h s 1 , x s + b l
where f s represents the output of the forget gate, σ denotes the Sigmoid activation function, w l is the weight matrix, x s is the input at the current time step s, h s 1 is the hidden state from the previous time step s 1 , b l is the bias term.
The output of the input gate c s can be expressed as [19]:
c s = f s c s 1 + i s c ˜ s
where c ˜ s is the candidate memory cell state at time step s and i s is the activation value of the input gate.
The final output of the output gate hs can be calculated as [19]:
h s = o s tanh ( c s )
where o s is the activation value of the output gate.

2.3. CNN

CNN is a deep learning algorithm inspired by visual neurons, which is currently widely applied in fields such as image classification and data prediction. A traditional CNN model mainly consists of convolutional layer, pooling layer, and fully connected layer. The CNN model constructed in this paper is different from the traditional CNN model. Specifically, a batch normalization layer and a ReLU layer are added after the convolutional layer. In addition, after the pooling layer, another convolutional layer is used, followed by a batch normalization layer and a ReLU layer. A dropout layer is added before the fully connected layer.

2.4. Attention Mechanism

The attention mechanism has been widely used in fields such as data prediction and fault identification. This model can focus on critical information of the input data while filtering out irrelevant noise. The multi-head attention mechanism includes three matrices: Q (query), K (key), and V (value). The multi-head attention mechanism independently learns the transformations for Q, K, and V matrices, and then obtains j groups of linear projections through these transformations [20]. The formula can be written as [21]:
MultiHeadSA ( Q , K , V ) = Concat ( h 1 , , h k , , h j ) R O
head k = Attention Q R k Q , K R k K , V R k V
where R k Q , R k K , and R k V are the parameter matrices for query matrix, key matrix, and value matrix, respectively.

2.5. Improved Transformer Model

This paper develops an improved Transformer model for battery SoC estimation, integrating convolution layer, LSTM layer, and multi-head attention mechanism algorithms. Figure 1 is the flowchart of the improved Transformer method constructed in this study. The calculation steps can be summarized as follows:
(1)
The processed experimental data are set as the input layer. In this study, the original experimental data are processed using three different methods (original experimental data, Kalman filter, and Kalman filter–principal component analysis), yielding three corresponding datasets. Then, these three datasets subsequently are used as input for this algorithm.
(2)
The position embedding layer is utilized to obtain position information.
(3)
The addition layer is employed to fuse the data from steps (1) and (2).
(4)
The convolution layer is used to extract local features of the data. The kernel size is set to 3 and the number of filters is selected as 64.
(5)
Two layers of attention mechanisms, namely a masked multi-head attention mechanism and a multi-head attention mechanism without masking are used to adaptively capture the internal correlations. The number of attention heads in these two attention layers is set to 4, and the attention dimensions are both set as 128.
(6)
The LSTM layer is used to re-extract data features. The number of hidden units in the LSTM layer is selected as 50.
(7)
The fully connected layer is selected to integrate data features.
(8)
The regression layer is set as the output layer, which defines the loss function (mean squared error) to optimize the model parameters during training.
Figure 1. The flowchart of improved Transformer model.
Figure 1. The flowchart of improved Transformer model.
Batteries 12 00041 g001

3. Experimental Results

Figure 2 illustrates the battery testing platform setup. The platform consists of a host computer equipped with a working condition setting interface and experimental testing monitoring software, a high-rate charge–discharge tester for batteries, two temperature test chambers, a lithium iron phosphate battery cell and a lithium iron phosphate battery pack, and an electric forklift. The manufacturer of temperature-controlled Chamber 1 is Chongqing Harding Environmental Test Technology Co., Ltd. (Chongqing, China), with model number V0-40A1A. The manufacturer of temperature-controlled Chamber 2 is Jufu Instrument Industry Co., Ltd. (Dongguan, China), with model number ECT-360DU-40-SSP-AR. The manufacturer of the battery cell charge–discharge test system is Shenzhen Neware Electronics Co., Ltd. (Shenzhen, China), with model number 5V300A-4CH.
The procedures for obtaining the experimental data of batteries of an electric forklift can be summarized as follows:
(1)
The first step is to conduct a performance test on the single battery. The main equipment includes two temperature-controlled chambers and a cell charge–discharge test system. The testing pipeline is as follows: The host computer transmits preset working conditions of batteries to the tester via a TCP/IP communication interface. The tester then executes the corresponding configuration for the batteries. The lithium iron phosphate batteries used for experimental testing are placed in two independent temperature chambers to compare the impact of different temperatures on the battery’s charge–discharge performance. The experimental test monitor displays and records the battery’s real-time operating parameters, such as current, voltage, and battery temperature.
(2)
The second step is to conduct tests on the battery pack data of a real vehicle. The individual battery cells are assembled into a battery pack, which is then installed on an electric forklift. All experimental data are collected by the Battery Management System (BMS) collector. In this paper, 3.2 V 280 Ah lithium iron phosphate batteries are connected in a 2-parallel and 24-series configuration to form a 76.8 V 560 Ah battery pack, with an energy capacity of 43 kWh when fully charged. This battery pack is mounted on a 3.5-ton forklift, enabling the equipment to operate continuously for 6 h. In Step (1), the experimental tests are conducted of a single battery under high and low temperature environment as well as high-current charge–discharge conditions, so as to determine the boundary conditions for the overall operation of the battery pack after assembly. According to the design parameters of the forklift’s electric drive system, the maximum power of its traction motor is 8 kW and that of the lifting motor is 12 kW. Based on the estimation under full-load lifting conditions, the maximum discharge rate of the battery is less than 1 C. Therefore, the battery pack designed and developed through Step (1) and Step (2) meets the requirements for the vehicle’s operation under all working conditions.
Through experimental tests, 12-dimensional input experimental data is obtained, which includes the battery pack inner voltage, the battery pack outer voltage, the battery system current, the cumulative charging energy, the cumulative discharging capacity, the cumulative discharging energy, the cumulative charging capacity, the single charging energy, the total voltage of individual cells, the maximum individual cell voltage, the minimum individual cell voltage, and the average individual cell voltage.

4. Results and Discussion

4.1. Evaluation Metrics

To clearly compare the performance of different estimation models, MAE, RMSE, and R-squared (R2) are adopted to evaluate the prediction accuracy of different algorithms. The calculation formulas can be expressed as follows:
M A E = 1 n i = 1 n S M i E i
R M S E = 1 n i = 1 n ( S M i E i ) 2
R 2 = 1 i = 1 n S M i E i 2 i = 1 n E i E ¯ 2
where S M i denotes the i-th estimation results obtained by different estimation models, E i represents the i-th experimental results, and E ¯ is the average value of the experimen- tal results.

4.2. Comparison with Different Algorithms

4.2.1. Data Preprocessing

Accurate processing of experimental data enables effective training of deep learning models. In this section, three different methods are adopted to process the original experimental data. The first method directly uses the original experimental data as input to train the improved Transformer model, where a total of 12 dimensions of experimental data are selected. The second method employs the Kalman filter method to process the experimental data and then trains the improved Transformer model with the filtered data. The third method first processes the experimental data using the Kalman filter, then performs dimensionality reduction via the PCA method, and finally trains the improved Transformer model with the five-dimensional data. 70% of the experimental data are used as the training set to train the neural network, while the remaining 30% are employed to verify the accuracy of the model.
Figure 3 shows a comparison of the prediction performance of the improved Transformer model with these three different data processing methods. The x-axis in Figure 3 represents the sequence number of the predicted samples. As shown in Figure 3, when the improved Transformer model is trained on original data, there are clear discrepancies between the SoC prediction results and the experimental values. The SoC of the experimental values shows a continuous downward trend with an increase in usage time, while the predicted value always remains at about 92% in the Figure 3a. It can be observed that the improved Transformer model struggles to effectively identify the inherent patterns of the original experimental data. After the original experimental data are processed by the Kalman filter and Kalman filter–PCA methods, the improved Transformer model successfully learns the inherent patterns of the data, with its prediction results showing good consistency with the experimental values, as shown in the Figure 3b,c. This indicates that the model effectively captures the decay behavior of battery SoC. It can be concluded that these two data processing methods, namely the standalone Kalman filter and the Kalman filter–PCA methods are effective in preprocessing experimental data for battery SoC estimation.
It is difficult to determine which algorithm is better according to Figure 3b,c. Therefore, the MAE, RMSE, and R2 are calculated for different processing methods to show a clear comparison of their performance. Table 1 shows the results (MAE, RMSE, and R2) for the Kalman filter and Kalman filter–PCA methods. As shown in the table, the MAE and RMSE of the improved Transformer model trained by the Kalman filter–PCA method are both lower than those of the model trained by the single Kalman filter method. Compared to the single Kalman filter method, the Kalman filter–PCA algorithm is better in the performance in predictive performance in SoC, reducing MAE and RMSE by 26.32% and 27.73%, respectively.
Next, the PCA method is employed to reduce the dimensionality of the original 12-dimensional experimental data to different target dimensions. Then the preprocessed experimental data of different dimensions are used to train the improved Transformer model. Table 2 presents the quantitative analysis results of the corresponding predictions. As indicated in the table, there are significant differences in the model’s prediction performance depending on the different dimensionalities of the input data. In terms of prediction metrics in Table 2, the MAE and RMSE from training with five-dimensional data are lower than those obtained with three-dimensional and seven-dimensional data. Compared with the three-dimensional and seven-dimensional data, the MAE for the five-dimensional data decreases by 14.63% and 19.54%, respectively, while the RMSE decreases by 14.85% and 20.37%, respectively. In conclusion, five-dimensional data are superior to the three-dimensional and seven-dimensional data for battery SoC prediction.

4.2.2. Analysis of the Impact of Single-Layer Removal on Model Prediction Performance

The improved Transformer model proposed in this paper consists of multiple layers, and this section focuses on discussing the impact of different layers on SoC estimation performance. Table 3 presents the quantitative analysis results of SoC prediction using different model structures. It can be observed from the table that the R2 of each model is greater than 0.9, indicating that all these different models can effectively predict SoC. In terms of training efficiency, the training time is the shortest when the convolution layer is removed. It can be found that the convolution layer accounts for the longest training time in the original model. Removing one layer from the original improved algorithm does not lead to significant changes in SoC prediction for models No. 1, No. 3, and No. 4. However, compared with No. 1, No. 3, and No. 4, the MAE and RMSE of the No. 2 algorithm are both significantly lower. It can be concluded that the removal of the LSTM layer enables the model to predict SoC more accurately. Therefore, the No. 2 algorithm is adopted for the subsequent research and is named the improved Transformer 2 method.

4.2.3. Discussion for Different Estimation Models

The LSTM, CNN, and improved Transformer 2 model are used to predict SoC using the same five-dimensional data set. Figure 4 is a comparison of prediction performance using different evaluation models. The x-axis in Figure 4 represents the sequence number of the predicted samples. It can be found from the figures that the prediction accuracy of these three methods is different. Compared with the improved Transformer 2 model, the prediction effects of the LSTM and CNN models trained on battery data show some fluctuations at certain points. The LSTM algorithm shows the worst prediction performance in the first 18 samples. The fluctuation in the CNN algorithm’s predictions is more obvious than that of the LSTM algorithm prediction after sample No. 688. The prediction results of the improved Transformer 2 model show hardly any significant deviation from the experimental values. It can be concluded that the accuracy of the improved Transformer 2 model is more accurate than the LSTM and CNN algorithms in SoC estimation.
Figure 5 is the deviation between the experimental values and the prediction results using different evaluation models. It can be found from Figure 5a that the LSTM model has high accuracy for most data, but there is an obvious deviation in the high SoC range. From Figure 5b, it can be seen that the CNN model can effectively capture the overall trend of SoC change, but its accuracy at local points is poor, resulting in certain deviations at some points. Compared with Figure 5a,b, the improved Transformer 2 model has the best agreement with the experimental values from Figure 5c, achieving accurate prediction across the entire range with very small errors, and almost showing a linear relationship.
Table 4 presents a quantitative comparison of the prediction results obtained by the LSTM, CNN, and improved Transformer 2 models on the same data set. It can be observed from the table that the improved Transformer 2 model achieves the smallest MAE and RMSE, while the LSTM model yields the largest values. This is primarily because the LSTM model exhibits a large deviation in predicting high SoC values. The improved Transformer 2 model provides a more accurate SoC estimation than the other algorithms. Compared with the LSTM model, the improved Transformer 2 model reduces the MAE by 77.16% and the RMSE by 91.75%. In contrast to the CNN model, it decreases the MAE by 71.81% and the RMSE by 80%.

5. Conclusions

This paper proposes a state of charge (SoC) estimation method based on the improved Transformer model. First, a SoC data set of an electric forklift is obtained through experimental testing. The Kalman filter is adopted to filter the experimental data, then the PCA method is used to reduce the dimensionality of the filtered data. A convolutional layer is employed to extract local features. The data correlations are captured using the masked multi-head attention mechanism and unmasked multi-head attention mechanism. Two improved Transformer models, namely the improved Transformer algorithm and the improved Transformer 2 algorithm, are developed in this study. The difference between the two algorithms is that the former incorporates the LSTM layer. The conclusions can be summarized as follows:
(1)
Since the data preprocessing methods can affect the prediction accuracy of different estimation models, three methods are employed in this paper to compare the prediction performance on SoC estimation. Training the improved Transformer model using original experimental data proves ineffective in learning the laws of SoC. Compared to the single Kalman filter method, the Kalman filter–PCA algorithm achieves a higher prediction accuracy, where the MAE and RMSE are decreased by 26.32% and 27.73%, respectively. It can be found that compared with the single Kalman filter method, the Kalman filter–PCA algorithm can more effectively improve the prediction accuracy of the improved Transformer model in SoC prediction.
(2)
The PCA is employed to reduce the dimensionality of the original 12-dimensional experimental data. The results indicate that different dimensionalities have an impact on the prediction performance of the improved Transformer model. Compared to three-dimensional and seven-dimensional data, the five-dimensional data reduce MAE by 14.63% and 19.54%, and RMSE by 14.85% and 20.37%, respectively. It can be evident that the selection of data dimensions has an impact on the prediction performance of neural networks, and appropriate dimensions can effectively train the surrogate models, resulting in high prediction accuracy.
(3)
Three different models are derived from the initial improved Transformer model. These three models are obtained by removing the convolution layer, LSTM layer, and attention layer, respectively. The results show that different Transformer network structures achieve varying prediction accuracy for SoC estimation. The improved Transformer 2 model with its LSTM layer removed achieves the highest SoC prediction accuracy compared to the other three models.
(4)
The Kalman filter–PCA method is used to preprocess data for training the LSTM, CNN, and improved Transformer 2 models, which are then utilized to predict SoC. Compared to the single LSTM model, the improved Transformer 2 model achieves a significant reduction in the MAE (77.16%) and RMSE (91.75%) of the prediction results. Compared with the single CNN model, the proposed improved Transformer 2 model can decrease the MAE and RMSE by 71.81% and 80%, respectively. It can be observed that the improved Transformer 2 model developed in this study can significantly improve the prediction accuracy of the SoC for the current electric forklift under real operating conditions.

Author Contributions

Conceptualization, J.W., S.Z. and X.H.; methodology, S.Z.; validation, S.Z.; formal analysis, S.Z.; investigation, J.W., S.Z. and X.H.; resources, J.W.; data curation, J.W. and X.H.; writing—original draft preparation, S.Z. and X.H.; writing—review and editing, J.W.; visualization, S.Z.; supervision, J.W.; project administration, J.W.; funding acquisition, J.W. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by Research Funds of Suzhou University of Technology of FUNDER grant number KYZ2019006Q.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Data will be made available on request.

Conflicts of Interest

Author Xia Hu was employed by the company Shanghai Yiri Technology Co., Ltd. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

References

  1. Xie, Y.; Han, Z.; Zhang, J.; Liu, L. A physics-constrained Informer-LSTM network for battery state-of-charge estimation. J. Energy Storage 2025, 136, 118381. [Google Scholar] [CrossRef]
  2. Chen, S. State of Power Prediction for Parallel Battery System Considering Cell Inconsistency Degree; Yancheng Institute of Technology: Yancheng, China, 2025. [Google Scholar]
  3. Oyewole, I.; Chehade, A.; Kim, Y. A controllable deep transfer learning network with multiple domain adaptation for battery state-of-charge estimation. Appl. Energy 2022, 312, 118726. [Google Scholar] [CrossRef]
  4. Jafari, S.; Kim, J.; Byun, Y. A novel fusion-based deep learning approach with PSO and explainable AI for batteries State of Charge estimation in Electric Vehicles. Energy Rep. 2024, 12, 3364–3385. [Google Scholar] [CrossRef]
  5. Kim, S.; Lee, S. Application of deep learning techniques for the state of charge prediction of lithium-Ion batteries. Appl. Sci. 2024, 14, 8077. [Google Scholar] [CrossRef]
  6. Franzese, P.; Iannuzzi, D.; Merolla, R.; Spina, I. Artificial neural networks for residual capacity estimation of cycle-aged cylindric LFP batteries. Batteries 2025, 11, 260. [Google Scholar] [CrossRef]
  7. Ding, Z.; Hu, D.; Jing, Y.; Ma, M.; Xie, Y.; Yin, Q.; Zeng, X.; Zhang, C.; Peng, T.; Ji, J. Research on precise lithium battery state of charge estimation method based on CALSE-LSTM model and pelican algorithm. Heliyon 2024, 10, e36232. [Google Scholar] [CrossRef]
  8. Song, X.; Yang, F.; Wang, D.; Tsui, K. Combined CNN-LSTM network for state-of-charge estimation of lithium-ion batteries. IEEE Access 2019, 7, 88894–88902. [Google Scholar] [CrossRef]
  9. Van, C.N.; Ngo, M.D.; Duc, C.D.; Thao, L.Q.; Ahn, S.J. State-of-charge estimation of lithium-ion battery integrated in electrical vehicle using a long short-term memory network. IEEE Access 2024, 12, 165472–165481. [Google Scholar]
  10. Shen, Y.; Ge, Y. Prediction of state of charge for lead-acid battery based on LSTM-attention and LightGBM. J. Comput. Inf. Sci. Eng. 2024, 24, 9. [Google Scholar] [CrossRef]
  11. Leonori, S.; Mostacciuolo, E.; Baccari, S.; Luzio, F.D. An advanced Li-ion cell equivalent circuit model using a neuro-physical approach. J. Energy Storage 2025, 139, 118623. [Google Scholar] [CrossRef]
  12. Lyu, L.; Jiang, B.; Zhu, J.; Wei, X.; Dai, H. An adaptive combined method for lithium-ion battery state of charge estimation using long short-term memory network and unscented kalman filter considering battery aging. Batter. Supercaps 2024, 7, e202400441. [Google Scholar] [CrossRef]
  13. Paul, T.; Wang, S.; Zhang, H.; Yang, X.; Fernandez, C. An optimized long short-term memory-weighted fading extended Kalman filtering model with wide temperature adaptation for the state of charge estimation of lithium-ion batteries. Appl. Energy 2022, 326, 120043. [Google Scholar]
  14. Zhang, X.; Huang, Y.; Zhang, Z.; Lin, H.; Zeng, Y.; Gao, M. A hybrid method for state-of-charge estimation for lithium-ion batteries using a long short-term memory network combined with attention and a kalman filter. Energies 2022, 15, 6745. [Google Scholar]
  15. Hu, P.; Tsang, C.; Lu, X.; Li, C.; Lee, C. Enhancing electric wheelchair safety via battery state of charge estimation with PCC–NSSR–LSTM method. Electron. Lett. 2025, 61, e70228. [Google Scholar] [CrossRef]
  16. Turk, M.; Pentland, A. Eigenfaces for recognition. J. Cogn. Neurosci. 1991, 3, 71–86. [Google Scholar] [CrossRef] [PubMed]
  17. Wang, Q. Study of Obust Principal Component Analysis and Its Applications. Ph.D. Thesis, Xidian University, Xi’an, China, 2019. [Google Scholar]
  18. Sun, L.; Qin, H.; Przystupa, K.; Majka, M.; Kochan, O. Individualized short-term electric load forecasting using data-driven meta-heuristic method based on LSTM network. Sensors 2022, 22, 7900. [Google Scholar] [CrossRef] [PubMed]
  19. Yu, T.; Zeng, X.; Feng, E.; Huang, J.; Zhang, G. A joint estimation of SOC-SOH for lithium batteries based on LSTM-Transformer multi-channel feature fusion. J. Railw. Sci. Eng. 2025, 43, 1423. [Google Scholar]
  20. Wang, M.; Gao, Q.; Feng, Y.; Zhao, N.; Chen, J.; Lü, C. Research on Bearing Fault Diagnosis via Multi-Feature Fusion Transformer-LSTM. Mech. Sci. Technol. Aerosp. Eng. 2025, 12, 1–11. [Google Scholar]
  21. Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, Ł.; Polosukhin, I. Attention is all you need. In Proceedings of the 31st Conference on Neural Information Processing Systems (NIPS 2017), Long Beach, CA, USA, 4–9 December 2017. [Google Scholar]
Figure 2. Experimental setup diagram.
Figure 2. Experimental setup diagram.
Batteries 12 00041 g002
Figure 3. Comparison of prediction effect using different data preprocessing methods. (a) Prediction effect based on original experimental values. (b) Prediction effect based on Kalman filter method. (c) Prediction effect based on Kalman filter–PCA method.
Figure 3. Comparison of prediction effect using different data preprocessing methods. (a) Prediction effect based on original experimental values. (b) Prediction effect based on Kalman filter method. (c) Prediction effect based on Kalman filter–PCA method.
Batteries 12 00041 g003
Figure 4. Comparison of prediction performance using different evaluation models. (a) LSTM model. (b) CNN model. (c) Improved Transformer 2 model.
Figure 4. Comparison of prediction performance using different evaluation models. (a) LSTM model. (b) CNN model. (c) Improved Transformer 2 model.
Batteries 12 00041 g004
Figure 5. The deviation graph of experimental values and prediction results using different evaluation models. (a) LSTM model. (b) CNN model. (c) Improved Transformer 2 model.
Figure 5. The deviation graph of experimental values and prediction results using different evaluation models. (a) LSTM model. (b) CNN model. (c) Improved Transformer 2 model.
Batteries 12 00041 g005
Table 1. Comparison of performance using different data preprocessing methods.
Table 1. Comparison of performance using different data preprocessing methods.
Evaluation MetricsKalman FilterKalman Filter–PCA
MAE0.0950.070
RMSE0.1190.086
R20.9995320.999754
Table 2. Comparison of performance using different data dimensions.
Table 2. Comparison of performance using different data dimensions.
Evaluation MetricsThree DimensionsFive DimensionsSeven Dimensions
MAE0.0820.0700.087
RMSE0.1010.0860.108
R20.9996650.9997540.999614
Table 3. Comparison of performance using different Transformer structures.
Table 3. Comparison of performance using different Transformer structures.
No.ModelsMAERMSER2Computational Time
1Delete convolution layer0.0770.0950.99970175 s
2Delete LSTM layer0.0530.0640.999864197 s
3Delete attention Layer0.0750.0920.999722142 s
4Improved Transformer0.0700.0860.999754300 s
Table 4. Comparison of performance using different models.
Table 4. Comparison of performance using different models.
Evaluation MetricsLSTMCNNImproved Transformer 2
MAE0.2320.1880.053
RMSE0.7760.3200.064
R20.9801100.9966130.999864
Computational time45 s111 s197 s
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Wang, J.; Zhang, S.; Hu, X. Research on the State of Charge Estimation of Electric Forklift Batteries Based on an Improved Transformer Model. Batteries 2026, 12, 41. https://doi.org/10.3390/batteries12020041

AMA Style

Wang J, Zhang S, Hu X. Research on the State of Charge Estimation of Electric Forklift Batteries Based on an Improved Transformer Model. Batteries. 2026; 12(2):41. https://doi.org/10.3390/batteries12020041

Chicago/Turabian Style

Wang, Jia, Shenglong Zhang, and Xia Hu. 2026. "Research on the State of Charge Estimation of Electric Forklift Batteries Based on an Improved Transformer Model" Batteries 12, no. 2: 41. https://doi.org/10.3390/batteries12020041

APA Style

Wang, J., Zhang, S., & Hu, X. (2026). Research on the State of Charge Estimation of Electric Forklift Batteries Based on an Improved Transformer Model. Batteries, 12(2), 41. https://doi.org/10.3390/batteries12020041

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop